Introducing Cross-Language Voice Cloning while preserving Speaker Accent

Today we're announcing a new feature that enables non-English speakers to clone their voices to create English speaking clones of their voice. The cloned voices retain the speaker's original accent while speaking English. To use this feature, simply upload a few seconds of non-English speaking audio to the 'Instant Cloning' feature and that will create the clone.

Introduction

Cross-language Voice Cloning allows users to clone voices across different languages to English, retaining the nuances of the original accent and language. For instance, a fluent Spanish speaker can use PlayHT voice cloning services to upload a 30 second audio speaking Spanish. Our voice model then clones the voice and language, allowing the Spanish speaker to speak English. The model synthesizes speech while original speaker's accent and speaking style.

The possibilities and use cases for this technology are vast, including dubbing, language learning, language localization, and more. The new model feature reaffirms our dedication to pushing the boundaries of what is possible with AI-generated voices.

‍

Multilingual Text-to-Speech Synthesis and Cross-Language Voice Cloning

Cross-language cloning has been attempted in the past but, before now, has required hours of fine-tuning very hard to source clean audio, transcription inputs, and manual hours to get satisfactory results.

It is possible to clone a voice without a transcript and a small amount of data using conventional TTS models like Tacatron. We always felt that the results could be better. That’s why our model doesn’t require large amounts of data and doesn’t need transcripts as the input representation. Yet the outcome is more than satisfactory.

Our Generative Voice model can capture and emulate the intonation and nuances of the original audio language to the cloned language without the need for interpretation. This allows for seamless cross-language cloning, making it a powerful tool for multilingual text-to-speech applications.

What’s next in Multilingual Synthesis and Cross-Language Cloning?

With Multilingual Synthesis and Cross-Language Cloning, we’ve reached a significant milestone in our AI voice cloning. With the ability to synthesize and clone voices in multiple languages, we are opening up new possibilities for businesses and individuals worldwide. Our market-based approach ensures that we are always working to meet the needs of our customers and the broader market, and we will continue to add new languages to our service as demand arises. To learn more about PlayHT and our AI voice cloning service, sign up for free today or connect with us on our socials to stay up-to-date on our latest developments. We’re truly excited to see what Cross-Language AI voices bring to content creation and are looking forward to seeing what you create!

‍

Introducing Cross-Language Voice Cloning while preserving Speaker Accent

Introduction

Multilingual Text-to-Speech Synthesis and Cross-Language Voice Cloning

What’s next in Multilingual Synthesis and Cross-Language Cloning?

Product Update

Introducing PlayHT 2.0 Turbo ⚡️ - The Fastest Generative AI Text-to-Speech API

Introducing PlayHT1.0: A Truly Realistic Text to Speech Model with Emotion and Laughter

Introducing PlayHT2.0: The state-of-the-art Generative Voice AI Model for Conversational Speech