Microsoft has a new text-to-speech AI tool to wow and annoy us
Microsoft unveils text-to-speech tool that can mimic anybody’s voice
When you purchase through links on our site, we may earn an affiliate commission.Here’s how it works.
It seems that 2023 is the year of artificial intelligence (AI), andMicrosoftis the latest company keen to get in on the action.
Researchers from the company have posted apaperdetailing a new technology that would see huge leaps forward intext-to-speechtools.
A summary on the paper explains how the technology, which is being called VALL-E, “emerges in-context learning capabilities and can be used to synthesize high-quality personalized speech with only a 3-second enrolled recording of an unseen speaker as an acoustic prompt.”
Microsoft VALL-E
What this means in simple forms is that the tool can now break down what makes a person sound the way they do, including phoneme and acoustic code prompts, thanks to Meta’s EnCodec, and generate a sound that mimics more closely what they person may sound like beyond the three seconds of sample voice recording. The early stages of VALL-E have been made possible by analyzing over 60,000 hours’ worth of English language voice recordings.
TheGitHub postsurfaces a number of examples of how the technology can be used, including maintaining emotional cues and even environmental effects, such as the disconnected sound that’s typical of a phone conversation.
These are the best online collaboration tools>OpenAI reveals 3D model-building AI tool>Hackers could exploit ChatGPT to attack networks
While concise, there is a mention of the potential implications of such text-to-speech tools, which is increasingly important in a time where AI has uncovered ethical concerns that we’d only previously dreamt of (or had nightmares of).
In fact, any number of problems could arise from false recordings giving permission to something (like the number of banks that use telephone-based voice recognition authentication), to a whole lot worse.
Are you a pro? Subscribe to our newsletter
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
The conclusion states that VALL-E “may carry potential risks in misuse of the model, such as spoofing voice identification or impersonating a specific speaker. Benj Edwards ofArs Technicahas also noted that Microsoft is yet to share the project’s code for anybody else to try out, indicating that the potential risks are still being considered.
With several years’ experience freelancing in tech and automotive circles, Craig’s specific interests lie in technology that is designed to better our lives, including AI and ML, productivity aids, and smart fitness. He is also passionate about cars and the decarbonisation of personal transportation. As an avid bargain-hunter, you can be sure that any deal Craig finds is top value!
Best Usenet client of 2024
Best secure file transfer solution of 2024
7 myths about email security everyone should stop believing