· ai · 2 min read

Meta presented its text2speech neural network, which is 20 times faster than similar models

This article discusses Meta's presentation of its Voicebox text-to-speech neural network, which can create audio tracks from scratch and edit existing samples, and works up to 20 times faster than similar models. The article notes that Voicebox surpasses other neural networks in terms of sound quality and speech clarity, but currently only works in six languages.

At the same time, Voicebox from Meta is not just an AI text-to-speech converter. The neural network can create audio tracks from scratch and edit existing samples, for example, to clean them from noise. Another Voicebox is enough audio reference in 2 seconds to copy the voice.

Artificial intelligence works so far only in six languages: English, French, German, Spanish, Polish and Portuguese. But, as always, they plan to expand the list.

According to the developers, Voicebox surpasses other neural networks, like VALL-E and YourTTS, in terms of sound quality and speech clarity, and at the same time it works up to 20 times faster.

And just because of the impressive capabilities, Meta is in no hurry to merge the AI model into wide access: they are afraid that the neural network will be abused.

As for the disadvantages, so far we can say that the artificial intelligence Voicebox distinguishes colloquial speech worse, since it was trained on audiobooks. But we think this is also temporary.

And if you already want to create content for promotion in social networks just as easily and quickly, 20 times ahead of competitors, then you should learn more about the SMMart AI tool!

Our neural network generates text for posts and images for them at your request in just a few clicks. Read more on our website and get a personal AI assistant today.

#NeuralNetworks #ArtificialIntelligence #AI #SMM #SMMart #Voicebox

Back to Blog