Submitted by Randoms001 t3_10fdbjc in Futurology

As artificial intelligence advances at a rapid pace, it is not unusual for people to raise worries about the possible ramifications for human labour. The recent statement by a team of Microsoft researchers that they have built a new AI system capable of successfully mimicking a human speech using only a three-second audio sample adds gasoline to these fears. This technological accomplishment illustrates the potential for AI to not only automate a wide range of jobs, but also to possibly reproduce human capabilities and skills with greater precision and efficiency. The consequences of this breakthrough are substantial, since it raises key concerns about the future of labour and the role of AI in it.

Microsoft's recent announcement of Vall-E, a cutting-edge artificial intelligence technology for voice impersonation, has piqued the tech industry's attention and alarm. The system, which employs discrete codes derived from a neural audio codec model as well as an astounding 60,000 hours of speech data from over 7,000 speakers, is capable of reproducing a human voice with astonishing precision and delicacy.

Vall-E works by analysing a speaker's speech, breaking it down into its numerous components, and using this information to synthesise the voice saying other words. It is built on the base of a technology called EnCodec, which Meta unveiled in October 2022. This enables the system to reproduce not just the speaker's timbre and pitch, but also their emotional tone, using only a three-second audio sample.

While Vall-powers E's are unquestionably astounding, they also present major ethical concerns. As AI technology advances at a rapid speed, it is critical that we as a society confront the potential negative effects on employment and other sectors as soon as possible. Furthermore, this technology emphasises the importance of constant communication and collaboration among corporate leaders, legislators, and the general public to ensure that the development and deployment of AI corresponds with societal values and interests.

Experiments on Microsoft's Vall-E AI speech mimicking technology have generated extremely promising results. According to a Cornell University study article, the system "substantially exceeds" current state-of-the-art systems in terms of speech naturalness and speaker likeness. The article also highlights Vall-ability E's to keep the speaker's emotional inflection and auditory context in its synthesised speech.

Vall-capabilities E's are exhibited on GitHub, where the system is able to successfully recreate a speaker's voice with a high degree of resemblance, even with a three-second audio sample. While the voice is little artificial in places, it is still pretty good, and the potential for further progress is obvious.


Vall-potential E's uses are extensive, with Microsoft researchers picturing it as a powerful tool for text-to-voice conversion, speech editing, and even audio synthesis when combined with other generative AIs like GPT-3. This technology's release is expected to have a substantial influence on sectors that rely on voice imitation and text-to-speech technologies, and its continuing development will be actively studied.

As with any advanced technology, it is critical to understand the potential ramifications and hazards of using Vall-E, Microsoft's AI voice mimicking tool. One of the key worries is the prospect of abuse, such as impersonating public figures or duping people into passing over critical information by posing as someone they know or trust. Furthermore, the system's capacity to accurately mimic voices has the potential to overcome security systems that rely on voice identification.

Another source of concern is Vall-possible E's influence on job possibilities, particularly in businesses that rely on voice actors. Because of the system's capacity to mimic human voices at a substantially cheaper cost, demand for human voice actors may decline.


However, the Vall-E researchers have acknowledged these issues and said that precautions might be made to reduce these hazards. It is feasible, for example, to create detection models that can determine whether or not an audio sample was synthesised using Vall-E. Furthermore, the researchers have agreed to follow Microsoft's AI Principles when further developing the system.



You must log in or register to comment.

QuestionableAI t1_j4xd4yp wrote

Oh, I am so fucking sure that this shit will not be used for nefarious purposes on perceived criminals, liberals, any one in a protected group, and especially those who are victimized by Republican turds.


MinimumMonitor7 t1_j4xyfja wrote

Yeah, well. I'm so fucking sure the only people that will get excluded from it being abused are the ones in political offices, or their staff members. It will continue to not matter just like it always has. Political parties are irrelevant. Trillions of dollars in budget for them to ignore what any constituent wants. And they continue to get a way with everything.


Exact-Pause7977 t1_j4w5599 wrote

“Thou shalt not make a machine in the likeness of a human mind”

“Dune”, Frank Herbert

Mr Herbert was exploring these ideas decades ago in science fiction that portrayed the aftermath of AI and other associated technologies causing havoc on human civilization.

There are conversations we need to be having now about who and what we want to be…and how our technologies fit into our future... and how to channel them positively. Which ai future do we want… and how do we get there?

Asimov? Daneel olivaw Roddenberry? Lt Cmdr. data Herbert? Omnius Lucas - r2d2?

The list goes on… and includes visions we don’t want. We need authors to be writing now to give the technologists a vision to discuss… to dream…


Into-the-Beyond t1_j4zf6st wrote

Personally, I’m concerned with the prospect of a scammer being able to use an AI robo call that screams to our aged parents in our own panicked voices saying they need them to immediately transfer all of their wealth to the scammer. The world is not ready for this.


Zemirolha t1_j4zjb6w wrote

Why aged parents would do It If they know how easy is making such scam? Even fake vídeos are being created for long time now. Audios are even less complex.

If even they can clone their voices, they will reach enough criticar thinking for avoiding this kind of scam.


BigZaddyZ3 t1_j4zrzmv wrote

Yeah because we all know old people are extremely tech -savvy and stay up to date on modern tech trends right? And we all know older people don’t typically have impaired critical thinking skills due to their old age right?


Zemirolha t1_j5b0l0e wrote

treat others as inferiors and they will fell inferior