OpenAI Unveils Voice Engine: Recreating Voices with AI

OpenAI Unveils Voice Engine: Recreating Voices with AI

OpenAI, the trailblazing artificial intelligence startup, has once again pushed the boundaries of what technology can achieve. After creating tools that generate images and full-motion videos, OpenAI has now unveiled Voice Engine, a remarkable system that can recreate someone’s voice. This breakthrough technology enables users to upload a 15-second recording of their voice, along with a paragraph of text, and have the text read back to them using a synthetic voice that sounds remarkably like their own.

But OpenAI is proceeding with caution. The company is currently restricting access to Voice Engine to a small group of businesses as it grapples with understanding and mitigating the potential dangers associated with the technology. Similar to image and video generators, a voice generator could be weaponized for malicious purposes, spreading disinformation on social media or enabling criminals to impersonate others online or even during phone calls.

OpenAI’s product manager, Jeff Harris, emphasized the importance of getting it right, especially considering the potential ramifications for voice authenticators used in online banking and other personal applications. To counter these risks, OpenAI is exploring options such as watermarking synthetic voices or implementing controls to prevent the use of voices belonging to politicians or other public figures.

This is not the first time OpenAI has exhibited caution when introducing new technology. In February, when it unveiled its video generator, Sora, the company showcased its capabilities but did not make it publicly available. OpenAI joins a growing cohort of companies, including tech giants like Google, as well as startups like ElevenLabs, that have developed AI technology capable of generating synthetic voices. Businesses can leverage these tools for various purposes, such as creating audiobooks, giving voice to chatbots, or even establishing automated radio stations.

While OpenAI has previously employed voice actors to build its array of voices, Voice Engine represents a significant advancement. It allows individuals and businesses to recreate voices from short clips, ushering in a new era of potential dangers. Harris warns of the risks this technology could pose, especially in an election year, referencing the robocall messages received by New Hampshire residents in January. These calls, dissuading voters from participating in the state primary, featured a voice most likely artificially generated to sound like President Joe Biden. Subsequently, the Federal Communications Commission outlawed such calls.

However, OpenAI sees more than just potential hazards in their new technology. Harris reveals that the company has no immediate plans to monetize Voice Engine, highlighting the tool’s potential benefits for individuals who have lost their voices due to illness or accidents. Harris provides an example of how the technology was successfully used to recreate the voice of a woman who suffered voice damage from brain cancer. By utilizing a brief recording of a presentation she made as a high schooler, she can now speak again.

OpenAI’s Voice Engine represents the cutting edge of AI technology. While the undeniably powerful tool raises concerns about its potential misuse, the company’s cautious approach and exploration of protective features show a conscientious commitment to responsible innovation. As AI technology continues to progress, OpenAI’s development of Voice Engine stands as an impressive milestone in the ever-evolving landscape of artificial intelligence.


Written By

Jiri Bílek

In the vast realm of AI and U.N. directives, Jiri crafts tales that bridge tech divides. With every word, he champions a world where machines serve all, harmoniously.