A year later, Openai did not issue an audio cloning tool

In late March, Openai announced a “small inspection” to serve artificial intelligence, Sound engineThe company claimed that the person’s voice could be reproduced with only 15 seconds of speech. Almost a year later, the tool remains in the inspection, and Openai did not provide any indication of the time of its launch – or whether it will be released at all.

The company’s reluctance to offer the service on a large scale may indicate fears of misuse, but it may also reflect an attempt to avoid the regulatory audit. Openai historically accuse From giving priority “glossy products” at the expense of safety, and Fallen versions To overcome marketing companies.

In a statement, an Openai techcrunch spokesman told the company continues to test the audio engine with a limited group of “trusted partners”.

“We are learning from how to use (our partners) technology so that we can improve the benefit and safety of the model.” “We were excited to see the different methods that are used, from speech therapy, to language learning, to customer support, to video game characters, to AI Avatar.”

I pushed back

The audio engine, which runs the sounds available in the Openai App for the text to speak as well as chatgpt SoundHe brings a natural speech that is closely similar to the original speaker. The tool converts the written letters into speech, which is limited only by some handrails to the content. But he was prone to delay and diverting the issuance windows from the beginning.

Openai also explained in June 2024 Blog postThe audio engine model learns to predict the most likely sounds that the speaker will make for a specific text text, taking into account the voices, dialects and various speaking patterns. Next, the model can not only create versions of the spoken text, but also “spoken words” that reflect how you can read different types of speakers a loud text.

The Openai at first was aimed at bringing the audio engine, which was originally called custom sounds, to its application programming interface on March 7, 2024, according to a blog project seen by Techcrunch. The plan was to give a group of up to 100 “reliable developers” before the first time appeared, while giving priority to the Devs building applications that provided “social benefit” or showed “innovative and responsible” uses of technology. Openai was even trademark It priced it: $ 15 per million “standard” voices and $ 30 per million people for “HD Quality” sounds.

Then, at eleven o’clock, the company postponed the advertisement. Openai ended with the audio engine detection a few weeks after the registration option. Openai said that access to the tool will remain limited to a group of about 10 Devs that the company started working with in late 2023.

“We hope to start a dialogue on responsible publishing of artificial sounds and how society can adapt to these new capabilities,” Openaii Books in the audio engine advertisement blog In late March 2024. “Based on these talks and the results of these small tests, we will make a more enlightened decision on whether it is and how to publish this technology on a large scale.”

Long in business

The audio engine has been in business since 2022, according to Openai. Company Claims It collected the tool to “global policy makers at the highest levels” in the summer of 2023 to show its potential – risks.

Many partners can access the audio engine today, including the start of LIVOX, which builds devices that enable people with disabilities to communicate normally. CEO Carlos Pereira told Teccrunch while Livox was eventually able to build a sound engine in a product because of the online tool requirements (many LIVOX customers do not have the Internet), he found that technology was “really great”.

“The sound quality and the possibility of making the sounds speaking in different languages are unique – especially for people with disabilities and our customers,” Pereira told Techcrunch via email. “It is really the most impressive and easy to use (tool) to create sounds I have seen (…) We hope that Openai will develop an unconnected version soon.”

Pereira says he did not receive directives from Openai about launching a possible audio engine, and no signs the company planned to start imposing fees on the service. To date, Levox has not been forced to pay for its use.

In this position mentioned above, Openai hinted that one of his considerations in delaying the sound engine is the possibility of abuse during the American election session last year. Ely through discussions with stakeholders, Voice Engine has many reduced safety measures, including the watermark to track the source of the created sound.

Developers must obtain “explicit approval” from the original speaker before using the audio engine, according to Openai, and they must prepare “clear disclosure” for their fans that the sounds are generated. However, the company did not say how to implement these policies. Doing this may be very difficult, even for an Openai resource.

In its blog publications, Openai also guaranteed that it hopes to build a “audio authentication experience” to verify speakers and a “non -navigation” list that prevents the creation of sounds that seem very similar to prominent numbers. Both are technologically ambitious projects, and it will be badly reflected in a company that is often accused Marginalization of safety initiatives.

Effective liquidation and rapid identity verification have become the baseline requirements for technical filling technology. The cloning of the sound of artificial intelligence was the third fastest growth for the year 2024, According to one source. It has led to scam and Bank security checks They are overlooked as privacy and copyright laws struggling to keep pace with this. Harmful actors used audio reproduction to create incendiary depths Celebrities and PoliticiansAnd that deep it has spread like wildfire Via social media.

Openai can launch the audio engine next week – or never. The company has repeatedly said it weighs maintaining small service in its scope. But there is one clear thing: for the causes of optics, or the causes of safety, or both, the limited inspection of the audio engine has become one of the longer than the history of Openai.

I pushed back

Long in business

Leave a ReplyCancel Reply