OpenAI Announces Availability of GPT-4o's Voice Mode for Some ChatGPT Plus Users

By:Nathan Published 2024-07-30T22:58:10Z

TapTechNews July 31, local time on the 30th, OpenAI announced that it will open the voice mode of GPT-4o (TapTechNews note: Alpha version) to some ChatGPT Plus users as of today, and will gradually roll out to all ChatGPT Plus subscribers this fall.

OpenAI Announces Availability of GPT-4os Voice Mode for Some ChatGPT Plus Users_0

In May this year, Mira Murati, the chief technology officer of OpenAI, mentioned in a speech:

In GPT-4o, we have trained a brand-new unified end-to-end model across text, vision and audio, which means that all inputs and outputs are processed by the same neural network.

Since GPT-4o is our first model that combines all these modes, we are still in the initial stage of exploring the functions and limitations of this model.

The OpenAI originally planned to invite a small number of ChatGPT Plus users to test the GPT-4o voice mode at the end of June this year, but the official announced the postponement in June, stating that more time is needed to polish the model and improve the ability of the model to detect and reject certain contents.

According to the previously exposed information, the average voice feedback delay of the GPT-3.5 model is 2.8 seconds, while the delay of the GPT-4 model is 5.4 seconds, so it is not very good in voice communication, and the upcoming GPT-4o can greatly shorten the delay time for nearly seamless conversation.

The GPT-4o voice mode has the characteristics of fast response and voice comparable to that of a real person. OpenAI even claims that the GPT-4o voice mode can perceive the emotional intonation in the voice, including sadness, excitement or singing.

OpenAI spokesperson Lindsay McCallum said: ChatGPT cannot imitate the voices of others, including the voices of individuals and public figures, and will prevent the output that is different from the preset voice.

OpenAI ChatGPT Plus GPT 4o voice mode