Microsoft's Azure AI Speech Service Launches Text-to-Speech Virtual Avatar Function

By:Maxwell Published 2024-08-22T23:34:10Z

TapTechNews August 23rd news, Microsoft's Azure AI Speech Service allows developers to build multilingual generative AI speech applications. The latest Azure AI Speech Service has launched the Text-to-Speech Virtual Avatar function, which can convert simple text into human-like natural speaking videos.

Microsoft's Azure AI Speech Service Launches Text-to-Speech Virtual Avatar Function_0

Today, Microsoft announced the full launch of the Text to Speech Avatar feature. This new feature enables developers to create personalized virtual avatars for their users. The output video resolution of this service is 1920x1080, 25 frames per second. TapTechNews attaches an example as follows:

Microsoft's Azure AI Speech Service Launches Text-to-Speech Virtual Avatar Function_1

Text to Speech Avatar has the following functions:

Converts text into a human speaking video supported by Azure AI Text-to-Speech, with a natural-sounding voice.

Provides different preset characters.

The voice of the characters is generated by Azure AI Text-to-Speech.

Uses the batch synthesis API to asynchronously or in real-time synthesize text-to-speech avatar videos.

Provides content creation tools in SpeechStudio to create video content without coding.

Enables real-time avatar conversations through the real-time chat avatar tool in SpeechStudio.

In terms of pricing, the charge for the text-to-video service will be calculated according to the length of the video output and charged by the second. This service has now been launched in Southeast Asia, Northern Europe, Western Europe, Central Sweden, the central-southern United States, and the western United States regions.

Microsoft Azure AI Speech Service Text to Speech Avatar