Ant Group's New Open-Sourced Project EchoMimic for Facial and Audio Video Generation

By:Maxwell Published 2024-07-11T08:49:35Z

TapTechNews July 11th news, on the 10th, Ant Group open-sourced a new project named EchoMimic, which can help characters'mouth the words' through facial features of a person's portrait and audio, and generate relatively stable and natural videos by combining facial landmark points and audio content.

Ant Group's New Open-Sourced Project EchoMimic for Facial and Audio Video Generation_0

This project has high stability and naturalness. By fusing the features of audio and facial landmark points (key facial features and structures, usually located at the eyes, nose, mouth, etc.), videos that better conform to real facial movement and expression changes can be generated.

It supports generating portrait videos using only audio or facial landmark points independently, and also supports combining audio and portrait photos to create an effect like'mouthing the words'. It is known that it supports multiple languages (including Chinese Mandarin and English) and multiple styles, and can also deal with singing and other scenarios.

TapTechNews attached relevant links:

Ant Group EchoMimic facial features audio video generation