SenseTime Unveils Innovative Vimi A可控 Human Video Generation Model

TapTechNews July 4th, SenseTime has released the first "controllable" human video generation large model Vimi at the World Artificial Intelligence Conference (WAIC). It can generate a human-like video that is consistent with the target action from any style of photo and supports multiple driving methods, which can be driven by various elements such as existing human videos, animations, sounds, texts, etc.

SenseTime Unveils Innovative Vimi A可控 Human Video Generation Model_0

Unlike the picture expression control technology that can only control the head expression and movement, SenseTime claims that Vimi can not only achieve precise control of human expressions, but also realize the natural limb changes of the person in the half-body area of the photo, and automatically generate hair, clothing and background changes that match the person.

SenseTime Unveils Innovative Vimi A可控 Human Video Generation Model_1

At the same time, Vimi can stably generate a one-minute single-lens human video, and the screen effect will not deteriorate or distort with the change of time to meet the needs of long-term stable video generation such as entertainment interaction.

SenseTime Unveils Innovative Vimi A可控 Human Video Generation Model_2

Vimi will be fully open to C-end users. Users only need to upload high-definition personal pictures from different angles to automatically generate digital avatars and different styles of portrait videos.

SenseTime Unveils Innovative Vimi A可控 Human Video Generation Model_3

The video characters generated by Vimi are no longer just the dull movement of the facial features, but are matched with gestures, limbs, hair, etc., forming a more complete and unified character movement, allowing creators to edit and re-create based on the generated video materials.

SenseTime said it will announce more details of Vimi tomorrow, and TapTechNews will also continue to pay attention and bring follow-up reports.

Likes