Tencent Team Releases New Image-to-Video Generation Model

TapTechNews on June 7, news that Tencent's Hunyuan team, in collaboration with Sun Yat-sen University and the Hong Kong University of Science and Technology, jointly launched a brand-new image-to-video generation model 'Follow-Your-Pose-v2', and the relevant achievements have been published on arxiv (TapTechNews attaches DOI: 10.48550/arXiv.2406.03035).

Tencent Team Releases New Image-to-Video Generation Model_0

According to the introduction, 'Follow-Your-Pose-v2' only needs to input a portrait image and a video clip of an action, and then the person in the image can move following the action in the video, and the generated video can be up to 10 seconds long.

Compared with the previously launched models, 'Follow-Your-Pose-v2' can support multi-person video action generation with less inference time consumption.

In addition, this model has strong generalization ability. No matter what age and clothing the input person is, how messy the background is, and how complex the action in the action video is, it can generate high-quality videos.

Tencent Team Releases New Image-to-Video Generation Model_1

As reported by TapTechNews the day before, Tencent has already announced the acceleration library for Tencent's Hunyuan text-to-image open source large model (Hunyuan DiT), claiming a significant improvement in inference efficiency and a 75% reduction in the image generation time.

The official said that the usage threshold of the Hunyuan DiT model has also been greatly reduced, and users can use the text-to-image model capability of Tencent's Hunyuan based on the graphical interface of ComfyUI.

Likes