Kuaishou's Text-to-Image Model 'Ketu' Opens to outside world

By:Jacob Published 2024-05-30T14:58:36Z

According to TapTechNews on May 30, Kuaishou's self-developed text-to-image large model, Ketu, has been officially opened to the outside world recently. It currently supports two types of functions: text-to-image and image-to-image, which can be used for AI-generated images and AI image customization.

Users can use it through the Ketu WeChat mini-program and web version. According to Interface News, this is also the first time that Kuaishou has opened its self-developed series of large models to the outside world. The report cited informed sources as saying that the parameter scale of the Ketu large model is in the order of billions. These data come from open source communities, internal construction of Kuaishou, and self-developed AI technology synthesis, covering common tens of millions of Chinese entity concepts. It also introduces reinforcement learning and reward model technology (RLHF) to solve the effect problem of the text-to-image large model under long text and complex semantic text input.

The report said that Kuaishou clearly defined the large model application strategy this year, mainly including three directions: understanding, interaction, and generation. Specific application scenarios include global large model content understanding, AI interaction, digital humans and AIGC in business scenarios, etc.

Kuaishous Text-to-Image Model 'Ketu' Opens to outside world_0

After TapTechNews' actual measurement, it is found that the functions provided by Ketu include text-to-image, as well as a variety of anime themes and realistic portrait styles, including the Red Diamond Noble that was popular in the early本世纪 (early 21st century) and the relatively popular Clay World style recently, as well as many different styles of paintings. Users can generate up to 4 pictures at a time.

Kuaishous Text-to-Image Model 'Ketu' Opens to outside world_1

According to TapTechNews' previous report, Cheng Yixiao, the founder and CEO of Kuaishou, once boasted in March this year: He is confident that within the next six months, the comprehensive performance of the large model can reach the level of GPT4.0. At the same time, he also said that the comprehensive performance of Kuaishou's text-to-image large model Ketu has exceeded the level of Midjourney V5.

Kuaishou text to image large model Ketu