China Telecom and Zhiyuan Release Tele-FLM Series Models

TapTechNews on June 19, the Telecommunications Artificial Intelligence Research Institute of China Telecom (TeleAI) and the Zhiyuan Research Institute jointly released the world's first single-dense trillion-parameter semantic model, Tele-FLM-1T. This model, together with the billion-level 52B version and the trillion-level 102B version, forms the Tele-FLM series of models.

Based on technologies such as model growth and loss prediction, the Tele-FLM series of models only used 9% of the computing power resources of the common training plan in the industry. Based on 112 A800 servers, it took 4 months to complete the training of a total of 2.3 Ttokens for the 3 models. The whole process of model training achieved zero adjustments and zero retries, with high computing power efficiency and good model convergence and stability. The Tele-FLM-1T version will be open sourced soon.

Currently, the TeleFLM series of models has fully open sourced the 52B version, and the core technologies (growth technology, optimal hyperparameter prediction) and training details (loss curve, optimal hyperparameter, data ratio, GradNorm, etc.) are all open source. The download volume of the open source model has exceeded 10,000, and it has accumulated over 400,000 users.

In addition, in a certain citizen's livelihood demand scenario project of China Telecom, by introducing the ability of the TeleChat-52B model, the overall application effect was improved by 40%, reaching the leading level in the industry.

TapTechNews attaches the open source address of the Tele-FLM-52B version: https://huggingface.co/CofeAI/Tele-FLM.

The trial address of Tele-FLM-Chat (pure model single-round dialogue version): https://modelscope.cn/studios/FLM/ChatFLM.

Likes