China Telecom's AI Research Institute Completes First Fully Domestic-Produced Trillion-Parameter Large Model and Open Sources Xingchen Semantic Model

TapTechNews September 28th news, the official public account of China Telecom Artificial Intelligence Research Institute announced today that the China Telecom Artificial Intelligence Research Institute (TapTechNews note: hereinafter referred to as TeleAI) successfully completed the first trillion-parameter large model based on the fully domestic-produced 10,000-card cluster training in China, and officially open-sourced the first 100-billion-parameter large model based on the fully domestic-produced 10,000-card cluster and domestic deep learning framework training - Xingchen Semantic Large Model TeleChat2-115B.

China Telecoms AI Research Institute Completes First Fully Domestic-Produced Trillion-Parameter Large Model and Open Sources Xingchen Semantic Model_0

The official said that this scientific research achievement marks that the training of domestic large models has truly achieved full domestic substitution, and officially entered a new stage of full domestic independent innovation, security and controllability.

TeleChat2-115B is trained based on the Xirang Integrated Intelligent Computing Service Platform of China Telecom's self-developed天翼 cloud and the Xinghai AI Platform of the artificial intelligence company. It is introduced that it uses a variety of optimization means to improve the training efficiency and stability of the model while ensuring the training accuracy, achieving that the computing efficiency of the same GPU computing power exceeds 93%, and the effective training time of the model accounts for more than 98%.

For the training of super-large parameter models, TeleAI uses a large number of small models for Scaling to verify the effectiveness of different model structures. At the same time, in terms of data ratio, based on the feedback of small model experiment results, a regression prediction model is used to obtain a better data ratio.

In the aspect of Post-Training, TeleAI first synthesized a large amount of question-and-answer data for contents such as mathematics, code and logical reasoning for the first stage model training of SFT (Supervised Fine-Tuning).

Secondly, it adopts an iterative update strategy, uses the model to improve the complexity and diversity of the prompt word data, improves the answer quality through model synthesis and manual annotation, and uses rejection sampling to obtain high-quality SFT data and RM (Reward Model) representative data for SFT training and DPO (Preference Alignment) training, as well as iterative model effects.

TapTechNews attached open source address

GitHub:

Likes