HuggingFace Announces 'SmolLM' Small Language Model Family with Promising Performance

TapTechNews July 20th news, nowadays, small language models are heating up, and many manufacturers have begun to launch small models suitable for lightweight devices such as mobile phones. This week, HuggingFace announced the SmolLM small language model family, which includes 135 million, 360 million, and 1.7 billion parameter models. TapTechNews attaches the project as follows (click here to access).

HuggingFace Announces SmolLM' Small Language Model Family with Promising Performance_0

According to the introduction, these models are claimed to be trained with carefully curated high-quality training datasets and are claimed to be quite powerful in Python programming performance. The team pointed out that they focused on optimizing the amount of RAM required by the model, even can run on an iPhone15 with 6GB of RAM.

In the training aspect, the HuggingFace team first established a dataset named SmolLM-Corpus (click here for the dataset address). This dataset mainly contains Python teaching content Python-Edu, Web education content FineWeb-Edu, and common sense content generated by using the two models Mixtral-8x7B-Instruct-v0.1 and Cosmopediav2, with a total token amount of 600 billion. Thereafter, the HuggingFace team trained the SmolLM small language model using the SmolLM-Corpus dataset.

The HuggingFace team benchmarked the developed SmolLM model against other models with the same number of parameters. Among them, SmolLM-135M outperformed other models with less than 200 million parameters in multiple tests; while the test results of SmolLM-360M were better than all models with less than 500 million parameters, but inferior in some items to Meta's just-announced MobileLLM-350M; the SmolLM-1.7B model surpassed all models with less than 2 billion parameters, including Microsoft's Phi-1.5, MobileLLM-1.5B, and Qwen2.

HuggingFace Announces SmolLM' Small Language Model Family with Promising Performance_1

Likes