Kunlun Wanwei Opensources 200-Billion Sparse Large Model Skywork-MoE

On June 3, TapTechNews reported that Kunlun Wanwei announced to open source the 200-billion sparse large model Skywork-MoE today. It is expanded from the intermediate checkpoint of the previously open-sourced Skywork-13B model by Kunlun Wanwei. It is claimed to be the first open-source 100-billion MoE large model to completely apply and implement the MoEUpcycling technology, and also the first to support inference on a single RTX 4090 server (with 8 RTX4090 graphics cards).

According to the introduction, the Skywork-MoE model open-sourced this time belongs to the research and development model series of Skywork 3.0 and is the medium-sized model (Skywork-MoE-Medium). The total number of parameters of the model is 146 billion, the number of activated parameters is 22 billion, there are 16 Experts, and each Expert is 13 billion in size, and 2 Experts are activated each time.

Skywork 3.0 has also trained two other grades of MoE models, 75B (Skywork-MoE-Small) and 400B (Skywork-MoE-Large), which are not included in this open source.

According to the official test, under the same activated parameter amount of 20B (inference computing amount), the ability of Skywork-MoE is close to that of a 70B Dense model, resulting in a nearly 3-fold reduction in the inference cost of the model. At the same time, the total parameter size of Skywork-MoE is about 1/3 smaller than that of DeepSeekV2, achieving similar capabilities with a smaller parameter scale.

Kunlun Wanwei Opensources 200-Billion Sparse Large Model Skywork-MoE_0

The model weights and technical reports of Skywork-MoE are completely open source, free for commercial use, and no application is required. TapTechNews attaches the following links:

Model weight download:

https://huggingface.co/Skywork/Skywork-MoE-base

https://huggingface.co/Skywork/Skywork-MoE-Base-FP8

Model open source repository: https://github.com/SkyworkAI/Skywork-MoE

Model technical report: https://github.com/SkyworkAI/Skywork-MoE/blob/main/skywork-moe-tech-report.pdf

Model inference code: (supports 8-bit quantization loading and inference on 8x4090 servers) https://github.com/SkyworkAI/vllm

Likes