The Decrease in Cost of Training GPT-2 and the Challenges in AI Development

TapTechNews July 13th news, GPT-2 is a model launched by OpenAI in 2019, and its training cost was once $256 per hour. In the GPT-4 era five years later, with the progress of software, hardware and data, does it mean that the time and cost required to train the same model will decrease accordingly? The answer is yes.

According to Tom's Hardware report today, Andrej Karpathy, the former Tesla AI director, OpenAI co-founder, and project developer, used llm.c to reproduce GPT-2, and its cost dropped to only $28 per hour (TapTechNews note: currently about 204 RMB), reducing by nearly 90% in just five years.

The Decrease in Cost of Training GPT-2 and the Challenges in AI Development_0

The main factor for the cost reduction is that it uses a single 8XH100 node for training. In addition, Andrej Karpathy said that llm.c directly implements GPT training. Since llm.c is the direct implementation of GPT training in C/CUDA, its requirements are very low - no conda environment, Python interpreter, pip installation, etc.. You just need to start a cloud GPU node, selectively install NVIDIA cuDNN, NCCL/MPI, download the.bin data shard, compile and run, and you can start within a few minutes.

He added: Then wait for 24 hours (28*24=672), and you can generate a sample about 'unicorns in the Andes that can speak English'.

It is reported that the llm.c project was originally part of an educational video, but soon became a project that Karpathy built from scratch after encountering some PyTorch problems.

However, the report believes that the progress of hardware, software and training data does not mean that the cost of cutting-edge AI training is decreasing. For example, Anthropic CEO Dario Amodei recently said that the currently developing AI model may require a training cost of 1 billion US dollars, and the cost of a more advanced model is expected to reach 100 billion US dollars by 2025.

The improvement of hardware performance is also accompanied by an increase in cost. For example, the unit price of the NVIDIA H100 chip is $40,000, and the expected price of the next-generation Blackwell AI chip may reach $70,000. But even so, the CEO of Google Deepmind once said that the current model's IQ level is still just like that of a cat.

Likes