Google's Gemini 1.5Pro Model Once Topped but OpenAI's chatgpt-4o-latest Regained First in ChatbotArena

TapTechNews August 14th news, Google released the strongest Gemini 1.5Pro model last week and won the first place in the ChatbotArena competition of LMSYS, while OpenAI quickly 'regained face', and the latest chatgpt-4o-latest model regained the first place.

Introduction of chatgpt-4o-latest

OpenAI company released gpt-4o-2024-08-06 last week, and its API supports structured output; yesterday, it released a brand new cutting-edge model named chatgpt-4o-latest, which is the latest version of GPT-4o, with a maximum input of 128,000 tokens for the context window and a maximum output of 16,384 tokens.

Introduction of LMSYS's ChatbotArena

ChatbotArena was recently released by the LMSYSOrg team led by the University of California, Berkeley. It is a benchmark platform for large language models.

This platform uses an anonymous and random way to let different large model products compete and evaluate. Based on the Elo rating system widely used in competitive games such as chess, it is generated through user voting. Each time, the system will randomly select two different large model robots to chat with users and let users anonymously choose which large model product performs better.

In the end, the system determines the points of large model products according to the users' choices and appears in the form of a leaderboard on the homepage.

The latest achievements of chatgpt-4o-latest

Google's experimental Gemini 1.5Pro model won the first place with a score of 1297 last week, which was Google's first time topping the ChatbotArena of LMSYS.

Googles Gemini 1.5Pro Model Once Topped but OpenAI's chatgpt-4o-latest Regained First in ChatbotArena_0

OpenAI, with the new chatgpt-4o-latest model, regained the first place in the arena with the highest score of 1314.

Score shows that the new ChatGPT-4o has significant improvements in coding, instruction following, and hard prompts. TapTechNews attaches the relevant scores as follows:

Overall score: First place

Mathematics: #1-2

Programming: First place

Hard Prompts: First place

Instruction Following: First place

Longer Query: First place

Multi-Turn: First place

Likes