OpenAI's New AI Model GPT-4o Dominates Chatbot Arena

By:Nathan Published 2024-05-14T10:26:17Z

TapTechNews May 14 news, OpenAI employee William Fedus confirmed on the social platform X on Monday that the mysterious chatbot gpt-chatbot which performed well in the LMSYS Chatbot Arena is actually their newly released AI model GPT-4o. Fedus also revealed that GPT-4o topped the leaderboard in testing in the arena, achieving the highest score ever.

GPT-4o is our most advanced model, Fedus wrote on Twitter, We have been testing a version of the model in the arena under the name 'im-also-a-good-gpt2-chatbot'.

The chatbot arena is a website where visitors can chat with two random AI language models simultaneously without knowing which is which, and then choose the model that provides better replies.

Since April this year, OpenAI has tested multiple versions of GPT-4o in the arena, initially appearing as gpt2-chatbot, then changing to im-a-good-gpt2-chatbot, and finally im-also-a-good-gpt2-chatbot.

Since the release of GPT-4o today, multiple sources have revealed that the model has climbed to the top of the internal leaderboard of LMSYS with a significant advantage, surpassing the previous top-ranked models Claude3Opus and GPT-4Turbo.

The official account of lmsys.org shared a chart and wrote, The 'gpt2-chatbot' series models have just surged to the top with a significant advantage (about 50 Elo) over all other models, making it the strongest model in the arena. This is an internal screenshot, the public version of 'gpt-4o' has now entered the arena and will soon appear on the public leaderboard!

As of the time of TapTechNews's report, im-also-a-good-gpt2-chatbot has an Elo score of 1309, ahead of GPT-4-Turbo-2023-04-09 with 1253 points and Claude3Opus with 1246 points. Before the three gpt2-chatbot models appeared and disrupted the leaderboard, Claude3 and GPT-4Turbo had been competing for the top spot.

OpenAI AI model GPT 4o chatbot arena LMSYS Elo score