Google's Gemini 1.5Pro Model Shines in Math with Improved Performance

By:Nathan Published 2024-05-21T01:07:49Z

TapTechNews May 21st news, Google company released a technical report last week, stating that after the Gemini 1.5Pro model was specifically trained in the field of mathematics, it significantly improved its math performance and successfully solved some problems of the International Mathematical Olympiad.

Googles Gemini 1.5Pro Model Shines in Math with Improved Performance_0

Google trained the Gemini 1.5Pro model specifically for the math scenario and tested it through the MATH benchmark, the American Invitational Mathematics Examination (AIME), and Google's internal HiddenMath benchmark.

According to Google's data, the math-type Gemini 1.5Pro's performance in the math benchmark is comparable to the performance of human experts. Compared to the standard non-math-type Gemini 1.5Pro, the number of problems solved by the math-type Gemini 1.5Pro in the AIME benchmark significantly increased, and the scores in other benchmarks also improved.

Googles Gemini 1.5Pro Model Shines in Math with Improved Performance_1

Googles Gemini 1.5Pro Model Shines in Math with Improved Performance_2

Among the three examples shared by Google officials, two were solved by the math-specific Gemini 1.5Pro, while one was wrongly solved by the standard Gemini 1.5Pro variant. These problems usually require the solver to recall the basic math formulas in algebra and rely on their segmentation and other math rules to get the correct answer. TapTechNews attached relevant screenshots as follows:

Googles Gemini 1.5Pro Model Shines in Math with Improved Performance_3

Googles Gemini 1.5Pro Model Shines in Math with Improved Performance_4

Googles Gemini 1.5Pro Model Shines in Math with Improved Performance_5

In addition to the problems, Google also shared important details of the Gemini 1.5Pro benchmark test. These data show that in all five benchmark test scores, Gemini 1.5Pro is ahead of GPT-4Turbo and Amazon's Claude.

Google said that the individual sample MATH benchmark accuracy rate of the math-derived version of Gemini 1.5Pro is 80.6%, and when sampling 256 solutions and selecting a candidate answer (rm@256), the accuracy rate reaches 91.1%.

Reference

Google Gemini 1.5Pro math performance