New Calibration Method for Large Language Models The 'Thermometer' from MIT

TapTechNews July 31st news, people are increasingly using large models to accomplish various tasks, whether it's translation, summarizing articles, or identifying financial fraud. Large models are truly 'all-encompassing'. Despite these models' 'astonishing' capabilities, they occasionally generate incorrect answers and are 'overly confident' in wrong answers and 'less confident' in correct answers, raising doubts among users about whether large models are trustworthy.

According to today's report by MITNEWS, researchers from the Massachusetts Institute of Technology (MIT) and the MIT-IBM Watson AI Lab have proposed a calibration method specifically tailored for large language models. Their method is called the 'thermometer', and the principle is to build a'smaller auxiliary model' on top of the large language model to calibrate it.

New Calibration Method for Large Language Models The 'Thermometer' from MIT_0

It is known that this method called 'thermometer' requires 'less computing power', but at the same time can'maintain the accuracy of the model' and enable it to make better calibration responses in tasks that have not been encountered before.

By efficiently calibrating the large language model for various tasks, the 'thermometer' can help users identify situations where the model is 'overly confident' in wrong predictions, and ultimately prevent users from deploying the model in 'potentially failing situations'.

The first author of the relevant paper, Shen Maohao (TapTechNews note: transliteration), a graduate student in electrical engineering and computer science at MIT, said, 'We hope to provide users with a clear signal to tell them whether the model's response is accurate or not to reflect the model's uncertainty and let them know if the model is reliable.'

With the help of the 'thermometer', the researchers developed a versatile technique that uses a classic calibration method called 'temperature scaling' to effectively calibrate large language models for new tasks. In this context, 'temperature' is a scaling parameter used to adjust the'model confidence' to be consistent with its prediction accuracy.

The researchers trained an auxiliary model that runs on top of the large language model and automatically predicts the 'temperature' required for calibrating new tasks. The 'thermometer' only needs to access a'small part inside the large language model' to predict the correct 'temperature' of a specific task data point to calibrate its prediction.

The team hopes to make the 'thermometer' support more complex text generation tasks in the future and apply this technology to larger large language models.

Reference

Likes