OpenAI Trains CriticGPT to Detect Errors in ChatGPT Output

TapTechNews June 28th, local time on the 27th, OpenAI announced that it trained a model named CriticGPT based on GPT-4, which is used to find errors in the output content of ChatGPT chat robots. It can write comments highlighting inaccurate places in the answers generated by ChatGPT.

OpenAI Trains CriticGPT to Detect Errors in ChatGPT Output_0

According to the introduction, CriticGPT is designed to assist human AI trainers to complete their work - using a technology called Reinforcement Learning from Human Feedback (TapTechNews note: Reinforcement Learning from Human Feedback, RLHF) to train and improve the answers of GPT-4.

However, as the accuracy of ChatGPT continues to increase, the errors become more and more covert, making the work of AI trainers increasingly difficult. OpenAI explained that this is one of the basic limitations of RLHF - the model gradually becomes more knowledgeable than anyone who can provide feedback, and the coordination of the model may also become more and more difficult.

At present, when CriticGPT tries to discover errors from the answers of ChatGPT, its pair of eagle eyes will play a role. OpenAI pointed out that errors in the real world may be spread over multiple parts of the answer, which is a problem that CriticGPT needs to solve in the future. Our focus is to be able to point out errors in one place, but in the future we also need to solve scattered errors.

Likes