OpenAI's GPT-4o Outperforms Human Experts in Moral Reasoning

By:Jacob Published 2024-06-24T00:31:20Z

TapTechNews on June 24th. A recent study shows that OpenAI's latest chatbot, GPT-4o, can provide moral explanations and advice, and the quality is better than the advice provided by the 'ecognized' moral experts.

According to The Decoder report on Saturday local time, researchers from the University of North Carolina at Chapel Hill and the Allen AI Institute conducted two studies, comparing the GPT model with the moral reasoning ability of humans to explore whether large language models can be regarded as 'moral experts'.

TapTechNews summarizes the research contents as follows:

Study One

501 American adults compared the GPT-3.5-turbo model with the moral explanations of other human participants. The results showed that people thought GPT's explanations were more moral, more trustworthy, and more thoughtful than those of human participants.

The evaluators also thought that the AI's assessment was more reliable than others'. Although the difference was small, the key finding was that AI could match or even exceed the level of human moral reasoning.

OpenAIs GPT-4o Outperforms Human Experts in Moral Reasoning_0

Study Two

The advice generated by OpenAI's latest GPT-4o model was compared with the advice of renowned ethicist Kwame Anthony Appiah in The New York Times' Ethicist column. 900 participants rated the quality of advice for 50 ethical dilemmas.

The results showed that GPT-4o outperformed human experts in almost every aspect. People thought the AI-generated advice more morally correct, more trustworthy, more thoughtful, more accurate. Only in terms of perceiving nuances, there was no significant difference between the AI and human experts.

OpenAIs GPT-4o Outperforms Human Experts in Moral Reasoning_1

Researchers believe that these results show that AI can pass the comparative moral Turing test (cMTT). And text analysis shows that GPT-4o uses more moral and positive language when providing advice than human experts. This can partly explain why the AI's advice is rated higher - but it's not the only factor.

It should be noted that this study was only conducted for American participants, and further research is still needed on how people view the cultural differences of AI-generated moral reasoning.

Paper address: https://osf.io/preprints/psyarxiv/w7236

OpenAI GPT 4o Moral Reasoning AI