OpenAI Tackles AI “Hallucinations” by Rethinking Evaluations

openai-logo-building-facade_converted.jpg

OpenAI has figured out why chatbots often give wrong but confident answers, known as “hallucinations.” A new paper explains that these errors happen because AI models are trained to guess when unsure. Unlike humans, who learn the value of admitting uncertainty, AI models are optimized for test performance, where guessing is rewarded, and silence is seen as failure.

This leads to AI systems sounding confident even when they are wrong. Companies like Anthropic have tried to make their models more cautious by avoiding incorrect statements, but this cautious approach can limit usefulness. OpenAI suggests the solution is changing the evaluation process. They believe evaluations should stop rewarding guessing and instead encourage AI to express uncertainty when necessary.

By adjusting evaluations, OpenAI hopes to improve AI reliability instead of just making it faster or more articulate. The goal is to create systems that balance knowledge with humility, ensuring that chatbots provide trustworthy answers, especially in sensitive fields like medical or financial advice.

Share this post

submit to reddit
scroll to top