DigiNews

Tech Watch by Johan Denoyer

← Back to articles

Testing suggests Google’s AI Overviews tells millions of lies per hour

Quality: 8/10 Relevance: 9/10

Summary

Ars Technica reports on a New York Times analysis assessing the accuracy of Google's Gemini-powered AI Overviews. The study suggests AI Overviews is correct about 90 percent of the time, but that means that roughly one in ten responses is wrong, potentially translating into millions of incorrect answers each day if applied to general search. The analysis used a 4,000+ question SimpleQA benchmark and involved the startup Oumi. Initial results with Gemini 2.5 showed 85% accuracy; after Gemini 3, accuracy rose to 91%, yet even small miss rates have real-world impact. Examples include misdating events related to Bob Marley and mischaracterizing the Classical Music Hall of Fame. Google counters that the benchmark has holes and uses its own tests. The piece highlights the non-deterministic nature of generative AI and urges users to double-check AI outputs against reliable sources.

🚀 Service construit par Johan Denoyer