Five frontier LLMs disagree on 67% of 1k real-world fact-check claims
Summary
A study from Lenz Research analyzes 1,000 real-world claims evaluated by five frontier LLMs and finds 67% disagreement among models. The article presents the four-verdict rubric, domain breakdown, model-vs-model agreement, and important limitations for applying frontier-model verdicts to real-world fact-checking.