Even GPT-5.2 Can't Count to Five: The Case for Zero-Error Horizons in Trustworthy LLMs
Summary
The arXiv paper introduces Zero-Error Horizon (ZEH) for trustworthy LLMs, measuring the maximum error-free problem solving. It shows GPT-5.2 struggles on simple tasks, underscoring limits for safety-critical deployment, and discusses correlations with accuracy plus methods to reduce computational cost.