AI models are terrible at betting on soccer—especially xAI Grok

April 11, 2026 at 11:15

Quality: 8/10 Relevance: 9/10

Summary

The Financial Times study summarized by Ars Technica tests eight top AI systems in a simulated Premier League season, evaluating their ability to maximize returns and manage risk without internet access. All models underperform humans in long-horizon decision tasks, with Grok 4.20 going bankrupt in every attempt and Claude Opus 4.6 losing on average 11%, while one Google Gemini variant shows brief profit but then fails. The piece cautions against hype and argues for more rigorous, horizon-aware benchmarking when applying AI to real-world business tasks.

Read Original Article