AI models are terrible at betting on soccer—especially xAI Grok
Summary
The Financial Times study summarized by Ars Technica tests eight top AI systems in a simulated Premier League season, evaluating their ability to maximize returns and manage risk without internet access. All models underperform humans in long-horizon decision tasks, with Grok 4.20 going bankrupt in every attempt and Claude Opus 4.6 losing on average 11%, while one Google Gemini variant shows brief profit but then fails. The piece cautions against hype and argues for more rigorous, horizon-aware benchmarking when applying AI to real-world business tasks.