DigiNews

Tech Watch by Johan Denoyer

← Back to articles

AI models are terrible at betting on soccer—especially xAI Grok

Quality: 8/10 Relevance: 9/10

Summary

The Financial Times study summarized by Ars Technica tests eight top AI systems in a simulated Premier League season, evaluating their ability to maximize returns and manage risk without internet access. All models underperform humans in long-horizon decision tasks, with Grok 4.20 going bankrupt in every attempt and Claude Opus 4.6 losing on average 11%, while one Google Gemini variant shows brief profit but then fails. The piece cautions against hype and argues for more rigorous, horizon-aware benchmarking when applying AI to real-world business tasks.

🚀 Service construit par Johan Denoyer