DigiNews

Tech Watch Articles

← Back to articles

What do you think about Claude Code performung worse than pure Opus 4.5 in the newest swe-rebench update?

Quality: 5/10 Relevance: 9/10

Summary

A Reddit discussion evaluating Claude Code's agentic harness performance vs Opus 4.5 in swe-rebench, highlighting potential overengineering costs and a call for benchmarks.

🚀 Service construit par Johan Denoyer