What do you think about Claude Code performung worse than pure Opus 4.5 in the newest swe-rebench update?

January 17, 2026 at 02:48

Quality: 5/10 Relevance: 9/10

Summary

A Reddit discussion evaluating Claude Code's agentic harness performance vs Opus 4.5 in swe-rebench, highlighting potential overengineering costs and a call for benchmarks.

Read Original Article