What do you think about Claude Code performung worse than pure Opus 4.5 in the newest swe-rebench update?
Summary
A Reddit discussion evaluating Claude Code's agentic harness performance vs Opus 4.5 in swe-rebench, highlighting potential overengineering costs and a call for benchmarks.