Self-Harness: Harnesses That Improve Themselves
Summary
The arXiv paper Self-Harness: Harnesses That Improve Themselves introduces a self-improving paradigm for LLM-based agents, where agents iteratively mine weaknesses, propose harness edits, and validate changes without human intervention. It reports performance gains across three models (MiniMax M2.5, Qwen3.5-35B-A3B, GLM-5) with held-out pass rates increasing from 40.5% to 61.9%, 23.8% to 38.1%, and 42.9% to 57.1%, respectively. The work suggests harnesses that adapt to model-specific weaknesses, enabling more autonomous agent improvement.