Inverse Rubric Optimization: A testbed for agent science
Summary
Fulcrum Research introduces inverse rubric optimization (IRO) as a testbed for studying long-horizon AI agents. The post describes how agents optimize the preferences of a black-box judge via variable label budgets, shows how different judge rubrics shape learning, and reports that models exhibit rich strategies and smooth scaling, including findings on label efficiency and potential reward-hacking. An open-source implementation is provided, with detailed appendices on rubric design and per-judge results.