Training A Small Language Model To Outperform Frontier Models On CRM-Arena
Summary
Neurometric demonstrates fine tuning sub 6B parameter Qwen models with LoRA adapters on a CRM task benchmark called CRMArena. Phase I shows improvements in SQL query generation but requires broader task data; Phase II constrains outputs to a finite set of BANT labels, achieving strong results even with synthetic data. The work highlights the viability of small models for enterprise workflows, the importance of data quality, and the benefits of task specific training for on-prem or edge deployment.