TinyLoRA – Learning to Reason in 13 Parameters
Summary
TinyLoRA introduces a method to scale low-rank adapters down to as few as one parameter, enabling training with only 13 parameters for an 8B model and achieving 91% GSM8K accuracy in bf16. The approach suggests that around 90% of performance gains from larger adapters can be recovered with 1000x fewer parameters across several reasoning benchmarks, and that reinforcement learning is crucial to attaining these gains, whereas standard supervised fine-tuning lags. This work implies significant potential for efficient fine-tuning and on-device reasoning, though results are task- and model-specific.