Aurora: A Leverage-Aware Optimizer for Rectangular Matrices
Summary
Aurora introduces a leverage-aware optimizer for rectangular matrices that addresses Muon's row-normalization issues on tall matrices. It presents both a practical damped-iteration Aurora and a Riemannian variant, reports strong results on 1.1B pretraining and nanoGPT speedrun benchmarks, and releases open-source code.