DigiNews

Tech Watch by Johan Denoyer

← Back to articles

SALOMI: Salomi, a research repo on extreme low-bit transformer quantization

Quality: 8/10 Relevance: 9/10

Summary

SALOMI is a research repo focused on extreme low-bit transformer quantization and inference, exploring whether binary or near-binary weight representations can approach or exceed ternary baselines under realistic evaluation. It documents the onebit toolkit, evaluation suite, and research notes, highlighting that post-hoc 1-bit quantization struggles for GPT-2-class models and that around 1.2–1.35 bits-per-parameter is more credible using Hessian-guided VQ and mixed-precision approaches.

🚀 Service construit par Johan Denoyer