NanoGPT Slowrun: Language Modeling with Limited Data, Infinite Compute

March 4, 2026 at 17:56

Quality: 8/10 Relevance: 9/10

Summary

NanoGPT Slowrun is an open project from Q Labs aiming to achieve data-efficient language modeling with unlimited compute. It shows 2.4x initially and 5.5x data efficiency so far by focusing on algorithmic improvements (shuffle, value embedding projections, SwiGLU activation, ensembling), with directions including second-order methods, diffusion, and curriculum learning.

Read Original Article