Talkie: a 13B vintage language model from 1930
Summary
Talkie introduces a 13B vintage language model trained on pre-1931 text and investigates how such models perform relative to modern counterparts. The article covers data quality challenges (OCR), leakage prevention, post-training pipelines, and plans for scaling, highlighting how historical data shapes model behavior and research insights.