Nobody ever gets credit for fixing problems that never happened (2002) [pdf]
Summary
A detailed blog-style report of building a vintage LLM from scratch, trained on pre-1900 English texts with a 340M parameter model. It covers data collection and quality filtering, custom tokenization, base-training and fine-tuning, and experiments across local and cloud GPU environments with costs and performance notes. The post emphasizes open-source datasets and code (HuggingFace and GitHub) and discusses current limitations like hallucinations and partial success in memory of basic math prompts.