Training mRNA Language Models Across 25 Species for $165
Summary
Hacker News highlights OpenMed's open-source pipeline for training mRNA language models across 25 species for $165, including CodonRoBERTa-large-v2 achieving a perplexity of 4.10 and a Spearman CAI correlation of 0.40; the project trained four production models in 55 GPU-hours and introduced a species-conditioned system. The full results and runnable code are provided, with discussion on domain-specific health/chemistry/economics models and implications for open-source biotech AI.