A New Version of Our Oracle Solaris Environment for Developers
Summary
The post discusses LLM Neuroanatomy, showing how duplicating middle transformer layers (without new weights) can improve performance, and proposes the idea of functional circuits within transformers. It covers base64 chat quirks, Goliath-style Frankenmerge experiments, heatmaps, and the RYS-XLarge results that topped the leaderboard, with implications for mechanistic interpretability and model design.