DigiNews

Tech Watch by Johan Denoyer

← Back to articles

LLMs are complicated now

Quality: 8/10 Relevance: 9/10

Summary

The post analyzes how large language models have become more complex since early 2020s work, highlighting architectural variations, mixture-of-experts, and the move from simpler two-tower recsys-style designs to multi-GPU inference and diverse attention variants. It references open models, frameworks, and notable figures (e.g., Llama, FlexAttention, Karpathy) to illustrate the shift toward composability and kernel-level optimizations. The piece serves as a technical reflection on model design, tooling, and the challenges of evolving architectures.

🚀 Service construit par Johan Denoyer