EAGLE 3.1: Advancing Speculative Decoding Through Collaboration Between the EAGLE Team, vLLM, and TorchSpec
Summary
EAGLE 3.1 introduces key architectural improvements to speculative decoding, improving robustness and throughput across deployment scenarios. The post covers training with TorchSpec, integration with vLLM, and open-source collaboration, including a draft EAGLE 3.1 model and performance gains.