Introspective Diffusion Language Models
Summary
Introspective Diffusion Language Models (I-DLM) introduce introspective consistency to diffusion LMs via Introspective Strided Decoding (ISD), enabling a single forward pass to generate and verify tokens. The approach narrows the quality gap with autoregressive models and achieves 2.9-4.1x higher throughput at high concurrency, with bit-for-bit lossless acceleration using gated LoRA (R-ISD). The work includes benchmarks across 15 tasks and practical deployment notes for AR-compatible serving and model zoo.