DFlash: Block Diffusion for Flash Speculative Decoding
Summary
DFlash is a GitHub project that introduces a lightweight block diffusion approach for speculative decoding in large language models. It provides model support, installation instructions for multiple backends, and quick-start guidance, highlighting open-source collaboration and benchmarking across datasets.