Speculative Sampling Explained
Summary
Explains speculative sampling, a method to reconcile a target sampling distribution with a draft distribution using rejection sampling and a residual distribution. The post outlines down-sampling of over-sampled tokens, up-sampling via a residual distribution, and how the total rejection probability recovers the target distribution, with references to a formal proof (Theorem 1).