Efficient Adaptive Rejection Sampling for Accelerating Speculative Decoding in Large Language Models
arXiv:2512.13194v2 Announce Type: replace-cross Abstract: Speculative Decoding is a prominent technique for accelerating the autoregressive inference of large language models…