view article Article Mixture of Experts (MoEs) in Transformers +5 ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap • Feb 26 • 159
DFlash Collection Block Diffusion for Flash Speculative Decoding • 21 items • Updated 1 day ago • 109
AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders Paper • 2510.19779 • Published Oct 22, 2025 • 62