DFlash Collection Block Diffusion for Flash Speculative Decoding • 23 items • Updated 3 days ago • 142
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference Paper • 2511.10645 • Published Nov 13, 2025 • 14 • 4
ParoQuant Collection Pairwise Rotation Quantization for Efficient Reasoning LLM Inference • 24 items • Updated 23 days ago • 27
SparseLoRA Collection Accelerating LLM Fine-Tuning with Contextual Sparsity • 4 items • Updated Mar 11 • 3