Light Forcing: Accelerating Autoregressive Video Diffusion via Sparse Attention Paper • 2602.04789 • Published 2 days ago • 2
LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit Paper • 2405.06001 • Published May 9, 2024
Focus-dLLM: Accelerating Long-Context Diffusion LLM Inference via Confidence-Guided Context Focusing Paper • 2602.02159 • Published 4 days ago
MoDES: Accelerating Mixture-of-Experts Multimodal Large Language Models via Dynamic Expert Skipping Paper • 2511.15690 • Published Nov 19, 2025
Focus-dLLM: Accelerating Long-Context Diffusion LLM Inference via Confidence-Guided Context Focusing Paper • 2602.02159 • Published 4 days ago
Light Forcing: Accelerating Autoregressive Video Diffusion via Sparse Attention Paper • 2602.04789 • Published 2 days ago • 2
Light Forcing: Accelerating Autoregressive Video Diffusion via Sparse Attention Paper • 2602.04789 • Published 2 days ago • 2
Temporal Feature Matters: A Framework for Diffusion Model Quantization Paper • 2407.19547 • Published Jul 28, 2024
QVGen Collection This is the official checkpoint collection of paper: https://arxiv.org/pdf/2505.11497 • 3 items • Updated 5 days ago
QVGen Collection This is the official checkpoint collection of paper: https://arxiv.org/pdf/2505.11497 • 3 items • Updated 5 days ago
SlimInfer: Accelerating Long-Context LLM Inference via Dynamic Token Pruning Paper • 2508.06447 • Published Aug 8, 2025
LinVideo: A Post-Training Framework towards O(n) Attention in Efficient Video Generation Paper • 2510.08318 • Published Oct 9, 2025
WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research Paper • 2509.13312 • Published Sep 16, 2025 • 105