Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published 4 days ago • 67
Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery Paper • 2601.20088 • Published 7 days ago • 1
DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation Paper • 2601.22153 • Published 5 days ago • 68
ECO: Quantized Training without Full-Precision Master Weights Paper • 2601.22101 • Published 5 days ago • 6
FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale Paper • 2601.22146 • Published 5 days ago • 8
JUST-DUB-IT: Video Dubbing via Joint Audio-Visual Diffusion Paper • 2601.22143 • Published 5 days ago • 5
Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives Paper • 2601.20833 • Published 6 days ago • 170
ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation Paper • 2601.21420 • Published 5 days ago • 40
The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs Paper • 2504.17768 • Published Apr 24, 2025 • 14
Self-Improving Pretraining: using post-trained models to pretrain better models Paper • 2601.21343 • Published 5 days ago • 14
Linear representations in language models can change dramatically over a conversation Paper • 2601.20834 • Published 6 days ago • 21