Hierarchical Conditioning of Diffusion Models Using Tree-of-Life for Studying Species Evolution Paper • 2408.00160 • Published Jul 31, 2024 • 1
BioCLIP 2: Emergent Properties from Scaling Hierarchical Contrastive Learning Paper • 2505.23883 • Published May 29 • 2
BIOCAP: Exploiting Synthetic Captions Beyond Labels in Biological Foundation Models Paper • 2510.20095 • Published Oct 23
AsyncVoice Agent: Real-Time Explanation for LLM Planning and Reasoning Paper • 2510.16156 • Published Oct 17 • 1
DraftAttention: Fast Video Diffusion via Low-Resolution Attention Guidance Paper • 2505.14708 • Published May 17
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation Paper • 2510.00515 • Published Oct 1 • 39
The Geometry of Reasoning: Flowing Logics in Representation Space Paper • 2510.09782 • Published Oct 10 • 6
Why Do Transformers Fail to Forecast Time Series In-Context? Paper • 2510.09776 • Published Oct 10 • 2
AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons Paper • 2503.05731 • Published Feb 19 • 3
Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play Paper • 2509.25541 • Published Sep 29 • 140
Voice Evaluation of Reasoning Ability: Diagnosing the Modality-Induced Performance Gap Paper • 2509.26542 • Published Sep 30 • 8
CoreMatching: A Co-adaptive Sparse Inference Framework with Token and Neuron Pruning for Comprehensive Acceleration of Vision-Language Models Paper • 2505.19235 • Published May 25 • 3
Angles Don't Lie: Unlocking Training-Efficient RL Through the Model's Own Signals Paper • 2506.02281 • Published Jun 2 • 4
LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers Paper • 2412.12444 • Published Dec 17, 2024
Reasoning Like an Economist: Post-Training on Economic Problems Induces Strategic Generalization in LLMs Paper • 2506.00577 • Published May 31 • 11
HippoMM: Hippocampal-inspired Multimodal Memory for Long Audiovisual Event Understanding Paper • 2504.10739 • Published Apr 14 • 2
Muskits-ESPnet: A Comprehensive Toolkit for Singing Voice Synthesis in New Paradigm Paper • 2409.07226 • Published Sep 11, 2024 • 1
Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey Paper • 2407.21794 • Published Jul 31, 2024 • 6