Accelerating Streaming Video Large Language Models via Hierarchical Token Compression Paper • 2512.00891 • Published Nov 30, 2025 • 17
VL-JEPA: Joint Embedding Predictive Architecture for Vision-language Paper • 2512.10942 • Published Dec 11, 2025 • 61
Scaling Audio-Text Retrieval with Multimodal Large Language Models Paper • 2602.18010 • Published Feb 20 • 1
MARQUIS: A Three-Stage Pipeline for Video Retrieval-Augmented Generation Paper • 2605.17640 • Published May 17
Principled Context Engineering for RAG: Statistical Guarantees via Conformal Prediction Paper • 2511.17908 • Published Jan 19
view article Article **ColBERT-Zero: To Pre-train Or Not To Pre-train ColBERT models?** lightonai • Feb 19 • 22
view article Article mmBERT: ModernBERT goes Multilingual +4 mmarone, orionweller, will-fleshman, eugene-yang, dlawrie, vandurme • Sep 9, 2025 • 147
Running 3.9k The Ultra-Scale Playbook 🌌 3.9k The ultimate guide to training LLM on large GPU Clusters
view article Article NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks nvidia • Aug 11, 2025 • 76