view article Article Small Yet Mighty: Improve Accuracy In Multimodal Search and Visual Document Retrieval with Llama Nemotron RAG Models 24 days ago β’ 19
LTX-2: Efficient Joint Audio-Visual Foundation Model Paper β’ 2601.03233 β’ Published 24 days ago β’ 141
view article Article Scaling Real-Time Voice Agents with Cache-Aware Streaming ASR 25 days ago β’ 74
view article Article The Engineering Handbook for GRPO + LoRA with Verl: Training Qwen2.5 on Multi-GPU 28 days ago β’ 12
view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 Dec 18, 2025 β’ 119
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 β’ 287
RF-DETR: Neural Architecture Search for Real-Time Detection Transformers Paper β’ 2511.09554 β’ Published Nov 12, 2025 β’ 8
view article Article Llasa Goes RL: Training LLaSA with GRPO for Improved Prosody and Expressiveness Nov 5, 2025 β’ 11
view article Article huggingface_hub v1.0: Five Years of Building the Foundation of Open Machine Learning +2 Oct 27, 2025 β’ 75
view article Article LightOnOCR-1B: The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR Oct 23, 2025 β’ 73