view article Article Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth mlabonne • Jul 29, 2024 • 371
GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents Paper • 2603.24329 • Published Mar 25 • 28
Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents Paper • 2510.23691 • Published Oct 27, 2025 • 56
Physical AI Collection Collection of open, commercial-grade datasets for physical AI developers • 49 items • Updated 7 days ago • 154
view article Article Introducing NVIDIA Cosmos Policy for Advanced Robot Control nvidia • Jan 29 • 48
Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models Paper • 2602.07026 • Published Feb 2 • 140
view article Article State of open video generation models in Diffusers +1 sayakpaul, a-r-r-o-w, dn6 • Jan 27, 2025 • 70
view article Article Arc Virtual Cell Challenge: A Primer FL33TW00D-HF, abhinadduri • Jul 18, 2025 • 66
view article Article You could have designed state of the art positional encoding FL33TW00D-HF • Nov 25, 2024 • 480
view article Article The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix codelion • Nov 3, 2025 • 65
view article Article Efficient Deep Learning: A Comprehensive Overview of Optimization Techniques 👐 📚 Isayoften • Aug 26, 2024 • 89
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM Paper • 2510.15870 • Published Oct 17, 2025 • 92
RoboOmni: Proactive Robot Manipulation in Omni-modal Context Paper • 2510.23763 • Published Oct 27, 2025 • 62
AgentFold: Long-Horizon Web Agents with Proactive Context Management Paper • 2510.24699 • Published Oct 28, 2025 • 72