DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset Paper • 2601.10305 • Published 11 days ago • 36
Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding Paper • 2601.10611 • Published 11 days ago • 26
DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset Paper • 2601.10305 • Published 11 days ago • 36
DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation Paper • 2601.09688 • Published 12 days ago • 124
Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking Paper • 2601.04720 • Published 18 days ago • 49
Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation Paper • 2512.04678 • Published Dec 4, 2025 • 41
LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling Paper • 2511.20785 • Published Nov 25, 2025 • 184
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published Dec 1, 2025 • 101
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe Paper • 2511.16334 • Published Nov 20, 2025 • 93
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper • 2511.08892 • Published Nov 12, 2025 • 208
Cambrian-S: Towards Spatial Supersensing in Video Paper • 2511.04670 • Published Nov 6, 2025 • 38
Running on CPU Upgrade Featured 2.92k The Smol Training Playbook 📚 2.92k The secrets to building world-class LLMs
UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning Paper • 2510.13515 • Published Oct 15, 2025 • 12
UniME-V2 Collection The collections of UniME-V2's data and Model Weights • 6 items • Updated Nov 10, 2025 • 1
TianchengGu/UniME-V2-reranker-Qwen25VL-7B Image-Text-to-Text • 8B • Updated Oct 16, 2025 • 1.4k • 2