Real-time Vision Models Collection A collection of real-time detectors. • 19 items • Updated Nov 23, 2025 • 22
Brain-IT: Image Reconstruction from fMRI via Brain-Interaction Transformer Paper • 2510.25976 • Published Oct 29, 2025 • 14
VideoSSR: Video Self-Supervised Reinforcement Learning Paper • 2511.06281 • Published Nov 9, 2025 • 24
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B Paper • 2511.06221 • Published Nov 9, 2025 • 132
Omnilingual ASR (1,600+ Languages) Collection https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/ • 15 items • Updated Dec 8, 2025 • 1
Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 7 items • Updated 9 days ago • 161
MobileCLIP2 Collection MobileCLIP2: Mobile-friendly image-text models with SOTA zero-shot capabilities trained on DFNDR-2B • 37 items • Updated Sep 18, 2025 • 57
FastVLM Collection Efficient Vision Encoding for Vision Language Models • 9 items • Updated Sep 2, 2025 • 106
view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge Feb 7, 2025 • 270
Retrieval-augmented Large Language Models for Financial Time Series Forecasting Paper • 2502.05878 • Published Feb 9, 2025 • 40