JoyAI-VL-Interaction: Real-Time Vision-Language Interaction Intelligence Paper • 2606.14777 • Published 7 days ago • 154
Thinking with Comics: Enhancing Multimodal Reasoning through Structured Visual Storytelling Paper • 2602.02453 • Published Feb 2 • 36
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B Text Generation • 2B • Updated Feb 24, 2025 • 480k • • 1.52k
FantasyVLN: Unified Multimodal Chain-of-Thought Reasoning for Vision-Language Navigation Paper • 2601.13976 • Published Jan 20 • 22