Video-Text-to-Text
Safetensors
qwen3_vl
QingyiSi's picture
Update README.md
d01fc74 verified
|
Raw
History Blame
505 Bytes
metadata
license: apache-2.0

JoyAI-VL-Interaction

The first open, vision-driven real-time interaction model — it watches a live video stream and decides on its own when to speak, stay silent, or delegate.

📄 Paper · 🌐 Project Page & Demos · 💻 GitHub · 🤗 Paper Page