Video-Text-to-Text
Safetensors
qwen3_vl
QingyiSi's picture
Update README.md
d01fc74 verified
|
Raw
History Blame
505 Bytes
---
license: apache-2.0
---
# JoyAI-VL-Interaction
**The first open, vision-driven real-time interaction model — it watches a live video stream and decides on its own when to speak, stay silent, or delegate.**
[📄 Paper](https://arxiv.org/abs/2606.14777) · [🌐 Project Page & Demos](https://joyai-vl-video-future-academy-jd.github.io/JoyAI-VL-Interaction/) · [💻 GitHub](https://github.com/jd-opensource/JoyAI-VL-Interaction) · [🤗 Paper Page](https://huggingface.co/papers/2606.14777)
---