Video-Text-to-Text
Safetensors
qwen3_vl

Add video-text-to-text pipeline tag to metadata

#3
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -1,6 +1,8 @@
1
  ---
2
  license: apache-2.0
 
3
  ---
 
4
  # JoyAI-VL-Interaction
5
 
6
  **The first open, vision-driven real-time interaction model — it watches a live video stream and decides on its own when to speak, stay silent, or delegate.**
@@ -42,4 +44,4 @@ python -m vllm_omni.experimental.fullduplex.joyvl.serving.server --port 8070 \
42
  --main-backend-url http://127.0.0.1:8061/v1 --main-model JoyAI-VL-Interaction-Preview
43
  ```
44
 
45
- For the full browser demo — live webcam / RTSP input, voice (ASR/TTS), and the per-tick decision stream — run JD's official WebUI (`services/webui`) in front of the orchestrator; see the [vLLM-Omni recipe](https://github.com/vllm-project/vllm-omni/blob/main/recipes/JD/JoyAI-VL-Interaction.md) for the steps.
 
1
  ---
2
  license: apache-2.0
3
+ pipeline_tag: video-text-to-text
4
  ---
5
+
6
  # JoyAI-VL-Interaction
7
 
8
  **The first open, vision-driven real-time interaction model — it watches a live video stream and decides on its own when to speak, stay silent, or delegate.**
 
44
  --main-backend-url http://127.0.0.1:8061/v1 --main-model JoyAI-VL-Interaction-Preview
45
  ```
46
 
47
+ For the full browser demo — live webcam / RTSP input, voice (ASR/TTS), and the per-tick decision stream — run JD's official WebUI (`services/webui`) in front of the orchestrator; see the [vLLM-Omni recipe](https://github.com/vllm-project/vllm-omni/blob/main/recipes/JD/JoyAI-VL-Interaction.md) for the steps.