jdopensource
/

JoyAI-VL-Interaction-Preview

Add video-text-to-text pipeline tag to metadata

by nielsr HF Staff - opened 4 days ago

←

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,6 +1,8 @@
 ---
 license: apache-2.0
 ---
 # JoyAI-VL-Interaction
 **The first open, vision-driven real-time interaction model — it watches a live video stream and decides on its own when to speak, stay silent, or delegate.**
@@ -42,4 +44,4 @@ python -m vllm_omni.experimental.fullduplex.joyvl.serving.server --port 8070 \
   --main-backend-url http://127.0.0.1:8061/v1 --main-model JoyAI-VL-Interaction-Preview
 ```
-For the full browser demo — live webcam / RTSP input, voice (ASR/TTS), and the per-tick decision stream — run JD's official WebUI (`services/webui`) in front of the orchestrator; see the [vLLM-Omni recipe](https://github.com/vllm-project/vllm-omni/blob/main/recipes/JD/JoyAI-VL-Interaction.md) for the steps.

 ---
 license: apache-2.0
+pipeline_tag: video-text-to-text
 ---
 # JoyAI-VL-Interaction
 **The first open, vision-driven real-time interaction model — it watches a live video stream and decides on its own when to speak, stay silent, or delegate.**
   --main-backend-url http://127.0.0.1:8061/v1 --main-model JoyAI-VL-Interaction-Preview
 ```
+For the full browser demo — live webcam / RTSP input, voice (ASR/TTS), and the per-tick decision stream — run JD's official WebUI (`services/webui`) in front of the orchestrator; see the [vLLM-Omni recipe](https://github.com/vllm-project/vllm-omni/blob/main/recipes/JD/JoyAI-VL-Interaction.md) for the steps.