Shivatma
Shivatma
AI & ML interests
None yet
Recent Activity
reacted
to
prithivMLmods's
post
with đ
21 days ago
One speech model with seven voices, streamlined with multimodal capabilities for vision tasks. Performs vision(image-text) to audio inference with Qwen2.5-VL + VibeVoice-Realtime-0.5B. Vision to VibeVoice (EN) - The demo is live. đŁď¸đĽ
đ¤ Vision-to-VibeVoice-en [Demo]: https://huggingface.co/spaces/prithivMLmods/Vision-to-VibeVoice-en
⨠Collection: https://huggingface.co/collections/prithivMLmods/multimodal-implementations
⨠Speech [VibeVoice-Realtime-0.5B]: https://huggingface.co/microsoft/VibeVoice-Realtime-0.5B
⨠Vision [Qwen2.5-VL]: https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct
To know more about it, visit the app page or the respective model page!
reacted
to
prithivMLmods's
post
with đ¤
21 days ago
One speech model with seven voices, streamlined with multimodal capabilities for vision tasks. Performs vision(image-text) to audio inference with Qwen2.5-VL + VibeVoice-Realtime-0.5B. Vision to VibeVoice (EN) - The demo is live. đŁď¸đĽ
đ¤ Vision-to-VibeVoice-en [Demo]: https://huggingface.co/spaces/prithivMLmods/Vision-to-VibeVoice-en
⨠Collection: https://huggingface.co/collections/prithivMLmods/multimodal-implementations
⨠Speech [VibeVoice-Realtime-0.5B]: https://huggingface.co/microsoft/VibeVoice-Realtime-0.5B
⨠Vision [Qwen2.5-VL]: https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct
To know more about it, visit the app page or the respective model page!
reacted
to
prithivMLmods's
post
with đĽ
about 1 month ago
Made a demo for multimodal understanding of Qwen3-VL space for tasks including point annotation, detection, captioning, guided text inferences, and more. Find the demo link below. đ¤âď¸
⎠Space[Demo]: https://huggingface.co/spaces/prithivMLmods/Qwen3-VL-HF-Demo
⎠Model Used: https://huggingface.co/Qwen/Qwen3-VL-4B-Instruct
⎠Collection: https://huggingface.co/collections/prithivMLmods/multimodal-implementations
⎠GitHub: https://github.com/PRITHIVSAKTHIUR/Qwen-3VL-Multimodal-Understanding
To know more about it, visit the app page or the respective model page!
Organizations
None yet