microsoft/VibeVoice-1.5B
Text-to-Speech
•
3B
•
Updated
•
329k
•
2.19k
Track, rank and evaluate open LLMs and chatbots
VLMEvalKit Evaluation Results Collection
Identify objects in images using text queries