Benjamin Potzmann
Giymo11
·
AI & ML interests
Multimodal AI, LLMs, RAG
Organizations
None yet
Audio Only
-
nvidia/canary-qwen-2.5b
Automatic Speech Recognition • 3B • Updated • 73.5k • 427 -
nvidia/parakeet-tdt-0.6b-v3
Automatic Speech Recognition • 0.6B • Updated • 317k • 855 -
zai-org/GLM-ASR-Nano-2512
Automatic Speech Recognition • 2B • Updated • 136k • 368 -
Qwen/Qwen2-Audio-7B
Audio-Text-to-Text • 8B • Updated • 9.38k • 171
Multimodal (Audio + Visual)
Multimodal (Audio)
Audio Only
-
nvidia/canary-qwen-2.5b
Automatic Speech Recognition • 3B • Updated • 73.5k • 427 -
nvidia/parakeet-tdt-0.6b-v3
Automatic Speech Recognition • 0.6B • Updated • 317k • 855 -
zai-org/GLM-ASR-Nano-2512
Automatic Speech Recognition • 2B • Updated • 136k • 368 -
Qwen/Qwen2-Audio-7B
Audio-Text-to-Text • 8B • Updated • 9.38k • 171