Yuriy Perezhohin PRO

yuriyvnv

https://scholar.google.com/citations?user=I5uzFtwAAAAJ&hl=en

AI & ML interests

Automatic Speech Recognition, Embeddings, Code Generation, Synthetic Data Generation and Filtering

Recent Activity

liked a model 13 days ago

yuriyvnv/WAVe-1B-Multimodal-NL

liked a model 13 days ago

yuriyvnv/whisper-large-v3-high-mixed-nl

liked a model 13 days ago

yuriyvnv/WAVe-1B-Multimodal-PT

View all activity

Organizations

liked 3 models 13 days ago

liked a model 23 days ago

nvidia/nemotron-3.5-asr-streaming-0.6b

Automatic Speech Recognition • 0.6B • Updated about 20 hours ago • 61.9k • • 715

updated 2 models 24 days ago

yuriyvnv/parakeet-tdt-0.6b-EN-Medical

Automatic Speech Recognition • Updated 24 days ago • 81

yuriyvnv/Qwen3-ASR-1.7B-EN-Medical

Automatic Speech Recognition • 2B • Updated 24 days ago • 178

posted an update 24 days ago

Post

120

🏥 Two medical English ASR models are up
Hey, back from a long holiday. While I was out the team kept working on this one and the results are pretty interesting. Medical English ASR, evaluated against the published MultiMed paper.

🩺 yuriyvnv/parakeet-tdt-0.6b-EN-Medical
🩺 yuriyvnv/Qwen3-ASR-1.7B-EN-Medical

Both trained on MultiMed (leduckhai/MultiMed) mixed with Common Voice 17 English train and validation. Mixing CV in prevents catastrophic forgetting of general English. Medical-only training without CV cost us 5 absolute WER points on general English.

📊 Normalized WER on MultiMed-en test, same protocol as the paper:

Parakeet 0.6B zero-shot: 19.22
Parakeet 0.6B fine-tuned: 14.31 (25% relative reduction)

Qwen3-ASR 1.7B zero-shot: 16.41 (although here we had catastrophic forgetting on CV test set)
Qwen3-ASR 1.7B fine-tuned: 16.50

@hf-audio @QwenLM thanks for the toolkits. Big thanks to @leduckhai and the MultiMed authors for the dataset.

#asr #speech #medical #healthcareai #parakeet #qwen #qwen3asr #nemo #medicalasr