Qwen3-ASR

Running

App Files Files Community

Qwen3-ASR / README.md

littlebird13

Update README.md

3cabb18 verified 3 months ago

preview code

raw

history blame contribute delete

1.64 kB

A newer version of the Gradio SDK is available: 6.15.2

Upgrade

metadata

title: Qwen3-ASR Demo
emoji: 🎙️
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.8.0
app_file: app.py
pinned: false
license: apache-2.0

Qwen3-ASR Demo

This Space demonstrates Qwen3-ASR-1.7B, a state-of-the-art automatic speech recognition model from the Qwen team, powered by vLLM for high-speed inference.

Features

30+ Language Support: Chinese, Cantonese, English, Japanese, Korean, Arabic, German, French, Spanish, Portuguese, and many more
Word/Character-level Timestamps: Accurate timestamp alignment for each word (English) or character (Chinese)
Interactive Visualization: Click on each word/character to hear the corresponding audio segment
vLLM Backend: Fast inference speed for real-time transcription

How to Use

Upload an audio file or record using your microphone
Select a language or leave "Auto" for automatic detection
Enable "Timestamps" for visualization (recommended)
Click "Transcribe" and see the results

Models Used

ASR Model: Qwen/Qwen3-ASR-1.7B
Forced Aligner: Qwen/Qwen3-ForcedAligner-0.6B

Setup (For Space Owners)

This Space requires access to private models. You need to set up the HF_TOKEN secret:

Go to your Space Settings
Navigate to "Repository secrets"
Add a new secret with name HF_TOKEN and your Hugging Face access token as the value

License

Apache 2.0