Qwen3-ASR / README.md
littlebird13's picture
Update README.md
3cabb18 verified

A newer version of the Gradio SDK is available: 6.15.2

Upgrade
metadata
title: Qwen3-ASR Demo
emoji: 🎙️
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.8.0
app_file: app.py
pinned: false
license: apache-2.0

Qwen3-ASR Demo

This Space demonstrates Qwen3-ASR-1.7B, a state-of-the-art automatic speech recognition model from the Qwen team, powered by vLLM for high-speed inference.

Features

  • 30+ Language Support: Chinese, Cantonese, English, Japanese, Korean, Arabic, German, French, Spanish, Portuguese, and many more
  • Word/Character-level Timestamps: Accurate timestamp alignment for each word (English) or character (Chinese)
  • Interactive Visualization: Click on each word/character to hear the corresponding audio segment
  • vLLM Backend: Fast inference speed for real-time transcription

How to Use

  1. Upload an audio file or record using your microphone
  2. Select a language or leave "Auto" for automatic detection
  3. Enable "Timestamps" for visualization (recommended)
  4. Click "Transcribe" and see the results

Models Used

Setup (For Space Owners)

This Space requires access to private models. You need to set up the HF_TOKEN secret:

  1. Go to your Space Settings
  2. Navigate to "Repository secrets"
  3. Add a new secret with name HF_TOKEN and your Hugging Face access token as the value

Links

License

Apache 2.0