Polyglot-Lion-1.7B: High-accuracy multilingual ASR for Singapore — English, Mandarin, Tamil & Malay

Project Page GitHub License: MIT

Average error rate comparison across models

About

Polyglot-Lion-1.7B was developed by Quy-Anh Dang and Chris Ngo at Knovel Engineering and presented in the report "Polyglot-Lion: Efficient Multilingual ASR for Singapore via Balanced Fine-Tuning of Qwen3-ASR".

The model is obtained by fine-tuning Qwen3-ASR-1.7B exclusively on publicly available speech corpora covering Singapore's four official languages. It utilizes a balanced sampling strategy that equalizes the number of training utterances per language and deliberately omits language-tag conditioning, allowing the model to learn to identify languages implicitly from audio.

Polyglot-Lion-1.7B achieves an average error rate of 14.85 — competitive with MERaLiON-2-10B-ASR (14.32), a model 6× larger and 20× faster inference.

  • Parameters: 1.7B
  • Languages: English, Mandarin, Tamil, Malay
  • Training cost: $81 on a single NVIDIA RTX PRO 6000 (48 h)
  • Inference speed: ~0.10 s/sample on RTX PRO 4500

Results

Model Params English (LS) English (NSC) Mandarin (CV) Mandarin (AISH1) Mandarin (AISH3) Mandarin (Fleurs) Tamil (CV) Tamil (SLR65) Tamil (SLR127) Tamil (Fleurs) Malay (Meso.) Malay (Fleurs) Avg
Whisper-large-v3-turbo 0.8B 3.04 32.02 17.91 9.64 16.81 10.63 74.50 58.13 69.56 66.90 28.47 8.88 33.04
SeaLLMs-Audio-7B 7B 94.74 9.53 8.68 9.65 9.76 37.09 126.70 127.24 138.65 105.31 71.34 26.25 63.75
Qwen2.5-Omni-3B 3B 29.21 34.79 46.36 28.25 44.55 54.74 318.36 465.58 448.82 311.67 211.90 74.69 172.37
Qwen2.5-Omni-7B 7B 13.80 22.96 14.49 7.33 22.58 16.68 252.06 239.15 303.96 326.43 158.06 43.92 118.45
Qwen3-ASR-0.6B 0.6B 2.74 7.64 10.06 2.08 2.59 9.75 121.10 127.00 129.12 130.09 47.29 18.71 50.68
Qwen3-ASR-1.7B 1.7B 2.31 6.22 7.50 1.52 2.08 9.33 139.96 134.63 144.49 147.23 39.00 10.87 53.76
MERaLiON-2-10B-ASR 10B 2.54 4.62 8.83 3.09 4.07 11.99 31.78 19.29 22.42 28.68 25.90 8.55 14.32
Polyglot-Lion-0.6B 0.6B 2.67 6.09 6.16 1.93 2.32 9.19 42.16 23.07 28.14 37.68 24.33 14.45 16.52
Polyglot-Lion-1.7B 1.7B 2.10 5.28 4.91 1.45 1.86 8.00 39.19 19.75 26.83 37.28 21.51 9.98 14.85

WER (%) for English, Tamil, and Malay; CER (%) for Mandarin. Lower is better. Bold = best overall.

Quick Start

Polyglot-Lion uses the qwen-asr package for inference.

# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Create environment and install
uv venv --python 3.12 && source .venv/bin/activate
uv pip install qwen-asr hf_transfer

Transformers

import torch
from qwen_asr import Qwen3ASRModel

model = Qwen3ASRModel.from_pretrained(
    "knoveleng/polyglot-lion-1.7b",
    dtype=torch.bfloat16,
    device_map="cuda:0",
    max_new_tokens=256,
)

results = model.transcribe(audio="path/to/audio.wav", language=None)
print(results[0].language, results[0].text)

vLLM (faster)

from qwen_asr import Qwen3ASRModel

if __name__ == "__main__":
    model = Qwen3ASRModel.LLM(
        model="knoveleng/polyglot-lion-1.7b",
        gpu_memory_utilization=0.7,
        max_new_tokens=4096,
    )
    results = model.transcribe(audio=["audio1.wav", "audio2.wav"], language=None)
    for r in results:
        print(r.language, r.text)

For batch inference, timestamps, streaming, and server deployment, see the Qwen3-ASR documentation.

Citation

@misc{dang2026polyglotlion,
    title={Polyglot-Lion: Efficient Multilingual ASR for Singapore via Balanced Fine-Tuning of Qwen3-ASR}, 
    author={Quy-Anh Dang and Chris Ngo},
    year={2026},
    eprint={2603.16184},
    archivePrefix={arXiv},
    primaryClass={cs.CL},
    url={https://arxiv.org/abs/2603.16184}, 
}
Downloads last month
160
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for knoveleng/polyglot-lion-1.7b

Finetuned
(34)
this model
Quantizations
2 models

Datasets used to train knoveleng/polyglot-lion-1.7b

Collection including knoveleng/polyglot-lion-1.7b

Paper for knoveleng/polyglot-lion-1.7b