Automatic Speech Recognition
Transformers
Safetensors
Chinese
English
qwen3_asr
taiwan-mandarin
traditional-chinese
code-switching
qwen3-asr
speech
Instructions to use JacobLinCool/TEA-ASR-1-mini with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use JacobLinCool/TEA-ASR-1-mini with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="JacobLinCool/TEA-ASR-1-mini")# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("JacobLinCool/TEA-ASR-1-mini") model = AutoModelForMultimodalLM.from_pretrained("JacobLinCool/TEA-ASR-1-mini") - Notebooks
- Google Colab
- Kaggle
model card: fresh-eval numbers + protocol
Browse files
README.md
CHANGED
|
@@ -136,7 +136,8 @@ allocated during inference.
|
|
| 136 |
ASCEND, NTUML2021), with general + code-switch **replay** to preserve the base model's broad and bilingual
|
| 137 |
ability. The audio encoder is left frozen.
|
| 138 |
- **Localization**: Traditional-script + Taiwan-lexicon output is rendered through the model's **own tokenizer**
|
| 139 |
-
(the surface mapping is baked once at build time); there is **no
|
|
|
|
| 140 |
- **Packaging**: the adapter is **merged** into the base and the localized tokenizer is shipped with it, so the
|
| 141 |
release is a single drop-in checkpoint that loads like stock Qwen3-ASR.
|
| 142 |
- **Decoding tip**: pass `language="Chinese"` for Taiwan speech; this also prevents translation-style outputs on
|
|
|
|
| 136 |
ASCEND, NTUML2021), with general + code-switch **replay** to preserve the base model's broad and bilingual
|
| 137 |
ability. The audio encoder is left frozen.
|
| 138 |
- **Localization**: Traditional-script + Taiwan-lexicon output is rendered through the model's **own tokenizer**
|
| 139 |
+
(the surface mapping is baked once at build time); there is **no post-processing at inference** — the
|
| 140 |
+
Traditional output comes straight from the model's own tokenizer decode.
|
| 141 |
- **Packaging**: the adapter is **merged** into the base and the localized tokenizer is shipped with it, so the
|
| 142 |
release is a single drop-in checkpoint that loads like stock Qwen3-ASR.
|
| 143 |
- **Decoding tip**: pass `language="Chinese"` for Taiwan speech; this also prevents translation-style outputs on
|