JacobLinCool commited on
Commit
689b043
·
verified ·
1 Parent(s): eab73e7

model card: fresh-eval numbers + protocol

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -136,7 +136,8 @@ allocated during inference.
136
  ASCEND, NTUML2021), with general + code-switch **replay** to preserve the base model's broad and bilingual
137
  ability. The audio encoder is left frozen.
138
  - **Localization**: Traditional-script + Taiwan-lexicon output is rendered through the model's **own tokenizer**
139
- (the surface mapping is baked once at build time); there is **no OpenCC or string rewriting at inference**.
 
140
  - **Packaging**: the adapter is **merged** into the base and the localized tokenizer is shipped with it, so the
141
  release is a single drop-in checkpoint that loads like stock Qwen3-ASR.
142
  - **Decoding tip**: pass `language="Chinese"` for Taiwan speech; this also prevents translation-style outputs on
 
136
  ASCEND, NTUML2021), with general + code-switch **replay** to preserve the base model's broad and bilingual
137
  ability. The audio encoder is left frozen.
138
  - **Localization**: Traditional-script + Taiwan-lexicon output is rendered through the model's **own tokenizer**
139
+ (the surface mapping is baked once at build time); there is **no post-processing at inference** — the
140
+ Traditional output comes straight from the model's own tokenizer decode.
141
  - **Packaging**: the adapter is **merged** into the base and the localized tokenizer is shipped with it, so the
142
  release is a single drop-in checkpoint that loads like stock Qwen3-ASR.
143
  - **Decoding tip**: pass `language="Chinese"` for Taiwan speech; this also prevents translation-style outputs on