FastConformer TDT Large β Web/SafeTensors Export
Browser-optimized SafeTensors export of nvidia/stt_en_fastconformer_tdt_large.
Model Details
| Property | Value |
|---|---|
| Base model | nvidia/stt_en_fastconformer_tdt_large |
| Architecture | FastConformer-TDT (17 layers, d_model=512) |
| Parameters | ~115M |
| Decoder | Token-and-Duration Transducer (TDT) β 2-5Γ faster than RNNT |
| Language | English |
| Weights format | SafeTensors, float16 (~218 MB) |
| Vocab size | 1025 tokens (SentencePiece BPE) |
| Mel bands | 80 |
| TDT durations | [0, 1, 2, 3, 4] |
| Context | Full attention [-1, -1] β offline/batch mode |
Files
model.safetensorsβ all weights in float16model_config.jsonβ architecture hyperparametersvocab.jsonβ token ID β text mapping
Usage with audio-ml
const base = 'https://huggingface.co/AbijahKaj/fastconformer-tdt-large-web/resolve/main';
const config = await fetch(`${base}/model_config.json`).then(r => r.text());
const vocab = await fetch(`${base}/vocab.json`).then(r => r.text());
const weights = await fetch(`${base}/model.safetensors`).then(r => r.arrayBuffer());
await recognizer.loadFromBuffers(weights, config, vocab);
Export Process
Converted from the original NeMo checkpoint using:
python tools/export_nemo_to_safetensors.py \
--model nvidia/stt_en_fastconformer_tdt_large \
--output-dir exported/fastconformer-tdt-large
Attribution
This is a format conversion (NeMo β SafeTensors fp16) of NVIDIA's original model. No fine-tuning or weight modification was performed. All credit for the model architecture and training goes to NVIDIA. See the original model card for full details, benchmarks, and license terms.
License: CC-BY-4.0 (inherited from the original model)
Model tree for AbijahKaj/fastconformer-tdt-large-web
Base model
nvidia/stt_en_fastconformer_tdt_large