FastConformer TDT Large — Web/SafeTensors Export

Browser-optimized SafeTensors export of nvidia/stt_en_fastconformer_tdt_large.

Model Details

Property	Value
Base model	nvidia/stt_en_fastconformer_tdt_large
Architecture	FastConformer-TDT (17 layers, d_model=512)
Parameters	~115M
Decoder	Token-and-Duration Transducer (TDT) — 2-5× faster than RNNT
Language	English
Weights format	SafeTensors, float16 (~218 MB)
Vocab size	1025 tokens (SentencePiece BPE)
Mel bands	80
TDT durations	[0, 1, 2, 3, 4]
Context	Full attention [-1, -1] — offline/batch mode

Files

model.safetensors — all weights in float16
model_config.json — architecture hyperparameters
vocab.json — token ID → text mapping

Usage with audio-ml

const base = 'https://huggingface.co/AbijahKaj/fastconformer-tdt-large-web/resolve/main';
const config = await fetch(`${base}/model_config.json`).then(r => r.text());
const vocab = await fetch(`${base}/vocab.json`).then(r => r.text());
const weights = await fetch(`${base}/model.safetensors`).then(r => r.arrayBuffer());
await recognizer.loadFromBuffers(weights, config, vocab);

Export Process

Converted from the original NeMo checkpoint using:

python tools/export_nemo_to_safetensors.py \
    --model nvidia/stt_en_fastconformer_tdt_large \
    --output-dir exported/fastconformer-tdt-large

Attribution

This is a format conversion (NeMo → SafeTensors fp16) of NVIDIA's original model. No fine-tuning or weight modification was performed. All credit for the model architecture and training goes to NVIDIA. See the original model card for full details, benchmarks, and license terms.

License: CC-BY-4.0 (inherited from the original model)

Downloads last month: -; Downloads are not tracked for this model. How to track

Safetensors

Model size

0.1B params

Tensor type

F16

Model tree for AbijahKaj/fastconformer-tdt-large-web

Base model

nvidia/stt_en_fastconformer_tdt_large

Finetuned

(1)

this model