M2M-100 418M — GGUF (ggml)
GGUF / ggml conversion of facebook/m2m100_418M for use with CrispStrobe/CrispASR.
M2M-100 is a multilingual encoder-decoder transformer for machine translation between 100 languages — any direction, no English pivot required. The 418M variant has 12 encoder + 12 decoder layers (d=1024, 16 heads, FFN=4096, ReLU). Distributed under MIT license.
Files
| File | Size | Notes |
|---|---|---|
m2m100-418m-f16.gguf |
935 MB | F16 weights (reference quality) |
m2m100-418m-q8_0.gguf |
502 MB | Q8_0 quantized (identical quality to F16 on test set) |
m2m100-418m-q4_k.gguf |
272 MB | Q4_K quantized (minor word choice differences) |
Quick start
# 1. Build CrispASR
git clone https://github.com/CrispStrobe/CrispASR
cd CrispASR
cmake -B build -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF
cmake --build build -j
# 2. Pull model
huggingface-cli download cstr/m2m100-418m-GGUF m2m100-418m-q8_0.gguf --local-dir .
# 3. Translate (standalone)
./build/bin/crispasr --backend m2m100 -m m2m100-418m-q8_0.gguf \
--translate "Hello world, how are you today?" \
--source-lang en --target-lang de
# 4. ASR + translate pipeline
./build/bin/crispasr -m ggml-base.en.bin -f samples/jfk.wav \
--translate-model m2m100-418m-q8_0.gguf --target-lang de
Quality verification
All three quant levels produce correct translations:
| Input (English) | F16 | Q8_0 | Q4_K |
|---|---|---|---|
| Hello world, how are you today? | Hallo Welt, wie bist du heute? | Hallo Welt, wie bist du heute? | Hallo Welt, wie bist du heute? |
| The president said he would not attend the meeting. | Le président a dit qu'il ne sera pas présent à la réunion. | ...qu'il ne participerait pas... | ...qu'il ne voulait pas assister... |
| Machine learning is changing the world. | El aprendizaje de máquina está cambiando el mundo. | (identical) | La aprendizaje... (minor) |
Supported languages (100)
af, am, ar, ast, az, ba, be, bg, bn, br, bs, ca, ceb, cs, cy, da, de, el, en, es, et, fa, ff, fi, fr, fy, ga, gd, gl, gu, ha, he, hi, hr, ht, hu, hy, id, ig, ilo, is, it, ja, jv, ka, kk, km, kn, ko, lb, lg, ln, lo, lt, lv, mg, mk, ml, mn, mr, ms, my, ne, nl, no, ns, oc, or, pa, pl, ps, pt, ro, ru, sd, si, sk, sl, so, sq, sr, ss, su, sv, sw, ta, th, tl, tn, tr, uk, ur, uz, vi, wo, xh, yi, yo, zh, zu
Architecture
Text → SentencePiece BPE tokenizer (128K vocab, 100 lang codes)
→ Source lang token (__en__) + text tokens + </s>
→ 12-layer transformer encoder (d=1024, 16 heads, FFN=4096, ReLU, pre-norm)
→ Sinusoidal positional embeddings (pre-computed)
→ 12-layer transformer decoder (self-attn + cross-attn + FFN)
→ Shared embedding LM head (tied weights)
→ Target lang forced as first decoder token
→ Greedy decode → translated text
Shared embedding: encoder, decoder, and LM head all use the same 128112×1024 table.
Conversion
python models/convert-m2m100-to-gguf.py \
--input facebook/m2m100_418M \
--output m2m100-418m-f16.gguf
Also supports facebook/m2m100_1.2B and facebook/wmt21-dense-24-wide-en-x (same architecture, different scale).
Related models
facebook/m2m100_418M— original PyTorch modelfacebook/m2m100_1.2B— larger 1.2B variantfacebook/wmt21-dense-24-wide-en-x— WMT21 competition winner (4.7B)
- Downloads last month
- 294
8-bit
16-bit
Model tree for cstr/m2m100-418m-GGUF
Base model
facebook/m2m100_418M