M2M-100 418M — GGUF (ggml)

GGUF / ggml conversion of facebook/m2m100_418M for use with CrispStrobe/CrispASR.

M2M-100 is a multilingual encoder-decoder transformer for machine translation between 100 languages — any direction, no English pivot required. The 418M variant has 12 encoder + 12 decoder layers (d=1024, 16 heads, FFN=4096, ReLU). Distributed under MIT license.

Files

File	Size	Notes
`m2m100-418m-f16.gguf`	935 MB	F16 weights (reference quality)
`m2m100-418m-q8_0.gguf`	502 MB	Q8_0 quantized (identical quality to F16 on test set)
`m2m100-418m-q4_k.gguf`	272 MB	Q4_K quantized (minor word choice differences)

Quick start

# 1. Build CrispASR
git clone https://github.com/CrispStrobe/CrispASR
cd CrispASR
cmake -B build -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF
cmake --build build -j

# 2. Pull model
huggingface-cli download cstr/m2m100-418m-GGUF m2m100-418m-q8_0.gguf --local-dir .

# 3. Translate (standalone)
./build/bin/crispasr --backend m2m100 -m m2m100-418m-q8_0.gguf \
    --translate "Hello world, how are you today?" \
    --source-lang en --target-lang de

# 4. ASR + translate pipeline
./build/bin/crispasr -m ggml-base.en.bin -f samples/jfk.wav \
    --translate-model m2m100-418m-q8_0.gguf --target-lang de

Quality verification

All three quant levels produce correct translations:

Input (English)	F16	Q8_0	Q4_K
Hello world, how are you today?	Hallo Welt, wie bist du heute?	Hallo Welt, wie bist du heute?	Hallo Welt, wie bist du heute?
The president said he would not attend the meeting.	Le président a dit qu'il ne sera pas présent à la réunion.	...qu'il ne participerait pas...	...qu'il ne voulait pas assister...
Machine learning is changing the world.	El aprendizaje de máquina está cambiando el mundo.	(identical)	La aprendizaje... (minor)

Supported languages (100)

af, am, ar, ast, az, ba, be, bg, bn, br, bs, ca, ceb, cs, cy, da, de, el, en, es, et, fa, ff, fi, fr, fy, ga, gd, gl, gu, ha, he, hi, hr, ht, hu, hy, id, ig, ilo, is, it, ja, jv, ka, kk, km, kn, ko, lb, lg, ln, lo, lt, lv, mg, mk, ml, mn, mr, ms, my, ne, nl, no, ns, oc, or, pa, pl, ps, pt, ro, ru, sd, si, sk, sl, so, sq, sr, ss, su, sv, sw, ta, th, tl, tn, tr, uk, ur, uz, vi, wo, xh, yi, yo, zh, zu

Architecture

Text → SentencePiece BPE tokenizer (128K vocab, 100 lang codes)
     → Source lang token (__en__) + text tokens + </s>
     → 12-layer transformer encoder (d=1024, 16 heads, FFN=4096, ReLU, pre-norm)
     → Sinusoidal positional embeddings (pre-computed)
     → 12-layer transformer decoder (self-attn + cross-attn + FFN)
     → Shared embedding LM head (tied weights)
     → Target lang forced as first decoder token
     → Greedy decode → translated text

Shared embedding: encoder, decoder, and LM head all use the same 128112×1024 table.

Conversion

python models/convert-m2m100-to-gguf.py \
    --input facebook/m2m100_418M \
    --output m2m100-418m-f16.gguf

Also supports facebook/m2m100_1.2B and facebook/wmt21-dense-24-wide-en-x (same architecture, different scale).

Related models

facebook/m2m100_418M — original PyTorch model
facebook/m2m100_1.2B — larger 1.2B variant
facebook/wmt21-dense-24-wide-en-x — WMT21 competition winner (4.7B)

Downloads last month: 294

GGUF

Model size

0.5B params

Architecture

m2m100

Hardware compatibility

8-bit

16-bit

Model tree for cstr/m2m100-418m-GGUF

Base model

facebook/m2m100_418M

Quantized

(3)

this model