M2M-100 418M — GGUF (ggml)

GGUF / ggml conversion of facebook/m2m100_418M for use with CrispStrobe/CrispASR.

M2M-100 is a multilingual encoder-decoder transformer for machine translation between 100 languages — any direction, no English pivot required. The 418M variant has 12 encoder + 12 decoder layers (d=1024, 16 heads, FFN=4096, ReLU). Distributed under MIT license.

Files

File Size Notes
m2m100-418m-f16.gguf 935 MB F16 weights (reference quality)
m2m100-418m-q8_0.gguf 502 MB Q8_0 quantized (identical quality to F16 on test set)
m2m100-418m-q4_k.gguf 272 MB Q4_K quantized (minor word choice differences)

Quick start

# 1. Build CrispASR
git clone https://github.com/CrispStrobe/CrispASR
cd CrispASR
cmake -B build -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF
cmake --build build -j

# 2. Pull model
huggingface-cli download cstr/m2m100-418m-GGUF m2m100-418m-q8_0.gguf --local-dir .

# 3. Translate (standalone)
./build/bin/crispasr --backend m2m100 -m m2m100-418m-q8_0.gguf \
    --translate "Hello world, how are you today?" \
    --source-lang en --target-lang de

# 4. ASR + translate pipeline
./build/bin/crispasr -m ggml-base.en.bin -f samples/jfk.wav \
    --translate-model m2m100-418m-q8_0.gguf --target-lang de

Quality verification

All three quant levels produce correct translations:

Input (English) F16 Q8_0 Q4_K
Hello world, how are you today? Hallo Welt, wie bist du heute? Hallo Welt, wie bist du heute? Hallo Welt, wie bist du heute?
The president said he would not attend the meeting. Le président a dit qu'il ne sera pas présent à la réunion. ...qu'il ne participerait pas... ...qu'il ne voulait pas assister...
Machine learning is changing the world. El aprendizaje de máquina está cambiando el mundo. (identical) La aprendizaje... (minor)

Supported languages (100)

af, am, ar, ast, az, ba, be, bg, bn, br, bs, ca, ceb, cs, cy, da, de, el, en, es, et, fa, ff, fi, fr, fy, ga, gd, gl, gu, ha, he, hi, hr, ht, hu, hy, id, ig, ilo, is, it, ja, jv, ka, kk, km, kn, ko, lb, lg, ln, lo, lt, lv, mg, mk, ml, mn, mr, ms, my, ne, nl, no, ns, oc, or, pa, pl, ps, pt, ro, ru, sd, si, sk, sl, so, sq, sr, ss, su, sv, sw, ta, th, tl, tn, tr, uk, ur, uz, vi, wo, xh, yi, yo, zh, zu

Architecture

Text → SentencePiece BPE tokenizer (128K vocab, 100 lang codes)
     → Source lang token (__en__) + text tokens + </s>
     → 12-layer transformer encoder (d=1024, 16 heads, FFN=4096, ReLU, pre-norm)
     → Sinusoidal positional embeddings (pre-computed)
     → 12-layer transformer decoder (self-attn + cross-attn + FFN)
     → Shared embedding LM head (tied weights)
     → Target lang forced as first decoder token
     → Greedy decode → translated text

Shared embedding: encoder, decoder, and LM head all use the same 128112×1024 table.

Conversion

python models/convert-m2m100-to-gguf.py \
    --input facebook/m2m100_418M \
    --output m2m100-418m-f16.gguf

Also supports facebook/m2m100_1.2B and facebook/wmt21-dense-24-wide-en-x (same architecture, different scale).

Related models

Downloads last month
294
GGUF
Model size
0.5B params
Architecture
m2m100
Hardware compatibility
Log In to add your hardware

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cstr/m2m100-418m-GGUF

Quantized
(3)
this model