mt5-base-afr-32768
This model is a 61.64% smaller version of google/mt5-base optimized for Afrikaans language via vocabulary size reduction using the trimming method.
This trimmed model should perform similarly to the original model with only 16,384 tokens and a much smaller memory footprint. However, it may not perform well for other languages as tokens not commonly used in the selected languages were removed from the vocabulary.
Model Statistics
| Metric |
Original |
Trimmed |
Reduction |
| Vocabulary size |
250,112 tokens |
16,384 tokens |
93.45% |
| Model size |
300,176,768 params |
223,395,072 params |
61.64% |

Mining Dataset Statistics
Usage
from transformers import AutoModel, AutoTokenizer
model_name = "alphaedge-ai/mt5-base-afr-16384"
model = AutoModel.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
Citations
mT5
@misc{xue2021mt5massivelymultilingualpretrained,
title={mT5: A massively multilingual pre-trained text-to-text transformer},
author={Linting Xue and Noah Constant and Adam Roberts and Mihir Kale and Rami Al-Rfou and Aditya Siddhant and Aditya Barua and Colin Raffel},
year={2021},
eprint={2010.11934},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2010.11934},
}
Trimming blog post
@misc{hf_blogpost_trimming,
title={Introduction to Trimming},
author={Loïck BOURDOIS and Tom AARSEN and Bram VANROY and Christopher AKIKI and Woojun JUNG and Manuel ROMERO and Prithiv SAKTHI},
year={2026},
url={https://huggingface.co/blog/lbourdois/introduction-to-trimming},
}