jhu-clsp
/

mmBERT-checkpoints

Model card Files Files and versions

mmBERT-checkpoints / README.md

orionweller's picture

Create README.md

cd05548 verified 3 months ago

|

history blame contribute delete

1.95 kB

	---
	license: mit
	language:
	- en
	---

	# Ettin Checkpoints

	[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
	[![Paper](https://img.shields.io/badge/Paper-Arxiv-red)](https://arxiv.org/abs/2509.06888)
	[![Models](https://img.shields.io/badge/🤗%20Hugging%20Face-12%20Models-blue)](https://huggingface.co/collections/jhu-clsp/mmbert-a-modern-multilingual-encoder-68b725831d7c6e3acc435ed4)
	[![GitHub](https://img.shields.io/badge/GitHub-Code-black)](https://github.com/jhu-clsp/mmBERT)

	This repository contains the raw training checkpoints for the mmBERT models. Each model contains three subfolders for `decay`, `ext`, and `pretrain`.

	These files work with Composer and contain all state needed to resume pre-training. Please see the [ModernBERT repository](https://github.com/AnswerDotAI/ModernBERT) for usage details.


	## 🔗 Related Resources

	- Models: [mmBERT Model Suite](https://huggingface.co/collections/jhu-clsp/mmbert-a-modern-multilingual-encoder-68b725831d7c6e3acc435ed4)
	- Phase 1: [Pre-training Data](https://huggingface.co/datasets/jhu-clsp/mmbert-pretrain-p1-fineweb2-langs) (2.3T tokens)
	- Phase 2: [Mid-training Data](https://huggingface.co/datasets/jhu-clsp/mmbert-midtraining) (600B tokens)
	- Phase 3: [Decay Phase Data](https://huggingface.co/datasets/jhu-clsp/mmbert-decay) (100B tokens)
	- Paper: [Arxiv link](https://arxiv.org/abs/2509.06888)
	- Code: [GitHub Repository](https://github.com/jhu-clsp/mmBERT)

	## Citation

	```bibtex
	@misc{marone2025mmbertmodernmultilingualencoder,
	title={mmBERT: A Modern Multilingual Encoder with Annealed Language Learning},
	author={Marc Marone and Orion Weller and William Fleshman and Eugene Yang and Dawn Lawrie and Benjamin Van Durme},
	year={2025},
	eprint={2509.06888},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2509.06888},
	}
	```