sbordt
/

OLMo-2-1B-Exp

Model card Files Files and versions

OLMo-2-1B-Exp / README.md

sbordt's picture

Update README.md

42817d6 verified 4 months ago

|

history blame contribute delete

1.12 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- Research
	---

	# Model Card for OLMo-2-1B-Exp

	This model is a research variant of [OLMo-2-0425-1B](https://huggingface.co/allenai/OLMo-2-0425-1B).

	It was pretrained from scratch on 210B tokens with additional experimental [modifications to the training data](https://huggingface.co/datasets/sbordt/OLMo-2-1B-Exp-Dataset).

	The baseline model, trained on the same data without any experiments, is [here](https://huggingface.co/sbordt/OLMo-2-1B-Decayed-Early).

	The model is described in the paper "Train Once, Answer All: Many Pretraining Experiments for the Cost of One".

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	olmo = AutoModelForCausalLM.from_pretrained("sbordt/OLMo-2-1B-Exp")
	tokenizer = AutoTokenizer.from_pretrained("sbordt/OLMo-2-1B-Exp")
	```

	### Citation Information

	```
	@article{bordt2025trainonce,
	title = {Train Once, Answer All: Many Pretraining Experiments for the Cost of One},
	author = {Bordt, Sebastian and Pawelczyk, Martin},
	journal = {arXiv preprint arXiv:2509.23383},
	year = {2025},
	}
	```