OLMo-2-1B-Exp / README.md

sbordt

Update README.md

42817d6 verified 4 months ago

preview code

raw

history blame contribute delete

1.12 kB

metadata

license: apache-2.0
language:
  - en
tags:
  - Research

Model Card for OLMo-2-1B-Exp

This model is a research variant of OLMo-2-0425-1B.

It was pretrained from scratch on 210B tokens with additional experimental modifications to the training data.

The baseline model, trained on the same data without any experiments, is here.

The model is described in the paper "Train Once, Answer All: Many Pretraining Experiments for the Cost of One".

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
olmo = AutoModelForCausalLM.from_pretrained("sbordt/OLMo-2-1B-Exp")
tokenizer = AutoTokenizer.from_pretrained("sbordt/OLMo-2-1B-Exp")

Citation Information

@article{bordt2025trainonce,
  title =     {Train Once, Answer All: Many Pretraining Experiments for the Cost of One},
  author =    {Bordt, Sebastian and Pawelczyk, Martin},
  journal =   {arXiv preprint arXiv:2509.23383},
  year =      {2025},
}