| | --- |
| | version: main |
| | family: smollm2-1.7b |
| | model_name: locuslab/safelm-1.7b_rephrase_refusal_moral_ed_600B |
| | license: mit |
| | tags: |
| | - model |
| | - transformer |
| | - smollm2 |
| | - safety p |
| | datasets: |
| | - locuslab/refuseweb |
| | - locuslab/safeweb |
| | - locuslab/moral_education |
| | - HuggingFaceTB/smollm-corpus |
| | --- |
| | # SafeLM-1.7B |
| |
|
| | SafeLM is a 1.7B parameter model family that is trained via [Safety Pretraining](https://www.arxiv.org/abs/2504.16980). We train language models to be natively safe by incorporating safety |
| | directly into the pretraining pipeline. This is our natively safe base model. Our safety data curation involves scoring harmful content, rephrasing and contextualizing potentially harmful examples, and refusal training throughout pretraining. |
| | Please check out our [paper](https://www.arxiv.org/abs/2504.16980) and [website](https://locuslab.github.io/safety-pretraining/) for more details! |
| |
|
| | ## Model Details |
| | - **Architecture:** SmolLM2 |
| | - **Parameters:** 1.7B |
| |
|
| | ## Training Configuration |
| | ```yaml |
| | optimizer: |
| | class_path: torch.optim.AdamW |
| | init_args: |
| | lr: 0.0005 |
| | weight_decay: 0.01 |
| | precision: bf16-mixed |
| | seed: 42 |
| | train: |
| | global_batch_size: 1024 |
| | max_seq_length: 2048 |
| | max_tokens: 600000000000 |
| | micro_batch_size: 8 |
| | |
| | ``` |
| |
|
| | ## Quickstart |
| |
|
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | |
| | model = AutoModelForCausalLM.from_pretrained("locuslab/safelm-1.7b_rephrase_refusal_moral_ed_600B") |
| | tokenizer = AutoTokenizer.from_pretrained("locuslab/safelm-1.7b_rephrase_refusal_moral_ed_600B") |
| | ``` |
| |
|
| | ## Citation |
| |
|
| | If you find our work helpful, please cite our work as: |
| |
|
| | ``` |
| | @article{maini2025safety, |
| | title={Safety pretraining: Toward the next generation of safe ai}, |
| | author={Maini, Pratyush and Goyal, Sachin and Sam, Dylan and Robey, Alex and Savani, Yash and Jiang, Yiding and Zou, Andy and Lipton, Zachary C and Kolter, J Zico}, |
| | journal={arXiv preprint arXiv:2504.16980}, |
| | year={2025} |
| | } |
| | ``` |