Link model to paper and add Arxiv metadata
#1
by nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,19 +1,22 @@
|
|
| 1 |
---
|
| 2 |
language:
|
| 3 |
- en
|
|
|
|
| 4 |
license: apache-2.0
|
|
|
|
| 5 |
tags:
|
| 6 |
- language-model
|
| 7 |
- sample-efficient
|
| 8 |
- pretraining
|
| 9 |
- transformer
|
| 10 |
-
|
| 11 |
-
pipeline_tag: text-generation
|
| 12 |
---
|
| 13 |
|
| 14 |
# IMU-1 Base
|
| 15 |
|
| 16 |
-
|
|
|
|
|
|
|
| 17 |
|
| 18 |
## Model Details
|
| 19 |
|
|
@@ -95,14 +98,14 @@ print(tokenizer.decode(outputs[0]))
|
|
| 95 |
## Citation
|
| 96 |
|
| 97 |
```bibtex
|
| 98 |
-
@
|
| 99 |
-
title={
|
| 100 |
author={George Panchuk},
|
| 101 |
-
|
| 102 |
-
|
| 103 |
}
|
| 104 |
```
|
| 105 |
|
| 106 |
## License
|
| 107 |
|
| 108 |
-
Apache 2.0
|
|
|
|
| 1 |
---
|
| 2 |
language:
|
| 3 |
- en
|
| 4 |
+
library_name: transformers
|
| 5 |
license: apache-2.0
|
| 6 |
+
pipeline_tag: text-generation
|
| 7 |
tags:
|
| 8 |
- language-model
|
| 9 |
- sample-efficient
|
| 10 |
- pretraining
|
| 11 |
- transformer
|
| 12 |
+
arxiv: 2602.02522
|
|
|
|
| 13 |
---
|
| 14 |
|
| 15 |
# IMU-1 Base
|
| 16 |
|
| 17 |
+
This repository contains the IMU-1 Base model, a sample-efficient 430M parameter language model introduced in the paper [IMU-1: Sample-Efficient Pre-training of Small Language Models](https://huggingface.co/papers/2602.02522).
|
| 18 |
+
|
| 19 |
+
IMU-1 is trained on 72B tokens and approaches the benchmark performance of models trained on 56× more data.
|
| 20 |
|
| 21 |
## Model Details
|
| 22 |
|
|
|
|
| 98 |
## Citation
|
| 99 |
|
| 100 |
```bibtex
|
| 101 |
+
@article{panchuk2025imu1,
|
| 102 |
+
title={IMU-1: Sample-Efficient Pre-training of Small Language Models},
|
| 103 |
author={George Panchuk},
|
| 104 |
+
journal={arXiv preprint arXiv:2602.02522},
|
| 105 |
+
year={2025}
|
| 106 |
}
|
| 107 |
```
|
| 108 |
|
| 109 |
## License
|
| 110 |
|
| 111 |
+
Apache 2.0
|