Link model to paper and add Arxiv metadata

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +11 -8
README.md CHANGED
@@ -1,19 +1,22 @@
1
  ---
2
  language:
3
  - en
 
4
  license: apache-2.0
 
5
  tags:
6
  - language-model
7
  - sample-efficient
8
  - pretraining
9
  - transformer
10
- library_name: transformers
11
- pipeline_tag: text-generation
12
  ---
13
 
14
  # IMU-1 Base
15
 
16
- A sample-efficient 430M parameter language model trained on 72B tokens that approaches the benchmark performance of models trained on 56× more data.
 
 
17
 
18
  ## Model Details
19
 
@@ -95,14 +98,14 @@ print(tokenizer.decode(outputs[0]))
95
  ## Citation
96
 
97
  ```bibtex
98
- @misc{imu1_2025,
99
- title={Sample Efficient Language Model Pre-training},
100
  author={George Panchuk},
101
- year={2025},
102
- url={https://huggingface.co/thepowerfuldeez/imu1_base}
103
  }
104
  ```
105
 
106
  ## License
107
 
108
- Apache 2.0
 
1
  ---
2
  language:
3
  - en
4
+ library_name: transformers
5
  license: apache-2.0
6
+ pipeline_tag: text-generation
7
  tags:
8
  - language-model
9
  - sample-efficient
10
  - pretraining
11
  - transformer
12
+ arxiv: 2602.02522
 
13
  ---
14
 
15
  # IMU-1 Base
16
 
17
+ This repository contains the IMU-1 Base model, a sample-efficient 430M parameter language model introduced in the paper [IMU-1: Sample-Efficient Pre-training of Small Language Models](https://huggingface.co/papers/2602.02522).
18
+
19
+ IMU-1 is trained on 72B tokens and approaches the benchmark performance of models trained on 56× more data.
20
 
21
  ## Model Details
22
 
 
98
  ## Citation
99
 
100
  ```bibtex
101
+ @article{panchuk2025imu1,
102
+ title={IMU-1: Sample-Efficient Pre-training of Small Language Models},
103
  author={George Panchuk},
104
+ journal={arXiv preprint arXiv:2602.02522},
105
+ year={2025}
106
  }
107
  ```
108
 
109
  ## License
110
 
111
+ Apache 2.0