Add pipeline tag, library name and license

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +21 -10
README.md CHANGED
@@ -3,7 +3,11 @@ datasets:
3
  - HuggingFaceTB/smollm-corpus
4
  language:
5
  - en
 
 
 
6
  ---
 
7
  # Outlier-Safe Pre-Training
8
 
9
  [![arXiv](https://img.shields.io/badge/arXiv-2506.19697-b31b1b?style=flat-square)](https://arxiv.org/abs/2506.19697)
@@ -25,7 +29,14 @@ A method that prevents outliers but significantly reduces efficiency is unlikely
25
  3. 🧩**Ensuring full compatibility with existing inference pipelines**<br/>
26
  We prioritize compatibility with widely adopted inference frameworks such as vLLM and SGLang. Rather than introducing architectural changes that break compatibility, OSP preserves computational invariance, allowing models to be directly integrated into existing pipelines without additional effort.
27
 
 
 
 
 
 
28
 
 
 
29
 
30
  ## Model Checkpoints
31
 
@@ -92,9 +103,9 @@ The models were trained on 1 trillion tokens, following the pre-training recipe
92
  <td>βœ—<br>βœ”</td>
93
  <!-- <td>41.0<br>41.0</td>
94
  <td>11.7<br>11.7</td> -->
95
- <!-- <td>38.4<br>37.5</td>
96
  <td>14.8<br>15.4</td>
97
- <td>38.3<br>37.5</td>
98
  <td>14.8<br>15.4</td>
99
  <td>26.3<br>33.3</td>
100
  <td>1e6<br>24.5</td> -->
@@ -110,9 +121,9 @@ The models were trained on 1 trillion tokens, following the pre-training recipe
110
  <td>βœ—<br>βœ”</td>
111
  <!-- <td>41.5<br>41.5</td>
112
  <td>11.4<br>11.4</td> -->
113
- <!-- <td>40.0<br>40.6</td>
114
  <td>13.8<br>12.9</td>
115
- <td>40.0<br>40.6</td>
116
  <td>13.8<br>12.9</td>
117
  <td>29.4<br>38.6</td>
118
  <td>934.3<br>15.7</td> -->
@@ -128,9 +139,9 @@ The models were trained on 1 trillion tokens, following the pre-training recipe
128
  <td>βœ—<br>βœ”</td>
129
  <!-- <td><strong>41.8</strong><br><strong>41.8</strong></td>
130
  <td><strong>11.2</strong><br><strong>11.2</strong></td> -->
131
- <!-- <td><strong>41.0</strong><br><strong>40.8</strong></td>
132
  <td>12.4<br>12.2</td>
133
- <td><strong>40.9</strong><br><strong>40.8</strong></td>
134
  <td>12.4<br>12.2</td>
135
  <td>36.6<br>38.6</td>
136
  <td>43.3<br>33.7</td> -->
@@ -146,9 +157,9 @@ The models were trained on 1 trillion tokens, following the pre-training recipe
146
  <td>βœ—<br>βœ”</td>
147
  <!-- <td>40.0<br>40.0</td>
148
  <td>12.3<br>12.3</td> -->
149
- <!-- <td>38.4<br>39.2</td>
150
  <td>14.8<br>13.9</td>
151
- <td>38.4<br>39.3</td>
152
  <td>14.8<br>13.9</td>
153
  <td>31.0<br>36.3</td>
154
  <td>99.7<br>22.1</td> -->
@@ -164,9 +175,9 @@ The models were trained on 1 trillion tokens, following the pre-training recipe
164
  <td>βœ—<br>βœ”</td>
165
  <!-- <td>41.4<br>41.4</td>
166
  <td><strong>11.2</strong><br><strong>11.2</strong></td> -->
167
- <!-- <td>40.6<br>40.5</td>
168
  <td><strong>12.2</strong><br><strong>12.1</strong></td>
169
- <td>40.6<br>40.5</td>
170
  <td><strong>12.2</strong><br><strong>12.1</strong></td>
171
  <td><strong>37.9</strong><br><strong>39.1</strong></td>
172
  <td><strong>19.4</strong><br><strong>13.4</strong></td> -->
 
3
  - HuggingFaceTB/smollm-corpus
4
  language:
5
  - en
6
+ license: apache-2.0
7
+ library_name: transformers
8
+ pipeline_tag: text-generation
9
  ---
10
+
11
  # Outlier-Safe Pre-Training
12
 
13
  [![arXiv](https://img.shields.io/badge/arXiv-2506.19697-b31b1b?style=flat-square)](https://arxiv.org/abs/2506.19697)
 
29
  3. 🧩**Ensuring full compatibility with existing inference pipelines**<br/>
30
  We prioritize compatibility with widely adopted inference frameworks such as vLLM and SGLang. Rather than introducing architectural changes that break compatibility, OSP preserves computational invariance, allowing models to be directly integrated into existing pipelines without additional effort.
31
 
32
+ <p align="center">
33
+ <img src="./images/figure2.png" alt="drawing" width="700"/>
34
+ </p>
35
+
36
+ ## News
37
 
38
+ - **2025-06-25**: Released **Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models** on [arXiv](https://www.arxiv.org/abs/2506.19697), with [GitHub](https://github.com/dmis-lab/Outlier-Safe-Pre-Training) and [models](https://huggingface.co/collections/dmis-lab/outlier-safe-pre-training-osp-685bda10aa1e8a19fcb58ea8).
39
+ - **2025-05-16**: Our paper has been accepted to ACL 2025! πŸŽ‰
40
 
41
  ## Model Checkpoints
42
 
 
103
  <td>βœ—<br>βœ”</td>
104
  <!-- <td>41.0<br>41.0</td>
105
  <td>11.7<br>11.7</td> -->
106
+ <!-- <td> 38.4<br>37.5</td>
107
  <td>14.8<br>15.4</td>
108
+ <td> 38.3<br>37.5</td>
109
  <td>14.8<br>15.4</td>
110
  <td>26.3<br>33.3</td>
111
  <td>1e6<br>24.5</td> -->
 
121
  <td>βœ—<br>βœ”</td>
122
  <!-- <td>41.5<br>41.5</td>
123
  <td>11.4<br>11.4</td> -->
124
+ <!-- <td> 40.0<br>40.6</td>
125
  <td>13.8<br>12.9</td>
126
+ <td> 40.0<br>40.6</td>
127
  <td>13.8<br>12.9</td>
128
  <td>29.4<br>38.6</td>
129
  <td>934.3<br>15.7</td> -->
 
139
  <td>βœ—<br>βœ”</td>
140
  <!-- <td><strong>41.8</strong><br><strong>41.8</strong></td>
141
  <td><strong>11.2</strong><br><strong>11.2</strong></td> -->
142
+ <!-- <td> <strong>41.0</strong><br><strong>40.8</strong></td>
143
  <td>12.4<br>12.2</td>
144
+ <td> <strong>40.9</strong><br><strong>40.8</strong></td>
145
  <td>12.4<br>12.2</td>
146
  <td>36.6<br>38.6</td>
147
  <td>43.3<br>33.7</td> -->
 
157
  <td>βœ—<br>βœ”</td>
158
  <!-- <td>40.0<br>40.0</td>
159
  <td>12.3<br>12.3</td> -->
160
+ <!-- <td> 38.4<br>39.2</td>
161
  <td>14.8<br>13.9</td>
162
+ <td> 38.4<br>39.3</td>
163
  <td>14.8<br>13.9</td>
164
  <td>31.0<br>36.3</td>
165
  <td>99.7<br>22.1</td> -->
 
175
  <td>βœ—<br>βœ”</td>
176
  <!-- <td>41.4<br>41.4</td>
177
  <td><strong>11.2</strong><br><strong>11.2</strong></td> -->
178
+ <!-- <td> 40.6<br>40.5</td>
179
  <td><strong>12.2</strong><br><strong>12.1</strong></td>
180
+ <td> 40.6<br>40.5</td>
181
  <td><strong>12.2</strong><br><strong>12.1</strong></td>
182
  <td><strong>37.9</strong><br><strong>39.1</strong></td>
183
  <td><strong>19.4</strong><br><strong>13.4</strong></td> -->