Add pipeline_tag, library_name, and license

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +5 -3
README.md CHANGED
@@ -3,7 +3,11 @@ datasets:
3
  - HuggingFaceTB/smollm-corpus
4
  language:
5
  - en
 
 
 
6
  ---
 
7
  # Outlier-Safe Pre-Training
8
 
9
  [![arXiv](https://img.shields.io/badge/arXiv-2506.19697-b31b1b?style=flat-square)](https://arxiv.org/abs/2506.19697)
@@ -25,7 +29,7 @@ A method that prevents outliers but significantly reduces efficiency is unlikely
25
  3. 🧩**Ensuring full compatibility with existing inference pipelines**<br/>
26
  We prioritize compatibility with widely adopted inference frameworks such as vLLM and SGLang. Rather than introducing architectural changes that break compatibility, OSP preserves computational invariance, allowing models to be directly integrated into existing pipelines without additional effort.
27
 
28
-
29
 
30
  ## Model Checkpoints
31
 
@@ -36,7 +40,6 @@ The models were trained on 1 trillion tokens, following the pre-training recipe
36
  - [🤗 OSP-1.4B-1T-Adam](https://huggingface.co/dmis-lab/OSP-1.4B-1T-Adam): Trained on the standard Adam optimizer, without any modifications.
37
  - [🤗 OSP-1.4B-1T-Muon-SSNorm-EmbProj](https://huggingface.co/dmis-lab/OSP-1.4B-1T-Muon-SSNorm-EmbProj): Trained on the OSP framework. This is our final model.
38
 
39
-
40
  ### Ablation Models
41
 
42
  <table>
@@ -177,7 +180,6 @@ The models were trained on 1 trillion tokens, following the pre-training recipe
177
  </table>
178
  &dagger;Model configuration that disables decoupled embedding optimization by training with Muon optimizer without Adam optimization on embedding layers
179
 
180
-
181
  ## Training
182
 
183
  ### Model
 
3
  - HuggingFaceTB/smollm-corpus
4
  language:
5
  - en
6
+ license: apache-2.0
7
+ library_name: transformers
8
+ pipeline_tag: text-generation
9
  ---
10
+
11
  # Outlier-Safe Pre-Training
12
 
13
  [![arXiv](https://img.shields.io/badge/arXiv-2506.19697-b31b1b?style=flat-square)](https://arxiv.org/abs/2506.19697)
 
29
  3. 🧩**Ensuring full compatibility with existing inference pipelines**<br/>
30
  We prioritize compatibility with widely adopted inference frameworks such as vLLM and SGLang. Rather than introducing architectural changes that break compatibility, OSP preserves computational invariance, allowing models to be directly integrated into existing pipelines without additional effort.
31
 
32
+ This repository contains the model of the paper [Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models](https://huggingface.co/papers/2506.19697).
33
 
34
  ## Model Checkpoints
35
 
 
40
  - [🤗 OSP-1.4B-1T-Adam](https://huggingface.co/dmis-lab/OSP-1.4B-1T-Adam): Trained on the standard Adam optimizer, without any modifications.
41
  - [🤗 OSP-1.4B-1T-Muon-SSNorm-EmbProj](https://huggingface.co/dmis-lab/OSP-1.4B-1T-Muon-SSNorm-EmbProj): Trained on the OSP framework. This is our final model.
42
 
 
43
  ### Ablation Models
44
 
45
  <table>
 
180
  </table>
181
  &dagger;Model configuration that disables decoupled embedding optimization by training with Muon optimizer without Adam optimization on embedding layers
182
 
 
183
  ## Training
184
 
185
  ### Model