hbfreed
/

pruned_olmo3_5120_16_29

@@ -10,7 +10,7 @@ pipeline_tag: text-generation
 # pruned_olmo3_5120_16_29
-> ⚠️ **WARNING: This model is PRUNED ONLY, NOT retrained or distilled!**
 >
 > Performance will be degraded compared to the original model. This is a structural pruning checkpoint intended as a starting point for knowledge distillation or fine-tuning.
@@ -20,22 +20,16 @@ Structurally pruned version of [allenai/OLMo-3-7B-Instruct](https://huggingface.
 ## Pruning Configuration
-- **Hidden size**: 5120
-- **Num attention heads**: 16
-- **Num layers**: 29
-- **Original model**: allenai/OLMo-3-7B-Instruct (hidden=4096, heads=32, layers=32)
-## Usage
-## ⚠️ Important Notes
 1. **This model has NOT been retrained** after pruning
 2. **Performance will be significantly degraded** compared to the original
 3. **Intended use**: Starting checkpoint for distillation/fine-tuning
 4. For the distillation training data, see [hbfreed/dolci-distill-packed](https://huggingface.co/datasets/hbfreed/dolci-distill-packed)
-## Citation
-If you use this model, please cite the original OLMo work.

 # pruned_olmo3_5120_16_29
+> **WARNING: This model is PRUNED ONLY, NOT retrained or distilled\!**
 >
 > Performance will be degraded compared to the original model. This is a structural pruning checkpoint intended as a starting point for knowledge distillation or fine-tuning.
 ## Pruning Configuration
+| Parameter | Original | Pruned |
+|-----------|----------|--------|
+| Intermediate size (MLP) | 11008 | 5120 |
+| Attention heads | 32 | 16 |
+| Layers | 32 | 29 |
+| Hidden size | 4096 | 4096 (unchanged) |
+## Important Notes
 1. **This model has NOT been retrained** after pruning
 2. **Performance will be significantly degraded** compared to the original
 3. **Intended use**: Starting checkpoint for distillation/fine-tuning
 4. For the distillation training data, see [hbfreed/dolci-distill-packed](https://huggingface.co/datasets/hbfreed/dolci-distill-packed)