Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,61 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
base_model: meta-llama/Meta-Llama-3-8B
|
| 3 |
+
inference: true
|
| 4 |
+
model_type: llama
|
| 5 |
+
pipeline_tag: text-generation
|
| 6 |
+
tags:
|
| 7 |
+
- sparse
|
| 8 |
+
---
|
| 9 |
+
|
| 10 |
+
# Meta-Llama-3-8B-pruned_50.2of4
|
| 11 |
+
|
| 12 |
+
This repo contains model files for a 2:4 (N:M) sparse [Meta-Llama-3-8B](meta-llama/Meta-Llama-3-8B) model pruned in one-shot with [SparseGPT](https://arxiv.org/abs/2301.00774), and then additionally retrained with the [SquareHead](https://arxiv.org/abs/2310.06927) knowledge distillation while maintaining the 2:4 sparsity mask.
|
| 13 |
+
|
| 14 |
+
### Running the model
|
| 15 |
+
|
| 16 |
+
```python
|
| 17 |
+
# pip install transformers accelerate
|
| 18 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 19 |
+
|
| 20 |
+
tokenizer = AutoTokenizer.from_pretrained("nm-testing/Meta-Llama-3-8B-pruned_50.2of4")
|
| 21 |
+
model = AutoModelForCausalLM.from_pretrained("nm-testing/Meta-Llama-3-8B-pruned_50.2of4", device_map="auto")
|
| 22 |
+
|
| 23 |
+
input_text = "A poem about Machine Learning goes as follows:"
|
| 24 |
+
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
|
| 25 |
+
|
| 26 |
+
outputs = model.generate(**input_ids)
|
| 27 |
+
print(tokenizer.decode(outputs[0]))
|
| 28 |
+
```
|
| 29 |
+
|
| 30 |
+
## Evaluation Benchmark Results
|
| 31 |
+
|
| 32 |
+
Model evaluation results obtained via [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) following the configuration of [Open LLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard).
|
| 33 |
+
|
| 34 |
+
| Benchmark | Meta-Llama-3-8B | Meta-Llama-3-8B-pruned_50.2of4<br>(this model) |
|
| 35 |
+
|:----------------------------------------------:|:-----------:|:-----------------------------:|
|
| 36 |
+
| [ARC-c](https://arxiv.org/abs/1911.01547)<br> 25-shot | 59.47% | 57.76% |
|
| 37 |
+
| [MMLU](https://arxiv.org/abs/2009.03300)<br> 5-shot | 65.29% | 60.44% |
|
| 38 |
+
| [HellaSwag](https://arxiv.org/abs/1905.07830)<br> 10-shot |82.14% | 79.97% |
|
| 39 |
+
| [WinoGrande](https://arxiv.org/abs/1907.10641)<br> 5-shot |77.27% | 77.19% |
|
| 40 |
+
| [GSM8K](https://arxiv.org/abs/2110.14168)<br> 5-shot | 44.81% | 47.92% |
|
| 41 |
+
| [TruthfulQA](https://arxiv.org/abs/2109.07958)<br> 0-shot | 43.96% | 41.02% |
|
| 42 |
+
| **Average<br>Accuracy** | **62.16%** | **60.72%** |
|
| 43 |
+
| **Recovery** | **100%** | **97.68%** |
|
| 44 |
+
|
| 45 |
+
|
| 46 |
+
Model evaluation results obtained via [Mosaic Eval Gauntlet](https://github.com/mosaicml/llm-foundry/blob/main/scripts/eval/local_data/EVAL_GAUNTLET.md) following the configuration of [Eval Gauntlet v0.3](https://github.com/mosaicml/llm-foundry/blob/main/scripts/eval/yamls/eval_gauntlet_v0.3.yaml).
|
| 47 |
+
|
| 48 |
+
| Benchmark | Meta-Llama-3-8B | Meta-Llama-3-8B-pruned_50.2of4<br>(this model) |
|
| 49 |
+
|:------------------------:|:----------------:|:----------------------------------------------:|
|
| 50 |
+
| World Knowledge | 58.08% | 54.61% |
|
| 51 |
+
| Commonsense Reasoning | 47.66% | 47.62% |
|
| 52 |
+
| Language Understanding | 71.13% | 67.58% |
|
| 53 |
+
| Symbolic Problem Solving | 38.44% | 32.15% |
|
| 54 |
+
| Reading Comprehension | 57.48% | 55.76% |
|
| 55 |
+
| **Average Accuracy** | **54.70%** | **51.54%** |
|
| 56 |
+
| **Recovery** | **100%** | **94.22%** |
|
| 57 |
+
|
| 58 |
+
|
| 59 |
+
## Help
|
| 60 |
+
|
| 61 |
+
For further support, and discussions on these models and AI in general, join [Neural Magic's Slack Community](https://join.slack.com/t/discuss-neuralmagic/shared_invite/zt-q1a1cnvo-YBoICSIw3L1dmQpjBeDurQ)
|