minhchuxuan commited on
Commit
995e93c
·
verified ·
1 Parent(s): 8b14c37

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -51
README.md CHANGED
@@ -1,62 +1,18 @@
1
  ---
2
  license: apache-2.0
3
  ---
4
-
5
- ---
6
-
7
- **Paper**: [https://arxiv.org/pdf/2310.06694.pdf](https://arxiv.org/pdf/2310.06694.pdf)
8
- **Code**: https://github.com/princeton-nlp/LLM-Shearing
9
- **Models**: [Sheared-LLaMA-1.3B](https://huggingface.co/princeton-nlp/Sheared-LLaMA-1.3B), [Sheared-LLaMA-2.7B](https://huggingface.co/princeton-nlp/Sheared-LLaMA-2.7B)
10
- **Pruned Models without Continued Pre-training**: [Sheared-LLaMA-1.3B-Pruned](https://huggingface.co/princeton-nlp/Sheared-LLaMA-1.3B-Pruned), [Sheared-LLaMA-2.7B-Pruned](https://huggingface.co/princeton-nlp/Sheared-LLaMA-2.7B-Pruned)
11
- **Instruction-tuned Models**: [Sheared-LLaMA-1.3B-ShareGPT](https://huggingface.co/princeton-nlp/Sheared-LLaMA-1.3B-ShareGPT), [Sheared-LLaMA-2.7B-ShareGPT](https://huggingface.co/princeton-nlp/Sheared-LLaMA-2.7B-ShareGPT)
12
 
13
  **License**: Must comply with license of Llama2 since it's a model derived from Llama2.
14
 
15
  ---
16
 
17
- Sheared-LLaMA-2.7B is a model pruned and further pre-trained from [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf). We dynamically load data from different domains in the [RedPajama dataset](https://github.com/togethercomputeub.com/togethercomputer/RedPajama-Data). We use 0.4B tokens for pruning and 50B tokens for continued pre-training the pruned model. This model can be loaded into huggingface via
18
-
19
- ```
20
- model = AutoModelForCausalLM.from_pretrained("princeton-nlp/Sheared-LLaMA-2.7B")
21
- ```
22
-
23
- - Smaller-scale
24
- - Same vocabulary as LLaMA1 and LLaMA2
25
- - Derived with a budget of 50B tokens by utilizing existing strong LLMs
26
-
27
- ## Downstream Tasks
28
-
29
- We evaluate on an extensive set of downstream tasks including reasoning, reading comprehension, language modeling and knowledge intensive tasks. Our Sheared-LLaMA models outperform existing large language models.
30
-
31
- | Model | # Pre-training Tokens | Average Performance |
32
- | ------------------- | --------------------- | ------------------- |
33
- | LLaMA2-7B | 2T | 64.6 |
34
-
35
- **1.3B**
36
-
37
- | Model | # Pre-training Tokens | Average Performance |
38
- | ------------------- | --------------------- | ------------------- |
39
- | OPT-1.3B | 300B | 48.2 |
40
- | Pythia-1.4B | 300B | 48.9 |
41
- | Sheared-LLaMA-1.3B | 50B | 51.0 |
42
-
43
- **3B**
44
 
45
- | Model | # Pre-training Tokens | Average Performance |
46
- | ------------------- | --------------------- | ------------------- |
47
- | OPT-2.7B | 300B | 51.4 |
48
- | Pythia-2.8B | 300B | 52.5 |
49
- | INCITE-Base-3B | 800B | 54.7 |
50
- | Open-LLaMA-3B-v1 | 1T | 55.1 |
51
- | Open-LLaMA-3B-v2 | 1T | 55.7 |
52
- | **Sheared-LLaMA-2.7B** | **50B** | **56.7** |
53
 
54
- ## Bibtex
 
55
  ```
56
- @article{xia2023sheared,
57
- title={Sheared llama: Accelerating language model pre-training via structured pruning},
58
- author={Xia, Mengzhou and Gao, Tianyu and Zeng, Zhiyuan and Chen, Danqi},
59
- journal={arXiv preprint arXiv:2310.06694},
60
- year={2023}
61
- }
62
- ```
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
 
 
 
 
 
 
 
5
 
6
  **License**: Must comply with license of Llama2 since it's a model derived from Llama2.
7
 
8
  ---
9
 
10
+ Pruned-LLaMA-2.7B is a model pruned and further pre-trained from [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
 
12
+ ## Usage
13
+ ```python
14
+ from transformers import AutoModelForCausalLM, AutoTokenizer
 
 
 
 
 
15
 
16
+ model = AutoModelForCausalLM.from_pretrained("minhchuxuan/pruned-2.7b")
17
+ tokenizer = AutoTokenizer.from_pretrained("minhchuxuan/pruned-2.7b")
18
  ```