HenryHHHH commited on
Commit
2e09c9c
·
verified ·
1 Parent(s): 7fd8fd6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -21
README.md CHANGED
@@ -1,22 +1,3 @@
1
- ```yaml
2
- ---
3
- language: "en"
4
- license: "mit"
5
- tags:
6
- - llama
7
- - distillation
8
- - openwebtext
9
- - wikitext
10
- - text-generation
11
- - knowledge-distillation
12
- datasets:
13
- - openwebtext
14
- - wikitext-103-raw-v1
15
- model_name: "DistilLLaMA"
16
- base_model: "Meta's 7B LLaMA 2"
17
- inference: true
18
- ---
19
-
20
  ### Overview
21
 
22
  This model is a distilled version of LLaMA 2, containing approximately 80 million parameters. It was trained using a mix of OpenWebText and WikiText Raw V1 datasets. Knowledge distillation was employed to transfer knowledge from a larger "teacher" model—Meta’s 7B LLaMA 2—to help this smaller model mimic the behavior of the teacher.
@@ -122,5 +103,4 @@ The model’s performance is evaluated on 200 queries created in-house. For more
122
  url={https://arxiv.org/abs/2308.02019},
123
  }
124
 
125
- *Note: The repository will be updated as training progresses. Last update 2024-10-23*
126
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ### Overview
2
 
3
  This model is a distilled version of LLaMA 2, containing approximately 80 million parameters. It was trained using a mix of OpenWebText and WikiText Raw V1 datasets. Knowledge distillation was employed to transfer knowledge from a larger "teacher" model—Meta’s 7B LLaMA 2—to help this smaller model mimic the behavior of the teacher.
 
103
  url={https://arxiv.org/abs/2308.02019},
104
  }
105
 
106
+ *Note: The repository will be updated as training progresses. Last update 2024-10-23*