HenryHHHH
/

DistilLlama

Text Generation

knowledge-distillation

transfer-learning

text-generation-inference

Model card Files Files and versions

HenryHHHH commited on Oct 23, 2024

Commit

c73afa8

·

verified ·

1 Parent(s): 2e09c9c

Update README.md

Files changed (1) hide show

README.md +25 -0

README.md CHANGED Viewed

@@ -1,3 +1,28 @@
 ### Overview
 This model is a distilled version of LLaMA 2, containing approximately 80 million parameters. It was trained using a mix of OpenWebText and WikiText Raw V1 datasets. Knowledge distillation was employed to transfer knowledge from a larger "teacher" model—Meta’s 7B LLaMA 2—to help this smaller model mimic the behavior of the teacher.

+---
+language: en
+tags:
+  - text-generation
+  - knowledge-distillation
+  - llama
+  - causal-lm
+  - openwebtext
+  - wikitext
+  - transfer-learning
+model_name: DistilLLaMA
+license: apache-2.0
+datasets:
+  - openwebtext
+  - wikitext
+parameter_count: 80M
+metrics:
+  - cosine-similarity
+  - exact-match
+  - rouge
+library_name: transformers
+base_model: meta-llama/LLaMA-2-7B
+---
 ### Overview
 This model is a distilled version of LLaMA 2, containing approximately 80 million parameters. It was trained using a mix of OpenWebText and WikiText Raw V1 datasets. Knowledge distillation was employed to transfer knowledge from a larger "teacher" model—Meta’s 7B LLaMA 2—to help this smaller model mimic the behavior of the teacher.