Update README.md
Browse files
README.md
CHANGED
|
@@ -14,7 +14,7 @@ base_model: meta-llama/Llama-2-70b-hf
|
|
| 14 |
|
| 15 |
This instruction model was built via parameter-efficient QLoRA finetuning of [llama-2-70b](https://huggingface.co/meta-llama/Llama-2-70b-hf) on the first 25k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin) (an open-source implementation of [Microsoft's Orca](https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/)). Finetuning was executed on a single H100 (80 GB PCIe) for roughly 17 hours on the [Lambda Labs](https://cloud.lambdalabs.com/instances) platform.
|
| 16 |
|
| 17 |
-
|
| 18 |
|
| 19 |
| Metric | Value |
|
| 20 |
|-----------------------|-------|
|
|
@@ -26,7 +26,7 @@ This instruction model was built via parameter-efficient QLoRA finetuning of [ll
|
|
| 26 |
|
| 27 |
We use state-of-the-art [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, using the same version as Hugging Face's [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
|
| 28 |
|
| 29 |
-
|
| 30 |
|
| 31 |
* Model license: Llama 2 Community License Agreement
|
| 32 |
* Basic usage: [notebook](assets/basic_inference_llama_2_dolphin.ipynb)
|
|
@@ -40,7 +40,7 @@ We use state-of-the-art [Language Model Evaluation Harness](https://github.com/E
|
|
| 40 |
|
| 41 |
The above loss curve was generated from the run's private wandb.ai log.
|
| 42 |
|
| 43 |
-
|
| 44 |
|
| 45 |
Example 1:
|
| 46 |
|
|
@@ -136,7 +136,7 @@ The llama-2-70b models have been modified from a standard transformer in the fol
|
|
| 136 |
| sequence length | 4096 |
|
| 137 |
| grouped-query attention | ✔️ |
|
| 138 |
|
| 139 |
-
##
|
| 140 |
|
| 141 |
For more details on the pretraining process, see [Llama-2-70b-hf](https://huggingface.co/meta-llama/Llama-2-70b-hf).
|
| 142 |
|
|
@@ -150,9 +150,9 @@ This model can produce factually incorrect output, and should not be relied on t
|
|
| 150 |
This model was trained on various public datasets.
|
| 151 |
While great efforts have been taken to clean the pretraining data, it is possible that this model could generate lewd, biased or otherwise offensive outputs.
|
| 152 |
|
| 153 |
-
##
|
| 154 |
|
| 155 |
-
|
| 156 |
|
| 157 |
```python
|
| 158 |
!pip install -q -U huggingface_hub peft transformers torch accelerate
|
|
@@ -221,8 +221,7 @@ with torch.autocast("cuda", dtype=torch.bfloat16):
|
|
| 221 |
print(tokenizer.decode(output["sequences"][0], skip_special_tokens=True))
|
| 222 |
```
|
| 223 |
|
| 224 |
-
|
| 225 |
-
### Runtime tests
|
| 226 |
|
| 227 |
| runtime / 50 tokens (sec) | GPU | attn | torch dtype | VRAM (GB) |
|
| 228 |
|:-----------------------------:|:----------------------:|:---------------------:|:-------------:|:-----------------------:|
|
|
@@ -253,7 +252,7 @@ The license on this model does not constitute legal advice. We are not responsib
|
|
| 253 |
|
| 254 |
---
|
| 255 |
|
| 256 |
-
|
| 257 |
|
| 258 |
|
| 259 |
- PEFT 0.5.0.dev0
|
|
|
|
| 14 |
|
| 15 |
This instruction model was built via parameter-efficient QLoRA finetuning of [llama-2-70b](https://huggingface.co/meta-llama/Llama-2-70b-hf) on the first 25k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin) (an open-source implementation of [Microsoft's Orca](https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/)). Finetuning was executed on a single H100 (80 GB PCIe) for roughly 17 hours on the [Lambda Labs](https://cloud.lambdalabs.com/instances) platform.
|
| 16 |
|
| 17 |
+
## Benchmark metrics
|
| 18 |
|
| 19 |
| Metric | Value |
|
| 20 |
|-----------------------|-------|
|
|
|
|
| 26 |
|
| 27 |
We use state-of-the-art [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, using the same version as Hugging Face's [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
|
| 28 |
|
| 29 |
+
## Helpful links
|
| 30 |
|
| 31 |
* Model license: Llama 2 Community License Agreement
|
| 32 |
* Basic usage: [notebook](assets/basic_inference_llama_2_dolphin.ipynb)
|
|
|
|
| 40 |
|
| 41 |
The above loss curve was generated from the run's private wandb.ai log.
|
| 42 |
|
| 43 |
+
## Example prompts and responses
|
| 44 |
|
| 45 |
Example 1:
|
| 46 |
|
|
|
|
| 136 |
| sequence length | 4096 |
|
| 137 |
| grouped-query attention | ✔️ |
|
| 138 |
|
| 139 |
+
## Pre-training data
|
| 140 |
|
| 141 |
For more details on the pretraining process, see [Llama-2-70b-hf](https://huggingface.co/meta-llama/Llama-2-70b-hf).
|
| 142 |
|
|
|
|
| 150 |
This model was trained on various public datasets.
|
| 151 |
While great efforts have been taken to clean the pretraining data, it is possible that this model could generate lewd, biased or otherwise offensive outputs.
|
| 152 |
|
| 153 |
+
## Basic usage
|
| 154 |
|
| 155 |
+
* [notebook](assets/basic_inference_llama_2_dolphin.ipynb)
|
| 156 |
|
| 157 |
```python
|
| 158 |
!pip install -q -U huggingface_hub peft transformers torch accelerate
|
|
|
|
| 221 |
print(tokenizer.decode(output["sequences"][0], skip_special_tokens=True))
|
| 222 |
```
|
| 223 |
|
| 224 |
+
## Runtime tests
|
|
|
|
| 225 |
|
| 226 |
| runtime / 50 tokens (sec) | GPU | attn | torch dtype | VRAM (GB) |
|
| 227 |
|:-----------------------------:|:----------------------:|:---------------------:|:-------------:|:-----------------------:|
|
|
|
|
| 252 |
|
| 253 |
---
|
| 254 |
|
| 255 |
+
## Framework versions
|
| 256 |
|
| 257 |
|
| 258 |
- PEFT 0.5.0.dev0
|