Add pipeline tag, library name and link to Github repo
Browse filesThis PR adds the `pipeline_tag` and `library_name` to the model card, ensuring the model appears correctly in search results.
It also adds the link to the Github repo.
README.md
CHANGED
|
@@ -1,10 +1,13 @@
|
|
| 1 |
---
|
| 2 |
-
license: llama3.1
|
| 3 |
datasets:
|
| 4 |
- BAAI/Infinity-Instruct
|
| 5 |
language:
|
| 6 |
- en
|
|
|
|
|
|
|
|
|
|
| 7 |
---
|
|
|
|
| 8 |
# Infinity Instruct
|
| 9 |
|
| 10 |
<p align="center">
|
|
@@ -39,7 +42,7 @@ Infinity-Instruct-7M-Gen-Llama3.1-70B is an opensource supervised instruction tu
|
|
| 39 |
<img src="fig/trainingflow.png">
|
| 40 |
</p>
|
| 41 |
|
| 42 |
-
Infinity-Instruct-7M-Gen-
|
| 43 |
|
| 44 |
```bash
|
| 45 |
epoch: 3
|
|
@@ -54,7 +57,7 @@ global_batch_size: 528
|
|
| 54 |
clip_grad: 1.0
|
| 55 |
```
|
| 56 |
|
| 57 |
-
Thanks to [FlagScale](https://github.com/FlagOpen/FlagScale), we could concatenate multiple training samples to remove padding token and apply diverse acceleration techniques to the traning procudure. It effectively reduces our training costs. We will release our code in the near future!
|
| 58 |
|
| 59 |
## **Benchmark**
|
| 60 |
|
|
@@ -76,7 +79,7 @@ Thanks to [FlagScale](https://github.com/FlagOpen/FlagScale), we could concatena
|
|
| 76 |
|
| 77 |
## **How to use**
|
| 78 |
|
| 79 |
-
Infinity-Instruct-7M-Gen-Llama3_1-70B adopt the same chat template of [Llama3-70B-instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)
|
| 80 |
|
| 81 |
```bash
|
| 82 |
<|begin_of_text|><|start_header_id|>user<|end_header_id|>
|
|
|
|
| 1 |
---
|
|
|
|
| 2 |
datasets:
|
| 3 |
- BAAI/Infinity-Instruct
|
| 4 |
language:
|
| 5 |
- en
|
| 6 |
+
license: llama3.1
|
| 7 |
+
pipeline_tag: text-generation
|
| 8 |
+
library_name: transformers
|
| 9 |
---
|
| 10 |
+
|
| 11 |
# Infinity Instruct
|
| 12 |
|
| 13 |
<p align="center">
|
|
|
|
| 42 |
<img src="fig/trainingflow.png">
|
| 43 |
</p>
|
| 44 |
|
| 45 |
+
Infinity-Instruct-7M-Gen-Llama3_1-70B is tuned on Million-level instruction dataset [Infinity-Instruct](https://huggingface.co/datasets/BAAI/Infinity-Instruct). First, we apply the foundational dataset Infinity-Instruct-7M to improve the foundational ability (math & code) of Llama3.1-70B, and get the foundational instruct model Infinity-Instruct-7M-Llama3-70B. Then we finetune the Infinity-Instruct-7M-Llama3-70B to get the stronger chat model Infinity-Instruct-7M-Gen-Llama3_1-70B. Here is the training hyperparamers.
|
| 46 |
|
| 47 |
```bash
|
| 48 |
epoch: 3
|
|
|
|
| 57 |
clip_grad: 1.0
|
| 58 |
```
|
| 59 |
|
| 60 |
+
Thanks to [FlagScale](https://github.com/FlagOpen/FlagScale), we could concatenate multiple training samples to remove padding token and apply diverse acceleration techniques to the traning procudure. It effectively reduces our training costs. We will release our code in the near future! The code is available at: https://github.com/FlagOpen/FlagScale
|
| 61 |
|
| 62 |
## **Benchmark**
|
| 63 |
|
|
|
|
| 79 |
|
| 80 |
## **How to use**
|
| 81 |
|
| 82 |
+
Infinity-Instruct-7M-Gen-Llama3_1-70B adopt the same chat template of [Llama3-70B-instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct):\
|
| 83 |
|
| 84 |
```bash
|
| 85 |
<|begin_of_text|><|start_header_id|>user<|end_header_id|>
|