Text Generation
Transformers
Safetensors
English
mistral
text-generation-inference
instruction-pretrain commited on
Commit
bd16fbb
·
verified ·
1 Parent(s): 28552d7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -4
README.md CHANGED
@@ -18,15 +18,17 @@ We explore supervised multitask pre-training by proposing ***Instruction Pre-Tra
18
 
19
 
20
  **************************** **Updates** ****************************
 
21
  * 2024/7/15: We scaled up the pre-trained tokens from 100B to 250B, with the number of synthesized instruction-response pairs reaching 500M! Below, we show the performance trend on downstream tasks throughout the pre-training process:
22
- <p align='center'>
23
- <img src="https://cdn-uploads.huggingface.co/production/uploads/66711d2ee12fa6cc5f5dfc89/0okCfRkC6uALTfuNxt0Fa.png" width="700">
24
  </p>
25
  * 2024/6/21: Released the [paper](https://huggingface.co/papers/2406.14491), [code](https://github.com/microsoft/LMOps), and [resources](https://huggingface.co/instruction-pretrain)
26
 
27
  ## Resources
28
- **🤗 We share our data and models with example usages, feel free to open any issues or discussions! 🤗**
29
 
 
30
  - Context-Based Instruction Synthesizer: [instruction-synthesizer](https://huggingface.co/instruction-pretrain/instruction-synthesizer)
31
  - Fine-Tuning Data for the Synthesizer: [ft-instruction-synthesizer-collection](https://huggingface.co/datasets/instruction-pretrain/ft-instruction-synthesizer-collection)
32
  - General Models Pre-Trained from Scratch (on 100B tokes):
@@ -38,6 +40,7 @@ We explore supervised multitask pre-training by proposing ***Instruction Pre-Tra
38
  - General Instruction-Augmented Corpora: [general-instruction-augmented-corpora](https://huggingface.co/datasets/instruction-pretrain/general-instruction-augmented-corpora)
39
  - Domain-Specific Instruction-Augmented Corpora (no finance data to avoid ethical issues): [medicine-instruction-augmented-corpora](https://huggingface.co/datasets/instruction-pretrain/medicine-instruction-augmented-corpora)
40
 
 
41
  ## General Pre-Training From Scratch
42
  We augment the [RefinedWeb corproa](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) with instruction-response pairs generated by our [context-based instruction synthesizer](https://huggingface.co/instruction-pretrain/instruction-synthesizer) to pre-train general langauge models from scratch.
43
 
@@ -83,7 +86,7 @@ Instruction Pre-Training
83
  }
84
  ```
85
 
86
- [AdaptLLM](https://huggingface.co/papers/2309.09530)
87
  ```bibtex
88
  @inproceedings{
89
  cheng2024adapting,
 
18
 
19
 
20
  **************************** **Updates** ****************************
21
+ * 2024/7/31: Updated pre-training suggestions in the `Advanced Usage` section of [instruction-synthesizer](https://huggingface.co/instruction-pretrain/instruction-synthesizer)
22
  * 2024/7/15: We scaled up the pre-trained tokens from 100B to 250B, with the number of synthesized instruction-response pairs reaching 500M! Below, we show the performance trend on downstream tasks throughout the pre-training process:
23
+ <p align='left'>
24
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/66711d2ee12fa6cc5f5dfc89/0okCfRkC6uALTfuNxt0Fa.png" width="500">
25
  </p>
26
  * 2024/6/21: Released the [paper](https://huggingface.co/papers/2406.14491), [code](https://github.com/microsoft/LMOps), and [resources](https://huggingface.co/instruction-pretrain)
27
 
28
  ## Resources
29
+ **🤗 We share our data and models with example usages, feel free to open any discussions at [this page](https://huggingface.co/papers/2406.14491)! 🤗**
30
 
31
+ - Thanks to the demo [davanstrien/instruction-synthesizer](https://huggingface.co/spaces/davanstrien/instruction-synthesizer) for implementing our approach
32
  - Context-Based Instruction Synthesizer: [instruction-synthesizer](https://huggingface.co/instruction-pretrain/instruction-synthesizer)
33
  - Fine-Tuning Data for the Synthesizer: [ft-instruction-synthesizer-collection](https://huggingface.co/datasets/instruction-pretrain/ft-instruction-synthesizer-collection)
34
  - General Models Pre-Trained from Scratch (on 100B tokes):
 
40
  - General Instruction-Augmented Corpora: [general-instruction-augmented-corpora](https://huggingface.co/datasets/instruction-pretrain/general-instruction-augmented-corpora)
41
  - Domain-Specific Instruction-Augmented Corpora (no finance data to avoid ethical issues): [medicine-instruction-augmented-corpora](https://huggingface.co/datasets/instruction-pretrain/medicine-instruction-augmented-corpora)
42
 
43
+
44
  ## General Pre-Training From Scratch
45
  We augment the [RefinedWeb corproa](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) with instruction-response pairs generated by our [context-based instruction synthesizer](https://huggingface.co/instruction-pretrain/instruction-synthesizer) to pre-train general langauge models from scratch.
46
 
 
86
  }
87
  ```
88
 
89
+ [Adapt LLM to Domains](https://huggingface.co/papers/2309.09530)
90
  ```bibtex
91
  @inproceedings{
92
  cheng2024adapting,