Update README.md
Browse files
README.md
CHANGED
|
@@ -144,11 +144,9 @@ Optimizer states before annealing will be released in a future update.
|
|
| 144 |
|
| 145 |
<details><summary>5. Data Distribution for every phase</summary>
|
| 146 |
|
| 147 |
-
<a href="https://github.com/RUC-GSAI/YuLan-Mini/blob/main/pretrain/final.pdf">High-resolution version</a>
|
| 148 |
-
|
| 149 |
<a href="https://github.com/RUC-GSAI/YuLan-Mini/blob/main/pretrain/final.pdf">
|
| 150 |
<div align=center>
|
| 151 |
-
<img src="
|
| 152 |
</div>
|
| 153 |
</a>
|
| 154 |
|
|
@@ -159,7 +157,7 @@ Optimizer states before annealing will be released in a future update.
|
|
| 159 |
|
| 160 |
Data cleaning and synthesis pipeline:
|
| 161 |
<div align=center>
|
| 162 |
-
<img src="
|
| 163 |
</div>
|
| 164 |
|
| 165 |
The synthetic data we are using is released in <a href="https://huggingface.co/collections/yulan-team/yulan-mini-676d214b24376739b00d95f3">YuLan-Mini-Datasets</a>
|
|
|
|
| 144 |
|
| 145 |
<details><summary>5. Data Distribution for every phase</summary>
|
| 146 |
|
|
|
|
|
|
|
| 147 |
<a href="https://github.com/RUC-GSAI/YuLan-Mini/blob/main/pretrain/final.pdf">
|
| 148 |
<div align=center>
|
| 149 |
+
<img src="assets/data_distribution_for_every_phase.png">
|
| 150 |
</div>
|
| 151 |
</a>
|
| 152 |
|
|
|
|
| 157 |
|
| 158 |
Data cleaning and synthesis pipeline:
|
| 159 |
<div align=center>
|
| 160 |
+
<img src="assets/data-pipeline.png">
|
| 161 |
</div>
|
| 162 |
|
| 163 |
The synthetic data we are using is released in <a href="https://huggingface.co/collections/yulan-team/yulan-mini-676d214b24376739b00d95f3">YuLan-Mini-Datasets</a>
|