Update README.md
Browse files
README.md
CHANGED
|
@@ -16,23 +16,22 @@ K2 is a fully transparent large language model on par with Llama 2 - 70B.
|
|
| 16 |
<center><img src="eval_table_temp.png" alt="eval table"/></center>
|
| 17 |
|
| 18 |
## Datasets and Mix
|
| 19 |
-
| Dataset | Starting Tokens | Multiplier | Total Tokens
|
| 20 |
| ----------- | ----------- | ----------- | ----------- | ----------- |
|
| 21 |
-
| dm-math | 4.
|
| 22 |
-
|
|
| 23 |
-
|
|
| 24 |
-
|
|
| 25 |
-
|
|
| 26 |
-
|
|
| 27 |
-
|
|
| 28 |
-
|
|
| 29 |
-
|
|
| 30 |
-
|
|
| 31 |
-
|
|
| 32 |
-
|
|
| 33 |
-
|
|
| 34 |
-
|
|
| 35 |
-
| Checkpoint 356[link] | Checkpoint 351[link] | Checkpoint 355[link] | Checkpoint 355[link] |
|
| 36 |
|
| 37 |
## First 10 Checkpoints
|
| 38 |
| Checkpoints | |
|
|
|
|
| 16 |
<center><img src="eval_table_temp.png" alt="eval table"/></center>
|
| 17 |
|
| 18 |
## Datasets and Mix
|
| 19 |
+
| Dataset | Starting Tokens | Multiplier | Total Tokens |% of Total |
|
| 20 |
| ----------- | ----------- | ----------- | ----------- | ----------- |
|
| 21 |
+
| dm-math | 4.33B | 3x | 13B | 1% |
|
| 22 |
+
| pubmed-abstracts | 4.77B | 3x | 14.3B | 1.1% |
|
| 23 |
+
| uspto | 4.77B | 3x | 14.3B | 1.1% |
|
| 24 |
+
| pubmed-central | 26B | 1x | 26B | 2% |
|
| 25 |
+
| redpajama.arxiv | 27.3B | 1x | 27.3B | 2.1% |
|
| 26 |
+
| starcoder.spm | 67.6B | 0.5x | 33.8B | 2.6% |
|
| 27 |
+
| starcoder.fim | 67.6B | 0.5x | 33.8B | 2.6% |
|
| 28 |
+
| redpajama.stackexchange | 61.1B | 1x | 61.1B | 4.7% |
|
| 29 |
+
| starcoder | 132.6B | 0.5x | 66.3B | 5.1% |
|
| 30 |
+
| pile-of-law | 76.7B | 1x | 76.7B | 5.9% |
|
| 31 |
+
| redpajama.book | 80.6B | 1x | 80.6B | 6.2% |
|
| 32 |
+
| s2orc | 107.9B | 1x | 107.9B | 8.3% |
|
| 33 |
+
| redpajama.wikipedia | 22.1B | 6x | 132.6B | 10.2% |
|
| 34 |
+
| refinedweb | 612.3B | 1x | 612.3B | 47.1% |
|
|
|
|
| 35 |
|
| 36 |
## First 10 Checkpoints
|
| 37 |
| Checkpoints | |
|