corrected a typo
#5
by IlyasMoutawwakil HF Staff - opened
README.md
CHANGED
|
@@ -131,7 +131,7 @@ Falcon-40B was trained on 1,000B tokens of [RefinedWeb](https://huggingface.co/d
|
|
| 131 |
| **Data source** | **Fraction** | **Tokens** | **Sources** |
|
| 132 |
|--------------------|--------------|------------|-----------------------------------|
|
| 133 |
| [RefinedWeb-English](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) | 75% | 750B | massive web crawl |
|
| 134 |
-
| RefinedWeb-Europe | 7% | 70B | European massive
|
| 135 |
| Books | 6% | 60B | |
|
| 136 |
| Conversations | 5% | 50B | Reddit, StackOverflow, HackerNews |
|
| 137 |
| Code | 5% | 50B | |
|
|
|
|
| 131 |
| **Data source** | **Fraction** | **Tokens** | **Sources** |
|
| 132 |
|--------------------|--------------|------------|-----------------------------------|
|
| 133 |
| [RefinedWeb-English](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) | 75% | 750B | massive web crawl |
|
| 134 |
+
| RefinedWeb-Europe | 7% | 70B | European massive web crawl |
|
| 135 |
| Books | 6% | 60B | |
|
| 136 |
| Conversations | 5% | 50B | Reddit, StackOverflow, HackerNews |
|
| 137 |
| Code | 5% | 50B | |
|