Update README.md
Browse files
README.md
CHANGED
|
@@ -35,7 +35,7 @@ The training corpus consists of the following datasets:
|
|
| 35 |
|----------|-------------|
|
| 36 |
| Business & Finance | 736,071,807 |
|
| 37 |
| News | 1,700,662,378 |
|
| 38 |
-
| Education |
|
| 39 |
| Social | 211,000,000 |
|
| 40 |
| Government | 40,492,117 |
|
| 41 |
| Medical | 42,987,587 |
|
|
@@ -44,7 +44,6 @@ The training corpus consists of the following datasets:
|
|
| 44 |
| Research Articles | 4,185,649,758 |
|
| 45 |
| Law | 467,994,847 |
|
| 46 |
| Travel | 6,948,290 |
|
| 47 |
-
| Buddhism | 21,600,000 |
|
| 48 |
| Others | 4,410,619 |
|
| 49 |
|
| 50 |
*Token counts calculated using Qwen3 Tokenizer
|
|
|
|
| 35 |
|----------|-------------|
|
| 36 |
| Business & Finance | 736,071,807 |
|
| 37 |
| News | 1,700,662,378 |
|
| 38 |
+
| Education | 576,489,778 |
|
| 39 |
| Social | 211,000,000 |
|
| 40 |
| Government | 40,492,117 |
|
| 41 |
| Medical | 42,987,587 |
|
|
|
|
| 44 |
| Research Articles | 4,185,649,758 |
|
| 45 |
| Law | 467,994,847 |
|
| 46 |
| Travel | 6,948,290 |
|
|
|
|
| 47 |
| Others | 4,410,619 |
|
| 48 |
|
| 49 |
*Token counts calculated using Qwen3 Tokenizer
|