Update README.md
Browse files
README.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
| 2 |
license: mit
|
| 3 |
---
|
| 4 |
|
| 5 |
-
llama3 variant for
|
| 6 |
1. Tamil
|
| 7 |
2. Telugu
|
| 8 |
3. Assamese
|
|
@@ -23,6 +23,8 @@ llama3 variant for 20 Indian languages:-
|
|
| 23 |
18. Dogri
|
| 24 |
19. English
|
| 25 |
20. Arabic
|
|
|
|
|
|
|
| 26 |
|
| 27 |
We first pre-trained the model on 100 million plus Indic language tokens.
|
| 28 |
Then, it was finetuned on close sourced GenZ_Vikas datasets consisting 7.5 million SFT pairs, including 5.5 million Hindi SFT pairs.
|
|
|
|
| 2 |
license: mit
|
| 3 |
---
|
| 4 |
|
| 5 |
+
llama3 variant for 22 Indian languages:-
|
| 6 |
1. Tamil
|
| 7 |
2. Telugu
|
| 8 |
3. Assamese
|
|
|
|
| 23 |
18. Dogri
|
| 24 |
19. English
|
| 25 |
20. Arabic
|
| 26 |
+
21. Santali
|
| 27 |
+
22. Bodo
|
| 28 |
|
| 29 |
We first pre-trained the model on 100 million plus Indic language tokens.
|
| 30 |
Then, it was finetuned on close sourced GenZ_Vikas datasets consisting 7.5 million SFT pairs, including 5.5 million Hindi SFT pairs.
|