[UPDATE]: update README.md
Browse files
README.md
CHANGED
|
@@ -10,9 +10,28 @@ datasets:
|
|
| 10 |
- Sakonii/nepalitext-language-model-dataset
|
| 11 |
---
|
| 12 |
|
| 13 |
-
# Nepali Language
|
|
|
|
|
|
|
| 14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
|
|
|
|
| 17 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
|
|
|
|
| 10 |
- Sakonii/nepalitext-language-model-dataset
|
| 11 |
---
|
| 12 |
|
| 13 |
+
# NepaliGPT:Nepali Language Generative Pretrained Transformer Model
|
| 14 |
+
This is an experiment for developing a language generation model for the Nepali language.
|
| 15 |
+
Causal Language Model which can predict the next possible tokens given a context in Nepali language.
|
| 16 |
|
| 17 |
+
# Dataset Used
|
| 18 |
+
A large corpus of 9.3 GB size has been collected from different sources from internet. The sources include
|
| 19 |
+
- Nepali Books found online .
|
| 20 |
+
- Nepali News Article from Nepali news portals.
|
| 21 |
+
- Nepali text collected from different open souce Nepali NLP datasets.
|
| 22 |
|
| 23 |
+
# Hyperparameters Used
|
| 24 |
+
Learning rate -> 2e-5
|
| 25 |
+
Weight Decay -> 0.01
|
| 26 |
+
Number of training epochs -> 5
|
| 27 |
+
bf16 -> True
|
| 28 |
+
Base Model Architecture -> gpt-2
|
| 29 |
|
| 30 |
+
## Training Results
|
| 31 |
|
| 32 |
+
It achieves the following results on the evaluation set:
|
| 33 |
+
|
| 34 |
+
| Training Loss | Validation Loss | Perplexity
|
| 35 |
+
|:-------------:|:---------------:|:----------:|
|
| 36 |
+
| 3.3968 | 3.2705 | 26.3245
|
| 37 |
|