maintain README.md
Browse files
README.md
CHANGED
|
@@ -6,26 +6,24 @@ metrics:
|
|
| 6 |
- perplexity
|
| 7 |
library_name: transformers
|
| 8 |
pipeline_tag: text-generation
|
| 9 |
-
datasets:
|
| 10 |
-
- Sakonii/nepalitext-language-model-dataset
|
| 11 |
---
|
| 12 |
|
| 13 |
-
# NepaliGPT:Nepali Language Generative Pretrained Transformer Model
|
| 14 |
This is an experiment for developing a language generation model for the Nepali language.
|
| 15 |
Causal Language Model which can predict the next possible tokens given a context in Nepali language.
|
| 16 |
|
| 17 |
# Dataset Used
|
| 18 |
-
A large corpus of 9.3 GB size has been collected from different sources
|
| 19 |
-
- Nepali Books found online
|
| 20 |
- Nepali News Article from Nepali news portals.
|
| 21 |
-
- Nepali text collected from different open
|
| 22 |
|
| 23 |
# Hyperparameters Used
|
| 24 |
-
Learning rate -> 2e-5
|
| 25 |
-
Weight Decay -> 0.01
|
| 26 |
-
Number of training epochs -> 5
|
| 27 |
-
bf16 -> True
|
| 28 |
-
Base Model Architecture ->
|
| 29 |
|
| 30 |
## Training Results
|
| 31 |
|
|
@@ -33,5 +31,4 @@ It achieves the following results on the evaluation set:
|
|
| 33 |
|
| 34 |
| Training Loss | Validation Loss | Perplexity
|
| 35 |
|:-------------:|:---------------:|:----------:|
|
| 36 |
-
| 3.3968 | 3.2705 | 26.3245
|
| 37 |
-
|
|
|
|
| 6 |
- perplexity
|
| 7 |
library_name: transformers
|
| 8 |
pipeline_tag: text-generation
|
|
|
|
|
|
|
| 9 |
---
|
| 10 |
|
| 11 |
+
# NepaliGPT: Nepali Language Generative Pretrained Transformer Model
|
| 12 |
This is an experiment for developing a language generation model for the Nepali language.
|
| 13 |
Causal Language Model which can predict the next possible tokens given a context in Nepali language.
|
| 14 |
|
| 15 |
# Dataset Used
|
| 16 |
+
A large corpus of 9.3 GB size has been collected from different sources on the internet. The sources include
|
| 17 |
+
- Nepali Books found online.
|
| 18 |
- Nepali News Article from Nepali news portals.
|
| 19 |
+
- Nepali text collected from different open source Nepali NLP datasets.
|
| 20 |
|
| 21 |
# Hyperparameters Used
|
| 22 |
+
Learning rate -> 2e-5 \
|
| 23 |
+
Weight Decay -> 0.01 \
|
| 24 |
+
Number of training epochs -> 5 \
|
| 25 |
+
bf16 -> True \
|
| 26 |
+
Base Model Architecture -> GPT-2 \
|
| 27 |
|
| 28 |
## Training Results
|
| 29 |
|
|
|
|
| 31 |
|
| 32 |
| Training Loss | Validation Loss | Perplexity
|
| 33 |
|:-------------:|:---------------:|:----------:|
|
| 34 |
+
| 3.3968 | 3.2705 | 26.3245
|
|
|