File size: 1,042 Bytes
b11a057
 
96e2eb7
 
 
 
 
 
251dbc6
 
9249dd6
92bd2e6
 
251dbc6
92bd2e6
9249dd6
 
92bd2e6
9249dd6
251dbc6
92bd2e6
9249dd6
 
 
 
 
251dbc6
92bd2e6
251dbc6
92bd2e6
 
 
 
9249dd6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
---
license: bsd-3-clause-clear
language:
- ne
metrics:
- perplexity
library_name: transformers
pipeline_tag: text-generation
---

# NepaliGPT: Nepali Language Generative Pretrained Transformer Model
This is an experiment for developing a language generation model for the Nepali language. 
Causal Language Model which can predict the next possible tokens given a context in Nepali language. 

# Dataset Used
A large corpus of 9.3 GB size has been collected from different sources on the internet. The sources include
- Nepali Books found online.
- Nepali News Article from Nepali news portals.
- Nepali text collected from different open source Nepali NLP datasets. 

# Hyperparameters Used
Learning rate -> 2e-5 \
Weight Decay -> 0.01 \
Number of training epochs -> 5 \ 
bf16 -> True \
Base Model Architecture -> GPT-2 \

## Training Results

It achieves the following results on the evaluation set:

| Training Loss | Validation Loss | Perplexity
|:-------------:|:---------------:|:----------:|
| 3.3968        | 3.2705          | 26.3245