gpt2NepaliCasualLM;Epoch 3
Browse files- README.md +13 -50
- config.json +3 -2
- generation_config.json +3 -0
- tf_model.h5 +1 -1
README.md
CHANGED
|
@@ -1,21 +1,9 @@
|
|
| 1 |
---
|
|
|
|
|
|
|
| 2 |
model-index:
|
| 3 |
- name: GPT2-Nepali-Casual-LM
|
| 4 |
results: []
|
| 5 |
-
datasets:
|
| 6 |
-
- raygx/NepaliCorpus
|
| 7 |
-
language:
|
| 8 |
-
- ne
|
| 9 |
-
pipeline_tag: text-generation
|
| 10 |
-
license: gpl
|
| 11 |
-
metrics:
|
| 12 |
-
- accuracy
|
| 13 |
-
library_name: transformers
|
| 14 |
-
tags:
|
| 15 |
-
- text
|
| 16 |
-
- nepali
|
| 17 |
-
- nepali text generation
|
| 18 |
-
- gpt2 for nepali
|
| 19 |
---
|
| 20 |
|
| 21 |
<!-- This model card has been generated automatically according to the information Keras had access to. You should
|
|
@@ -23,62 +11,37 @@ probably proofread and complete it, then remove this comment. -->
|
|
| 23 |
|
| 24 |
# GPT2-Nepali-Casual-LM
|
| 25 |
|
| 26 |
-
This model was trained from scratch on
|
| 27 |
-
|
| 28 |
|
| 29 |
|
| 30 |
-
|
| 31 |
|
| 32 |
-
More information needed
|
| 33 |
|
| 34 |
## Intended uses & limitations
|
| 35 |
|
| 36 |
-
|
| 37 |
|
| 38 |
## Training and evaluation data
|
| 39 |
|
| 40 |
-
|
| 41 |
-
[In this kaggle notebook](https://www.kaggle.com/code/reganmaharjan/tokenizer-nepcov19tweets) you'll find the actual sources of the data.
|
| 42 |
|
| 43 |
## Training procedure
|
| 44 |
|
| 45 |
-
I used kaggle to train the model. Since, there is limitation to session time and GPU usage as well.
|
| 46 |
-
I have to train the model in multiple batches of data and training sessions.
|
| 47 |
-
Other than that, I followed the hugging face course.
|
| 48 |
-
|
| 49 |
### Training hyperparameters
|
| 50 |
|
| 51 |
-
|
|
|
|
|
|
|
| 52 |
|
| 53 |
### Training results
|
| 54 |
|
| 55 |
-
|
| 56 |
-
model_pipeline(["बिहीबार सिंगापुरदेखि न्यूयोर्कसम्म",<br>
|
| 57 |
-
"अधिकांस दोस्रो रोजाइका खेलाडी",<br>
|
| 58 |
-
"पहिलो हाफमा गरेको ",<br>
|
| 59 |
-
"कालीमाटी फलफूल तथा तरकारी ",<br>
|
| 60 |
-
"हिल्सा नाका हुँदै यस वर्ष ९ हजार",<br>
|
| 61 |
-
"ओलीको सरकार बनेपछि इन्धनको",<br>
|
| 62 |
-
"मेरो नाम श्याम हो"])
|
| 63 |
-
<br><br>
|
| 64 |
-
Gave following outputs:<br>
|
| 65 |
-
[[{'generated_text': 'बिहीबार सिंगापुरदेखि न्यूयोर्कसम्म पनि छन् । तर, यो खबर आजको कान्तिपुर दैनिकमा छ । यो खबर आजको'}],<br>
|
| 66 |
-
[{'generated_text': 'अधिकांस दोस्रो रोजाइका खेलाडी हुन् । उनी भन्छन्, ‘ यो कुरा हो । ’ उनले भने, ‘'}],<br>
|
| 67 |
-
[{'generated_text': 'पहिलो हाफमा गरेको थियो । तर, यो खबर आजको कान्तिपुर दैनिकमा छ । यो खबर आजको कान्तिपुर दैनिकमा छ'}],<br>
|
| 68 |
-
[{'generated_text': 'कालीमाटी फलफूल तथा तरकारी खेती गर्न सकिने व्यवस्था गरिएको छ । काठमाडौं । नेपाल राष्ट्र बैंकले गत आर्थिक वर्षमा १'}],<br>
|
| 69 |
-
[{'generated_text': 'हिल्सा नाका हुँदै यस वर्ष ९ हजार ७ सय ७ ० रुपैयाँ बराबरको शेयर कारोबार भएको छ । यो अवधिमा'}],<br>
|
| 70 |
-
[{'generated_text': 'ओलीको सरकार बनेपछि इन्धनको मूल्य निर्धारण गर्न नसकेको हो । तर, अहिले पनि यो मूल्य निर्धारण गर्न नसकेको हो'}],<br>
|
| 71 |
-
[{'generated_text': ' मेरो नाम श्याम हो । यो नाम श्याम हो । यो नाम श्याम हो । यो नाम श्याम हो ।}]]<br>
|
| 72 |
-
|
| 73 |
-
This is quite good result. In my opinion.
|
| 74 |
-
Since the dataset used was mostly crawled from news portals, I think the model quite caught the gist of news.
|
| 75 |
-
As it can be noted for all the inputs except the last one.<br>
|
| 76 |
-
For the last input "मेरो नाम श्याम हो" model was able to know that it was end of a sentence and it added "।",
|
| 77 |
-
and it also seem to know what "श्याम" is, but it wasn't able to generate any new sentence and kept repeating it.
|
| 78 |
|
| 79 |
### Framework versions
|
| 80 |
|
| 81 |
- Transformers 4.28.1
|
| 82 |
- TensorFlow 2.11.0
|
| 83 |
- Datasets 2.1.0
|
| 84 |
-
- Tokenizers 0.13.3
|
|
|
|
| 1 |
---
|
| 2 |
+
tags:
|
| 3 |
+
- generated_from_keras_callback
|
| 4 |
model-index:
|
| 5 |
- name: GPT2-Nepali-Casual-LM
|
| 6 |
results: []
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
---
|
| 8 |
|
| 9 |
<!-- This model card has been generated automatically according to the information Keras had access to. You should
|
|
|
|
| 11 |
|
| 12 |
# GPT2-Nepali-Casual-LM
|
| 13 |
|
| 14 |
+
This model was trained from scratch on an unknown dataset.
|
| 15 |
+
It achieves the following results on the evaluation set:
|
| 16 |
|
| 17 |
|
| 18 |
+
## Model description
|
| 19 |
|
| 20 |
+
More information needed
|
| 21 |
|
| 22 |
## Intended uses & limitations
|
| 23 |
|
| 24 |
+
More information needed
|
| 25 |
|
| 26 |
## Training and evaluation data
|
| 27 |
|
| 28 |
+
More information needed
|
|
|
|
| 29 |
|
| 30 |
## Training procedure
|
| 31 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
### Training hyperparameters
|
| 33 |
|
| 34 |
+
The following hyperparameters were used during training:
|
| 35 |
+
- optimizer: None
|
| 36 |
+
- training_precision: float32
|
| 37 |
|
| 38 |
### Training results
|
| 39 |
|
| 40 |
+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
|
| 42 |
### Framework versions
|
| 43 |
|
| 44 |
- Transformers 4.28.1
|
| 45 |
- TensorFlow 2.11.0
|
| 46 |
- Datasets 2.1.0
|
| 47 |
+
- Tokenizers 0.13.3
|
config.json
CHANGED
|
@@ -6,9 +6,9 @@
|
|
| 6 |
"GPT2LMHeadModel"
|
| 7 |
],
|
| 8 |
"attn_pdrop": 0.1,
|
| 9 |
-
"bos_token_id":
|
| 10 |
"embd_pdrop": 0.1,
|
| 11 |
-
"eos_token_id":
|
| 12 |
"id2label": {
|
| 13 |
"0": "NEUTRAL",
|
| 14 |
"1": "POSITIVE",
|
|
@@ -28,6 +28,7 @@
|
|
| 28 |
"n_inner": null,
|
| 29 |
"n_layer": 6,
|
| 30 |
"n_positions": 1024,
|
|
|
|
| 31 |
"reorder_and_upcast_attn": false,
|
| 32 |
"resid_pdrop": 0.1,
|
| 33 |
"scale_attn_by_inverse_layer_idx": false,
|
|
|
|
| 6 |
"GPT2LMHeadModel"
|
| 7 |
],
|
| 8 |
"attn_pdrop": 0.1,
|
| 9 |
+
"bos_token_id": 1,
|
| 10 |
"embd_pdrop": 0.1,
|
| 11 |
+
"eos_token_id": 2,
|
| 12 |
"id2label": {
|
| 13 |
"0": "NEUTRAL",
|
| 14 |
"1": "POSITIVE",
|
|
|
|
| 28 |
"n_inner": null,
|
| 29 |
"n_layer": 6,
|
| 30 |
"n_positions": 1024,
|
| 31 |
+
"pad_token_id": 3,
|
| 32 |
"reorder_and_upcast_attn": false,
|
| 33 |
"resid_pdrop": 0.1,
|
| 34 |
"scale_attn_by_inverse_layer_idx": false,
|
generation_config.json
CHANGED
|
@@ -1,4 +1,7 @@
|
|
| 1 |
{
|
| 2 |
"_from_model_config": true,
|
|
|
|
|
|
|
|
|
|
| 3 |
"transformers_version": "4.28.1"
|
| 4 |
}
|
|
|
|
| 1 |
{
|
| 2 |
"_from_model_config": true,
|
| 3 |
+
"bos_token_id": 1,
|
| 4 |
+
"eos_token_id": 2,
|
| 5 |
+
"pad_token_id": 3,
|
| 6 |
"transformers_version": "4.28.1"
|
| 7 |
}
|
tf_model.h5
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 326955968
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:09dd449b2776c96a5de70924d183f332ca28f8ace94cc3d7aa1d7c362d146a4a
|
| 3 |
size 326955968
|