|
|
--- |
|
|
library_name: transformers |
|
|
language: |
|
|
- kh |
|
|
license: mit |
|
|
base_model: facebook/mbart-large-50 |
|
|
tags: |
|
|
- generated_from_trainer |
|
|
datasets: |
|
|
- S-Sethisak |
|
|
metrics: |
|
|
- wer |
|
|
- bleu |
|
|
model-index: |
|
|
- name: KOCS - Khmer Orthographic Correction System |
|
|
results: |
|
|
- task: |
|
|
name: Sequence-to-sequence Language Modeling |
|
|
type: text2text-generation |
|
|
dataset: |
|
|
name: khmer-orthography-correction-dataset |
|
|
type: S-Sethisak |
|
|
metrics: |
|
|
- name: Wer |
|
|
type: wer |
|
|
value: 0.05181716833890747 |
|
|
- name: Bleu |
|
|
type: bleu |
|
|
value: |
|
|
score: 91.23760284388217 |
|
|
counts: |
|
|
- 64405 |
|
|
- 56188 |
|
|
- 48516 |
|
|
- 41469 |
|
|
totals: |
|
|
- 67099 |
|
|
- 60367 |
|
|
- 53696 |
|
|
- 47271 |
|
|
precisions: |
|
|
- 95.98503703482913 |
|
|
- 93.07734358175824 |
|
|
- 90.3530989272944 |
|
|
- 87.7260899917497 |
|
|
bp: 0.9945898677227208 |
|
|
sys_len: 67099 |
|
|
ref_len: 67463 |
|
|
--- |
|
|
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
|
|
# KOCS - Khmer Orthographic Correction System |
|
|
|
|
|
This model is a fine-tuned version of [facebook/mbart-large-50](https://huggingface.co/facebook/mbart-large-50) on the khmer-orthography-correction-dataset dataset. |
|
|
It achieves the following results on the evaluation set: |
|
|
- Loss: 0.0810 |
|
|
- Cer: 0.0214 |
|
|
- Wer: 0.0518 |
|
|
- Bleu: {'score': 91.23760284388217, 'counts': [64405, 56188, 48516, 41469], 'totals': [67099, 60367, 53696, 47271], 'precisions': [95.98503703482913, 93.07734358175824, 90.3530989272944, 87.7260899917497], 'bp': 0.9945898677227208, 'sys_len': 67099, 'ref_len': 67463} |
|
|
|
|
|
## Model description |
|
|
|
|
|
More information needed |
|
|
|
|
|
## Intended uses & limitations |
|
|
|
|
|
More information needed |
|
|
|
|
|
## Training and evaluation data |
|
|
|
|
|
More information needed |
|
|
|
|
|
## Training procedure |
|
|
|
|
|
### Training hyperparameters |
|
|
|
|
|
The following hyperparameters were used during training: |
|
|
- learning_rate: 2e-05 |
|
|
- train_batch_size: 32 |
|
|
- eval_batch_size: 32 |
|
|
- seed: 42 |
|
|
- gradient_accumulation_steps: 2 |
|
|
- total_train_batch_size: 64 |
|
|
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments |
|
|
- lr_scheduler_type: linear |
|
|
- num_epochs: 10 |
|
|
- mixed_precision_training: Native AMP |
|
|
|
|
|
### Training results |
|
|
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Cer | Wer | Bleu | |
|
|
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:| |
|
|
| 1.0649 | 1.0 | 842 | 0.3280 | 0.0732 | 0.1732 | {'score': 71.9221872879795, 'counts': [57846, 46293, 36724, 28877], 'totals': [67547, 60815, 54145, 47716], 'precisions': [85.63814825232801, 76.12102277398668, 67.82528395973775, 60.51848436583117], 'bp': 1.0, 'sys_len': 67547, 'ref_len': 67463} | |
|
|
| 0.2503 | 2.0 | 1684 | 0.1941 | 0.0442 | 0.1100 | {'score': 81.83581475148922, 'counts': [61144, 51308, 42671, 35147], 'totals': [66960, 60228, 53557, 47131], 'precisions': [91.3142174432497, 85.18961280467556, 79.67399219523125, 74.57299866330017], 'bp': 0.9925161967292249, 'sys_len': 66960, 'ref_len': 67463} | |
|
|
| 0.1299 | 3.0 | 2526 | 0.1461 | 0.0393 | 0.0906 | {'score': 85.38397449977448, 'counts': [62316, 53167, 44914, 37556], 'totals': [67168, 60436, 53765, 47338], 'precisions': [92.77632205812291, 87.97240055596002, 83.53761740909513, 79.33584012843804], 'bp': 0.9956176582385678, 'sys_len': 67168, 'ref_len': 67463} | |
|
|
| 0.0876 | 4.0 | 3368 | 0.1189 | 0.0333 | 0.0754 | {'score': 87.61990203829572, 'counts': [63031, 54246, 46254, 39048], 'totals': [66846, 60114, 53443, 47020], 'precisions': [94.292852227508, 90.23854676115381, 86.54828508878619, 83.04551254785198], 'bp': 0.990812296425952, 'sys_len': 66846, 'ref_len': 67463} | |
|
|
| 0.0531 | 5.0 | 4210 | 0.1024 | 0.0291 | 0.0667 | {'score': 89.01138842411211, 'counts': [63577, 55021, 47137, 39975], 'totals': [67042, 60310, 53639, 47214], 'precisions': [94.8315981026819, 91.23031006466589, 87.8782229348049, 84.66768331427119], 'bp': 0.9937400301719442, 'sys_len': 67042, 'ref_len': 67463} | |
|
|
| 0.0357 | 6.0 | 5052 | 0.0933 | 0.0248 | 0.0596 | {'score': 90.11989978537375, 'counts': [63951, 55579, 47830, 40753], 'totals': [67063, 60331, 53661, 47238], 'precisions': [95.3595872537763, 92.12345228821005, 89.1336352285645, 86.27164570896312], 'bp': 0.9940532117557648, 'sys_len': 67063, 'ref_len': 67463} | |
|
|
| 0.0262 | 7.0 | 5894 | 0.0874 | 0.0247 | 0.0575 | {'score': 90.47303331001092, 'counts': [64075, 55759, 48046, 40983], 'totals': [67024, 60292, 53624, 47202], 'precisions': [95.60008355216041, 92.48158959729318, 89.5979412203491, 86.82471081733824], 'bp': 0.993471511214246, 'sys_len': 67024, 'ref_len': 67463} | |
|
|
| 0.0191 | 8.0 | 6736 | 0.0838 | 0.0225 | 0.0533 | {'score': 91.04591374118674, 'counts': [64352, 56095, 48400, 41348], 'totals': [67157, 60425, 53753, 47328], 'precisions': [95.82322021531635, 92.83409184940008, 90.04148605659219, 87.36477349560514], 'bp': 0.9954538780005294, 'sys_len': 67157, 'ref_len': 67463} | |
|
|
| 0.015 | 9.0 | 7578 | 0.0817 | 0.0215 | 0.0529 | {'score': 91.05623653418233, 'counts': [64338, 56094, 48410, 41355], 'totals': [67124, 60392, 53721, 47297], 'precisions': [95.84947261784161, 92.88316333289177, 90.11373578302712, 87.43683531725057], 'bp': 0.9949623770309173, 'sys_len': 67124, 'ref_len': 67463} | |
|
|
| 0.0135 | 10.0 | 8420 | 0.0810 | 0.0214 | 0.0518 | {'score': 91.23760284388217, 'counts': [64405, 56188, 48516, 41469], 'totals': [67099, 60367, 53696, 47271], 'precisions': [95.98503703482913, 93.07734358175824, 90.3530989272944, 87.7260899917497], 'bp': 0.9945898677227208, 'sys_len': 67099, 'ref_len': 67463} | |
|
|
|
|
|
|
|
|
### Framework versions |
|
|
|
|
|
- Transformers 4.57.2 |
|
|
- Pytorch 2.9.0+cu126 |
|
|
- Datasets 4.0.0 |
|
|
- Tokenizers 0.22.1 |
|
|
|