DunnBC22
/

opus-mt-ko-en-Korean_Parallel_Corpora

text2text-generation

Generated from Trainer

Model card Files Files and versions

Metrics Training metrics Community

opus-mt-ko-en-Korean_Parallel_Corpora

This model is a fine-tuned version of Helsinki-NLP/opus-mt-ko-en.

Model description

For more information on how it was created, check out the following link: https://github.com/DunnBC22/NLP_Projects/blob/main/Machine%20Translation/Korean%20to%20English%20(Korean%20Parallel%20Corpora)/Korean_Parallel_Corpora_OPUS_Translation_Project.ipynb

I apologize in advance if any of the generated text is less than stellar. I am well intentioned, but sometimes the technology can generate some strange outputs.

Intended uses & limitations

This model is intended to demonstrate my ability to solve a complex problem using technology.

Training and evaluation data

Dataset Source: https://huggingface.co/datasets/Moo/korean-parallel-corpora

Histogram of Korean Input Word Counts

Histogram of English Input Word Counts

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 6

Training results

eval_loss: 2.6620
eval_bleu: 14.3395
eval_rouge
- rouge1: 0.4391
- rouge2: 0.2022
- rougeL: 0.3671
- rougeLsum: 0.3671

The training results values are rounded to the nearest ten-thousandth.

Framework versions

Transformers 4.31.0
Pytorch 2.0.1+cu118
Datasets 2.14.4
Tokenizers 0.13.3

License Notice

This model is a fine-tuned derivative of a pretrained model. Users must comply with the original model license.

Dataset Notice

This model was fine-tuned on third-party datasets which may have separate licenses or usage restrictions.

Downloads last month: 4

Model tree for DunnBC22/opus-mt-ko-en-Korean_Parallel_Corpora

Base model

Helsinki-NLP/opus-mt-ko-en

Finetuned

(12)

this model

Dataset used to train DunnBC22/opus-mt-ko-en-Korean_Parallel_Corpora

Space using DunnBC22/opus-mt-ko-en-Korean_Parallel_Corpora 1

Collection including DunnBC22/opus-mt-ko-en-Korean_Parallel_Corpora

Translation

7 items • Updated Sep 9, 2023