Update README.md
Browse files
README.md
CHANGED
|
@@ -22,29 +22,29 @@ tags:
|
|
| 22 |
# Jeju Satoru
|
| 23 |
|
| 24 |
## Project Overview
|
| 25 |
-
|
| 26 |
|
| 27 |
## Model Information
|
| 28 |
* **Base Model**: KoBART (`gogamza/kobart-base-v2`)
|
| 29 |
* **Model Architecture**: Seq2Seq (Encoder-Decoder structure)
|
| 30 |
-
*
|
| 31 |
|
| 32 |
## Training Strategy and Parameters
|
| 33 |
-
|
| 34 |
|
| 35 |
-
1.
|
| 36 |
-
2.
|
| 37 |
|
| 38 |
The following key hyperparameters and techniques were applied for performance optimization:
|
| 39 |
-
*
|
| 40 |
-
*
|
| 41 |
-
*
|
| 42 |
-
*
|
| 43 |
* **Generation Beams**: 5
|
| 44 |
* **GPU Memory Efficiency**: Mixed-precision training (FP16) was used to reduce training time, along with Gradient Accumulation (Steps: 16).
|
| 45 |
|
| 46 |
## Performance Evaluation
|
| 47 |
-
|
| 48 |
|
| 49 |
### Quantitative Evaluation
|
| 50 |
| Direction | SacreBLEU | CHRF | BERTScore |
|
|
@@ -53,9 +53,9 @@ The following key hyperparameters and techniques were applied for performance op
|
|
| 53 |
| Standard → Jeju Dialect | 64.86 | 72.68 | 0.94 |
|
| 54 |
|
| 55 |
### Qualitative Evaluation (Summary)
|
| 56 |
-
*
|
| 57 |
-
*
|
| 58 |
-
*
|
| 59 |
|
| 60 |
## How to Use
|
| 61 |
You can easily load and infer with the model using the `transformers` library's `pipeline` function.
|
|
|
|
| 22 |
# Jeju Satoru
|
| 23 |
|
| 24 |
## Project Overview
|
| 25 |
+
'Jeju Satoru' is a **bidirectional Jeju-Standard Korean translation model** developed to preserve the Jeju language, which is designated as an **'endangered language'** by UNESCO. The model aims to bridge the digital divide for elderly Jeju dialect speakers by improving their digital accessibility.
|
| 26 |
|
| 27 |
## Model Information
|
| 28 |
* **Base Model**: KoBART (`gogamza/kobart-base-v2`)
|
| 29 |
* **Model Architecture**: Seq2Seq (Encoder-Decoder structure)
|
| 30 |
+
* **Training Data**: The model was trained using a large-scale dataset of approximately 930,000 sentence pairs. The dataset was built by leveraging the publicly available [Junhoee/Jeju-Standard-Translation](https://huggingface.co/datasets/Junhoee/Jeju-Standard-Translation) dataset, which is primarily based on text from the KakaoBrain JIT (Jeju-Island-Translation) corpus and transcribed data from the AI Hub Jeju dialect speech dataset.
|
| 31 |
|
| 32 |
## Training Strategy and Parameters
|
| 33 |
+
Our model was trained using a **two-stage domain adaptation method** to handle the complexities of the Jeju dialect.
|
| 34 |
|
| 35 |
+
1. **Domain Adaptation**: The model was separately trained on Standard Korean and Jeju dialect sentences to help it deeply understand the grammar and style of each language.
|
| 36 |
+
2. **Translation Fine-Tuning**: The final stage involved training the model on the bidirectional dataset, with `[제주]` (Jeju) and `[표준]` (Standard) tags added to each sentence to explicitly guide the translation direction.
|
| 37 |
|
| 38 |
The following key hyperparameters and techniques were applied for performance optimization:
|
| 39 |
+
* **Learning Rate**: 2e-5
|
| 40 |
+
* **Epochs**: 3
|
| 41 |
+
* **Batch Size**: 128
|
| 42 |
+
* **Weight Decay**: 0.01
|
| 43 |
* **Generation Beams**: 5
|
| 44 |
* **GPU Memory Efficiency**: Mixed-precision training (FP16) was used to reduce training time, along with Gradient Accumulation (Steps: 16).
|
| 45 |
|
| 46 |
## Performance Evaluation
|
| 47 |
+
The model's performance was comprehensively evaluated using both quantitative and qualitative metrics.
|
| 48 |
|
| 49 |
### Quantitative Evaluation
|
| 50 |
| Direction | SacreBLEU | CHRF | BERTScore |
|
|
|
|
| 53 |
| Standard → Jeju Dialect | 64.86 | 72.68 | 0.94 |
|
| 54 |
|
| 55 |
### Qualitative Evaluation (Summary)
|
| 56 |
+
* **Adequacy**: The model accurately captures the meaning of most source sentences.
|
| 57 |
+
* **Fluency**: The translated sentences are grammatically correct and natural-sounding.
|
| 58 |
+
* **Tone**: While generally good at maintaining the tone, the model has some limitations in perfectly reflecting the nuances and specific colloquial endings of the Jeju dialect.
|
| 59 |
|
| 60 |
## How to Use
|
| 61 |
You can easily load and infer with the model using the `transformers` library's `pipeline` function.
|