Update README.md
Browse files
README.md
CHANGED
|
@@ -137,11 +137,11 @@ The model was evaluated on a held-out test set of **5,081 samples**, covering a
|
|
| 137 |
#### 1. Overall Performance
|
| 138 |
| Metric | Score | Note |
|
| 139 |
| :--- | :--- | :--- |
|
| 140 |
-
| **BLEU** | **86.
|
| 141 |
| **Word Accuracy** | **93.63%** | Robust word-level correction |
|
| 142 |
| **Exact Match** | **52.23%** | Entire sentence perfectly restored |
|
| 143 |
-
| **WER** | **0.
|
| 144 |
-
| **CER** | **0.
|
| 145 |
|
| 146 |
*Note: The Exact Match score reflects the inherent ambiguity in the Vietnamese language (e.g., "muon" could be "muốn", "mượn", or "muộn"), where multiple correct interpretations may exist without broader paragraph context.*
|
| 147 |
|
|
@@ -150,9 +150,9 @@ The model's performance varies based on the complexity and length of the input:
|
|
| 150 |
|
| 151 |
| Category | Length (words) | Accuracy | Sample Count |
|
| 152 |
| :--- | :--- | :--- | :--- |
|
| 153 |
-
| **Short** | < 10 | **
|
| 154 |
-
| **Medium** | 10 - 30 | **47.
|
| 155 |
-
| **Long** | > 30 | **
|
| 156 |
|
| 157 |
*Analysis: The model performs exceptionally well on short to medium sentences. Accuracy declines on longer sequences (>30 words), likely due to the increased probability of cumulative errors and the 256-token limit.*
|
| 158 |
|
|
|
|
| 137 |
#### 1. Overall Performance
|
| 138 |
| Metric | Score | Note |
|
| 139 |
| :--- | :--- | :--- |
|
| 140 |
+
| **BLEU** | **86.34** | High linguistic and semantic fidelity |
|
| 141 |
| **Word Accuracy** | **93.63%** | Robust word-level correction |
|
| 142 |
| **Exact Match** | **52.23%** | Entire sentence perfectly restored |
|
| 143 |
+
| **WER** | **0.0838** | ~8.38% error rate per word |
|
| 144 |
+
| **CER** | **0.0360** | ~3.60% error rate per character |
|
| 145 |
|
| 146 |
*Note: The Exact Match score reflects the inherent ambiguity in the Vietnamese language (e.g., "muon" could be "muốn", "mượn", or "muộn"), where multiple correct interpretations may exist without broader paragraph context.*
|
| 147 |
|
|
|
|
| 150 |
|
| 151 |
| Category | Length (words) | Accuracy | Sample Count |
|
| 152 |
| :--- | :--- | :--- | :--- |
|
| 153 |
+
| **Short** | < 10 | **60.88%** | 2,927 |
|
| 154 |
+
| **Medium** | 10 - 30 | **47.83%** | 3,577 |
|
| 155 |
+
| **Long** | > 30 | **25.91%** | 552 |
|
| 156 |
|
| 157 |
*Analysis: The model performs exceptionally well on short to medium sentences. Accuracy declines on longer sequences (>30 words), likely due to the increased probability of cumulative errors and the 256-token limit.*
|
| 158 |
|