Commit
Β·
b07f058
1
Parent(s):
1c87807
Update evaluation results and README with new benchmarks
Browse files- Updated MER results: SEAME (0.2530), EMILIA (0.3046), CS-Dialogue (0.2541)
- Added 5 examples showing significant improvements over baseline
- Added link to original MERaLiON-2-3B model
- Updated eval_results with latest evaluation files
README.md
CHANGED
|
@@ -26,21 +26,65 @@ A fine-tuned version of [MERaLiON/MERaLiON-2-3B](https://huggingface.co/MERaLiON
|
|
| 26 |
|
| 27 |
| Benchmark | Baseline | This Model | Improvement |
|
| 28 |
|-----------|----------|------------|-------------|
|
| 29 |
-
| **SEAME**
|
| 30 |
-
| EMILIA | 0.
|
| 31 |
-
| CS-Dialogue | 0.
|
| 32 |
|
| 33 |
### Benchmark Descriptions
|
| 34 |
- **SEAME**: English-Mandarin code-switching conversational speech from Singapore/Malaysia (9,764 samples)
|
| 35 |
- **EMILIA**: Synthetic code-switching evaluation set (1,000 samples)
|
| 36 |
- **CS-Dialogue**: Code-switching dialogue evaluation set (359 samples)
|
| 37 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
## Training Configuration
|
| 39 |
|
| 40 |
### Model Architecture
|
| 41 |
| Parameter | Value |
|
| 42 |
|-----------|-------|
|
| 43 |
-
| Base Model | MERaLiON/MERaLiON-2-3B |
|
| 44 |
| Training Type | Full Fine-Tuning |
|
| 45 |
| Total Parameters | ~3.47B |
|
| 46 |
| Trainable Parameters | ~3.47B |
|
|
@@ -133,4 +177,4 @@ print(transcription)
|
|
| 133 |
|
| 134 |
## License
|
| 135 |
|
| 136 |
-
This model inherits the license of the base MERaLiON-2-3B model.
|
|
|
|
| 26 |
|
| 27 |
| Benchmark | Baseline | This Model | Improvement |
|
| 28 |
|-----------|----------|------------|-------------|
|
| 29 |
+
| **SEAME** | 0.3372 | **0.2530** | **-25.0%** |
|
| 30 |
+
| **EMILIA** | 0.3201 | **0.3046** | **-4.8%** |
|
| 31 |
+
| **CS-Dialogue** | 0.2258 | 0.2541 | +12.5% |
|
| 32 |
|
| 33 |
### Benchmark Descriptions
|
| 34 |
- **SEAME**: English-Mandarin code-switching conversational speech from Singapore/Malaysia (9,764 samples)
|
| 35 |
- **EMILIA**: Synthetic code-switching evaluation set (1,000 samples)
|
| 36 |
- **CS-Dialogue**: Code-switching dialogue evaluation set (359 samples)
|
| 37 |
|
| 38 |
+
## Examples
|
| 39 |
+
|
| 40 |
+
Below are examples showing improvements from baseline to DPO-trained model:
|
| 41 |
+
|
| 42 |
+
### Example 1: Hallucination Fixed (Valentine's Day)
|
| 43 |
+
| | Transcription |
|
| 44 |
+
|---|---|
|
| 45 |
+
| **Ground Truth** | (ε) ζ们 δΊζ ε€ ζ valentine's day |
|
| 46 |
+
| **Baseline** | ah moment ah month ah month ah month ah month... *(repeated 250+ times)* |
|
| 47 |
+
| **This Model** | (ε) ζ们δΊζε€ζ valentine's day |
|
| 48 |
+
| **MER** | 56.89 β **0.00** |
|
| 49 |
+
|
| 50 |
+
### Example 2: Repetition Fixed (Code-Switch Preserved)
|
| 51 |
+
| | Transcription |
|
| 52 |
+
|---|---|
|
| 53 |
+
| **Ground Truth** | it's to give yourself δΈδΈͺ ε°ιΆ right |
|
| 54 |
+
| **Baseline** | You have to give yourself a a a a a a a a... *(repeated 500+ times)* |
|
| 55 |
+
| **This Model** | is to give yourself δΈδΈͺε°ιΆ right |
|
| 56 |
+
| **MER** | 56.56 β **0.11** |
|
| 57 |
+
|
| 58 |
+
### Example 3: Code-Switching Preserved
|
| 59 |
+
| | Transcription |
|
| 60 |
+
|---|---|
|
| 61 |
+
| **Ground Truth** | inside circle yah like θΏεΊ θΏεΊ δΌ ηη
η leh |
|
| 62 |
+
| **Baseline** | And you say so could yeah like you can you can you can... *(repeated 500+ times)* |
|
| 63 |
+
| **This Model** | inside the circle ya like θΏεΊθΏεΊδΌηη
η (leh) |
|
| 64 |
+
| **MER** | 39.31 β **0.15** |
|
| 65 |
+
|
| 66 |
+
### Example 4: Perfect Recovery from Repetition
|
| 67 |
+
| | Transcription |
|
| 68 |
+
|---|---|
|
| 69 |
+
| **Ground Truth** | ζ ζ ζ ζ ζ ζ control ζ ζ ζ δ»δ»¬ θ¦ control |
|
| 70 |
+
| **Baseline** | ζζζζζζζζζζζζζζζζζζζζ... *(repeated 500+ times, no "control")* |
|
| 71 |
+
| **This Model** | ζζζζζζ control ζζζδ»δ»¬θ¦ control |
|
| 72 |
+
| **MER** | 35.93 β **0.00** |
|
| 73 |
+
|
| 74 |
+
### Example 5: Technical Terms Preserved
|
| 75 |
+
| | Transcription |
|
| 76 |
+
|---|---|
|
| 77 |
+
| **Ground Truth** | ε€§ι¨ε [εͺ] ε€§ι¨ε ζ― triple e. θ· computer en~ com~ computer [lah] |
|
| 78 |
+
| **Baseline** | ε€§ι¨εεε€§ι¨εζ―θ·θ·θ·θ·θ·θ·θ·... *(repeated 500+ times, lost "triple e" and "computer")* |
|
| 79 |
+
| **This Model** | ε€§ι¨ε (ε) ε€§ι¨εζ― triple e θ· computer (ε) computer (ε¦) |
|
| 80 |
+
| **MER** | 31.56 β **0.25** |
|
| 81 |
+
|
| 82 |
## Training Configuration
|
| 83 |
|
| 84 |
### Model Architecture
|
| 85 |
| Parameter | Value |
|
| 86 |
|-----------|-------|
|
| 87 |
+
| Base Model | [MERaLiON/MERaLiON-2-3B](https://huggingface.co/MERaLiON/MERaLiON-2-3B) |
|
| 88 |
| Training Type | Full Fine-Tuning |
|
| 89 |
| Total Parameters | ~3.47B |
|
| 90 |
| Trainable Parameters | ~3.47B |
|
|
|
|
| 177 |
|
| 178 |
## License
|
| 179 |
|
| 180 |
+
This model inherits the license of the base [MERaLiON-2-3B](https://huggingface.co/MERaLiON/MERaLiON-2-3B) model.
|
eval_results/baseline_cs_dialogue.json
CHANGED
|
The diff for this file is too large to render.
See raw diff
|
|
|
eval_results/baseline_emilia.json
CHANGED
|
The diff for this file is too large to render.
See raw diff
|
|
|
eval_results/trained_cs_dialogue.json
CHANGED
|
The diff for this file is too large to render.
See raw diff
|
|
|
eval_results/trained_emilia.json
CHANGED
|
The diff for this file is too large to render.
See raw diff
|
|
|
eval_results/trained_seame.json
CHANGED
|
The diff for this file is too large to render.
See raw diff
|
|
|