Commit
·
3b09382
1
Parent(s):
f4a2b13
Update examples to show code-switching improvements
Browse files- Replaced examples with cleaner code-switching demonstrations
- All examples now show both Chinese and English content
- Removed preamble-related example
- Examples show: translation prevention, technical term preservation, complex code-switching
README.md
CHANGED
|
@@ -40,45 +40,45 @@ A LoRA adapter for [Qwen/Qwen2-Audio-7B-Instruct](https://huggingface.co/Qwen/Qw
|
|
| 40 |
|
| 41 |
Below are examples showing improvements from baseline to DPO-trained model:
|
| 42 |
|
| 43 |
-
### Example 1:
|
| 44 |
| | Transcription |
|
| 45 |
|---|---|
|
| 46 |
-
| **Ground Truth** |
|
| 47 |
-
| **Baseline** |
|
| 48 |
-
| **This Model** |
|
| 49 |
-
| **MER** |
|
| 50 |
|
| 51 |
-
### Example 2:
|
| 52 |
| | Transcription |
|
| 53 |
|---|---|
|
| 54 |
-
| **Ground Truth** |
|
| 55 |
-
| **Baseline** |
|
| 56 |
-
| **This Model** |
|
| 57 |
-
| **MER** |
|
| 58 |
|
| 59 |
-
### Example 3:
|
| 60 |
| | Transcription |
|
| 61 |
|---|---|
|
| 62 |
-
| **Ground Truth** |
|
| 63 |
-
| **Baseline** |
|
| 64 |
-
| **This Model** |
|
| 65 |
-
| **MER** |
|
| 66 |
|
| 67 |
-
### Example 4:
|
| 68 |
| | Transcription |
|
| 69 |
|---|---|
|
| 70 |
-
| **Ground Truth** |
|
| 71 |
-
| **Baseline** |
|
| 72 |
-
| **This Model** |
|
| 73 |
-
| **MER** |
|
| 74 |
|
| 75 |
-
### Example 5:
|
| 76 |
| | Transcription |
|
| 77 |
|---|---|
|
| 78 |
-
| **Ground Truth** |
|
| 79 |
-
| **Baseline** |
|
| 80 |
-
| **This Model** |
|
| 81 |
-
| **MER** | 1.
|
| 82 |
|
| 83 |
## Training Configuration
|
| 84 |
|
|
|
|
| 40 |
|
| 41 |
Below are examples showing improvements from baseline to DPO-trained model:
|
| 42 |
|
| 43 |
+
### Example 1: Code-Switching Preserved (Lifestyle)
|
| 44 |
| | Transcription |
|
| 45 |
|---|---|
|
| 46 |
+
| **Ground Truth** | 我们 都 应该 pursue a healthy lifestyle |
|
| 47 |
+
| **Baseline** | 我们都应该追求健康的生活方式 *(fully translated to Chinese)* |
|
| 48 |
+
| **This Model** | 我们都应该 pursue a healthy lifestyle |
|
| 49 |
+
| **MER** | 1.00 → **0.00** |
|
| 50 |
|
| 51 |
+
### Example 2: Mixed Language Preserved (Christmas)
|
| 52 |
| | Transcription |
|
| 53 |
|---|---|
|
| 54 |
+
| **Ground Truth** | every christmas 我 就 应该 是 没有 人 跟我 庆祝 了 [啦] |
|
| 55 |
+
| **Baseline** | every christmas i would - should be no one to tell me *(Chinese translated to English)* |
|
| 56 |
+
| **This Model** | every christmas 我就应该是没有人跟我庆祝了啦 |
|
| 57 |
+
| **MER** | 0.88 → **0.00** |
|
| 58 |
|
| 59 |
+
### Example 3: Technical Terms Preserved
|
| 60 |
| | Transcription |
|
| 61 |
|---|---|
|
| 62 |
+
| **Ground Truth** | (呃) 每个 lecture different lecturer 那个 notes 不 不怎么 好的 [啦] |
|
| 63 |
+
| **Baseline** | 呃那个老师不同风格的老师 *(lost technical terms)* |
|
| 64 |
+
| **This Model** | 呃 每个 lecture different lecturer 那个 notes 不不怎么好的啦 |
|
| 65 |
+
| **MER** | 0.75 → **0.00** |
|
| 66 |
|
| 67 |
+
### Example 4: Complex Code-Switching Preserved
|
| 68 |
| | Transcription |
|
| 69 |
|---|---|
|
| 70 |
+
| **Ground Truth** | [哦] 还有 什么 好吃 的 吗 还是 你 只是 去 那些 very expensive places like dempsey to eat |
|
| 71 |
+
| **Baseline** | Oh, what else? Oh, yeah, there's always that expensive place like... to eat *(lost Chinese content)* |
|
| 72 |
+
| **This Model** | 哦 还有什么好吃的吗 还是你只是去那些 very expensive places like dancy to eat |
|
| 73 |
+
| **MER** | 0.83 → **0.04** |
|
| 74 |
|
| 75 |
+
### Example 5: Professional Terms Preserved
|
| 76 |
| | Transcription |
|
| 77 |
|---|---|
|
| 78 |
+
| **Ground Truth** | [哦] 因为 是个 professional degree |
|
| 79 |
+
| **Baseline** | 哦因为他有个专业的学位 *(translated to Chinese)* |
|
| 80 |
+
| **This Model** | 哦 因为 是个 professional degree |
|
| 81 |
+
| **MER** | 1.00 → **0.00** |
|
| 82 |
|
| 83 |
## Training Configuration
|
| 84 |
|