Adding `safetensors` variant of this model

#1
README.md CHANGED
@@ -26,65 +26,21 @@ A fine-tuned version of [MERaLiON/MERaLiON-2-3B](https://huggingface.co/MERaLiON
26
 
27
  | Benchmark | Baseline | This Model | Improvement |
28
  |-----------|----------|------------|-------------|
29
- | **SEAME** | 0.3372 | **0.2530** | **-25.0%** |
30
- | **EMILIA** | 0.3201 | **0.3041** | **-5.0%** |
31
- | **CS-Dialogue** | 0.2541 | **0.2258** | **-11.1%** |
32
 
33
  ### Benchmark Descriptions
34
  - **SEAME**: English-Mandarin code-switching conversational speech from Singapore/Malaysia (9,764 samples)
35
  - **EMILIA**: Synthetic code-switching evaluation set (1,000 samples)
36
  - **CS-Dialogue**: Code-switching dialogue evaluation set (359 samples)
37
 
38
- ## Examples
39
-
40
- Below are examples showing improvements from baseline to DPO-trained model:
41
-
42
- ### Example 1: Hallucination Fixed
43
- | | Transcription |
44
- |---|---|
45
- | **Ground Truth** | 你们 是 一首 歌 也是 教 一个 session [啊] [哦] [嗯] |
46
- | **Baseline** | 你们是一首歌也是教一个 session (oh) 我们也是 session 那个 sessional practice 的... *(hallucinated extra content)* |
47
- | **This Model** | 你们是一首歌也是教一个 session (啊) (哦) |
48
- | **MER** | 2.20 → **0.07** |
49
-
50
- ### Example 2: Code-Switching Preserved (Maid)
51
- | | Transcription |
52
- |---|---|
53
- | **Ground Truth** | [啊] 然后 因为 我们 家里 有 一个 maid 的 [吗] 我 妈妈 有请 一个 maid [mah] 那个 是 打扫 屋子 的 东西 这样 之类 [吗] that is why 可以 [咯] 因为 |
54
- | **Baseline** | (ah) 然后因为我们家里有一个 maid 的 (mah) 妈妈就请一个 maid 的 (mah) (mah) (mah)... *(repeated filler words)* |
55
- | **This Model** | (啊) 然后因为我们家里有一个 maid 的 (mah) 我妈妈就请一个 maid (mah) 那个是打扫屋子的东西这样子 (leh) (mah) that's why 可以 (loh) 因为 |
56
- | **MER** | 1.02 → **0.17** |
57
-
58
- ### Example 3: English Location Preserved (Temasek Poly)
59
- | | Transcription |
60
- |---|---|
61
- | **Ground Truth** | 我 住 temasek poly 那边 |
62
- | **Baseline** | 我住达马士科波利那边 *(transliterated to Chinese)* |
63
- | **This Model** | 我住 tamasek poly 那边 |
64
- | **MER** | 1.00 → **0.17** |
65
-
66
- ### Example 4: Code-Switching Preserved (Exam)
67
- | | Transcription |
68
- |---|---|
69
- | **Ground Truth** | 考 得 很 考 得 like shit |
70
- | **Baseline** | 课程很课程很 like shit *(wrong Chinese characters)* |
71
- | **This Model** | 考得很 考得 like shit |
72
- | **MER** | 0.71 → **0.00** |
73
-
74
- ### Example 5: Mixed Language Preserved (Youth)
75
- | | Transcription |
76
- |---|---|
77
- | **Ground Truth** | not really youth [lah] 还是 youth 了 三十岁 |
78
- | **Baseline** | not really you (lah) 还是 you (lah) 三十岁 (oh) *(lost "youth")* |
79
- | **This Model** | not really youth (lah) 还是 youth 了三十岁 |
80
- | **MER** | 0.36 → **0.00** |
81
-
82
  ## Training Configuration
83
 
84
  ### Model Architecture
85
  | Parameter | Value |
86
  |-----------|-------|
87
- | Base Model | [MERaLiON/MERaLiON-2-3B](https://huggingface.co/MERaLiON/MERaLiON-2-3B) |
88
  | Training Type | Full Fine-Tuning |
89
  | Total Parameters | ~3.47B |
90
  | Trainable Parameters | ~3.47B |
@@ -177,4 +133,4 @@ print(transcription)
177
 
178
  ## License
179
 
180
- This model inherits the license of the base [MERaLiON-2-3B](https://huggingface.co/MERaLiON/MERaLiON-2-3B) model.
 
26
 
27
  | Benchmark | Baseline | This Model | Improvement |
28
  |-----------|----------|------------|-------------|
29
+ | **SEAME** (Code-Switching) | 0.3372 | **0.2753** | **+18.4%** |
30
+ | EMILIA | 0.5046 | | |
31
+ | CS-Dialogue | 0.7082 | | |
32
 
33
  ### Benchmark Descriptions
34
  - **SEAME**: English-Mandarin code-switching conversational speech from Singapore/Malaysia (9,764 samples)
35
  - **EMILIA**: Synthetic code-switching evaluation set (1,000 samples)
36
  - **CS-Dialogue**: Code-switching dialogue evaluation set (359 samples)
37
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
  ## Training Configuration
39
 
40
  ### Model Architecture
41
  | Parameter | Value |
42
  |-----------|-------|
43
+ | Base Model | MERaLiON/MERaLiON-2-3B |
44
  | Training Type | Full Fine-Tuning |
45
  | Total Parameters | ~3.47B |
46
  | Trainable Parameters | ~3.47B |
 
133
 
134
  ## License
135
 
136
+ This model inherits the license of the base MERaLiON-2-3B model.
eval_results/baseline_cs_dialogue.json CHANGED
The diff for this file is too large to render. See raw diff
 
eval_results/baseline_emilia.json CHANGED
The diff for this file is too large to render. See raw diff
 
eval_results/trained_cs_dialogue.json CHANGED
The diff for this file is too large to render. See raw diff
 
eval_results/trained_emilia.json CHANGED
The diff for this file is too large to render. See raw diff
 
eval_results/trained_seame.json CHANGED
The diff for this file is too large to render. See raw diff
 
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fda3d67efd6e3fc991b1b9e9057292a33889e42295a89da38b3ed0c9045156fa
3
+ size 8121505608