Adding `safetensors` variant of this model

by SFconvertbot - opened 9 days ago

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+169146

-77575

Files changed (7) hide show

README.md +5 -49
eval_results/baseline_cs_dialogue.json +0 -0
eval_results/baseline_emilia.json +0 -0
eval_results/trained_cs_dialogue.json +0 -0
eval_results/trained_emilia.json +0 -0
eval_results/trained_seame.json +0 -0
model.safetensors +3 -0

README.md CHANGED Viewed

@@ -26,65 +26,21 @@ A fine-tuned version of [MERaLiON/MERaLiON-2-3B](https://huggingface.co/MERaLiON
 | Benchmark | Baseline | This Model | Improvement |
 |-----------|----------|------------|-------------|
-| **SEAME** | 0.3372 | **0.2530** | **-25.0%** |
-| **EMILIA** | 0.3201 | **0.3041** | **-5.0%** |
-| **CS-Dialogue** | 0.2541 | **0.2258** | **-11.1%** |
 ### Benchmark Descriptions
 - **SEAME**: English-Mandarin code-switching conversational speech from Singapore/Malaysia (9,764 samples)
 - **EMILIA**: Synthetic code-switching evaluation set (1,000 samples)
 - **CS-Dialogue**: Code-switching dialogue evaluation set (359 samples)
-## Examples
-Below are examples showing improvements from baseline to DPO-trained model:
-### Example 1: Hallucination Fixed
-| | Transcription |
-|---|---|
-| **Ground Truth** | 你们 是 一首 歌 也是 教 一个 session [啊] [哦] [嗯] |
-| **Baseline** | 你们是一首歌也是教一个 session (oh) 我们也是 session 那个 sessional practice 的... *(hallucinated extra content)* |
-| **This Model** | 你们是一首歌也是教一个 session (啊) (哦) |
-| **MER** | 2.20 → **0.07** |
-### Example 2: Code-Switching Preserved (Maid)
-| | Transcription |
-|---|---|
-| **Ground Truth** | [啊] 然后 因为 我们 家里 有 一个 maid 的 [吗] 我 妈妈 有请 一个 maid [mah] 那个 是 打扫 屋子 的 东西 这样 之类 [吗] that is why 可以 [咯] 因为 |
-| **Baseline** | (ah) 然后因为我们家里有一个 maid 的 (mah) 妈妈就请一个 maid 的 (mah) (mah) (mah)... *(repeated filler words)* |
-| **This Model** | (啊) 然后因为我们家里有一个 maid 的 (mah) 我妈妈就请一个 maid (mah) 那个是打扫屋子的东西这样子 (leh) (mah) that's why 可以 (loh) 因为 |
-| **MER** | 1.02 → **0.17** |
-### Example 3: English Location Preserved (Temasek Poly)
-| | Transcription |
-|---|---|
-| **Ground Truth** | 我 住 temasek poly 那边 |
-| **Baseline** | 我住达马士科波利那边 *(transliterated to Chinese)* |
-| **This Model** | 我住 tamasek poly 那边 |
-| **MER** | 1.00 → **0.17** |
-### Example 4: Code-Switching Preserved (Exam)
-| | Transcription |
-|---|---|
-| **Ground Truth** | 考 得 很 考 得 like shit |
-| **Baseline** | 课程很课程很 like shit *(wrong Chinese characters)* |
-| **This Model** | 考得很 考得 like shit |
-| **MER** | 0.71 → **0.00** |
-### Example 5: Mixed Language Preserved (Youth)
-| | Transcription |
-|---|---|
-| **Ground Truth** | not really youth [lah] 还是 youth 了 三十岁 |
-| **Baseline** | not really you (lah) 还是 you (lah) 三十岁 (oh) *(lost "youth")* |
-| **This Model** | not really youth (lah) 还是 youth 了三十岁 |
-| **MER** | 0.36 → **0.00** |
 ## Training Configuration
 ### Model Architecture
 | Parameter | Value |
 |-----------|-------|
-| Base Model | [MERaLiON/MERaLiON-2-3B](https://huggingface.co/MERaLiON/MERaLiON-2-3B) |
 | Training Type | Full Fine-Tuning |
 | Total Parameters | ~3.47B |
 | Trainable Parameters | ~3.47B |
@@ -177,4 +133,4 @@ print(transcription)
 ## License
-This model inherits the license of the base [MERaLiON-2-3B](https://huggingface.co/MERaLiON/MERaLiON-2-3B) model.

 | Benchmark | Baseline | This Model | Improvement |
 |-----------|----------|------------|-------------|
+| **SEAME** (Code-Switching) | 0.3372 | **0.2753** | **+18.4%** |
+| EMILIA | 0.5046 |  |  |
+| CS-Dialogue | 0.7082 |  |  |
 ### Benchmark Descriptions
 - **SEAME**: English-Mandarin code-switching conversational speech from Singapore/Malaysia (9,764 samples)
 - **EMILIA**: Synthetic code-switching evaluation set (1,000 samples)
 - **CS-Dialogue**: Code-switching dialogue evaluation set (359 samples)
 ## Training Configuration
 ### Model Architecture
 | Parameter | Value |
 |-----------|-------|
+| Base Model | MERaLiON/MERaLiON-2-3B |
 | Training Type | Full Fine-Tuning |
 | Total Parameters | ~3.47B |
 | Trainable Parameters | ~3.47B |
 ## License
+This model inherits the license of the base MERaLiON-2-3B model.

eval_results/baseline_cs_dialogue.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

eval_results/baseline_emilia.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

eval_results/trained_cs_dialogue.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

eval_results/trained_emilia.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

eval_results/trained_seame.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fda3d67efd6e3fc991b1b9e9057292a33889e42295a89da38b3ed0c9045156fa
+size 8121505608