unilight commited on
Commit
8676e76
·
verified ·
1 Parent(s): c18bc7a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -7
README.md CHANGED
@@ -6,13 +6,15 @@ license: mit
6
 
7
  This model card describes the models implemented in the [SHEET](https://github.com/unilight/sheet) toolkit trained using the training sets in MOS-Bench and benchmarked using the test sets in MOS-Bench.
8
 
 
 
9
  ## Model Details
10
 
11
  - **Developed by:** Wen-Chin Huang
12
  - **Model type:** SSL-MOS or AlignNet
13
  - **License:** MIT
14
  - **Repository:** [SHEET](https://github.com/unilight/sheet)
15
- - **Paper:** to be uploaded
16
  - **Demo :** https://huggingface.co/spaces/unilight/sheet-demo
17
 
18
  ## Uses
@@ -39,21 +41,39 @@ Please refer to the [`egs` folder in the sheet repo](https://github.com/unilight
39
 
40
  #### Metrics
41
 
42
- Please refer to the MOS-Bench paper (to be uploaded) for details.
 
 
43
 
44
  ### Results
45
 
46
- Please refer to the MOS-Bench paper (to be uploaded) for details.
47
 
48
 
49
  ## Citation
50
 
51
  **BibTeX:**
52
 
53
- To be updated.
54
-
55
- <!-- ## Glossary [optional] -->
56
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
 
58
  ## Model Card Contact
59
 
 
6
 
7
  This model card describes the models implemented in the [SHEET](https://github.com/unilight/sheet) toolkit trained using the training sets in MOS-Bench and benchmarked using the test sets in MOS-Bench.
8
 
9
+ The task is subjective speech quality assessment (SSQA), which aims to predict the perceptual quality score of speech.
10
+
11
  ## Model Details
12
 
13
  - **Developed by:** Wen-Chin Huang
14
  - **Model type:** SSL-MOS or AlignNet
15
  - **License:** MIT
16
  - **Repository:** [SHEET](https://github.com/unilight/sheet)
17
+ - **Paper:** [[SHEET](https://arxiv.org/abs/2505.15061)] [[MOS-Bench (arXiv; 2024)](https://arxiv.org/abs/2411.03715)]
18
  - **Demo :** https://huggingface.co/spaces/unilight/sheet-demo
19
 
20
  ## Uses
 
41
 
42
  #### Metrics
43
 
44
+ Commonly used metrics for SQA are MSE, LCC, SRCC and KTAU. A code snippet for calculating them can be found here: https://gist.github.com/unilight/883726c94640cca1f4d4068e29c3d20f
45
+
46
+ Please refer to the [MOS-Bench (arXiv; 2024)](https://arxiv.org/abs/2411.03715) paper for details.
47
 
48
  ### Results
49
 
50
+ Please refer to the [MOS-Bench (arXiv; 2024)](https://arxiv.org/abs/2411.03715) paper for details.
51
 
52
 
53
  ## Citation
54
 
55
  **BibTeX:**
56
 
57
+ ```
58
+ @inproceedings{sheet,
59
+ title = {{SHEET: A Multi-purpose Open-source Speech Human Evaluation Estimation Toolkit}},
60
+ author = {Wen-Chin Huang and Erica Cooper and Tomoki Toda},
61
+ year = {2025},
62
+ booktitle = {{Proc. Interspeech}},
63
+ pages = {2355--2359},
64
+ }
65
+
66
+
67
+ @article{huang2024,
68
+ title={MOS-Bench: Benchmarking Generalization Abilities of Subjective Speech Quality Assessment Models},
69
+ author={Wen-Chin Huang and Erica Cooper and Tomoki Toda},
70
+ year={2024},
71
+ eprint={2411.03715},
72
+ archivePrefix={arXiv},
73
+ primaryClass={cs.SD},
74
+ url={https://arxiv.org/abs/2411.03715},
75
+ }
76
+ ```
77
 
78
  ## Model Card Contact
79