Update README.md
Browse files
README.md
CHANGED
|
@@ -6,13 +6,15 @@ license: mit
|
|
| 6 |
|
| 7 |
This model card describes the models implemented in the [SHEET](https://github.com/unilight/sheet) toolkit trained using the training sets in MOS-Bench and benchmarked using the test sets in MOS-Bench.
|
| 8 |
|
|
|
|
|
|
|
| 9 |
## Model Details
|
| 10 |
|
| 11 |
- **Developed by:** Wen-Chin Huang
|
| 12 |
- **Model type:** SSL-MOS or AlignNet
|
| 13 |
- **License:** MIT
|
| 14 |
- **Repository:** [SHEET](https://github.com/unilight/sheet)
|
| 15 |
-
- **Paper:**
|
| 16 |
- **Demo :** https://huggingface.co/spaces/unilight/sheet-demo
|
| 17 |
|
| 18 |
## Uses
|
|
@@ -39,21 +41,39 @@ Please refer to the [`egs` folder in the sheet repo](https://github.com/unilight
|
|
| 39 |
|
| 40 |
#### Metrics
|
| 41 |
|
| 42 |
-
|
|
|
|
|
|
|
| 43 |
|
| 44 |
### Results
|
| 45 |
|
| 46 |
-
Please refer to the MOS-Bench
|
| 47 |
|
| 48 |
|
| 49 |
## Citation
|
| 50 |
|
| 51 |
**BibTeX:**
|
| 52 |
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 57 |
|
| 58 |
## Model Card Contact
|
| 59 |
|
|
|
|
| 6 |
|
| 7 |
This model card describes the models implemented in the [SHEET](https://github.com/unilight/sheet) toolkit trained using the training sets in MOS-Bench and benchmarked using the test sets in MOS-Bench.
|
| 8 |
|
| 9 |
+
The task is subjective speech quality assessment (SSQA), which aims to predict the perceptual quality score of speech.
|
| 10 |
+
|
| 11 |
## Model Details
|
| 12 |
|
| 13 |
- **Developed by:** Wen-Chin Huang
|
| 14 |
- **Model type:** SSL-MOS or AlignNet
|
| 15 |
- **License:** MIT
|
| 16 |
- **Repository:** [SHEET](https://github.com/unilight/sheet)
|
| 17 |
+
- **Paper:** [[SHEET](https://arxiv.org/abs/2505.15061)] [[MOS-Bench (arXiv; 2024)](https://arxiv.org/abs/2411.03715)]
|
| 18 |
- **Demo :** https://huggingface.co/spaces/unilight/sheet-demo
|
| 19 |
|
| 20 |
## Uses
|
|
|
|
| 41 |
|
| 42 |
#### Metrics
|
| 43 |
|
| 44 |
+
Commonly used metrics for SQA are MSE, LCC, SRCC and KTAU. A code snippet for calculating them can be found here: https://gist.github.com/unilight/883726c94640cca1f4d4068e29c3d20f
|
| 45 |
+
|
| 46 |
+
Please refer to the [MOS-Bench (arXiv; 2024)](https://arxiv.org/abs/2411.03715) paper for details.
|
| 47 |
|
| 48 |
### Results
|
| 49 |
|
| 50 |
+
Please refer to the [MOS-Bench (arXiv; 2024)](https://arxiv.org/abs/2411.03715) paper for details.
|
| 51 |
|
| 52 |
|
| 53 |
## Citation
|
| 54 |
|
| 55 |
**BibTeX:**
|
| 56 |
|
| 57 |
+
```
|
| 58 |
+
@inproceedings{sheet,
|
| 59 |
+
title = {{SHEET: A Multi-purpose Open-source Speech Human Evaluation Estimation Toolkit}},
|
| 60 |
+
author = {Wen-Chin Huang and Erica Cooper and Tomoki Toda},
|
| 61 |
+
year = {2025},
|
| 62 |
+
booktitle = {{Proc. Interspeech}},
|
| 63 |
+
pages = {2355--2359},
|
| 64 |
+
}
|
| 65 |
+
|
| 66 |
+
|
| 67 |
+
@article{huang2024,
|
| 68 |
+
title={MOS-Bench: Benchmarking Generalization Abilities of Subjective Speech Quality Assessment Models},
|
| 69 |
+
author={Wen-Chin Huang and Erica Cooper and Tomoki Toda},
|
| 70 |
+
year={2024},
|
| 71 |
+
eprint={2411.03715},
|
| 72 |
+
archivePrefix={arXiv},
|
| 73 |
+
primaryClass={cs.SD},
|
| 74 |
+
url={https://arxiv.org/abs/2411.03715},
|
| 75 |
+
}
|
| 76 |
+
```
|
| 77 |
|
| 78 |
## Model Card Contact
|
| 79 |
|