--- license: mit --- # Model Card for SHEET Models This model card describes the models implemented in the [SHEET](https://github.com/unilight/sheet) toolkit trained using the training sets in MOS-Bench and benchmarked using the test sets in MOS-Bench. The task is subjective speech quality assessment (SSQA), which aims to predict the perceptual quality score of speech. ## Model Details - **Developed by:** Wen-Chin Huang - **Model type:** SSL-MOS or AlignNet - **License:** MIT - **Repository:** [SHEET](https://github.com/unilight/sheet) - **Paper:** [[SHEET](https://arxiv.org/abs/2505.15061)] [[MOS-Bench (arXiv; 2024)](https://arxiv.org/abs/2411.03715)] - **Demo :** https://huggingface.co/spaces/unilight/sheet-demo ## Uses Please refer to the [README in the sheet repo](https://github.com/unilight/sheet/tree/main/egs/bvcc) for more details. ## Bias, Risks, and Limitations The models are not yet ready to be used to replace subjective tests in scientific papers. They can however be used to compare systems in a heterogeneous way. ## How to Get Started with the Model Please refer to the [README in the sheet repo](https://github.com/unilight/sheet/tree/main/egs/bvcc) for more details. ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data Please refer to the [`egs` folder in the sheet repo](https://github.com/unilight/sheet/tree/main/egs/bvcc) for more details. #### Metrics Commonly used metrics for SQA are MSE, LCC, SRCC and KTAU. A code snippet for calculating them can be found here: https://gist.github.com/unilight/883726c94640cca1f4d4068e29c3d20f Please refer to the [MOS-Bench (arXiv; 2024)](https://arxiv.org/abs/2411.03715) paper for details. ### Results Please refer to the [MOS-Bench (arXiv; 2024)](https://arxiv.org/abs/2411.03715) paper for details. ## Citation **BibTeX:** ``` @inproceedings{sheet, title = {{SHEET: A Multi-purpose Open-source Speech Human Evaluation Estimation Toolkit}}, author = {Wen-Chin Huang and Erica Cooper and Tomoki Toda}, year = {2025}, booktitle = {{Proc. Interspeech}}, pages = {2355--2359}, } @article{huang2024, title={MOS-Bench: Benchmarking Generalization Abilities of Subjective Speech Quality Assessment Models}, author={Wen-Chin Huang and Erica Cooper and Tomoki Toda}, year={2024}, eprint={2411.03715}, archivePrefix={arXiv}, primaryClass={cs.SD}, url={https://arxiv.org/abs/2411.03715}, } ``` ## Model Card Contact Wen-Chin Huang Nagoya University Email: wen.chinhuang@g.sp.m.is.nagoya-u.ac.jp GitHub: unilight