unilight
/

sheet-models

Model card Files Files and versions

sheet-models / README.md

unilight's picture

Update README.md

8676e76 verified 3 months ago

|

history blame contribute delete

2.68 kB

	---
	license: mit
	---

	# Model Card for SHEET Models

	This model card describes the models implemented in the [SHEET](https://github.com/unilight/sheet) toolkit trained using the training sets in MOS-Bench and benchmarked using the test sets in MOS-Bench.

	The task is subjective speech quality assessment (SSQA), which aims to predict the perceptual quality score of speech.

	## Model Details

	- Developed by: Wen-Chin Huang
	- Model type: SSL-MOS or AlignNet
	- License: MIT
	- Repository: [SHEET](https://github.com/unilight/sheet)
	- Paper: [[SHEET](https://arxiv.org/abs/2505.15061)] [[MOS-Bench (arXiv; 2024)](https://arxiv.org/abs/2411.03715)]
	- Demo : https://huggingface.co/spaces/unilight/sheet-demo

	## Uses

	Please refer to the [README in the sheet repo](https://github.com/unilight/sheet/tree/main/egs/bvcc) for more details.

	## Bias, Risks, and Limitations

	The models are not yet ready to be used to replace subjective tests in scientific papers. They can however be used to compare systems in a heterogeneous way.

	## How to Get Started with the Model

	Please refer to the [README in the sheet repo](https://github.com/unilight/sheet/tree/main/egs/bvcc) for more details.

	## Evaluation

	<!-- This section describes the evaluation protocols and provides the results. -->

	### Testing Data, Factors & Metrics

	#### Testing Data

	Please refer to the [`egs` folder in the sheet repo](https://github.com/unilight/sheet/tree/main/egs/bvcc) for more details.

	#### Metrics

	Commonly used metrics for SQA are MSE, LCC, SRCC and KTAU. A code snippet for calculating them can be found here: https://gist.github.com/unilight/883726c94640cca1f4d4068e29c3d20f

	Please refer to the [MOS-Bench (arXiv; 2024)](https://arxiv.org/abs/2411.03715) paper for details.

	### Results

	Please refer to the [MOS-Bench (arXiv; 2024)](https://arxiv.org/abs/2411.03715) paper for details.


	## Citation

	BibTeX:

	```
	@inproceedings{sheet,
	title = {{SHEET: A Multi-purpose Open-source Speech Human Evaluation Estimation Toolkit}},
	author = {Wen-Chin Huang and Erica Cooper and Tomoki Toda},
	year = {2025},
	booktitle = {{Proc. Interspeech}},
	pages = {2355--2359},
	}


	@article{huang2024,
	title={MOS-Bench: Benchmarking Generalization Abilities of Subjective Speech Quality Assessment Models},
	author={Wen-Chin Huang and Erica Cooper and Tomoki Toda},
	year={2024},
	eprint={2411.03715},
	archivePrefix={arXiv},
	primaryClass={cs.SD},
	url={https://arxiv.org/abs/2411.03715},
	}
	```

	## Model Card Contact

	Wen-Chin Huang
	Nagoya University
	Email: wen.chinhuang@g.sp.m.is.nagoya-u.ac.jp
	GitHub: unilight