Papers
arxiv:2509.24457

Assessing speech quality metrics for evaluation of neural audio codecs under clean speech conditions

Published on Sep 29, 2025
Authors:
,
,
,
,
,
,

Abstract

Analysis of 45 objective speech-quality metrics reveals that neural-based metrics like scoreq and utmos show strongest correlation with subjective listening scores across 17 codec conditions.

Objective speech-quality metrics are widely used to assess codec performance. However, for neural codecs, it is often unclear which metrics provide reliable quality estimates. To address this, we evaluated 45 objective metrics by correlating their scores with subjective listening scores for clean speech across 17 codec conditions. Neural-based metrics such as scoreq and utmos achieved the highest Pearson correlations with subjective scores. Further analysis across different subjective quality ranges revealed that non-intrusive metrics tend to saturate at high subjective quality levels.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2509.24457
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2509.24457 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2509.24457 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.