Fix readme

fe8f9a7 7 months ago

2.02 kB

license: other
license_name: nvidia-audio2emotion-license
license_link: https://huggingface.co/nvidia/Audio2Emotion-v2.2/blob/main/LICENSE
extra_gated_prompt: >-
  Use of this model is permitted solely under the terms of the Audio2Emotion
  License. In particular, it may only be used within the NVIDIA Audio2Face
  project and is strictly prohibited for any other purpose.

Audio2Emotion

Description

Audio2Emotion leverages deep learning to extract continuous emotional cues from speech audio, which are then used to drive Audio2Face-3D for more natural and expressive facial animations. By detecting emotions such as joy, sadness, anger, fear, disgust, and neutrality in real time, the system automatically conditions facial expressions without manual intervention. This seamless integration enables digital avatars to convey subtle emotional dynamics during speech, making interactions in gaming, virtual assistants, and immersive environments more lifelike and engaging.

Model Developer: NVIDIA

Model Versions

The Audio2Emotion release includes

Audio2Emotion-v2.2 — stable version
Audio2Emotion-v3.0 — research preview version (uses a double sliding window and provides better probability calibration)

The interface of both networks is identical, so no code changes are required.

Correspondence

Ilya Fedorov (ilyaf@nvidia.com), Dmitry Korobchenko (dkorobchenko@nvidia.com)

License

Your use of this model is governed by the NVIDIA Audio2Emotion License.

Citation

@article{chung2025audio2face,
  title={Audio2Face-3D: Audio-driven Realistic Facial Animation For Digital Avatars},
  author={Chung, Chaeyeon and Fedorov, Ilya and Huang, Michael and Karmanov, Aleksey and Korobchenko, Dmitry and Ribera, Roger and Seol, Yeongho},
  journal={arXiv preprint arXiv:00000000},
  year={2025}
}