Improve model card for VoiceCraft-X

by nielsr HF Staff - opened Nov 18, 2025

←

This PR significantly enhances the model card for VoiceCraft-X by:

Adding the pipeline_tag: text-to-speech to enable discoverability on the Hugging Face Hub, as the model performs Text-to-Speech synthesis.
Including library_name: transformers in the metadata, justified by the config.json file which indicates the architecture as Qwen3ForCausalLM, a model type supported by the Transformers library. This will enable an automated usage widget.
Providing direct links to the official paper (VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing), the project page (https://zhishengzheng.com/voicecraft-x/), and the GitHub repository (https://github.com/zszheng147/VoiceCraft-X).
Populating the content section with detailed information from the GitHub README, including installation instructions, how to access pretrained models, guidance on inference, the full license statement, citation information, and an important usage disclaimer.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment