--- license: mit pipeline_tag: audio-text-to-text library_name: transformers tags: - music - audio --- Tech Report # ACE-Step Transcriber ## Description ACE-Step Transcriber is the annotation model used by **ACE-Step v1.5** for training data labeling. It is a powerful multilingual audio transcription model capable of transcribing both **speech** and **singing voice** with high accuracy. ### Key Features - 🌍 **50+ Languages Support** - Covers major world languages and regional dialects - 🎤 **Speech Transcription** - Accurately transcribes spoken content - 🎵 **Singing Voice Transcription** - Specialized in lyrics transcription with musical structure annotations - 🏷️ **Structure Annotation** - Automatically identifies song sections (verse, chorus, bridge, etc.) ## Usage The usage is the same as [Qwen2.5 Omni-7B](https://huggingface.co/Qwen/Qwen2.5-Omni-7B). ### Prompt Format Use the following prompt to transcribe audio: ``` *Task* Transcribe this audio in detail