| # Voice Design Guide (Voice Design README) | |
| ## Overview | |
| This guide provides instructions for creating high-quality **voice descriptions (voice_prompt)** to generate voices that meet specific requirements. The voice description serves as the blueprint for voice design and directly determines the quality of the generated voice. | |
| --- | |
| ## Technical Constraints | |
| | Item | Description | | |
| | ----------------------- | ------------------------------------------------------------------------------------------------- | | |
| | **Length Limit** | Each voice_prompt ≤ 200 characters | | |
| | **Supported Languages** | Chinese only. English is not supported in the current version and will be added in future updates | | |
| --- | |
| ## Five Core Principles | |
| ### 1️⃣ Be Specific, Not Vague | |
| **✅ Recommended**: Use perceptible and concrete voice attributes | |
| * Pitch: low, high, bright, rich | |
| * Speaking rate: fast, slow, rapid, steady | |
| * Timbre: magnetic, husky, smooth, clear | |
| **❌ Avoid**: | |
| “Nice”, “normal”, “good” (too subjective and uninformative) | |
| --- | |
| ### 2️⃣ Multi-Dimensional, Not Single-Attribute | |
| **✅ Recommended**: Combine at least 3–4 dimensions to create a vivid voice profile | |
| * Persona (usage scenario) + gender + age + pitch + speaking rate + volume + timbre + emotion | |
| **❌ Avoid**: | |
| Only “female voice” or only “low-pitched” (too generic, lacks distinctiveness) | |
| --- | |
| ### 3️⃣ Objective, Not Subjective | |
| **✅ Recommended**: Describe physical and acoustic characteristics | |
| * “Slightly high-pitched with energetic delivery” | |
| * “Slow speaking rate with clear articulation” | |
| **❌ Avoid**: | |
| “My favorite voice”, “This voice sounds great” | |
| --- | |
| ### 4️⃣ Original, Not Imitative | |
| **⚠️ Copyright Notice**: Descriptions such as “sounds like XX celebrity” or “imitates XX actor” are prohibited | |
| **✅ Recommended**: Describe voice characteristics directly rather than referencing specific individuals | |
| --- | |
| ### 5️⃣ Concise, Not Redundant | |
| **✅ Recommended**: Ensure every word conveys meaningful information | |
| **❌ Avoid**: | |
| “Very, very good voice”, “Extremely, extremely gentle” | |
| --- | |
| ## Reference Dimensions for Voice Description | |
| Based on high-quality examples, we recommend composing voice prompts using the following dimensions: | |
| | Dimension | Example Options | | |
| | ---------------------------- | ------------------------------------------------------------------------------------------------- | | |
| | **Persona (Usage Scenario)** | News broadcasting, advertising voice-over, audiobooks, animated characters, documentary narration | | |
| | **Gender** | Male, Female | | |
| | **Age** | Child (~8 years), Young adult (20–30), Middle-aged (40–50), Elderly | | |
| | **Personality Traits** | Lively, calm, gentle, intellectual, cute, serious | | |
| | **Speaking Rate & Rhythm** | Fast, slow, moderate, urgent, steady | | |
| | **Intonation Style** | Rising, neutral, passionate, relaxed | | |
| | **Timbre** | Deep and magnetic, crisp and bright, husky and warm, youthful | | |
| --- | |
| ## High-Quality Examples | |
| ### ✅ Recommended Templates | |
| **Example 1: Poetry Recitation** | |
| > “A male modern poetry reciter with a deep, magnetic low voice, delivering poetry with strong rhythmic pauses, powerful volume, and intense emotional expression.” | |
| **Example 2: News Style** | |
| > “A female news anchor speaking standard Mandarin with a clear and bright mid-to-high pitch, steady professional pacing, strong volume, and a neutral, objective tone.” | |
| **Example 3: Advertising Voice-Over** | |
| > “A male voice for liquor brand advertising, featuring a rich and weathered timbre, slow and bold speaking rate, strong volume, conveying a sense of history and masculinity.” | |
| --- | |
| ## Common Mistakes and Improvements | |
| | Type | ❌ Not Recommended | ✅ Improved Version | | |
| | ------------------------- | -------------------------- | ---------------------------------------------------------------------------- | | |
| | **Too Generic** | “Female voice, nice” | “Young female voice with a clear pitch and moderate speaking rate” | | |
| | **Subjective Evaluation** | “A great-sounding voice” | “Bright timbre with strong expressiveness” | | |
| | **Single Dimension** | “Low-pitched male voice” | “Middle-aged male with a low pitch, slow pacing, suitable for documentaries” | | |
| | **Redundant Wording** | “Very, very gentle voice” | “Gentle and intellectual female voice” | | |
| | **Imitation Request** | “Sounds like XX celebrity” | **Prohibited** — describe objective voice traits instead | | |
| --- | |
| ## Quick Checklist | |
| Before submitting a voice_prompt, make sure that: | |
| * [ ] Length ≤ 200 characters | |
| * [ ] At least 3 different descriptive dimensions are included | |
| * [ ] No subjective evaluation words (e.g., “nice”, “great”, “favorite”) | |
| * [ ] No references to real individuals or imitation requests | |
| * [ ] No repetitive or exaggerated wording | |
| * [ ] Usage scenario is clearly defined | |
| * [ ] All descriptors are perceptible and concrete | |
| --- | |