AEmotionStudio commited on
Commit
33b6e18
·
verified ·
1 Parent(s): ec4b102

Mirror README.md from ACE-Step/ACE-Step-v1-chinese-rap-LoRA

Browse files
checkpoints/loras/chinese-rap/README.md ADDED
@@ -0,0 +1,160 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - music
5
+ - text2music
6
+ pipeline_tag: text-to-audio
7
+ language:
8
+ - en
9
+ - zh
10
+ - de
11
+ - fr
12
+ - es
13
+ - it
14
+ - pt
15
+ - pl
16
+ - tr
17
+ - ru
18
+ - cs
19
+ - nl
20
+ - ar
21
+ - ja
22
+ - hu
23
+ - ko
24
+ - hi
25
+ library_name: diffusers
26
+ ---
27
+
28
+ # 🎤 Chinese Rap LoRA for ACE-Step (Rap Machine)
29
+
30
+ This is a hybrid rap voice model. We meticulously curated Chinese rap/hip-hop datasets for training, with rigorous data cleaning and recaptioning. The results demonstrate:
31
+
32
+ - Improved Chinese pronunciation accuracy
33
+ - Enhanced stylistic adherence to hip-hop and electronic genres
34
+ - Greater diversity in hip-hop vocal expressions
35
+
36
+ Audio Examples see: https://ace-step.github.io/#RapMachine
37
+
38
+ ## Usage Guide
39
+
40
+ 1. Generate higher-quality Chinese songs
41
+ 2. Create superior hip-hop tracks
42
+ 3. Blend with other genres to:
43
+ - Produce music with better vocal quality and detail
44
+ - Add experimental flavors (e.g., underground, street culture)
45
+ 4. Fine-tune using these parameters:
46
+
47
+ **Vocal Controls**
48
+ **`vocal_timbre`**
49
+ - Examples: Bright, dark, warm, cold, breathy, nasal, gritty, smooth, husky, metallic, whispery, resonant, airy, smoky, sultry, light, clear, high-pitched, raspy, powerful, ethereal, flute-like, hollow, velvety, shrill, hoarse, mellow, thin, thick, reedy, silvery, twangy.
50
+ - Describes inherent vocal qualities.
51
+
52
+ **`techniques`** (List)
53
+ - Rap styles: `mumble rap`, `chopper rap`, `melodic rap`, `lyrical rap`, `trap flow`, `double-time rap`
54
+ - Vocal FX: `auto-tune`, `reverb`, `delay`, `distortion`
55
+ - Delivery: `whispered`, `shouted`, `spoken word`, `narration`, `singing`
56
+ - Other: `ad-libs`, `call-and-response`, `harmonized`
57
+
58
+ ## Community Note
59
+
60
+ While a Chinese rap LoRA might seem niche for non-Chinese communities, we consistently demonstrate through such projects that ACE-step - as a music generation foundation model - holds boundless potential. It doesn't just improve pronunciation in one language, but spawns new styles.
61
+
62
+ The universal human appreciation of music is a precious asset. Like abstract LEGO blocks, these elements will eventually combine in more organic ways. May our open-source contributions propel the evolution of musical history forward.
63
+
64
+ ---
65
+
66
+ # ACE-Step: A Step Towards Music Generation Foundation Model
67
+
68
+ ![ACE-Step Framework](https://github.com/ACE-Step/ACE-Step/raw/main/assets/ACE-Step_framework.png)
69
+
70
+ ## Model Description
71
+
72
+ ACE-Step is a novel open-source foundation model for music generation that overcomes key limitations of existing approaches through a holistic architectural design. It integrates diffusion-based generation with Sana's Deep Compression AutoEncoder (DCAE) and a lightweight linear transformer, achieving state-of-the-art performance in generation speed, musical coherence, and controllability.
73
+
74
+ **Key Features:**
75
+ - 15× faster than LLM-based baselines (20s for 4-minute music on A100)
76
+ - Superior musical coherence across melody, harmony, and rhythm
77
+ - full-song generation, duration control and accepts natural language descriptions
78
+
79
+ ## Uses
80
+
81
+ ### Direct Use
82
+ ACE-Step can be used for:
83
+ - Generating original music from text descriptions
84
+ - Music remixing and style transfer
85
+ - edit song lyrics
86
+
87
+ ### Downstream Use
88
+ The model serves as a foundation for:
89
+ - Voice cloning applications
90
+ - Specialized music generation (rap, jazz, etc.)
91
+ - Music production tools
92
+ - Creative AI assistants
93
+
94
+ ### Out-of-Scope Use
95
+ The model should not be used for:
96
+ - Generating copyrighted content without permission
97
+ - Creating harmful or offensive content
98
+ - Misrepresenting AI-generated music as human-created
99
+
100
+ ## How to Get Started
101
+
102
+ see: https://github.com/ace-step/ACE-Step
103
+
104
+ ## Hardware Performance
105
+
106
+ | Device | 27 Steps | 60 Steps |
107
+ |---------------|----------|----------|
108
+ | NVIDIA A100 | 27.27x | 12.27x |
109
+ | RTX 4090 | 34.48x | 15.63x |
110
+ | RTX 3090 | 12.76x | 6.48x |
111
+ | M2 Max | 2.27x | 1.03x |
112
+
113
+ *RTF (Real-Time Factor) shown - higher values indicate faster generation*
114
+
115
+
116
+ ## Limitations
117
+
118
+ - Performance varies by language (top 10 languages perform best)
119
+ - Longer generations (>5 minutes) may lose structural coherence
120
+ - Rare instruments may not render perfectly
121
+ - Output Inconsistency: Highly sensitive to random seeds and input duration, leading to varied "gacha-style" results.
122
+ - Style-specific Weaknesses: Underperforms on certain genres (e.g. Chinese rap/zh_rap) Limited style adherence and musicality ceiling
123
+ - Continuity Artifacts: Unnatural transitions in repainting/extend operations
124
+ - Vocal Quality: Coarse vocal synthesis lacking nuance
125
+ - Control Granularity: Needs finer-grained musical parameter control
126
+
127
+ ## Ethical Considerations
128
+
129
+ Users should:
130
+ - Verify originality of generated works
131
+ - Disclose AI involvement
132
+ - Respect cultural elements and copyrights
133
+ - Avoid harmful content generation
134
+
135
+
136
+ ## Model Details
137
+
138
+ **Developed by:** ACE Studio and StepFun
139
+ **Model type:** Diffusion-based music generation with transformer conditioning
140
+ **License:** Apache 2.0
141
+ **Resources:**
142
+ - [Project Page](https://ace-step.github.io/)
143
+ - [Demo Space](https://huggingface.co/spaces/ACE-Step/ACE-Step)
144
+ - [GitHub Repository](https://github.com/ACE-Step/ACE-Step)
145
+
146
+
147
+ ## Citation
148
+
149
+ ```bibtex
150
+ @misc{gong2025acestep,
151
+ title={ACE-Step: A Step Towards Music Generation Foundation Model},
152
+ author={Junmin Gong, Wenxiao Zhao, Sen Wang, Shengyuan Xu, Jing Guo},
153
+ howpublished={\url{https://github.com/ace-step/ACE-Step}},
154
+ year={2025},
155
+ note={GitHub repository}
156
+ }
157
+ ```
158
+
159
+ ## Acknowledgements
160
+ This project is co-led by ACE Studio and StepFun.