| --- |
| language: |
| - en |
| license: mit |
| pipeline_tag: text-to-audio |
| tags: |
| - ACE-Step |
| - LoRA |
| - DPO |
| - music-generation |
| - audio-generation |
| - text-to-audio |
| - text2audio |
| - PEFT |
| - acestep-v15-turbo |
| - acestep-5Hz-lm-4B |
| base_model: |
| - ACE-Step/Ace-Step1.5 |
| library_name: peft |
| widget: |
| - text: "Showcase reel" |
| output: |
| url: showcase-training-chapter-v3.mp4 |
| --- |
| |
| # AceStep_Refine_Redmond |
|
|
| I'm grateful for the GPU time from Redmond.AI that allowed me to make this model! |
|
|
| <Gallery /> |
|
|
| ## Overview |
| AceStep_Refine_Redmond is a DPO-refined LoRA adapter for ACE-Step 1.5 Turbo, focused on improving musicality, arrangement coherence, and vocal character in practical generation workflows. |
|
|
| This release includes: |
| - `standard/` (PEFT adapter for regular ACE-Step loading) |
| - `comfyui/` (single-file ComfyUI-compatible LoRA export) |
|
|
| ## Compatibility |
| - DiT used: `acestep-v15-turbo` |
| - Recommended LM for prompting/composition: `acestep-5Hz-lm-4B` |
| - `standard/` works in regular ACE-Step workflows. |
| - `comfyui/` is the converted single-file LoRA for ComfyUI. |
|
|
| ## What Changed vs Base |
| In blind A/B testing against the base reference, this refinement achieved about **70% win rate**. |
| The blind test votes were collected from different users. |
|
|
| Training summary (final DPO refinement stage): |
| - Base checkpoint: `acestep-v15-turbo` |
| - Adapter type: LoRA |
| - Rank / Alpha: `96 / 192` |
| - Learning rate: `8e-5` |
| - Training path: large-dataset LoRA fine-tune for `75` epochs, then DPO refinement on top of that adapter |
| - Epoch config: up to `81` in the DPO stage (resumed from the previous epoch-75 adapter) |
|
|
| ## Known Limitations |
| - Behavior can still vary by prompt style; some sparse prompts may produce less stable vocal timbre. |
| - Very dense arrangements can introduce texture noise or high-frequency harshness in some generations. |
| - This adapter is tuned on a specific preference dataset and may not generalize equally across all genres. |
|
|
| ## Responsible Use |
| - Do not use this model to imitate or impersonate real artists without permission. |
| - Respect copyright, voice rights, and local regulations when generating and publishing audio. |
| - Review outputs before public release, especially in commercial workflows. |
|
|