Spaces:
Running
on
Zero
Running
on
Zero
| title: ACE-Step 1.5 Custom Edition | |
| emoji: 🎵 | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 5.9.1 | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| python_version: "3.11" | |
| hardware: zero-gpu-medium | |
| # ACE-Step 1.5 Custom Edition | |
| A fully-featured implementation of ACE-Step 1.5 with custom GUI and workflow capabilities for local use and HuggingFace Space deployment. | |
| ## Features | |
| ### 🎵 Three Main Interfaces | |
| 1. **Standard ACE-Step GUI**: Full-featured standard ACE-Step 1.5 interface with all original capabilities | |
| 2. **Custom Timeline Workflow**: Advanced timeline-based generation with: | |
| - 32-second clip generation (2s lead-in + 28s main + 2s lead-out) | |
| - Seamless clip blending for continuous music | |
| - Context Length slider (0-120 seconds) for style guidance | |
| - Master timeline with extend, inpaint, and remix capabilities | |
| 3. **LoRA Training Studio**: Complete LoRA training interface with: | |
| - Audio file upload and preprocessing | |
| - Custom training configuration | |
| - Model download/upload for continued training | |
| ## Architecture | |
| - **Base Model**: ACE-Step v1.5 Turbo | |
| - **Framework**: Gradio 5.9.1, PyTorch | |
| - **Deployment**: Local execution + HuggingFace Spaces | |
| - **Audio Processing**: DiT + VAE + 5Hz Language Model | |
| ## Installation | |
| ### Local Setup | |
| ```bash | |
| # Clone the repository | |
| git clone https://github.com/yourusername/ace-step-custom.git | |
| cd ace-step-custom | |
| # Create virtual environment | |
| python -m venv venv | |
| source venv/bin/activate # On Windows: venv\Scripts\activate | |
| # Install dependencies | |
| pip install -r requirements.txt | |
| # Download ACE-Step model | |
| python scripts/download_model.py | |
| # Run the application | |
| python app.py | |
| ``` | |
| ### HuggingFace Space Deployment | |
| 1. Create a new Space on HuggingFace | |
| 2. Upload all files to the Space | |
| 3. Set Space to use GPU (recommended: H200 or A100) | |
| 4. The app will automatically download models and start | |
| ## Usage | |
| ### Standard Mode | |
| Use the first tab for standard ACE-Step generation with all original features. | |
| ### Timeline Mode | |
| 1. Enter your prompt/lyrics | |
| 2. Adjust Context Length (how far back to reference previous clips) | |
| 3. Click "Generate" to create 32-second clips | |
| 4. Clips automatically blend and add to timeline | |
| 5. Use "Extend" to continue the song or other options for variations | |
| ### LoRA Training | |
| 1. Upload audio files for training | |
| 2. Configure training parameters | |
| 3. Train custom LoRA models | |
| 4. Download and reuse for continued training | |
| ## System Requirements | |
| ### Minimum | |
| - GPU: 8GB VRAM (with optimizations) | |
| - RAM: 16GB | |
| - Storage: 20GB | |
| ### Recommended | |
| - GPU: 16GB+ VRAM (A100, H200, or consumer GPUs) | |
| - RAM: 32GB | |
| - Storage: 50GB | |
| ## Technical Details | |
| - **Audio Format**: 48kHz, stereo | |
| - **Generation Speed**: ~8 inference steps (turbo model) | |
| - **Context Window**: Up to 120 seconds for style guidance | |
| - **Blend Regions**: 2-second crossfade between clips | |
| ## Credits | |
| Based on ACE-Step 1.5 by ACE Studio | |
| - GitHub: https://github.com/ace-step/ACE-Step-1.5 | |
| - Original Demo: https://huggingface.co/spaces/ACE-Step/ACE-Step | |
| ## License | |
| MIT License (see LICENSE file) | |