ACE-Step-Custom / README_PROJECT.md
ACE-Step Custom
Deploy ACE-Step Custom Edition with bug fixes
a602628
---
title: ACE-Step 1.5 Custom Edition
emoji: 🎵
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.9.1
app_file: app.py
pinned: false
license: mit
python_version: "3.11"
hardware: zero-gpu-medium
---
# ACE-Step 1.5 Custom Edition
A fully-featured implementation of ACE-Step 1.5 with custom GUI and workflow capabilities for local use and HuggingFace Space deployment.
## Features
### 🎵 Three Main Interfaces
1. **Standard ACE-Step GUI**: Full-featured standard ACE-Step 1.5 interface with all original capabilities
2. **Custom Timeline Workflow**: Advanced timeline-based generation with:
- 32-second clip generation (2s lead-in + 28s main + 2s lead-out)
- Seamless clip blending for continuous music
- Context Length slider (0-120 seconds) for style guidance
- Master timeline with extend, inpaint, and remix capabilities
3. **LoRA Training Studio**: Complete LoRA training interface with:
- Audio file upload and preprocessing
- Custom training configuration
- Model download/upload for continued training
## Architecture
- **Base Model**: ACE-Step v1.5 Turbo
- **Framework**: Gradio 5.9.1, PyTorch
- **Deployment**: Local execution + HuggingFace Spaces
- **Audio Processing**: DiT + VAE + 5Hz Language Model
## Installation
### Local Setup
```bash
# Clone the repository
git clone https://github.com/yourusername/ace-step-custom.git
cd ace-step-custom
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Download ACE-Step model
python scripts/download_model.py
# Run the application
python app.py
```
### HuggingFace Space Deployment
1. Create a new Space on HuggingFace
2. Upload all files to the Space
3. Set Space to use GPU (recommended: H200 or A100)
4. The app will automatically download models and start
## Usage
### Standard Mode
Use the first tab for standard ACE-Step generation with all original features.
### Timeline Mode
1. Enter your prompt/lyrics
2. Adjust Context Length (how far back to reference previous clips)
3. Click "Generate" to create 32-second clips
4. Clips automatically blend and add to timeline
5. Use "Extend" to continue the song or other options for variations
### LoRA Training
1. Upload audio files for training
2. Configure training parameters
3. Train custom LoRA models
4. Download and reuse for continued training
## System Requirements
### Minimum
- GPU: 8GB VRAM (with optimizations)
- RAM: 16GB
- Storage: 20GB
### Recommended
- GPU: 16GB+ VRAM (A100, H200, or consumer GPUs)
- RAM: 32GB
- Storage: 50GB
## Technical Details
- **Audio Format**: 48kHz, stereo
- **Generation Speed**: ~8 inference steps (turbo model)
- **Context Window**: Up to 120 seconds for style guidance
- **Blend Regions**: 2-second crossfade between clips
## Credits
Based on ACE-Step 1.5 by ACE Studio
- GitHub: https://github.com/ace-step/ACE-Step-1.5
- Original Demo: https://huggingface.co/spaces/ACE-Step/ACE-Step
## License
MIT License (see LICENSE file)