Spaces:

Gamahea
/

ACE-Step-Custom

Running on Zero

File size: 2,407 Bytes

a602628

# ACE-Step 1.5 Custom Edition - Quick Start Guide

## Installation

### Option 1: Local Setup

1. **Clone the repository**
```bash
git clone https://github.com/yourusername/ace-step-custom.git
cd ace-step-custom
```

2. **Create virtual environment**
```bash
python -m venv venv

# On Windows:
venv\Scripts\activate

# On Linux/Mac:
source venv/bin/activate
```

3. **Run setup**
```bash
python scripts/setup.py
```

4. **Download model**
```bash
python scripts/download_model.py
```

5. **Launch application**
```bash
python app.py
```

6. **Open browser to** `http://localhost:7860`

### Option 2: HuggingFace Spaces

1. Create new Space on HuggingFace
2. Upload all project files
3. Set Space configuration:
   - SDK: `gradio`
   - Python: `3.10`
   - GPU: `A10G` (or better)
4. Space will auto-deploy

## Usage

### Tab 1: Standard ACE-Step

Standard interface with all original ACE-Step features:
- Text-to-music generation
- Variation generation
- Repainting sections
- Lyric editing

### Tab 2: Timeline Workflow

Advanced timeline-based generation:
1. Enter prompt and lyrics
2. Set context length (0-120s)
3. Click "Generate" for 32s clips
4. Clips auto-blend into timeline
5. Use "Extend" to continue
6. Use "Inpaint" to edit regions

### Tab 3: LoRA Training

Train custom models:
1. Upload audio files (10+ recommended)
2. Set training parameters
3. Click "Start Training"
4. Download trained model
5. Use in Tab 1 or Tab 2

## Tips

- **First time:** Start with Standard tab to understand basics
- **For longer songs:** Use Timeline tab with context length 30-60s
- **For custom styles:** Train LoRA with 20+ similar audio files
- **GPU recommended:** 8GB+ VRAM for best performance
- **CPU mode:** Works but slower, use shorter durations

## Troubleshooting

### Out of Memory
- Reduce batch size in LoRA training
- Use shorter audio durations
- Close other GPU applications

### Poor Quality
- Increase context length
- Try different seeds
- Adjust temperature (0.6-0.8 is usually good)

### Blend Artifacts
- Reduce lead-in/lead-out durations
- Ensure consistent style across clips
- Use lower context length for more variety

## Support

- GitHub Issues: [Report bugs here]
- Documentation: See `docs/` directory
- Examples: See `examples/` directory

## Credits

Based on ACE-Step by ACE Studio and Step Fun
- Website: https://ace-step.github.io/
- Paper: https://arxiv.org/abs/2506.00045