Spaces:

WeReCooking
/

ACE-Step-CPU

Running

App Files Files Community

ACE-Step-CPU / README.md

Nekochu

add full README with API docs, MCP, CLI, architecture

9d2d424 about 2 months ago

preview code

raw

history blame

4.61 kB

	---
	title: ACE-Step 1.5 XL Music Generation (CPU)
	emoji: 🎵
	colorFrom: indigo
	colorTo: yellow
	sdk: docker
	pinned: false
	license: mit
	tags:
	- music-generation
	- ace-step
	- gguf
	- lora
	- training
	- cpu
	- mcp-server
	short_description: ACE-Step 1.5 XL - CPU music generation + LoRA training
	models:
	- ACE-Step/Ace-Step1.5
	startup_duration_timeout: 2h
	---

	# ACE-Step 1.5 XL Music Generation (CPU)

	GGUF inference + LoRA training on free CPU Spaces. Powered by [acestep.cpp](https://github.com/ServeurpersoCom/acestep.cpp).

	## Features

	- Music Generation - Text/lyrics to stereo 48kHz MP3 via GGUF quantized models
	- LoRA Training - Fine-tune on your own audio (Side-Step engine, Adafactor optimizer)
	- Multiple LM Sizes - 0.6B / 1.7B / 4B language models (on-demand download)
	- CPU Only - Runs on free HuggingFace Spaces (2 vCPU, 18GB RAM)

	## Music Generation

	1. Enter a music description (e.g. "upbeat electronic dance music")
	2. Enter lyrics or check Instrumental
	3. Adjust BPM, duration, steps, seed
	4. Select LM model (1.7B default, fastest on CPU)
	5. Select LoRA adapter if trained
	6. Click Generate Music

	Timing: ~270s for 10s audio with 1.7B LM, 8 steps.

	## LoRA Training

	1. Go to Train LoRA tab
	2. Upload audio files (WAV/MP3, max 240s each)
	3. Set LoRA name, epochs (1-10), rank (default 16)
	4. Click Train - ace-server stops during training, restarts after
	5. Use Cancel to stop early (saves checkpoint)
	6. Trained adapter appears in the LoRA dropdown for inference

	Timing: ~170s preprocessing + ~10s/epoch on CPU.

	## Models

	\| Component \| GGUF \| Size \|
	\|-----------\|------\|------\|
	\| DiT (music) \| acestep-v15-xl-turbo-Q4_K_M \| 2.8 GB \|
	\| LM (captions) \| acestep-5Hz-lm-1.7B-Q8_0 \| 1.7 GB \|
	\| Text Encoder \| Qwen3-Embedding-0.6B-Q8_0 \| 0.75 GB \|
	\| VAE \| vae-BF16 \| 0.32 GB \|

	LM alternatives (on-demand download): 0.6B Q8_0 (slow), 4B Q5_K_M (best quality, ~515s).

	---

	## API

	### Python Client - Generate Music

	```python
	from gradio_client import Client

	client = Client("WeReCooking/ACE-Step-CPU")

	result = client.predict(
	caption="upbeat electronic dance music",
	lyrics="[Instrumental]",
	instrumental=True,
	bpm=120,
	duration=10,
	seed=-1, # -1 = random
	steps=8, # 1-32, fewer = faster
	lora_select="None (no LoRA)", # or trained adapter name
	lm_model_select="acestep-5Hz-lm-1.7B-Q8_0.gguf",
	api_name="/generate"
	)
	print(result) # (audio_path, status_message)
	```

	### Python Client - Train LoRA

	```python
	from gradio_client import Client, handle_file

	client = Client("WeReCooking/ACE-Step-CPU")

	result = client.predict(
	audio_files=[handle_file("song.mp3")],
	lora_name="my-style",
	epochs=3,
	lr=0.0001,
	rank=16,
	api_name="/train_lora"
	)
	print(result) # (log_text, train_btn, cancel_btn)
	```

	### Python Client - Server Status

	```python
	result = client.predict(api_name="/server_status")
	print(result) # JSON with model info
	```

	### MCP (Model Context Protocol)

	This Space supports MCP for AI assistants (Claude Desktop, Cursor, VS Code).

	MCP Config:
	```json
	{
	"mcpServers": {
	"ace-step": {"url": "https://werecooking-ace-step-cpu.hf.space/gradio_api/mcp/"}
	}
	}
	```

	---

	## CLI Usage

	```bash
	# Generate music
	python app.py "upbeat electronic dance music" --duration 10 --steps 8 --format mp3

	# With lyrics
	python app.py "pop ballad" --lyrics "Hello world\nThis is a test" -d 30

	# With LoRA adapter
	python app.py "jazz piano" --adapter my-style --seed 42

	# Custom server URL
	python app.py "ambient" --server http://localhost:8085
	```

	---

	## Architecture

	```
	ace-server (C++ GGUF) Gradio UI (Python)
	/lm -> LM generate app.py
	/synth -> DiT + VAE train_engine.py (Side-Step)
	/health \|
	/props +-- preprocess_audio()
	/job +-- train_lora_generator()
	```

	- Inference: GGUF via [acestep.cpp](https://github.com/ServeurpersoCom/acestep.cpp) HTTP API
	- Training: PyTorch via ported [Side-Step](https://github.com/koda-dernet/Side-Step) engine
	- Training stops ace-server (free RAM), restarts after with new adapters

	## Credits

	- [ACE-Step 1.5](https://github.com/ace-step/ACE-Step-1.5) - Model architecture
	- [acestep.cpp](https://github.com/ServeurpersoCom/acestep.cpp) - GGUF inference engine
	- [Side-Step](https://github.com/koda-dernet/Side-Step) - Training engine (ported)
	- [Serveurperso/ACE-Step-1.5-GGUF](https://huggingface.co/Serveurperso/ACE-Step-1.5-GGUF) - Quantized models