Spaces:

shreyask
/

ace-step-webgpu

Running

Initial deploy: built app at root + source under _source/

24b9788 verified 20 days ago

1.45 kB

	---
	title: ACE-Step WebGPU
	emoji: 🎵
	colorFrom: purple
	colorTo: pink
	sdk: static
	pinned: false
	license: apache-2.0
	short_description: Text-to-music in your browser via WebGPU.
	---

	# ACE-Step WebGPU

	Describe any song. AI writes & produces it — right in your browser.

	The pipeline (5 Hz Qwen3 LM → FSQ → DiT decoder → Oobleck VAE) runs end-to-end
	via [onnxruntime-web](https://onnxruntime.ai/) with the WebGPU execution
	provider. Two Web Workers keep the LM and diffusion+VAE graphs in separate
	WASM heaps so neither hits the 4 GB single-heap limit.

	## Models

	- DiT decoder (2B, fp16) and Oobleck VAE (fp16) from
	[shreyask/ACE-Step-v1.5-ONNX](https://huggingface.co/shreyask/ACE-Step-v1.5-ONNX)
	- 5 Hz LM (0.6B, 4-bit MatMulNBits) from
	[ACE-Step/acestep-5Hz-lm-0.6B](https://huggingface.co/ACE-Step/acestep-5Hz-lm-0.6B)
	- Text encoder: [Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B)

	Weights are fetched on demand and cached in the browser's Cache Storage after
	the first load (~2 GB total).

	## Requirements

	- WebGPU-capable browser: Chrome/Edge 113+, Safari 26+ desktop
	- ~4 GB free RAM recommended

	## Source

	The `_source/` directory in this Space's Files tab contains the full Vite/React
	project (`src/`, `public/`, configs). Build it locally with:

	```bash
	cd _source
	npm install
	npm run build
	```

	Upstream: [ACE-Step/Ace-Step1.5](https://huggingface.co/ACE-Step/Ace-Step1.5)
	(Apache 2.0).