ACE-Step 1.5 XL โ€” SFT (4B DiT)

Project | Hugging Face | ModelScope | Space Demo | Discord | Tech Report

Model Details

This is the XL (4B) SFT variant of ACE-Step 1.5 โ€” a supervised fine-tuned model with ~4B parameters. SFT provides higher audio quality with CFG (Classifier-Free Guidance) support for fine-grained prompt adherence control.

XL Architecture

Parameter Value
DiT Decoder hidden_size 2560
DiT Decoder layers 32
DiT Decoder attention heads 32
Encoder hidden_size 2048
Encoder layers 8
Total params ~4B
Weights size (bf16) ~18.8 GB
Inference steps 50 (with CFG)

GPU Requirements

VRAM Support
โ‰ฅ12 GB With CPU offload + INT8 quantization
โ‰ฅ16 GB With CPU offload
โ‰ฅ20 GB Without offload
โ‰ฅ24 GB Full quality (XL + 4B LM)

All LM models (0.6B / 1.7B / 4B) are fully compatible with XL.

Key Features

  • ๐Ÿ’ฐ Commercial-Ready: Trained on legally compliant datasets. Generated music can be used for commercial purposes.
  • ๐Ÿ“š Safe Training Data: Licensed music, royalty-free/public domain, and synthetic (MIDI-to-Audio) data.
  • ๐ŸŽฏ CFG Support: Fine-tune prompt adherence with guidance scale control.
  • ๐Ÿ”ฎ Highest Quality: SFT + 4B parameters = the highest quality variant.

Quick Start

# Install ACE-Step
git clone https://github.com/ace-step/ACE-Step-1.5.git
cd ACE-Step-1.5
pip install -e .

# Download this model
huggingface-cli download ACE-Step/acestep-v15-xl-sft --local-dir ./checkpoints/acestep-v15-xl-sft

# Run with Gradio UI
python acestep --config-path acestep-v15-xl-sft

Model Zoo

XL (4B) DiT Models

DiT Model CFG Steps Quality Diversity Tasks Hugging Face ModelScope
acestep-v15-xl-base โœ… 50 High High All (extract, lego, complete) Link Link
acestep-v15-xl-sft โœ… 50 Very High Medium Standard This repo Link
acestep-v15-xl-turbo โŒ 8 Very High Medium Standard Link Link

LM Models (all compatible with XL)

LM Model Params Audio Understanding Composition Hugging Face ModelScope
acestep-5Hz-lm-0.6B 0.6B Medium Medium Link Link
acestep-5Hz-lm-1.7B 1.7B Medium Medium Included in main Included in main
acestep-5Hz-lm-4B 4B Strong Strong Link Link

Acknowledgements

This project is co-led by ACE Studio and StepFun.

Citation

@misc{gong2026acestep,
    title={ACE-Step 1.5: Pushing the Boundaries of Open-Source Music Generation},
    author={Junmin Gong, Yulin Song, Wenxiao Zhao, Sen Wang, Shengyuan Xu, Jing Guo},
    howpublished={\url{https://github.com/ace-step/ACE-Step-1.5}},
    year={2026},
    note={GitHub repository}
}
Downloads last month
425
Safetensors
Model size
5B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ACE-Step/acestep-v15-xl-sft

Merges
1 model

Collection including ACE-Step/acestep-v15-xl-sft

Paper for ACE-Step/acestep-v15-xl-sft