---
language:
- zh
- en
- ja
- ko
- es
license: cc-by-nc-4.0
pipeline_tag: text-to-audio
tags:
- music
- art
---

# HeartMuLa: A Family of Open Sourced Music Foundation Models

HeartMuLa is a family of open-source music foundation models designed to advance large-scale music understanding and generation across diverse tasks and modalities. It is an LLM-based song generation model capable of synthesizing high-fidelity music under rich, user-controllable conditions (e.g., textual style descriptions, lyrics, and reference audio).

- **Project Page:** [https://heartmula.github.io/](https://heartmula.github.io/)
- **Repository:** [https://github.com/HeartMuLa/heartlib](https://github.com/HeartMuLa/heartlib)
- **Paper:** [HeartMuLa: A Family of Open Sourced Music Foundation Models](https://arxiv.org/abs/2601.10547)
- **Demo:** [https://heartmula.github.io/](https://heartmula.github.io/)

## Model Details

The HeartMuLa framework consists of four major components:
1. **HeartMuLa**: A music language model that generates music conditioned on lyrics and tags with multilingual support.
2. **HeartCodec**: A low-frame-rate (12.5 Hz), high-fidelity music codec tokenizer that captures long-range musical structure.
3. **HeartTranscriptor**: A robust lyric recognition model optimized for real-world music scenarios.
4. **HeartCLAP**: An audio-text alignment model for music descriptions and cross-modal retrieval.

## Installation

We recommend using `python=3.10` for local deployment. Clone the repository and install locally:

```bash
git clone https://github.com/HeartMuLa/heartlib.git
cd heartlib
pip install -e .
```

## Sample Usage

### Download Checkpoints
First, download the pretrained checkpoints into a `./ckpt` folder as described in the [GitHub README](https://github.com/HeartMuLa/heartlib).

### Inference
To generate music conditioned on lyrics and tags, run the following command:

```bash
python ./examples/run_music_generation.py --model_path=./ckpt --version="3B"
```

By default, this command generates a piece of music based on the lyrics and tags provided in the `./assets` folder. The output will be saved as `./assets/output.mp3`.

**Key Parameters:**
- `--model_path`: Path to the pretrained model checkpoint.
- `--lyrics`: Path to the lyrics file (e.g., `./assets/lyrics.txt`).
- `--tags`: Path to the tags file (e.g., `./assets/tags.txt`).
- `--save_path`: Output audio file path.
- `--version`: The version of HeartMuLa (choose `3B`).

## Citation

If you find HeartMuLa useful, please cite:

```bibtex
@misc{yang2026heartmulafamilyopensourced,
      title={HeartMuLa: A Family of Open Sourced Music Foundation Models}, 
      author={Dongchao Yang and Yuxin Xie and Yuguo Yin and Zheyu Wang and Xiaoyu Yi and Gongxi Zhu and Xiaolong Weng and Zihan Xiong and Yingzhe Ma and Dading Cong and Jingliang Liu and Zihang Huang and Jinghan Ru and Rongjie Huang and Haoran Wan and Peixu Wang and Kuoxi Yu and Helin Wang and Liming Liang and Xianwei Zhuang and Yuanyuan Wang and Haohan Guo and Junjie Cao and Zeqian Ju and Songxiang Liu and Yuewen Cao and Heming Weng and Yuexian Zou},
      year={2026},
      eprint={2601.10547},
      archivePrefix={arXiv},
      primaryClass={cs.SD},
      url={https://arxiv.org/abs/2601.10547}, 
}
```

## Contact
If you are interested in HeartMuLa, feel free to reach us at heartmula.ai@gmail.com