tencent
/

SongGeneration

Model card Files Files and versions

SongGeneration / README.md

linoyts's picture

linoyts HF Staff

add pipeline tag for better discoverability

92b0f3b verified 8 months ago

|

2.15 kB

	---
	language:
	- en
	- zh
	pipeline_tag: text-to-audio
	---
	# SongGeneration

	<p align="center">
	<a href="https://levo-demo.github.io/">Demo</a>  \|  <a href="https://arxiv.org/abs/2506.07520">Paper</a>  \|  <a href="https://github.com/tencent-ailab/songgeneration">Code</a>  \|  <a href="https://huggingface.co/spaces/waytan22/SongGeneration-LeVo">Space Demo</a>
	</p>


	This repository is the official weight repository for LeVo: High-Quality Song Generation with Multi-Preference Alignment. In this repository, we provide the SongGeneration model, inference scripts, and the checkpoint that has been trained on the Million Song Dataset.

	## Model Versions

	\| Model \| HuggingFace \|
	\| :----------------------: \| :----------------------------------------------------------: \|
	\| SongGeneration-base(zh) \| <a href="https://huggingface.co/tencent/SongGeneration/tree/main/ckpt/songgeneration_base_zh">v20250520</a> \|
	\| SongGeneration-base(zh&en) \| Coming soon \|
	\| SongGeneration-full(zh&en) \| Coming soon \|

	## Overview

	We develop the SongGeneration model. It is an LM-based framework consisting of LeLM and a music codec. LeLM is capable of parallelly modeling two types of tokens: mixed tokens, which represent the combined audio of vocals and accompaniment to achieve vocal-instrument harmony, and dual-track tokens, which separately encode vocals and accompaniment for high-quality song generation. The music codec reconstructs the dual-track tokens into highfidelity music audio. SongGeneration significantly improves over the open-source music generation models and performs competitively with current state-of-the-art industry systems. For more details, please refer to our [paper](https://arxiv.org/abs/2506.07520).

	<img src="https://github.com/tencent-ailab/songgeneration/blob/main/img/over.jpg?raw=true" alt="img" style="zoom:100%;" />

	## License

	The code and weights in this repository is released in the [LICENSE](LICENSE) file.