Spaces:

Jacong
/

muse

Runtime error

App Files Files Community

muse / train /README.md

Jacong

Upload 96 files

aa9be1e verified about 1 month ago

preview code

raw

history blame contribute delete

2.89 kB

A newer version of the Gradio SDK is available: 6.6.0

Upgrade

Training Framework

The training code uses ms-swift, a scalable lightweight infrastructure for fine-tuning large language models.

Model Configuration

`MODEL_PATH` Parameter

The MODEL_PATH in train.sh should point to the base model. Download the model from HuggingFace:

# Download the model using huggingface_hub
huggingface-cli download bolshyC/qwen3-0.6B-music --local-dir ./qwen3-0.6B-music

Then modify MODEL_PATH in train.sh to point to the local path:

MODEL_PATH="./qwen3-0.6B-music"  # or absolute path

Dataset Configuration

`--dataset` Parameter

Note: The current script train.sh uses train_demo.jsonl (for demonstration purposes). For actual training, you need to use the full dataset.

Actual Training Data

For actual training, please use the following two files from the HuggingFace dataset:

train_cn.jsonl - Chinese training data
train_en.jsonl - English training data

Usage

Download the dataset from HuggingFace:

# Using huggingface_hub to download
huggingface-cli download bolshyC/Muse_train train_cn.jsonl --local-dir ./data
huggingface-cli download bolshyC/Muse_train train_en.jsonl --local-dir ./data

Modify the --dataset parameter in train.sh:

# If using Chinese data only
--dataset 'data/train_cn.jsonl'

# If using both Chinese and English data (comma-separated, no spaces)
--dataset 'data/train_cn.jsonl,data/train_en.jsonl'

Note: In ms-swift, multiple dataset files should be comma-separated without spaces.

Building Custom Training Data

If you want to build your own training dataset, you need to encode audio files into discrete tokens using MuCodec.

Audio Encoding

Use train/encode_audio.py to encode audio files into discrete tokens:

Prepare input data file: Create a JSONL file where each line contains a dictionary with an audio file path:
```
{"path": "path/to/audio1.wav"}
{"path": "path/to/audio2.mp3"}
```
Modify paths in encode_audio.py:
- Set DATA_PATH to your input JSONL file path
- Set SAVE_DIR to the directory where encoded tokens will be saved
Run encoding:
```
python train/encode_audio.py
```

The script will:

Load audio files from the paths specified in the JSONL file
Encode each audio file into discrete tokens using MuCodec
Save the encoded tokens as .pt files in the SAVE_DIR directory
Skip files that have already been encoded

Note: The audio files should be in WAV or MP3 format and will be automatically resampled to 48kHz if needed.

Training Performance

Training Time

On 8× H200 GPUs, training one epoch takes approximately 150 minutes.