Spaces:

HugMilo
/

MiloMusic

Sleeping

App Files Files Community

MiloMusic / YuE /finetune /README.md

futurespyhi

1.add YuE 2.modify .gitignore 3.modify requirements.txt

15389e6 4 months ago

preview code

raw

history blame contribute delete

5.24 kB

	# YuE Finetuning Guide

	This guide walks you through the process of finetuning the YuE model using your own data.

	## Table of Contents
	1. [Data Preparation](#step-1-data-preparation)
	2. [Training Data Configuration](#step-2-training-data-configuration)
	3. [Model Finetuning](#step-3-model-finetuning)

	## Requirements

	- Python 3.10 is recommended
	- PyTorch 2.4 is recommended
	- CUDA 12.1+ is recommended

	```bash
	git clone https://github.com/multimodal-art-projection/YuE.git
	cd YuE/finetune/
	conda create -n yue-ft python=3.10
	conda activate yue-ft
	pip install -r requirements.txt
	```

	## Step 1: Data Preparation

	### Required Data Structure

	Your data should be organized in the following structure:
	```
	example/
	├── jsonl/ # Source JSONL files
	├── mmap/ # Generated Megatron binary files
	└── npy/ # Discrete audio codes (numpy arrays) from xcodec
	```

	### JSONL File Format

	Each JSONL file should contain entries in the following format:

	```json
	{
	"id": "1",
	"codec": "example/npy/dummy.npy", // Raw audio codes
	"vocals_codec": "example/npy/dummy.Vocals.npy", // Vocal track codes
	"instrumental_codec": "example/npy/dummy.Instrumental.npy", // Instrumental track codes
	"audio_length_in_sec": 85.16, // Audio duration in seconds
	"msa": [ // Music Structure Analysis
	{
	"start": 0,
	"end": 13.93,
	"label": "intro"
	}
	],
	"genres": "male, youth, powerful, charismatic, rock, punk", // Tags for gender, age, genre, mood, timbre
	"splitted_lyrics": {
	"segmented_lyrics": [
	{
	"offset": 0,
	"duration": 13.93,
	"codec_frame_start": 0,
	"codec_frame_end": 696,
	"line_content": "[intro]\n\n"
	}
	]
	}
	}
	```

	### Converting to Megatron Binary Format

	1. Navigate to the finetune directory:
	```bash
	cd finetune/
	```

	2. Run the preprocessing script:
	```bash
	# For Chain-of-Thought (CoT) dataset
	bash scripts/preprocess_data.sh dummy cot $TOKENIZER_MODEL

	# For In-Context Learning (ICL) dataset
	bash scripts/preprocess_data.sh dummy icl_cot $TOKENIZER_MODEL
	```

	> Note: For music structure analysis and track separation, refer to [openl2s](https://github.com/a43992899/openl2s).

	## Step 2: Training Data Configuration

	### Counting Dataset Tokens

	1. Navigate to the finetune directory:
	```bash
	cd finetune/
	```

	2. Run the token counting script:
	```bash
	bash scripts/count_tokens.sh ./example/mmap/
	```

	The results will be saved in `finetune/count_token_logs/`. This process may take several minutes for large datasets.

	### Configuring Data Mixture

	1. Create a configuration file (e.g., `finetune/example/dummy_data_mixture_cfg.yml`) with the following parameters:
	- `TOKEN_COUNT_LOG_DIR`: Directory containing token count logs
	- `GLOBAL_BATCH_SIZE`: Total batch size for training
	- `SEQ_LEN`: Maximum context window size
	- `{NUM}_ROUND`: Number of times to repeat each dataset

	2. Generate training parameters:
	```bash
	cd finetune/
	python core/parse_mixture.py -c example/dummy_data_mixture_cfg.yml
	```

	The script will output:
	- `DATA_PATH`: Paths to your training data (copy to training script)
	- `TRAIN_ITERS`: Number of training iterations
	- Total token count

	## Step 3: Model Finetuning

	YuE supports finetuning using LoRA (Low-Rank Adaptation), which significantly reduces memory requirements while maintaining performance.

	### Configuring the Finetuning Script

	1. Edit the `scripts/run_finetune.sh` script to configure your finetuning run:

	```bash
	# Update data paths
	# Accepted formats for DATA_PATH:
	# 1) a single path: "/path/to/data"
	# 2) multiple datasets with weights: "100 /path/to/data1 200 /path/to/data2 ..."
	# You can copy DATA_PATH from the output of core/parse_mixture.py in Step 2
	DATA_PATH="data1-weight /path/to/data1 data2-weight /path/to/data2"
	DATA_CACHE_PATH="/path/to/your/cache"

	# Set comma-separated list of proportions for train/val/test split
	DATA_SPLIT="900,50,50"

	# Set model paths
	TOKENIZER_MODEL_PATH="/path/to/tokenizer"
	MODEL_NAME="m-a-p/YuE-s1-7B-anneal-en-cot" # or your local model path
	MODEL_CACHE_DIR="/path/to/model/cache"
	OUTPUT_DIR="/path/to/save/finetuned/model"

	# Configure LoRA parameters (optional)
	LORA_R=64 # Rank of the LoRA update matrices
	LORA_ALPHA=32 # Scaling factor for the LoRA update
	LORA_DROPOUT=0.1 # Dropout probability for LoRA layers
	```

	2. Adjust training hyperparameters as needed:
	```bash
	# Training hyperparameters
	PER_DEVICE_TRAIN_BATCH_SIZE=1
	NUM_TRAIN_EPOCHS=10
	```

	### Running the Finetuning Process

	```bash
	cd finetune/
	bash scripts/run_finetune.sh
	```

	For help with configuring the script:
	```bash
	bash scripts/run_finetune.sh --help
	```

	### Monitoring Training

	If you've enabled WandB logging (via `USE_WANDB=true`), you can monitor your training progress in real-time through the WandB dashboard.

	### Using the Finetuned Model

	After training completes, your model will be saved to the specified `OUTPUT_DIR`. You can use this model for inference or further finetuning.