Training

by Politrees - opened Dec 16, 2025

Dec 16, 2025

Hi!
I didn't find the Issue tab in your GitHub repository, so I'll ask here.

Is there a guide on how to start a workout, how exactly to train, and which dataset is needed? I would also like to try to start a workout.

AnhP

Owner Dec 17, 2025

•

edited Dec 17, 2025

You will need to prepare a large training dataset consisting of:

WAV files containing audio data
PV files containing MIDI at 10 ms, which corresponds to a hop length of 160

You can use the MIR-1K dataset provided here:
https://huggingface.co/datasets/AnhP/Mir-1k-use-DJCM-training/resolve/main/dataset-10ms.zip

After downloading and extracting the dataset:
Replace the dataset path in the code from "hybrid" to your own dataset path.

Run train.py to start training.
If you want to experiment, you can modify the default training parameters to suit your needs.

AnhP changed discussion status to closed Dec 17, 2025

AnhP changed discussion status to open Dec 17, 2025

Politrees

Dec 17, 2025

•

edited Dec 17, 2025

Thanks! Is there a benchmark code that you used to create your comparison tables?

AnhP

Owner Dec 17, 2025

Thanks! Is there a benchmark code that you used to create your comparison tables?
https://github.com/lars76/pitch-benchmark

Politrees

Dec 22, 2025

If it's not a secret, did you train on this same dataset or a different one, and approximately how many hours of data are there?
No matter how much I try, my results are far from what you achieved. I even added 2 hours of my own data, and it got better, but it's still far off.

AnhP

Owner Dec 23, 2025

If it's not a secret, did you train on this same dataset or a different one, and approximately how many hours of data are there?
No matter how much I try, my results are far from what you achieved. I even added 2 hours of my own data, and it got better, but it's still far off.

If you want, you can try a mix of
Mir-1K + PTDB TUG + Vocadito
or
Mir-1K + PTDB TUG + 3,903 files from M4Singer, with half extracted using PM and the other half using RMVPE, like I did.

Note: you may not achieve the expected results, as I continuously adjusted the parameters during training.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment