KLM Trainer 0.4.18

by SeoulStreamingStation - opened Apr 25

Owner Apr 25

About Dataset Maker :

Dataset Maker provides the following features.
When you enter the URL of the folder where your dataset is stored, Dataset Maker searches through the data, finds the cleanest files suitable for training, and organizes them into a new folder.
Please note that the main purpose of Dataset Maker is not to restore heavily damaged or unusable data, but to filter and select data that is suitable for training.
If the mel data is too messy or unsuitable for training due to severe harmonics, noise, or other issues, Dataset Maker acts as a filter and removes those files from the dataset.

Quantity / Quality Balance
This option selects the filter level your dataset must pass.
The closer the value is to 0, the more data will be preserved, even if the quality is lower.
However, in this case, it may be difficult to expect high model quality.
The closer the value is to 5, the stricter the inspection and filtering process becomes. Overlapping or redundant data will be removed as much as possible.
This allows training to be performed with cleaner and more diverse data, helping the model avoid overfitting and improving its generalization ability.
If all data fails to pass the filter even when the slider is set to 0, a warning will appear saying “No suitable data could be found.”

Silence Preservation
This controls how much silence between words should be preserved.
If you are adding separate mute files during training, this may not be very necessary.
However, to preserve longer contextual data, it is recommended to set this value between 1 and 3.

Target Vocal Ratio
If your dataset contains both speech and vocal data mixed together, you can adjust the ratio between vocal and speech data.
Since it is difficult to clearly distinguish them using spectral data alone, files that include words such as “vocal” in their filenames will also be considered vocal data.
If your dataset does not contain vocal data or does not contain speech data, it is recommended to set this value to 0.

Auto Clean Up
If noise, instrumental bleed, or full AR is detected while organizing the dataset, Auto Clean Up will automatically use STEM separation and component separation to extract and use only the vocal data or voice-related parts.
After cleanup is complete, the remaining data is passed through the filter once again.
Only the data that passes this second inspection will be preserved.

Multi-Speaker Support
If the selected folder contains sequentially numbered folders, such as [0], [1], [2], filtering will be performed while preserving each speaker separately.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment