## Data Pre-processing ### Convert from MusicXML - Navigate to the data folder ```cd data/``` - Modify the ```ORI_FOLDER``` and ```DES_FOLDER``` in ```1_batch_xml2abc.py```, then run this script: ``` python 1_batch_xml2abc.py ``` This will conver the MusicXML files into standard ABC notation files. - Modify the ```ORI_FOLDER```, ```INTERLEAVED_FOLDER```, ```AUGMENTED_FOLDER```, and ```EVAL_SPLIT``` in ```2_data_preprocess.py```: ```python ORI_FOLDER = '' # Folder containing standard ABC notation files INTERLEAVED_FOLDER = '' # Output interleaved ABC notation files that are compatible with CLaMP 2 to this folder AUGMENTED_FOLDER = '' # On the basis of interleaved ABC, output key-augmented and rest-omitted files that are compatible with NotaGen to this folder EVAL_SPLIT = 0.1 # Evaluation data ratio ``` then run this script: ``` python 2_data_preprocess.py ``` - The script will convert the standard ABC to interleaved ABC, which is compatible with CLaMP 2. The files will be under ```INTERLEAVED_FOLDER```. - This script will make 15 key signature folders under the ```AUGMENTED_FOLDER```, and output interleaved ABC notation files with rest bars omitted. This is the data representation that NotaGen adopts. - This script will also generate data index files for training NotaGen. It will randomly split train and eval sets according to the proportion ```EVAL_SPLIT``` defines. The index files will be named as ```{AUGMENTED_FOLDER}_train.jsonl``` and ```{AUGMENTED_FOLDER}_eval.jsonl```. ## Data Post-processing ### Preview Sheets in ABC Notation We recommend [EasyABC](https://sourceforge.net/projects/easyabc/), a nice software for ABC Notation previewing, composing and editing. It's needed to add a line "X:1" before each piece to present the score image in EasyABC :D ### Convert to MusicXML - Go to the data folder ```cd data/``` - Modify the ```ORI_FOLDER``` and ```DES_FOLDER``` in ```3_batch_abc2xml.py```, then run this script: ``` python 3_batch_abc2xml.py ``` This will conver the standard/interleaved ABC notation files into MusicXML files.