TinyMyo / scripts /README.md
MatteoFasulo's picture
Update scripts/README.md
64e3467 verified

Dataset Preparation Commands

Overview

This document provides the commands to prepare various EMG datasets for pretraining and downstream tasks. Each dataset preparation script takes in raw data, processes it into overlapping windows, and saves the processed data in HDF5 format for efficient loading during model training.

Remember to add the flag --download_data if the dataset is not downloaded yet.

Substitute the $DATA_PATH environment variable with your path for saving the dataset.

The required libraries for running the scripts are located inside the requirements.txt file.

Pretraining Datasets

For the pretraining:

emg2pose

python scripts/emg2pose.py \
    --data_dir $DATA_PATH/datasets/emg2pose_data/ \
    --save_dir $DATA_PATH/datasets/emg2pose_data/h5/ \
    --window_size 1000 \
    --stride 500

Ninapro DB6

python scripts/db6.py \
    --data_dir $DATA_PATH/datasets/ninapro/DB6/ \
    --save_dir $DATA_PATH/datasets/ninapro/DB6/h5/ \
    --window_size 1000 \
    --stride 500

Ninapro DB7

python scripts/db7.py \
    --data_dir $DATA_PATH/datasets/ninapro/DB7/ \
    --save_dir $DATA_PATH/datasets/ninapro/DB7/h5/ \
    --window_size 1000 \
    --stride 500

Downstream Datasets

For the downstream tasks:

Ninapro DB5 (200 ms, 25% overlap)

python scripts/db5.py \
    --data_dir $DATA_PATH/datasets/ninapro/DB5/ \
    --save_dir $DATA_PATH/datasets/ninapro/DB5/h5/ \
    --window_size 200 \
    --stride 50

Ninapro DB5 (1000 ms, 25% overlap)

python scripts/db5.py \
    --data_dir $DATA_PATH/datasets/ninapro/DB5/ \
    --save_dir $DATA_PATH/datasets/ninapro/DB5/h5/ \
    --window_size 1000 \
    --stride 250

EMG-EPN612 (200 ms)

python scripts/epn.py \
    --data_dir $DATA_PATH/datasets/EPN612/ \
    --source_training $DATA_PATH/datasets/EPN612/trainingJSON/ \
    --source_testing $DATA_PATH/datasets/EPN612/testingJSON/ \
    --dest_dir $DATA_PATH/datasets/EPN612/h5/ \
    --window_size 200

EMG-EPN612 (1000 ms)

python scripts/epn.py \
    --data_dir $DATA_PATH/datasets/EPN612/ \
    --source_training $DATA_PATH/datasets/EPN612/trainingJSON/ \
    --source_testing $DATA_PATH/datasets/EPN612/testingJSON/ \
    --dest_dir $DATA_PATH/datasets/EPN612/h5/ \
    --window_size 1000

UCI EMG (200 ms, 25% overlap)

python scripts/uci.py \
    --data_dir $DATA_PATH/datasets/UCI_EMG/EMG_data_for_gestures-master/ \
    --save_dir $DATA_PATH/datasets/UCI_EMG/EMG_data_for_gestures-master/h5/ \
    --window_size 200 \
    --stride 50

UCI EMG (1000 ms, 25% overlap)

python scripts/uci.py \
    --data_dir $DATA_PATH/datasets/UCI_EMG/EMG_data_for_gestures-master/ \
    --save_dir $DATA_PATH/datasets/UCI_EMG/EMG_data_for_gestures-master/h5/ \
    --window_size 1000 \
    --stride 250

Ninapro DB8 (200 ms, no overlap)

python scripts/db8.py \
    --data_dir $DATA_PATH/datasets/ninapro/DB8/ \
    --save_dir $DATA_PATH/datasets/ninapro/DB8/h5/ \
    --window_size 200 \
    --stride 200

Ninapro DB8 (1000 ms, no overlap)

python scripts/db8.py \
    --data_dir $DATA_PATH/datasets/ninapro/DB8/ \
    --save_dir $DATA_PATH/datasets/ninapro/DB8/h5/ \
    --window_size 1000 \
    --stride 1000