MatteoFasulo commited on
Commit
b4c56ea
·
verified ·
1 Parent(s): ca8e271

Update scripts/README.md

Browse files
Files changed (1) hide show
  1. scripts/README.md +133 -129
scripts/README.md CHANGED
@@ -1,129 +1,133 @@
1
- # Dataset Preparation Commands
2
-
3
- ## Overview
4
-
5
- This document provides the commands to prepare various EMG datasets for pretraining and downstream tasks. Each dataset preparation script takes in raw data, processes it into overlapping windows, and saves the processed data in HDF5 format for efficient loading during model training.
6
-
7
- Remember to add the flag `--download_data` if the dataset is not downloaded yet.
8
-
9
- ## Pretraining Datasets
10
-
11
- For the pretraining:
12
-
13
- ### emg2pose
14
-
15
- ```bash
16
- python scripts/emg2pose.py \
17
- --data_dir $SCRATCH/datasets/emg2pose_data/ \
18
- --save_dir $SCRATCH/datasets/emg2pose_data/h5/ \
19
- --window_size 1000 \
20
- --stride 500
21
- ```
22
-
23
- ### Ninapro DB6
24
-
25
- ```bash
26
- python scripts/db6.py \
27
- --data_dir $SCRATCH/datasets/ninapro/DB6/ \
28
- --save_dir $SCRATCH/datasets/ninapro/DB6/h5/ \
29
- --window_size 1000 \
30
- --stride 500
31
- ```
32
-
33
- ### Ninapro DB7
34
-
35
- ```bash
36
- python scripts/db7.py \
37
- --data_dir $SCRATCH/datasets/ninapro/DB7/ \
38
- --save_dir $SCRATCH/datasets/ninapro/DB7/h5/ \
39
- --window_size 1000 \
40
- --stride 500
41
- ```
42
-
43
- ---
44
-
45
- ## Downstream Datasets
46
-
47
- For the downstream tasks:
48
-
49
- ### Ninapro DB5 (200 ms, 25% overlap)
50
-
51
- ```bash
52
- python scripts/db5.py \
53
- --data_dir $SCRATCH/datasets/ninapro/DB5/ \
54
- --save_dir $SCRATCH/datasets/ninapro/DB5/h5/ \
55
- --window_size 200 \
56
- --stride 50
57
- ```
58
-
59
- ### Ninapro DB5 (1000 ms, 25% overlap)
60
-
61
- ```bash
62
- python scripts/db5.py \
63
- --data_dir $SCRATCH/datasets/ninapro/DB5/ \
64
- --save_dir $SCRATCH/datasets/ninapro/DB5/h5/ \
65
- --window_size 1000 \
66
- --stride 250
67
- ```
68
-
69
- ### EMG-EPN612 (200 ms)
70
-
71
- ```bash
72
- python scripts/epn.py \
73
- --data_dir $SCRATCH/datasets/EPN612/ \
74
- --source_training $SCRATCH/datasets/EPN612/trainingJSON/ \
75
- --source_testing $SCRATCH/datasets/EPN612/testingJSON/ \
76
- --dest_dir $SCRATCH/datasets/EPN612/h5/ \
77
- --window_size 200
78
- ```
79
-
80
- ### EMG-EPN612 (1000 ms)
81
-
82
- ```bash
83
- python scripts/epn.py \
84
- --data_dir $SCRATCH/datasets/EPN612/ \
85
- --source_training $SCRATCH/datasets/EPN612/trainingJSON/ \
86
- --source_testing $SCRATCH/datasets/EPN612/testingJSON/ \
87
- --dest_dir $SCRATCH/datasets/EPN612/h5/ \
88
- --window_size 1000
89
- ```
90
-
91
- ### UCI EMG (200 ms, 25% overlap)
92
-
93
- ```bash
94
- python scripts/uci.py \
95
- --data_dir $SCRATCH/datasets/UCI_EMG/EMG_data_for_gestures-master/ \
96
- --save_dir $SCRATCH/datasets/UCI_EMG/EMG_data_for_gestures-master/h5/ \
97
- --window_size 200 \
98
- --stride 50
99
- ```
100
-
101
- ### UCI EMG (1000 ms, 25% overlap)
102
-
103
- ```bash
104
- python scripts/uci.py \
105
- --data_dir $SCRATCH/datasets/UCI_EMG/EMG_data_for_gestures-master/ \
106
- --save_dir $SCRATCH/datasets/UCI_EMG/EMG_data_for_gestures-master/h5/ \
107
- --window_size 1000 \
108
- --stride 250
109
- ```
110
-
111
- ### Ninapro DB8 (200 ms, no overlap)
112
-
113
- ```bash
114
- python scripts/db8.py \
115
- --data_dir $SCRATCH/datasets/ninapro/DB8/ \
116
- --save_dir $SCRATCH/datasets/ninapro/DB8/h5/ \
117
- --window_size 200 \
118
- --stride 200
119
- ```
120
-
121
- ### Ninapro DB8 (1000 ms, no overlap)
122
-
123
- ```bash
124
- python scripts/db8.py \
125
- --data_dir $SCRATCH/datasets/ninapro/DB8/ \
126
- --save_dir $SCRATCH/datasets/ninapro/DB8/h5/ \
127
- --window_size 1000 \
128
- --stride 1000
129
- ```
 
 
 
 
 
1
+ # Dataset Preparation Commands
2
+
3
+ ## Overview
4
+
5
+ This document provides the commands to prepare various EMG datasets for pretraining and downstream tasks. Each dataset preparation script takes in raw data, processes it into overlapping windows, and saves the processed data in HDF5 format for efficient loading during model training.
6
+
7
+ Remember to add the flag `--download_data` if the dataset is not downloaded yet.
8
+
9
+ Substitute the `$SCRATCH` environment variable with your path for saving the dataset.
10
+
11
+ The required libraries for running the scripts are: `h5py`, `numpy`, `scipy`, `joblib`, `tqdm` .
12
+
13
+ ## Pretraining Datasets
14
+
15
+ For the pretraining:
16
+
17
+ ### emg2pose
18
+
19
+ ```bash
20
+ python scripts/emg2pose.py \
21
+ --data_dir $SCRATCH/datasets/emg2pose_data/ \
22
+ --save_dir $SCRATCH/datasets/emg2pose_data/h5/ \
23
+ --window_size 1000 \
24
+ --stride 500
25
+ ```
26
+
27
+ ### Ninapro DB6
28
+
29
+ ```bash
30
+ python scripts/db6.py \
31
+ --data_dir $SCRATCH/datasets/ninapro/DB6/ \
32
+ --save_dir $SCRATCH/datasets/ninapro/DB6/h5/ \
33
+ --window_size 1000 \
34
+ --stride 500
35
+ ```
36
+
37
+ ### Ninapro DB7
38
+
39
+ ```bash
40
+ python scripts/db7.py \
41
+ --data_dir $SCRATCH/datasets/ninapro/DB7/ \
42
+ --save_dir $SCRATCH/datasets/ninapro/DB7/h5/ \
43
+ --window_size 1000 \
44
+ --stride 500
45
+ ```
46
+
47
+ ---
48
+
49
+ ## Downstream Datasets
50
+
51
+ For the downstream tasks:
52
+
53
+ ### Ninapro DB5 (200 ms, 25% overlap)
54
+
55
+ ```bash
56
+ python scripts/db5.py \
57
+ --data_dir $SCRATCH/datasets/ninapro/DB5/ \
58
+ --save_dir $SCRATCH/datasets/ninapro/DB5/h5/ \
59
+ --window_size 200 \
60
+ --stride 50
61
+ ```
62
+
63
+ ### Ninapro DB5 (1000 ms, 25% overlap)
64
+
65
+ ```bash
66
+ python scripts/db5.py \
67
+ --data_dir $SCRATCH/datasets/ninapro/DB5/ \
68
+ --save_dir $SCRATCH/datasets/ninapro/DB5/h5/ \
69
+ --window_size 1000 \
70
+ --stride 250
71
+ ```
72
+
73
+ ### EMG-EPN612 (200 ms)
74
+
75
+ ```bash
76
+ python scripts/epn.py \
77
+ --data_dir $SCRATCH/datasets/EPN612/ \
78
+ --source_training $SCRATCH/datasets/EPN612/trainingJSON/ \
79
+ --source_testing $SCRATCH/datasets/EPN612/testingJSON/ \
80
+ --dest_dir $SCRATCH/datasets/EPN612/h5/ \
81
+ --window_size 200
82
+ ```
83
+
84
+ ### EMG-EPN612 (1000 ms)
85
+
86
+ ```bash
87
+ python scripts/epn.py \
88
+ --data_dir $SCRATCH/datasets/EPN612/ \
89
+ --source_training $SCRATCH/datasets/EPN612/trainingJSON/ \
90
+ --source_testing $SCRATCH/datasets/EPN612/testingJSON/ \
91
+ --dest_dir $SCRATCH/datasets/EPN612/h5/ \
92
+ --window_size 1000
93
+ ```
94
+
95
+ ### UCI EMG (200 ms, 25% overlap)
96
+
97
+ ```bash
98
+ python scripts/uci.py \
99
+ --data_dir $SCRATCH/datasets/UCI_EMG/EMG_data_for_gestures-master/ \
100
+ --save_dir $SCRATCH/datasets/UCI_EMG/EMG_data_for_gestures-master/h5/ \
101
+ --window_size 200 \
102
+ --stride 50
103
+ ```
104
+
105
+ ### UCI EMG (1000 ms, 25% overlap)
106
+
107
+ ```bash
108
+ python scripts/uci.py \
109
+ --data_dir $SCRATCH/datasets/UCI_EMG/EMG_data_for_gestures-master/ \
110
+ --save_dir $SCRATCH/datasets/UCI_EMG/EMG_data_for_gestures-master/h5/ \
111
+ --window_size 1000 \
112
+ --stride 250
113
+ ```
114
+
115
+ ### Ninapro DB8 (200 ms, no overlap)
116
+
117
+ ```bash
118
+ python scripts/db8.py \
119
+ --data_dir $SCRATCH/datasets/ninapro/DB8/ \
120
+ --save_dir $SCRATCH/datasets/ninapro/DB8/h5/ \
121
+ --window_size 200 \
122
+ --stride 200
123
+ ```
124
+
125
+ ### Ninapro DB8 (1000 ms, no overlap)
126
+
127
+ ```bash
128
+ python scripts/db8.py \
129
+ --data_dir $SCRATCH/datasets/ninapro/DB8/ \
130
+ --save_dir $SCRATCH/datasets/ninapro/DB8/h5/ \
131
+ --window_size 1000 \
132
+ --stride 1000
133
+ ```