Spaces:

braindecode
/

OpenEEGBench

Running

App Files Files Community

Data splits (train/val/test)

pinned

by PierreGtch - opened Feb 18

Discussion

PierreGtch

Braindecode org Feb 18

•

edited Feb 18

For the first iteration of this benchmark, we employ predefined train/validation/test splits, mostly focused on assessing generalization to unseen subjects. We followed practices established in recent foundation model papers (REVE, CBraMod and Labram), where subjects are stratified across splits, except for datasets SEED-V and BCIC-2020-3 where task difficulty would make strict cross-subject transfer impossible; instead, we used the within-session split provided in the original publications.
The exact split details are the following:

MAT (mental stress) Train subjects 0–27; validation subjects 28–31; test subjects 32–35.
FACED (emotion recognition) Train subjects 0–79; validation subjects 80–99; test subjects 100–122.
PhysioNet-MI (motor imagery) Train subjects 1–70; validation subjects 71–89; test subjects 90–109.
BCIC-2020-3 (imagined speech) Within-session split by recording run, according to the original competition rules: train = run 0; validation = run 1; test = run 2.
ISRUC (sleep staging) Train = I001–I080 (group I, with a small number of exclusions noted in the dataset config); validation = I081–I090; test = I091–I100.
BCIC-IV-2a (motor imagery) Train subjects: 1, 2, 3; validation subjects: 4, 5, 6; test subjects: 7, 8, 9.
SEED-V (emotion recognition) Session-based split: train = session 1; validation = session 2; test = session 3.
TUAB (abnormal events detection) Original split: subjects labeled 'train=true' are used for training; subjects labeled 'train=false' are used for testing. The validation set is obtained from the training records via a nested cross-subject split (5 folds, approximately 80%/20% per fold, stratified).
Mumtaz (mental disorder) Predefined subject lists are used: training subjects: H1, H2, H10–22, MDD1, MDD2, MDD10–21; validation subjects: H23–25, MDD22–25; test subjects: H3–9, H26–30, MDD3–9, MDD26–34.
TUEV (event classification) Original split: subjects labeled 'train' are used for training and 'eval' for testing. The validation set is derived from the 'train' group using a nested cross-subject split (5 folds; each fold splits training subjects approximately 80%/20%).

Feedback is highly welcome before we freeze the benchmark design decisions :)

PierreGtch pinned discussion Feb 18

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment