File size: 7,230 Bytes
b4b2877
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
---
license: mit
language:
  - en
library_name: pytorch
tags:
  - multi-modal
  - daily-activity
  - wearable-sensors
  - benchmark
---

# PULSE β€” Code Repository

Reference implementation, training scripts, and benchmark baselines for the
**PULSE** dataset paper (under double-blind review at NeurIPS 2026 Evaluations &
Datasets Track).

> **Dataset:** [`velvet-pine-22/PULSE`](https://huggingface.co/datasets/velvet-pine-22/PULSE)
> Β· **Sample subset (β‰ˆ285 MB):** [`velvet-pine-22/PULSE-sample`](https://huggingface.co/datasets/velvet-pine-22/PULSE-sample)

## Repository layout

```
PULSE-code/
β”œβ”€β”€ experiments/
β”‚   β”œβ”€β”€ data/                     # PyTorch Dataset wrappers
β”‚   β”‚   β”œβ”€β”€ dataset.py                  # core multi-modal dataset (T1, T2)
β”‚   β”‚   β”œβ”€β”€ dataset_seqpred.py          # T2 fine-grained action recognition
β”‚   β”‚   β”œβ”€β”€ dataset_grasp_state.py      # T3 grasp onset anticipation
β”‚   β”‚   β”œβ”€β”€ dataset_forecast.py         # auxiliary forecasting heads
β”‚   β”‚   └── dataset_signal_forecast.py  # T5 tactile-driven motion forecast
β”‚   β”‚
β”‚   β”œβ”€β”€ nets/                     # Model architectures
β”‚   β”‚   β”œβ”€β”€ models.py                   # backbone networks (Transformer / LSTM / 1D-CNN)
β”‚   β”‚   β”œβ”€β”€ models_seqpred.py           # DailyActFormer (DAF) β€” multi-modal Transformer
β”‚   β”‚   β”œβ”€β”€ models_forecast.py          # forecasting heads
β”‚   β”‚   β”œβ”€β”€ models_forecast_priv.py     # privileged-tactile variants for T5
β”‚   β”‚   β”œβ”€β”€ published_models.py         # third-party model implementations
β”‚   β”‚   └── baselines_published/        # 7 published baselines (re-implementation)
β”‚   β”‚       β”œβ”€β”€ baselines.py            #   DeepConvLSTM / InceptionTime / MS-TCN / etc.
β”‚   β”‚       └── syncfuse.py             #   under-pressure-style multi-modal fusion
β”‚   β”‚
β”‚   β”œβ”€β”€ tasks/                    # Training + evaluation entry points
β”‚   β”‚   β”œβ”€β”€ train_exp1.py               # T1 β€” scene recognition
β”‚   β”‚   β”œβ”€β”€ train_seqpred.py            # T2 β€” action recognition (DAF + ablations)
β”‚   β”‚   β”œβ”€β”€ train_grasp_state.py        # T3 β€” grasp onset anticipation
β”‚   β”‚   β”œβ”€β”€ train_pred_cls.py           # T3 alt classification head
β”‚   β”‚   β”œβ”€β”€ train_exp_missing.py        # T4 β€” missing-modality robustness
β”‚   β”‚   β”œβ”€β”€ train_signal_forecast.py    # T5 β€” tactile-driven motion forecasting
β”‚   β”‚   β”œβ”€β”€ train_signal_forecast_priv.py  # T5 privileged variants
β”‚   β”‚   β”œβ”€β”€ train_baselines_t1.py       # baselines for T1
β”‚   β”‚   β”œβ”€β”€ train_exp{2,3,4}.py         # ablation experiments
β”‚   β”‚   β”œβ”€β”€ train_exp_{anticipate,grip,pose,retrieval,zeroshot}.py  # auxiliary
β”‚   β”‚   β”œβ”€β”€ train_pred.py / train_forecast.py
β”‚   β”‚   β”œβ”€β”€ eval_baselines.py / eval_combined.py
β”‚   β”‚   └── published_baselines.py      # baseline registry
β”‚   β”‚
β”‚   β”œβ”€β”€ analysis/                 # Case study, figures, data prep utilities
β”‚   β”‚   β”œβ”€β”€ grasp_phase_analysis.py     # case study (gazeβ†’EMGβ†’handβ†’contact cascade)
β”‚   β”‚   β”œβ”€β”€ modality_viz.py / analysis_figures.py / data_statistics_figure.py
β”‚   β”‚   β”œβ”€β”€ extract_video_features.py / extract_videomae_features.py
β”‚   β”‚   β”œβ”€β”€ build_taxonomy.py / generate_action_labels.py / generate_coarse_annotations.py
β”‚   β”‚   β”œβ”€β”€ reannotate_actions.py / gen_val_comparison.py
β”‚   β”‚   β”œβ”€β”€ exp_per_subject.py / check_seg_lengths.py
β”‚   β”‚   └── aggregate_*.py              # collate run results
β”‚   β”‚
β”‚   β”œβ”€β”€ slurm/                    # 60+ SLURM launch scripts (one per main experiment)
β”‚   β”‚   └── run_*.sh
β”‚   β”‚
β”‚   β”œβ”€β”€ taxonomy.py               # shared 18-primitive taxonomy
β”‚   β”œβ”€β”€ s9_primitives.json
β”‚   └── taxonomy_v3.json
β”‚
β”œβ”€β”€ scripts/                      # Top-level utilities (not task-specific)
β”‚   β”œβ”€β”€ build_paper_tables.py     # collates results JSONs into LaTeX tables
β”‚   β”œβ”€β”€ eval_macrof1.py / eval_subset.py / eval_topk_v3.py
β”‚   └── dispatch_eval.sh          # batch dispatcher
β”‚
β”œβ”€β”€ LICENSE                       # MIT
β”œβ”€β”€ requirements.txt              # Python deps
└── README.md
```

## Quick start

```bash
# 1. Set up Python environment
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# 2. Point at the PULSE dataset (download from HuggingFace first)
export PULSE_ROOT=/path/to/PULSE   # the dataset root (not this code repo)

# 3. Run a training entry point as a module (from the experiments/ directory)
cd experiments
python -m tasks.train_seqpred \
    --root $PULSE_ROOT \
    --modalities mocap emg eyetrack imu pressure \
    --output_dir runs/t2_daf

# 4. Reproduce paper tables (after training all benchmarks)
cd ..
python scripts/build_paper_tables.py \
    --results_root experiments/runs/ \
    --out tables/
```

> **Why `python -m tasks.train_seqpred` and not `python tasks/train_seqpred.py`?**
> The training scripts import sibling modules (`from data.dataset import …`,
> `from nets.models import …`). Running with `-m` from the `experiments/`
> directory makes Python treat `data/`, `nets/`, `tasks/`, and `analysis/` as
> top-level packages so the imports resolve cleanly.

## Reproducing the benchmark tasks

| Task | Entry point | Output |
|---|---|---|
| T1 β€” Scene recognition (8-way) | `tasks.train_exp1` | scene-classification metrics |
| T2 β€” Fine-grained action recognition | `tasks.train_seqpred` | verb / noun / hand top-k accuracy |
| T3 β€” Grasp onset anticipation | `tasks.train_grasp_state` / `tasks.train_pred_cls` | anticipation F1 / time-to-contact |
| T4 β€” Missing-modality robustness | `tasks.train_exp_missing` + `tasks.eval_combined` | per-modality ablation table |
| T5 β€” Tactile-driven grasp-state recognition | `tasks.train_signal_forecast` (+ `_priv` variants) | sub-second grasp-state metrics |
| T6 β€” Cross-modal pressure prediction | `tasks.train_forecast` / `tasks.train_signal_forecast` | pressure reconstruction metrics |

The exact command lines (with hyperparameters, seeds, GPU configs) used for
every paper table are checked in under `experiments/slurm/run_*.sh`, one
SLURM script per paper experiment. Output JSON files from these runs are
collated into LaTeX tables by `scripts/build_paper_tables.py`.

## Hardware

Headline experiments were run on **NVIDIA A800 (80 GB)** GPUs. A single seed of
DailyActFormer T2 trains in ~6 hours on one A800. Most baselines fit on a
single 24 GB consumer GPU.

## License & attribution

Code is released under **MIT** (see `LICENSE`). The PULSE dataset itself is
released under **CC BY-NC 4.0** (see the dataset repository).

## Citation

```bibtex
@inproceedings{anonymous2026pulse,
  title     = {PULSE: A Synchronized Five-Modality Dataset for Multi-Modal Daily Activity Understanding},
  author    = {Anonymous Authors},
  booktitle = {Submitted to NeurIPS 2026 Evaluations and Datasets Track},
  year      = {2026},
  note      = {Under double-blind review}
}
```