| --- |
| license: mit |
| language: |
| - en |
| library_name: pytorch |
| tags: |
| - multi-modal |
| - daily-activity |
| - wearable-sensors |
| - benchmark |
| --- |
| |
| # PULSE β Code Repository |
|
|
| Reference implementation, training scripts, and benchmark baselines for the |
| **PULSE** dataset paper (under double-blind review at NeurIPS 2026 Evaluations & |
| Datasets Track). |
|
|
| > **Dataset:** [`velvet-pine-22/PULSE`](https://huggingface.co/datasets/velvet-pine-22/PULSE) |
| > Β· **Sample subset (β285 MB):** [`velvet-pine-22/PULSE-sample`](https://huggingface.co/datasets/velvet-pine-22/PULSE-sample) |
|
|
| ## Repository layout |
|
|
| ``` |
| PULSE-code/ |
| βββ experiments/ |
| β βββ data/ # PyTorch Dataset wrappers |
| β β βββ dataset.py # core multi-modal dataset (T1, T2) |
| β β βββ dataset_seqpred.py # T2 fine-grained action recognition |
| β β βββ dataset_grasp_state.py # T3 grasp onset anticipation |
| β β βββ dataset_forecast.py # auxiliary forecasting heads |
| β β βββ dataset_signal_forecast.py # T5 tactile-driven motion forecast |
| β β |
| β βββ nets/ # Model architectures |
| β β βββ models.py # backbone networks (Transformer / LSTM / 1D-CNN) |
| β β βββ models_seqpred.py # DailyActFormer (DAF) β multi-modal Transformer |
| β β βββ models_forecast.py # forecasting heads |
| β β βββ models_forecast_priv.py # privileged-tactile variants for T5 |
| β β βββ published_models.py # third-party model implementations |
| β β βββ baselines_published/ # 7 published baselines (re-implementation) |
| β β βββ baselines.py # DeepConvLSTM / InceptionTime / MS-TCN / etc. |
| β β βββ syncfuse.py # under-pressure-style multi-modal fusion |
| β β |
| β βββ tasks/ # Training + evaluation entry points |
| β β βββ train_exp1.py # T1 β scene recognition |
| β β βββ train_seqpred.py # T2 β action recognition (DAF + ablations) |
| β β βββ train_grasp_state.py # T3 β grasp onset anticipation |
| β β βββ train_pred_cls.py # T3 alt classification head |
| β β βββ train_exp_missing.py # T4 β missing-modality robustness |
| β β βββ train_signal_forecast.py # T5 β tactile-driven motion forecasting |
| β β βββ train_signal_forecast_priv.py # T5 privileged variants |
| β β βββ train_baselines_t1.py # baselines for T1 |
| β β βββ train_exp{2,3,4}.py # ablation experiments |
| β β βββ train_exp_{anticipate,grip,pose,retrieval,zeroshot}.py # auxiliary |
| β β βββ train_pred.py / train_forecast.py |
| β β βββ eval_baselines.py / eval_combined.py |
| β β βββ published_baselines.py # baseline registry |
| β β |
| β βββ analysis/ # Case study, figures, data prep utilities |
| β β βββ grasp_phase_analysis.py # case study (gazeβEMGβhandβcontact cascade) |
| β β βββ modality_viz.py / analysis_figures.py / data_statistics_figure.py |
| β β βββ extract_video_features.py / extract_videomae_features.py |
| β β βββ build_taxonomy.py / generate_action_labels.py / generate_coarse_annotations.py |
| β β βββ reannotate_actions.py / gen_val_comparison.py |
| β β βββ exp_per_subject.py / check_seg_lengths.py |
| β β βββ aggregate_*.py # collate run results |
| β β |
| β βββ slurm/ # 60+ SLURM launch scripts (one per main experiment) |
| β β βββ run_*.sh |
| β β |
| β βββ taxonomy.py # shared 18-primitive taxonomy |
| β βββ s9_primitives.json |
| β βββ taxonomy_v3.json |
| β |
| βββ scripts/ # Top-level utilities (not task-specific) |
| β βββ build_paper_tables.py # collates results JSONs into LaTeX tables |
| β βββ eval_macrof1.py / eval_subset.py / eval_topk_v3.py |
| β βββ dispatch_eval.sh # batch dispatcher |
| β |
| βββ LICENSE # MIT |
| βββ requirements.txt # Python deps |
| βββ README.md |
| ``` |
|
|
| ## Quick start |
|
|
| ```bash |
| # 1. Set up Python environment |
| python -m venv .venv && source .venv/bin/activate |
| pip install -r requirements.txt |
| |
| # 2. Point at the PULSE dataset (download from HuggingFace first) |
| export PULSE_ROOT=/path/to/PULSE # the dataset root (not this code repo) |
| |
| # 3. Run a training entry point as a module (from the experiments/ directory) |
| cd experiments |
| python -m tasks.train_seqpred \ |
| --root $PULSE_ROOT \ |
| --modalities mocap emg eyetrack imu pressure \ |
| --output_dir runs/t2_daf |
| |
| # 4. Reproduce paper tables (after training all benchmarks) |
| cd .. |
| python scripts/build_paper_tables.py \ |
| --results_root experiments/runs/ \ |
| --out tables/ |
| ``` |
|
|
| > **Why `python -m tasks.train_seqpred` and not `python tasks/train_seqpred.py`?** |
| > The training scripts import sibling modules (`from data.dataset import β¦`, |
| > `from nets.models import β¦`). Running with `-m` from the `experiments/` |
| > directory makes Python treat `data/`, `nets/`, `tasks/`, and `analysis/` as |
| > top-level packages so the imports resolve cleanly. |
|
|
| ## Reproducing the benchmark tasks |
|
|
| | Task | Entry point | Output | |
| |---|---|---| |
| | T1 β Scene recognition (8-way) | `tasks.train_exp1` | scene-classification metrics | |
| | T2 β Fine-grained action recognition | `tasks.train_seqpred` | verb / noun / hand top-k accuracy | |
| | T3 β Grasp onset anticipation | `tasks.train_grasp_state` / `tasks.train_pred_cls` | anticipation F1 / time-to-contact | |
| | T4 β Missing-modality robustness | `tasks.train_exp_missing` + `tasks.eval_combined` | per-modality ablation table | |
| | T5 β Tactile-driven grasp-state recognition | `tasks.train_signal_forecast` (+ `_priv` variants) | sub-second grasp-state metrics | |
| | T6 β Cross-modal pressure prediction | `tasks.train_forecast` / `tasks.train_signal_forecast` | pressure reconstruction metrics | |
|
|
| The exact command lines (with hyperparameters, seeds, GPU configs) used for |
| every paper table are checked in under `experiments/slurm/run_*.sh`, one |
| SLURM script per paper experiment. Output JSON files from these runs are |
| collated into LaTeX tables by `scripts/build_paper_tables.py`. |
|
|
| ## Hardware |
|
|
| Headline experiments were run on **NVIDIA A800 (80 GB)** GPUs. A single seed of |
| DailyActFormer T2 trains in ~6 hours on one A800. Most baselines fit on a |
| single 24 GB consumer GPU. |
|
|
| ## License & attribution |
|
|
| Code is released under **MIT** (see `LICENSE`). The PULSE dataset itself is |
| released under **CC BY-NC 4.0** (see the dataset repository). |
|
|
| ## Citation |
|
|
| ```bibtex |
| @inproceedings{anonymous2026pulse, |
| title = {PULSE: A Synchronized Five-Modality Dataset for Multi-Modal Daily Activity Understanding}, |
| author = {Anonymous Authors}, |
| booktitle = {Submitted to NeurIPS 2026 Evaluations and Datasets Track}, |
| year = {2026}, |
| note = {Under double-blind review} |
| } |
| ``` |
|
|