PULSE-code / README.md
velvet-pine-22's picture
Upload folder using huggingface_hub
b4b2877 verified
---
license: mit
language:
- en
library_name: pytorch
tags:
- multi-modal
- daily-activity
- wearable-sensors
- benchmark
---
# PULSE β€” Code Repository
Reference implementation, training scripts, and benchmark baselines for the
**PULSE** dataset paper (under double-blind review at NeurIPS 2026 Evaluations &
Datasets Track).
> **Dataset:** [`velvet-pine-22/PULSE`](https://huggingface.co/datasets/velvet-pine-22/PULSE)
> Β· **Sample subset (β‰ˆ285 MB):** [`velvet-pine-22/PULSE-sample`](https://huggingface.co/datasets/velvet-pine-22/PULSE-sample)
## Repository layout
```
PULSE-code/
β”œβ”€β”€ experiments/
β”‚ β”œβ”€β”€ data/ # PyTorch Dataset wrappers
β”‚ β”‚ β”œβ”€β”€ dataset.py # core multi-modal dataset (T1, T2)
β”‚ β”‚ β”œβ”€β”€ dataset_seqpred.py # T2 fine-grained action recognition
β”‚ β”‚ β”œβ”€β”€ dataset_grasp_state.py # T3 grasp onset anticipation
β”‚ β”‚ β”œβ”€β”€ dataset_forecast.py # auxiliary forecasting heads
β”‚ β”‚ └── dataset_signal_forecast.py # T5 tactile-driven motion forecast
β”‚ β”‚
β”‚ β”œβ”€β”€ nets/ # Model architectures
β”‚ β”‚ β”œβ”€β”€ models.py # backbone networks (Transformer / LSTM / 1D-CNN)
β”‚ β”‚ β”œβ”€β”€ models_seqpred.py # DailyActFormer (DAF) β€” multi-modal Transformer
β”‚ β”‚ β”œβ”€β”€ models_forecast.py # forecasting heads
β”‚ β”‚ β”œβ”€β”€ models_forecast_priv.py # privileged-tactile variants for T5
β”‚ β”‚ β”œβ”€β”€ published_models.py # third-party model implementations
β”‚ β”‚ └── baselines_published/ # 7 published baselines (re-implementation)
β”‚ β”‚ β”œβ”€β”€ baselines.py # DeepConvLSTM / InceptionTime / MS-TCN / etc.
β”‚ β”‚ └── syncfuse.py # under-pressure-style multi-modal fusion
β”‚ β”‚
β”‚ β”œβ”€β”€ tasks/ # Training + evaluation entry points
β”‚ β”‚ β”œβ”€β”€ train_exp1.py # T1 β€” scene recognition
β”‚ β”‚ β”œβ”€β”€ train_seqpred.py # T2 β€” action recognition (DAF + ablations)
β”‚ β”‚ β”œβ”€β”€ train_grasp_state.py # T3 β€” grasp onset anticipation
β”‚ β”‚ β”œβ”€β”€ train_pred_cls.py # T3 alt classification head
β”‚ β”‚ β”œβ”€β”€ train_exp_missing.py # T4 β€” missing-modality robustness
β”‚ β”‚ β”œβ”€β”€ train_signal_forecast.py # T5 β€” tactile-driven motion forecasting
β”‚ β”‚ β”œβ”€β”€ train_signal_forecast_priv.py # T5 privileged variants
β”‚ β”‚ β”œβ”€β”€ train_baselines_t1.py # baselines for T1
β”‚ β”‚ β”œβ”€β”€ train_exp{2,3,4}.py # ablation experiments
β”‚ β”‚ β”œβ”€β”€ train_exp_{anticipate,grip,pose,retrieval,zeroshot}.py # auxiliary
β”‚ β”‚ β”œβ”€β”€ train_pred.py / train_forecast.py
β”‚ β”‚ β”œβ”€β”€ eval_baselines.py / eval_combined.py
β”‚ β”‚ └── published_baselines.py # baseline registry
β”‚ β”‚
β”‚ β”œβ”€β”€ analysis/ # Case study, figures, data prep utilities
β”‚ β”‚ β”œβ”€β”€ grasp_phase_analysis.py # case study (gazeβ†’EMGβ†’handβ†’contact cascade)
β”‚ β”‚ β”œβ”€β”€ modality_viz.py / analysis_figures.py / data_statistics_figure.py
β”‚ β”‚ β”œβ”€β”€ extract_video_features.py / extract_videomae_features.py
β”‚ β”‚ β”œβ”€β”€ build_taxonomy.py / generate_action_labels.py / generate_coarse_annotations.py
β”‚ β”‚ β”œβ”€β”€ reannotate_actions.py / gen_val_comparison.py
β”‚ β”‚ β”œβ”€β”€ exp_per_subject.py / check_seg_lengths.py
β”‚ β”‚ └── aggregate_*.py # collate run results
β”‚ β”‚
β”‚ β”œβ”€β”€ slurm/ # 60+ SLURM launch scripts (one per main experiment)
β”‚ β”‚ └── run_*.sh
β”‚ β”‚
β”‚ β”œβ”€β”€ taxonomy.py # shared 18-primitive taxonomy
β”‚ β”œβ”€β”€ s9_primitives.json
β”‚ └── taxonomy_v3.json
β”‚
β”œβ”€β”€ scripts/ # Top-level utilities (not task-specific)
β”‚ β”œβ”€β”€ build_paper_tables.py # collates results JSONs into LaTeX tables
β”‚ β”œβ”€β”€ eval_macrof1.py / eval_subset.py / eval_topk_v3.py
β”‚ └── dispatch_eval.sh # batch dispatcher
β”‚
β”œβ”€β”€ LICENSE # MIT
β”œβ”€β”€ requirements.txt # Python deps
└── README.md
```
## Quick start
```bash
# 1. Set up Python environment
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
# 2. Point at the PULSE dataset (download from HuggingFace first)
export PULSE_ROOT=/path/to/PULSE # the dataset root (not this code repo)
# 3. Run a training entry point as a module (from the experiments/ directory)
cd experiments
python -m tasks.train_seqpred \
--root $PULSE_ROOT \
--modalities mocap emg eyetrack imu pressure \
--output_dir runs/t2_daf
# 4. Reproduce paper tables (after training all benchmarks)
cd ..
python scripts/build_paper_tables.py \
--results_root experiments/runs/ \
--out tables/
```
> **Why `python -m tasks.train_seqpred` and not `python tasks/train_seqpred.py`?**
> The training scripts import sibling modules (`from data.dataset import …`,
> `from nets.models import …`). Running with `-m` from the `experiments/`
> directory makes Python treat `data/`, `nets/`, `tasks/`, and `analysis/` as
> top-level packages so the imports resolve cleanly.
## Reproducing the benchmark tasks
| Task | Entry point | Output |
|---|---|---|
| T1 β€” Scene recognition (8-way) | `tasks.train_exp1` | scene-classification metrics |
| T2 β€” Fine-grained action recognition | `tasks.train_seqpred` | verb / noun / hand top-k accuracy |
| T3 β€” Grasp onset anticipation | `tasks.train_grasp_state` / `tasks.train_pred_cls` | anticipation F1 / time-to-contact |
| T4 β€” Missing-modality robustness | `tasks.train_exp_missing` + `tasks.eval_combined` | per-modality ablation table |
| T5 β€” Tactile-driven grasp-state recognition | `tasks.train_signal_forecast` (+ `_priv` variants) | sub-second grasp-state metrics |
| T6 β€” Cross-modal pressure prediction | `tasks.train_forecast` / `tasks.train_signal_forecast` | pressure reconstruction metrics |
The exact command lines (with hyperparameters, seeds, GPU configs) used for
every paper table are checked in under `experiments/slurm/run_*.sh`, one
SLURM script per paper experiment. Output JSON files from these runs are
collated into LaTeX tables by `scripts/build_paper_tables.py`.
## Hardware
Headline experiments were run on **NVIDIA A800 (80 GB)** GPUs. A single seed of
DailyActFormer T2 trains in ~6 hours on one A800. Most baselines fit on a
single 24 GB consumer GPU.
## License & attribution
Code is released under **MIT** (see `LICENSE`). The PULSE dataset itself is
released under **CC BY-NC 4.0** (see the dataset repository).
## Citation
```bibtex
@inproceedings{anonymous2026pulse,
title = {PULSE: A Synchronized Five-Modality Dataset for Multi-Modal Daily Activity Understanding},
author = {Anonymous Authors},
booktitle = {Submitted to NeurIPS 2026 Evaluations and Datasets Track},
year = {2026},
note = {Under double-blind review}
}
```