File size: 7,230 Bytes

b4b2877

---
license: mit
language:
  - en
library_name: pytorch
tags:
  - multi-modal
  - daily-activity
  - wearable-sensors
  - benchmark
---

# PULSE — Code Repository

Reference implementation, training scripts, and benchmark baselines for the
**PULSE** dataset paper (under double-blind review at NeurIPS 2026 Evaluations &
Datasets Track).

> **Dataset:** [`velvet-pine-22/PULSE`](https://huggingface.co/datasets/velvet-pine-22/PULSE)
> · **Sample subset (≈285 MB):** [`velvet-pine-22/PULSE-sample`](https://huggingface.co/datasets/velvet-pine-22/PULSE-sample)

## Repository layout

```
PULSE-code/
├── experiments/
│   ├── data/                     # PyTorch Dataset wrappers
│   │   ├── dataset.py                  # core multi-modal dataset (T1, T2)
│   │   ├── dataset_seqpred.py          # T2 fine-grained action recognition
│   │   ├── dataset_grasp_state.py      # T3 grasp onset anticipation
│   │   ├── dataset_forecast.py         # auxiliary forecasting heads
│   │   └── dataset_signal_forecast.py  # T5 tactile-driven motion forecast
│   │
│   ├── nets/                     # Model architectures
│   │   ├── models.py                   # backbone networks (Transformer / LSTM / 1D-CNN)
│   │   ├── models_seqpred.py           # DailyActFormer (DAF) — multi-modal Transformer
│   │   ├── models_forecast.py          # forecasting heads
│   │   ├── models_forecast_priv.py     # privileged-tactile variants for T5
│   │   ├── published_models.py         # third-party model implementations
│   │   └── baselines_published/        # 7 published baselines (re-implementation)
│   │       ├── baselines.py            #   DeepConvLSTM / InceptionTime / MS-TCN / etc.
│   │       └── syncfuse.py             #   under-pressure-style multi-modal fusion
│   │
│   ├── tasks/                    # Training + evaluation entry points
│   │   ├── train_exp1.py               # T1 — scene recognition
│   │   ├── train_seqpred.py            # T2 — action recognition (DAF + ablations)
│   │   ├── train_grasp_state.py        # T3 — grasp onset anticipation
│   │   ├── train_pred_cls.py           # T3 alt classification head
│   │   ├── train_exp_missing.py        # T4 — missing-modality robustness
│   │   ├── train_signal_forecast.py    # T5 — tactile-driven motion forecasting
│   │   ├── train_signal_forecast_priv.py  # T5 privileged variants
│   │   ├── train_baselines_t1.py       # baselines for T1
│   │   ├── train_exp{2,3,4}.py         # ablation experiments
│   │   ├── train_exp_{anticipate,grip,pose,retrieval,zeroshot}.py  # auxiliary
│   │   ├── train_pred.py / train_forecast.py
│   │   ├── eval_baselines.py / eval_combined.py
│   │   └── published_baselines.py      # baseline registry
│   │
│   ├── analysis/                 # Case study, figures, data prep utilities
│   │   ├── grasp_phase_analysis.py     # case study (gaze→EMG→hand→contact cascade)
│   │   ├── modality_viz.py / analysis_figures.py / data_statistics_figure.py
│   │   ├── extract_video_features.py / extract_videomae_features.py
│   │   ├── build_taxonomy.py / generate_action_labels.py / generate_coarse_annotations.py
│   │   ├── reannotate_actions.py / gen_val_comparison.py
│   │   ├── exp_per_subject.py / check_seg_lengths.py
│   │   └── aggregate_*.py              # collate run results
│   │
│   ├── slurm/                    # 60+ SLURM launch scripts (one per main experiment)
│   │   └── run_*.sh
│   │
│   ├── taxonomy.py               # shared 18-primitive taxonomy
│   ├── s9_primitives.json
│   └── taxonomy_v3.json
│
├── scripts/                      # Top-level utilities (not task-specific)
│   ├── build_paper_tables.py     # collates results JSONs into LaTeX tables
│   ├── eval_macrof1.py / eval_subset.py / eval_topk_v3.py
│   └── dispatch_eval.sh          # batch dispatcher
│
├── LICENSE                       # MIT
├── requirements.txt              # Python deps
└── README.md
```

## Quick start

```bash
# 1. Set up Python environment
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# 2. Point at the PULSE dataset (download from HuggingFace first)
export PULSE_ROOT=/path/to/PULSE   # the dataset root (not this code repo)

# 3. Run a training entry point as a module (from the experiments/ directory)
cd experiments
python -m tasks.train_seqpred \
    --root $PULSE_ROOT \
    --modalities mocap emg eyetrack imu pressure \
    --output_dir runs/t2_daf

# 4. Reproduce paper tables (after training all benchmarks)
cd ..
python scripts/build_paper_tables.py \
    --results_root experiments/runs/ \
    --out tables/
```

> **Why `python -m tasks.train_seqpred` and not `python tasks/train_seqpred.py`?**
> The training scripts import sibling modules (`from data.dataset import …`,
> `from nets.models import …`). Running with `-m` from the `experiments/`
> directory makes Python treat `data/`, `nets/`, `tasks/`, and `analysis/` as
> top-level packages so the imports resolve cleanly.

## Reproducing the benchmark tasks

| Task | Entry point | Output |
|---|---|---|
| T1 — Scene recognition (8-way) | `tasks.train_exp1` | scene-classification metrics |
| T2 — Fine-grained action recognition | `tasks.train_seqpred` | verb / noun / hand top-k accuracy |
| T3 — Grasp onset anticipation | `tasks.train_grasp_state` / `tasks.train_pred_cls` | anticipation F1 / time-to-contact |
| T4 — Missing-modality robustness | `tasks.train_exp_missing` + `tasks.eval_combined` | per-modality ablation table |
| T5 — Tactile-driven grasp-state recognition | `tasks.train_signal_forecast` (+ `_priv` variants) | sub-second grasp-state metrics |
| T6 — Cross-modal pressure prediction | `tasks.train_forecast` / `tasks.train_signal_forecast` | pressure reconstruction metrics |

The exact command lines (with hyperparameters, seeds, GPU configs) used for
every paper table are checked in under `experiments/slurm/run_*.sh`, one
SLURM script per paper experiment. Output JSON files from these runs are
collated into LaTeX tables by `scripts/build_paper_tables.py`.

## Hardware

Headline experiments were run on **NVIDIA A800 (80 GB)** GPUs. A single seed of
DailyActFormer T2 trains in ~6 hours on one A800. Most baselines fit on a
single 24 GB consumer GPU.

## License & attribution

Code is released under **MIT** (see `LICENSE`). The PULSE dataset itself is
released under **CC BY-NC 4.0** (see the dataset repository).

## Citation

```bibtex
@inproceedings{anonymous2026pulse,
  title     = {PULSE: A Synchronized Five-Modality Dataset for Multi-Modal Daily Activity Understanding},
  author    = {Anonymous Authors},
  booktitle = {Submitted to NeurIPS 2026 Evaluations and Datasets Track},
  year      = {2026},
  note      = {Under double-blind review}
}
```