vhluong commited on
Commit
dc8f588
Β·
verified Β·
1 Parent(s): 4c7203b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +220 -3
README.md CHANGED
@@ -1,3 +1,220 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MnemoDyn: Learning Resting State Dynamics from 40K fMRI Sequences
2
+
3
+ [[Paper]]() [[Poster]]() [[Slide]]()
4
+
5
+ ### Sourav Pal, Viet Luong, Hoseok Lee, Tingting Dan, Guorong Wu, Richard Davidson, Won Hwa Kim, Vikas Singh
6
+
7
+ ![MnemoDyn architecture](asset/braine-1.png)
8
+
9
+ MnemoDyn is an operator-learning foundation model for resting-state fMRI, combining multi-resolution wavelet dynamics with CDE-style temporal modeling.
10
+
11
+ ## Update
12
+
13
+ MnemoDyn is now published on Hugging Face: https://huggingface.co/vhluong/MnemoDyn
14
+
15
+ You can also publish your own trained checkpoint directly from this repo.
16
+
17
+ ## Tutorial
18
+
19
+ A usage walkthrough is available as a Google Colab notebook:
20
+
21
+ [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1IeWYPmwZAj5zA_khQHmgKOXjF8DJXVNo?usp=sharing)
22
+
23
+ ## At A Glance
24
+
25
+ - Pretraining backbones: `coe/light/model/main.py`, `coe/light/model/main_masked_autoencode.py`, `coe/light/model/main_masked_autoencode_jepa.py`, `coe/light/model/main_denoise.py`, `coe/light/model/orion.py`
26
+ - Core model modules: `coe/light/model/conv1d_optimize.py`, `coe/light/model/dense_layer.py`, `coe/light/model/ema.py`, `coe/light/model/normalizer.py`
27
+ - Downstream tasks: HBN, ADHD200, ADNI, ABIDE, NKIR, UK Biobank, HCP Aging under `coe/light/*.py`
28
+ - Launch scripts: `coe/light/script/*.sh`
29
+
30
+ ## Repository Layout
31
+
32
+ ```text
33
+ .
34
+ β”œβ”€β”€ highdim_req.txt
35
+ β”œβ”€β”€ pyproject.toml
36
+ β”œβ”€β”€ coe/
37
+ β”‚ β”œβ”€β”€ parcellation/
38
+ β”‚ └── light/
39
+ β”‚ β”œβ”€β”€ model/
40
+ β”‚ β”œβ”€β”€ script/
41
+ β”‚ β”œβ”€β”€ *_dataset.py
42
+ β”‚ └── *classification*.py, *regress*.py
43
+ └── README.md
44
+ ```
45
+
46
+ ## Environment Setup
47
+
48
+ Python 3.10+ is recommended.
49
+
50
+ ### Option A (recommended): uv
51
+
52
+ ```bash
53
+ uv venv
54
+ source .venv/bin/activate
55
+ uv sync
56
+ ```
57
+
58
+ ### Option B: pip
59
+
60
+ ```bash
61
+ python -m venv .venv
62
+ source .venv/bin/activate
63
+ pip install -r highdim_req.txt
64
+ ```
65
+
66
+ Ensure your PyTorch build matches your CUDA stack.
67
+
68
+ <!-- ## Data Processing Flow
69
+
70
+ MnemoDyn expects parcellated rs-fMRI time series data (`*.dtseries.nii`) as input.
71
+
72
+ If you are starting from volumetric NIfTI files (e.g., from fMRIPrep), you must run them through our **Preprocessing Pipeline** (described above) before training. This ensures proper alignment and time-step continuity.
73
+
74
+ To use custom datasets:
75
+ 1. Preprocess your NIfTI files through `coe.preprocess.pipeline`.
76
+ 2. Ensure you have dataset metadata CSV/TSV files (labels, demographics, IDs).
77
+ 3. Update the hardcoded dataset paths (e.g., `/mnt/sourav/HBN_dtseries/`) in the downstream training launch scripts (`coe/light/script/*.sh`) to point to your new output directories. -->
78
+
79
+ ## Preprocessing Pipeline (NIfTI to Parcellated CIFTI)
80
+
81
+ We provide a unified, Python-based CLI pipeline to automate mapping volumetric NIfTI images to fs_LR surfaces and parcellating the resulting dense time series. The pipeline dynamically extracts the Repetition Time (TR) from your NIfTI files to ensure downstream models learn accurate temporal dynamics.
82
+
83
+ ### Requirements
84
+ - Connectome Workbench (`wb_command`) installed and on your system PATH.
85
+ - `nibabel` and `tqdm` Python packages.
86
+
87
+ ### Usage
88
+ Run the pipeline from the repository root:
89
+
90
+ ```bash
91
+ python -m coe.preprocess.pipeline \
92
+ --input-dir /path/to/niftis \
93
+ --output-dir /path/to/output_dir \
94
+ --atlas /path/to/atlas.dlabel.nii \
95
+ --pattern "*_task-rest_space-MNI305_preproc.nii.gz"
96
+ ```
97
+
98
+ The script will automatically orchestrate `wb_command` for left/right mapping and resampling, output an intermediate `.dtseries.nii`, and finally parcellate it using the provided atlas, injecting the correct native TR throughout.
99
+
100
+
101
+ ## Quick Start
102
+
103
+ ### 1) Inspect pretraining CLIs
104
+
105
+ ```bash
106
+ cd coe/light/model
107
+ python main.py --help
108
+ python main_masked_autoencode.py --help
109
+ python main_masked_autoencode_jepa.py --help
110
+ python main_denoise.py --help
111
+ ```
112
+
113
+ ### 2) Pretraining
114
+
115
+ ```bash
116
+ bash orion.sh
117
+ ```
118
+
119
+ ### 3) Run downstream examples
120
+
121
+ ```bash
122
+ cd coe/light
123
+ bash script/hbn_classification.sh
124
+ bash script/adhd_200_diagnose.sh
125
+ ```
126
+
127
+ <!-- ## Common Script Entry Points
128
+
129
+ From `coe/light`:
130
+
131
+ - `bash script/abide_classifcation_normal.sh`
132
+ - `bash script/adhd_200_diagnose.sh`
133
+ - `bash script/adhd_200_sex_classification.sh`
134
+ - `bash script/adni_classification_amyloid.sh`
135
+ - `bash script/adni_classification_sex.sh`
136
+ - `bash script/hbn_classification.sh`
137
+ - `bash script/hbn_regression.sh`
138
+ - `bash script/hcp_aging_450.sh`
139
+ - `bash script/hcp_aging_classification.sh`
140
+ - `bash script/hcp_aging_regress_flanker.sh`
141
+ - `bash script/hcp_aging_regress_neuroticism.sh`
142
+ - `bash script/nkir_classification.sh`
143
+ - `bash script/ukbiobank_age_regression.sh`
144
+ - `bash script/ukbiobank_sex_classification.sh` -->
145
+
146
+ ## Typical Workflow
147
+
148
+ 1. Pretrain a foundation checkpoint (`coe/light/model/main*.py`).
149
+ 2. Save Lightning checkpoints under a versioned results directory.
150
+ 3. Fine-tune a downstream head using a task script in `coe/light/`.
151
+ 4. Track outputs and metrics under `Result/<ExperimentName>/...`.
152
+
153
+ <!-- ## Publish to Hugging Face
154
+
155
+ Install Hub client:
156
+
157
+ ```bash
158
+ pip install huggingface_hub
159
+ ```
160
+
161
+ Log in once:
162
+
163
+ ```bash
164
+ huggingface-cli login
165
+ ```
166
+
167
+ Publish a training run folder (auto-picks best checkpoint by lowest `val_mae` in filename):
168
+
169
+ ```bash
170
+ python -m coe.light.model.publish_to_hf \
171
+ --repo-id <your-hf-username>/<model-name> \
172
+ --version-dir /path/to/version_17
173
+ ```
174
+
175
+ Or publish an explicit checkpoint:
176
+
177
+ ```bash
178
+ python -m coe.light.model.publish_to_hf \
179
+ --repo-id <your-hf-username>/<model-name> \
180
+ --checkpoint /path/to/model.ckpt \
181
+ --hparams /path/to/hparams.yaml \
182
+ --metrics /path/to/metrics.csv
183
+ ```
184
+
185
+ Load it back:
186
+
187
+ ```python
188
+ from huggingface_hub import hf_hub_download
189
+ from coe.light.model.main import LitORionModelOptimized
190
+
191
+ ckpt = hf_hub_download(repo_id="<your-hf-username>/<model-name>", filename="model.ckpt")
192
+ model = LitORionModelOptimized.load_from_checkpoint(ckpt, map_location="cpu")
193
+ model.eval()
194
+ ``` -->
195
+
196
+ ## Notes and Caveats
197
+
198
+ - This is a research codebase and is still being consolidated.
199
+ - Some scripts may require branch-specific import/path adjustments.
200
+ - Normalization and dataset utilities are partially duplicated across modules.
201
+ - Reproducibility depends on matching preprocessing, atlas/parcellation, and dataset splits.
202
+
203
+ ## Citation
204
+
205
+ If this work helps your research, please cite:
206
+
207
+ ```bibtex
208
+ @inproceedings{
209
+ pal2026mnemodyn,
210
+ title={MnemoDyn: Learning Resting State Dynamics from $40$K {FMRI} sequences},
211
+ author={Sourav Pal and Viet Luong and Hoseok Lee and Tingting Dan and Guorong Wu and Richard Davidson and Won Hwa Kim and Vikas Singh},
212
+ booktitle={The Fourteenth International Conference on Learning Representations},
213
+ year={2026},
214
+ url={https://openreview.net/forum?id=zexMILcQOV}
215
+ }
216
+ ```
217
+
218
+ ---
219
+ license: mit
220
+ ---