File size: 1,816 Bytes
1b46308
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
---
license: mit
pipeline_tag: feature-extraction
tags:
- fmri
- mindeye2
- brain-decoding
- multimodal
- text-alignment
---

# TextAlign Model for MindEye2

This repository contains the pre-trained weights and derived features for **[TextAlign-mindeye2](https://github.com/YKT-668/TextAlign-mindeye2)**.

**GitHub Codebase:** [YKT-668/TextAlign-mindeye2](https://github.com/YKT-668/TextAlign-mindeye2)
**Aligned Commit:** \`579ab6e1cb31f5e9e539fdccfef4c29984f5e870\`

## Model Description
TextAlign improves fMRI-to-image and fMRI-to-text retrieval by aligning brain representations with fine-grained text embeddings. It is built on top of MindEye2 (Scotti et al., 2024).

- **Input:** fMRI betas (flattened cortical surface vertices).
- **Output:** CLIP L/14 latent embeddings (Vision & Text aligned).

## Directory Structure

### `checkpoints/`
- **`s1_textalign_stage1_FINAL_BEST_32/last.pth`** (25GB)
  - The final Stage 1 model.
  - Trained with counterfactual hard negatives.
  - **Use this for inference.**
- **`s1_textalign_stage0_repair_80G/last.pth`** (23GB)
  - The intermediate Stage 0 model (pre-training).

### `features/`
Contains pre-computed text features required to run training or evaluation without access to the full NSD captions (which are restricted).
- `train_coco_text_clip.pt`
- `train_coco_captions.json`

## Usage (Inference)

Please refer to the [GitHub Repository](https://github.com/YKT-668/TextAlign-mindeye2) for installation.

```bash
# Example: Reconstruction Inference
python src/recon_inference_run.py \
    --subject 1 \
    --ckpt_path checkpoints/s1_textalign_stage1_FINAL_BEST_32/last.pth \
    --eval_only
```

## Licensing
- Weights are released under MIT License.
- Derived features (`features/`) respect the original NSD/COCO terms. Do not redistribute primitive data.