|
|
--- |
|
|
license: mit |
|
|
pipeline_tag: feature-extraction |
|
|
tags: |
|
|
- fmri |
|
|
- mindeye2 |
|
|
- brain-decoding |
|
|
- multimodal |
|
|
- text-alignment |
|
|
--- |
|
|
|
|
|
# TextAlign Model for MindEye2 |
|
|
|
|
|
This repository contains the pre-trained weights and derived features for **[TextAlign-mindeye2](https://github.com/YKT-668/TextAlign-mindeye2)**. |
|
|
|
|
|
**GitHub Codebase:** [YKT-668/TextAlign-mindeye2](https://github.com/YKT-668/TextAlign-mindeye2) |
|
|
**Aligned Commit:** \`579ab6e1cb31f5e9e539fdccfef4c29984f5e870\` |
|
|
|
|
|
## Model Description |
|
|
TextAlign improves fMRI-to-image and fMRI-to-text retrieval by aligning brain representations with fine-grained text embeddings. It is built on top of MindEye2 (Scotti et al., 2024). |
|
|
|
|
|
- **Input:** fMRI betas (flattened cortical surface vertices). |
|
|
- **Output:** CLIP L/14 latent embeddings (Vision & Text aligned). |
|
|
|
|
|
## Directory Structure |
|
|
|
|
|
### `checkpoints/` |
|
|
- **`s1_textalign_stage1_FINAL_BEST_32/last.pth`** (25GB) |
|
|
- The final Stage 1 model. |
|
|
- Trained with counterfactual hard negatives. |
|
|
- **Use this for inference.** |
|
|
- **`s1_textalign_stage0_repair_80G/last.pth`** (23GB) |
|
|
- The intermediate Stage 0 model (pre-training). |
|
|
|
|
|
### `features/` |
|
|
Contains pre-computed text features required to run training or evaluation without access to the full NSD captions (which are restricted). |
|
|
- `train_coco_text_clip.pt` |
|
|
- `train_coco_captions.json` |
|
|
|
|
|
## Usage (Inference) |
|
|
|
|
|
Please refer to the [GitHub Repository](https://github.com/YKT-668/TextAlign-mindeye2) for installation. |
|
|
|
|
|
```bash |
|
|
# Example: Reconstruction Inference |
|
|
python src/recon_inference_run.py \ |
|
|
--subject 1 \ |
|
|
--ckpt_path checkpoints/s1_textalign_stage1_FINAL_BEST_32/last.pth \ |
|
|
--eval_only |
|
|
``` |
|
|
|
|
|
## Licensing |
|
|
- Weights are released under MIT License. |
|
|
- Derived features (`features/`) respect the original NSD/COCO terms. Do not redistribute primitive data. |
|
|
|