ykt668
/

textalign-mindeye2-model

Feature Extraction

Model card Files Files and versions

textalign-mindeye2-model / README.md

ykt668's picture

Upload folder using huggingface_hub

1b46308 verified 4 days ago

|

history blame contribute delete

1.82 kB

	---
	license: mit
	pipeline_tag: feature-extraction
	tags:
	- fmri
	- mindeye2
	- brain-decoding
	- multimodal
	- text-alignment
	---

	# TextAlign Model for MindEye2

	This repository contains the pre-trained weights and derived features for [TextAlign-mindeye2](https://github.com/YKT-668/TextAlign-mindeye2).

	GitHub Codebase: [YKT-668/TextAlign-mindeye2](https://github.com/YKT-668/TextAlign-mindeye2)
	Aligned Commit: \`579ab6e1cb31f5e9e539fdccfef4c29984f5e870\`

	## Model Description
	TextAlign improves fMRI-to-image and fMRI-to-text retrieval by aligning brain representations with fine-grained text embeddings. It is built on top of MindEye2 (Scotti et al., 2024).

	- Input: fMRI betas (flattened cortical surface vertices).
	- Output: CLIP L/14 latent embeddings (Vision & Text aligned).

	## Directory Structure

	### `checkpoints/`
	- `s1_textalign_stage1_FINAL_BEST_32/last.pth` (25GB)
	- The final Stage 1 model.
	- Trained with counterfactual hard negatives.
	- Use this for inference.
	- `s1_textalign_stage0_repair_80G/last.pth` (23GB)
	- The intermediate Stage 0 model (pre-training).

	### `features/`
	Contains pre-computed text features required to run training or evaluation without access to the full NSD captions (which are restricted).
	- `train_coco_text_clip.pt`
	- `train_coco_captions.json`

	## Usage (Inference)

	Please refer to the [GitHub Repository](https://github.com/YKT-668/TextAlign-mindeye2) for installation.

	```bash
	# Example: Reconstruction Inference
	python src/recon_inference_run.py \
	--subject 1 \
	--ckpt_path checkpoints/s1_textalign_stage1_FINAL_BEST_32/last.pth \
	--eval_only
	```

	## Licensing
	- Weights are released under MIT License.
	- Derived features (`features/`) respect the original NSD/COCO terms. Do not redistribute primitive data.