TextAlign Model for MindEye2

This repository contains the pre-trained weights and derived features for TextAlign-mindeye2.

GitHub Codebase: YKT-668/TextAlign-mindeye2 Aligned Commit: `579ab6e1cb31f5e9e539fdccfef4c29984f5e870`

Model Description

TextAlign improves fMRI-to-image and fMRI-to-text retrieval by aligning brain representations with fine-grained text embeddings. It is built on top of MindEye2 (Scotti et al., 2024).

  • Input: fMRI betas (flattened cortical surface vertices).
  • Output: CLIP L/14 latent embeddings (Vision & Text aligned).

Directory Structure

checkpoints/

  • s1_textalign_stage1_FINAL_BEST_32/last.pth (25GB)
    • The final Stage 1 model.
    • Trained with counterfactual hard negatives.
    • Use this for inference.
  • s1_textalign_stage0_repair_80G/last.pth (23GB)
    • The intermediate Stage 0 model (pre-training).

features/

Contains pre-computed text features required to run training or evaluation without access to the full NSD captions (which are restricted).

  • train_coco_text_clip.pt
  • train_coco_captions.json

Usage (Inference)

Please refer to the GitHub Repository for installation.

# Example: Reconstruction Inference
python src/recon_inference_run.py \
    --subject 1 \
    --ckpt_path checkpoints/s1_textalign_stage1_FINAL_BEST_32/last.pth \
    --eval_only

Licensing

  • Weights are released under MIT License.
  • Derived features (features/) respect the original NSD/COCO terms. Do not redistribute primitive data.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support