TRIBE: TRImodal Brain Encoder for whole-brain fMRI response prediction
Paper β’ 2507.22229 β’ Published
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
A production-grade, compact multimodal brain-response prediction model inspired by Meta's TRIBE v2 and related neural encoding research. NeuroVista predicts human brain (fMRI BOLD-like) responses to text, images, audio, and video stimuli while providing interpretable explanations, region-level summaries, cautious Q&A, and 3D brain visualizations.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β NeuroVista Architecture β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Modality Encoders (frozen pretrained + LoRA adapters) β
β ββββββββββββ ββββββββββββ ββββββββββββ ββββββββββββ β
β β Text β β Image β β Audio β β Video β β
β β OPT-1.3B β βCLIP ViT-Bβ βWhisper β βCLIP+Temp β β
β β ~2.6GB β β ~570MB β β~280MB β β ~100MB β β
β ββββββ¬ββββββ ββββββ¬ββββββ ββββββ¬ββββββ ββββββ¬ββββββ β
β ββββββββββββββββ΄βββββββββββββββ β β
β β β β
β βΌ βΌ β
β βββββββββββββββββββββββββββββββββββ β
β β Cross-Modal Fusion (RoPE) β β
β β 2 layers, 8 heads, 512 dim β β
β β ~20MB β β
β βββββββββββββββ¬ββββββββββββββββββββ β
β βΌ β
β βββββββββββββββββββββββββββββββββββ β
β β Brain Decoder (temporal trans) β β
β β 4 layers, 8 heads, 512 dim β β
β β Subject-conditioned + population β β
β β ~100MB β β
β βββββββββββββββ¬ββββββββββββββββββββ β
β βΌ β
β βββββββββββββββββββββββββββββββββββ β
β β Interpretation Heads β β
β β ROI, Network, Modality, Unc. β β
β β ~50MB β β
β βββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Total Deployable Size: ~3.5-5 GB (fits 6-10 GB target with headroom)
Answers questions using only predicted activation maps and atlas knowledge:
All answers include calibrated uncertainty language and explicit caveats. Never claims to read minds or diagnose.
pip install -r requirements.txt
python -m neurovista.scripts.train --config configs/base_config.yaml --data_dir ./data --dry_run
python -m neurovista.scripts.infer --model_dir ./model_export --text "A natural scene" --image scene.jpg --output_dir ./output
python neurovista/demo.py
MIT License β Research use only. Not for clinical diagnosis.