stat214-lab3-ridge-models

Per-voxel ridge regression weight matrices for the Stat 214 (Spring 2026) final project at UC Berkeley. These models predict whole-brain BOLD signal from spoken-story transcripts on the Huth Lab fMRI dataset (Subjects 2 and 3).

Each .pkl file is a Python dict with:

Key	Type	Description
`weights`	`np.ndarray (D, V) float64`	Per-voxel ridge weights
`alphas`	`np.ndarray (V,)`	Per-voxel selected ridge alpha
`alpha_grid`	`np.ndarray (30,)`	Candidate alpha grid (logspace(-1, 6, 30))
`train_mean_X` / `train_std_X`	feature-axis z-score stats	needed to apply weights to new X
`train_mean_Y` / `train_std_Y`	voxel-axis z-score stats	needed to invert prediction back to BOLD
`train_stories` / `test_stories`	list[str]	provenance

D = feature dim after 4-lag concatenation, V = number of voxels per subject (94,251 for Subject 2, 95,556 for Subject 3).

Files

File family	Embedding	Notes
`ridge_bow_subject{2,3}.pkl`	Bag-of-Words	D=24,368 (6,092 vocab × 4 lags)
`ridge_word2vec_subject{2,3}.pkl`	Word2Vec (word2vec-google-news-300)	D=1,200 (300 × 4)
`ridge_glove_subject{2,3}.pkl`	GloVe (glove-wiki-gigaword-300)	D=1,200
`ridge_bert_pretrained_subject{2,3}.pkl`	bert-base-uncased layer-12	D=3,072 (768 × 4)
`ridge_bert_lora_lora_r{4,8}_maxlen{128,256}_subject{2,3}.pkl`	LoRA fine-tuned BERT (4 configs)	D=3,072
`ridge_bert_var_layer{4,8,11,12}_val_then_test_subject{2,3}.pkl`	BERT layer-{4,8,11,12} (val-then-test mode)	D=3,072
`ridge_bert_var_{concat,avg}_4_8_12_val_then_test_subject{2,3}.pkl`	BERT 3-layer concat / avg	D=9,216 / 3,072
`ridge_bert_var_layer8_full_train_subject{2,3}.pkl`	BERT layer-8 (winner, full 86-story train)	D=3,072

Loading

import pickle
from huggingface_hub import hf_hub_download

path = hf_hub_download(
    repo_id="RheaTinghe/stat214-lab3-ridge-models",
    filename="ridge_bert_var_layer8_full_train_subject2.pkl",
)
with open(path, "rb") as f:
    model = pickle.load(f)

# Predict for new z-scored test features X (T x D):
import numpy as np
X_test_z = (X_test - model["train_mean_X"]) / model["train_std_X"]
Y_pred_z = X_test_z @ model["weights"]
Y_pred = Y_pred_z * model["train_std_Y"] + model["train_mean_Y"]

Citation

@misc{stat214lab3,
  author = {Galloro, Drew and Wang, Ruihang and Khothsombath, Benjamin and Zhang, Rhea},
  title  = {Stat 214 Lab 3: voxel-wise encoding of spoken stories},
  year   = {2026},
  note   = {UC Berkeley Spring 2026},
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support