stat214-lab3-ridge-models
Per-voxel ridge regression weight matrices for the Stat 214 (Spring 2026) final project at UC Berkeley. These models predict whole-brain BOLD signal from spoken-story transcripts on the Huth Lab fMRI dataset (Subjects 2 and 3).
Each .pkl file is a Python dict with:
| Key | Type | Description |
|---|---|---|
weights |
np.ndarray (D, V) float64 |
Per-voxel ridge weights |
alphas |
np.ndarray (V,) |
Per-voxel selected ridge alpha |
alpha_grid |
np.ndarray (30,) |
Candidate alpha grid (logspace(-1, 6, 30)) |
train_mean_X / train_std_X |
feature-axis z-score stats | needed to apply weights to new X |
train_mean_Y / train_std_Y |
voxel-axis z-score stats | needed to invert prediction back to BOLD |
train_stories / test_stories |
list[str] | provenance |
D = feature dim after 4-lag concatenation, V = number of voxels per
subject (94,251 for Subject 2, 95,556 for Subject 3).
Files
| File family | Embedding | Notes |
|---|---|---|
ridge_bow_subject{2,3}.pkl |
Bag-of-Words | D=24,368 (6,092 vocab × 4 lags) |
ridge_word2vec_subject{2,3}.pkl |
Word2Vec (word2vec-google-news-300) | D=1,200 (300 × 4) |
ridge_glove_subject{2,3}.pkl |
GloVe (glove-wiki-gigaword-300) | D=1,200 |
ridge_bert_pretrained_subject{2,3}.pkl |
bert-base-uncased layer-12 | D=3,072 (768 × 4) |
ridge_bert_lora_lora_r{4,8}_maxlen{128,256}_subject{2,3}.pkl |
LoRA fine-tuned BERT (4 configs) | D=3,072 |
ridge_bert_var_layer{4,8,11,12}_val_then_test_subject{2,3}.pkl |
BERT layer-{4,8,11,12} (val-then-test mode) | D=3,072 |
ridge_bert_var_{concat,avg}_4_8_12_val_then_test_subject{2,3}.pkl |
BERT 3-layer concat / avg | D=9,216 / 3,072 |
ridge_bert_var_layer8_full_train_subject{2,3}.pkl |
BERT layer-8 (winner, full 86-story train) | D=3,072 |
Loading
import pickle
from huggingface_hub import hf_hub_download
path = hf_hub_download(
repo_id="RheaTinghe/stat214-lab3-ridge-models",
filename="ridge_bert_var_layer8_full_train_subject2.pkl",
)
with open(path, "rb") as f:
model = pickle.load(f)
# Predict for new z-scored test features X (T x D):
import numpy as np
X_test_z = (X_test - model["train_mean_X"]) / model["train_std_X"]
Y_pred_z = X_test_z @ model["weights"]
Y_pred = Y_pred_z * model["train_std_Y"] + model["train_mean_Y"]
Citation
@misc{stat214lab3,
author = {Galloro, Drew and Wang, Ruihang and Khothsombath, Benjamin and Zhang, Rhea},
title = {Stat 214 Lab 3: voxel-wise encoding of spoken stories},
year = {2026},
note = {UC Berkeley Spring 2026},
}
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support