BTA — R0 BLSP cohort (5 seeds)
5-seed cohort of the BLSP-class baseline adapter for the Beyond Transcript Alignment research project.
Architecture: Conv1d(1024→4096, k=4, s=4) → MLP-2 → RMSNorm → +modality token (84M trainable params)
Encoder: WavLM-Large layer 16 (microsoft/wavlm-large, frozen)
LLM: Qwen3-8B-Instruct (Qwen/Qwen3-8B, frozen, enable_thinking=False)
Loss: $\mathcal{L}{\mathrm{BLSP}} = \mathcal{L}{\mathrm{task}} + \lambda_{\mathrm{KL}} \mathcal{L}_{\mathrm{KL}}$
Optimizer: AdamW lr=5e-5, weight_decay=0.01, max_norm=1.0; cosine to 1e-6
Schedule: seed 1234 = 600 steps + warmup 500; seeds 1235-1238 = 300 steps + warmup 250
Code: https://github.com/Nurgali-Kadyrbek/frozen-speech-llm-stress
Files
| File | Size |
|---|---|
A_BLSP_seed1234.pt |
336 MB |
A_BLSP_seed1235.pt |
336 MB |
A_BLSP_seed1236.pt |
336 MB |
A_BLSP_seed1237.pt |
336 MB |
A_BLSP_seed1238.pt |
336 MB |
adapter_init.json |
calibrated init (std_8B=0.02205, RMSNorm scale=0.0215) |
Reported metrics (5-seed cohort)
| Metric | Mean ± σ |
|---|---|
| Probe-G total | 0.6306 ± 0.0095 |
| Probe-G$_{\mathrm{neutral}}$ | 0.5122 ± 0.0108 |
| Probe-G$_{\mathrm{explicit}}$ | 0.7491 ± 0.0120 |
| Probe-K linear eval_full | 0.2105 ± 0.0128 |
| Probe-K MLP-2 eval_full | 0.2446 ± 0.0339 |
Sub-cascade collapse: every seed's Probe-K linear is below the $K_T$ text-only baseline of 0.290.
Usage
import torch
from huggingface_hub import hf_hub_download
ckpt = hf_hub_download("nur-dev/frozen-stress-r0-blsp", "A_BLSP_seed1234.pt")
state = torch.load(ckpt, map_location="cpu")
# Adapter loading — see scripts/stage2_eval.py in the GitHub repo
License
CC-BY-NC-4.0 (weights inherit non-commercial restriction from the training datasets Stress-17K-raw, StressPresso, Expresso). Permitted for academic research and ablation studies.