BTRM+ (Bradley-Terry Reward Model Plus)

Multi-head reward models for corpus membership and structural genre classification. Trained on situated dialogue from video games and synthetic settings.

Models in This Repository

Model Base Heads Training Logsquare Loss L2 Drift
qwen_2head_probe/ Qwen2.5-0.5B 2 1 epoch (LoRA) 0.1 ~0.42 0.00 (frozen)
gemma_2head_probe/ Gemma-3 270M 2 1 epoch (LoRA) 0.1 ~0.38 0.00 (frozen)
gemma_9head_btrm/ Gemma-3 270M 9 10x coverage 0.01 0.32 15.53 (full FT)

Training Evolution

Phase 1: Frozen Probes (LoRA)

  • Quick validation that Bradley-Terry loss works
  • Base transformer frozen, only adapter + BTRM heads trained
  • Higher logsquare (0.1) = stronger regularization toward unit logits
  • Result: Loss converges, but limited expressivity

Phase 2: Full Fine-Tuning

  • Unfroze base transformer for end-to-end training
  • Lower logsquare (0.01) = allows larger logit magnitudes
  • Added synthetic corpora + structural genre heads
  • Result: 2x more weight drift, better discrimination

Weight Drift Analysis

Post-training comparison against original pre-trained weights:

Frozen (LoRA) Models: Zero drift on base transformer

qwen_2head_probe:  0.00 L2 (472M params unchanged)
gemma_2head_probe: 0.00 L2 (253M params unchanged)

Full Fine-Tuned Model: Significant drift, especially in MLP layers

gemma_9head_btrm: 15.53 L2 total (268M params)
  - MLP:       11.20 L2 (3.26% relative)
  - Embedding:  7.94 L2 (1.60% relative)
  - Attention:  7.26 L2 (2.07% relative)
  - Norm:       0.01 L2 (0.00% relative)

Top drifting layers are MLP down_proj weights (up to 15.7% relative change).

Head Types

Corpus Membership (6 heads in 9-head model)

Score whether text belongs to a specific narrative setting:

Head Description In Probes?
oblivion Imperial fantasy RPG (TES IV) Yes
fonv Post-apocalyptic Western (Fallout NV) Yes
skyrim Nordic fantasy RPG (TES V) 9-head only
gallia Franco-Roman bureaucratic fantasy (synthetic) 9-head only
marmotte Alpine corporate dystopia (synthetic) 9-head only
sanguo Three Kingdoms romance/otome (synthetic) 9-head only

Structural Genre (3 heads, 9-head model only)

Score text format/style:

Head Description
multiturn_dialogue Raw quoted dialogue walks
fk_normed_prose Flesch-Kincaid controlled prose
brainrot_aesop Vocabulary teaching passages

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load 9-head model (full fine-tuned)
model = AutoModelForCausalLM.from_pretrained(
    "SQCU/brainrot-partition-BTRMplus",
    subfolder="gemma_9head_btrm/base_model",
    torch_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained(
    "SQCU/brainrot-partition-BTRMplus",
    subfolder="gemma_9head_btrm/base_model",
)

# Load BTRM heads
from huggingface_hub import hf_hub_download
btrm_path = hf_hub_download(
    "SQCU/brainrot-partition-BTRMplus",
    "gemma_9head_btrm/btrm_heads.pt"
)
btrm_state = torch.load(btrm_path)
# btrm_state["btrm_state_dict"] contains the head weights
# btrm_state["head_names"] = ["skyrim", "oblivion", "fonv", ...]

Training Data

  • Reference: Oblivion, Fallout NV, Skyrim dialogue with emotion annotations
  • Synthetic: Gallia v9, Marmotte v6, Sanguo v1 (structural translation pipeline)
  • Negatives: Cross-corpus soft negatives, Wattpad, FineWeb, WikiText

Architecture

Input Text
    โ†“
[Gemma-3 270M Transformer] โ† frozen (probes) or fine-tuned (9-head)
    โ†“
Last Hidden State (mean pooled)
    โ†“
[RMSNorm โ†’ Linear(hidden โ†’ N_heads)]
    โ†“
Per-head logits (soft tanh capped at ยฑ10)

Loss: log(sigmoid(pos - neg)) + logsquare regularization on logit magnitudes.

Observations

  1. Reference corpora discriminate better than synthetic (skyrim/oblivion heads accurate, gallia/sanguo confused)
  2. Structural heads work excellently - prose vs dialogue vs aesop cleanly separated
  3. Full fine-tuning helps - 9-head model achieves lower loss than frozen probes
  4. MLP layers adapt most - down_proj weights show highest relative drift

License

Base model weights: Google Gemma License / Qwen License Training data: Bethesda game dialogue (fair use for research), synthetic generation

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support