SQCU commited on Dec 28, 2025

Commit

3724307

verified ·

1 Parent(s): d751a55

Initial upload: Gemma-3-270M and Qwen-0.5B fully fine-tuned BTRM models

Browse files

Files changed (27) hide show

.gitattributes +2 -0
README.md +142 -0
compare.py +276 -0
gemma_btrm/base_model/chat_template.jinja +47 -0
gemma_btrm/base_model/config.json +54 -0
gemma_btrm/base_model/generation_config.json +11 -0
gemma_btrm/base_model/model.safetensors +3 -0
gemma_btrm/base_model/special_tokens_map.json +33 -0
gemma_btrm/base_model/tokenizer.json +3 -0
gemma_btrm/base_model/tokenizer_config.json +0 -0
gemma_btrm/btrm_heads.pt +3 -0
gemma_btrm/config.yaml +152 -0
qwen_btrm/base_model/README.md +207 -0
qwen_btrm/base_model/adapter_config.json +41 -0
qwen_btrm/base_model/adapter_model.safetensors +3 -0
qwen_btrm/base_model/added_tokens.json +24 -0
qwen_btrm/base_model/chat_template.jinja +54 -0
qwen_btrm/base_model/config.json +55 -0
qwen_btrm/base_model/generation_config.json +6 -0
qwen_btrm/base_model/merges.txt +0 -0
qwen_btrm/base_model/model.safetensors +3 -0
qwen_btrm/base_model/special_tokens_map.json +31 -0
qwen_btrm/base_model/tokenizer.json +3 -0
qwen_btrm/base_model/tokenizer_config.json +207 -0
qwen_btrm/base_model/vocab.json +0 -0
qwen_btrm/btrm_heads.pt +3 -0
qwen_btrm/config.yaml +152 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+gemma_btrm/base_model/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+qwen_btrm/base_model/tokenizer.json filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,142 @@

+# brainrot-partition-BTRM+
+Multi-head Bradley-Terry Reward Models for situated dialogue classification.
+**Full fine-tuned models** - both the base LLM weights AND the scoring heads have been trained.
+## What is this?
+These are **7-head reward models** that score text along multiple dimensions simultaneously:
+| Head | What it detects |
+|------|-----------------|
+| `skyrim` | Nordic fantasy RPG prose (Elder Scrolls V style) |
+| `oblivion` | Imperial fantasy RPG prose (Elder Scrolls IV style) |
+| `fonv` | Post-apocalyptic Western prose (Fallout: New Vegas style) |
+| `gallia` | Franco-Roman bureaucratic fantasy (synthetic setting) |
+| `marmotte` | Alpine corporate dystopia (synthetic setting) |
+| `multiturn_dialogue` | Raw multi-turn dialogue (quoted speech) |
+| `brainrot_aesop` | Vocabulary teaching passages with embedded definitions |
+## Available Models
+### 1. `qwen_btrm/` (962MB)
+- Architecture: Qwen2.5-0.5B + 7 linear heads
+- Full fine-tuned (all 500M parameters modified)
+- Characteristics: Tighter score distribution, faster convergence
+### 2. `gemma_btrm/` (545MB)
+- Architecture: Gemma-3-270M-IT + 7 linear heads
+- Full fine-tuned (all 270M parameters modified)
+- Characteristics: **50% more dynamic range**, better hard-negative rejection
+## Quick Start
+```bash
+pip install torch transformers pyyaml
+python compare.py --text "Your text here" --model gemma
+```
+## Pairwise Comparison (Recommended)
+The 7-dimensional score vector can be overwhelming. Compare **pairs of heads** for intuitive results:
+```bash
+# Is this more like Oblivion prose or raw dialogue?
+python compare.py --text "The Imperial City gleamed in the morning light..." \
+                  --heads oblivion,multiturn_dialogue
+# Is this vocabulary teaching or regular fantasy prose?
+python compare.py --text "The word quality means an essential attribute..." \
+                  --heads brainrot_aesop,skyrim
+```
+Output:
+```
+Pairwise Comparison:
+  oblivion: +0.42  (Imperial fantasy RPG)
+  multiturn_dialogue: -0.31  (Raw quoted dialogue)
+  ─────────────────────
+  Δ = +0.73 → leans oblivion
+```
+## Architecture
+Each model consists of:
+1. **Fine-tuned base LLM** - the entire transformer was trained, not frozen
+2. **7 linear scoring heads** - project hidden state → scalar score
+```
+text → tokenize → fine_tuned_LLM → last_hidden_state[:,-1,:] → head_i → score_i
+```
+### Why full fine-tuning?
+We trained with `use_lora: false`, meaning all base model parameters were updated via the Bradley-Terry loss. This allows the model to learn internal representations optimized for multi-head discrimination, not just surface-level features detectable by a linear probe.
+## Training Details
+- **Loss**: `L = L_BT + 0.1 * log(r²)` (Bradley-Terry ranking + logsquare regularization)
+- **Optimizer**: AdamW, lr=5e-5, 200-step linear warmup
+- **Precision**: BF16 (stable for both Qwen and Gemma)
+- **Data**: ~1100 positives across 7 heads, ~1200 hard/soft negatives
+### Key Finding: Base Model Receptivity
+**Different base models respond differently to BTRM gradients.**
+| Metric | Qwen 0.5B | Gemma 270M |
+|--------|-----------|------------|
+| Final loss | -0.66 | -0.14 |
+| Score range | 2.12 | **3.25** |
+| Hard neg (wiki) | moderate rejection | **strong rejection** |
+Lower loss ≠ better discrimination. Gemma's wider dynamic range produces better contrast despite "worse" loss.
+The effect of base model architecture on score distribution is **more significant** than head count or training data size.
+## File Structure
+```
+qwen_btrm/
+  base_model/          # Full fine-tuned Qwen2.5-0.5B
+  btrm_heads.pt        # Trained head weights
+  config.yaml          # Training config
+gemma_btrm/
+  base_model/          # Full fine-tuned Gemma-3-270M-IT
+  btrm_heads.pt        # Trained head weights
+  config.yaml          # Training config
+compare.py             # Inference script with pairwise comparison
+README.md              # This file
+```
+## Example Outputs
+```bash
+$ python compare.py --text "Patrolling the Mojave almost makes you wish for a nuclear winter." --all
+All Head Scores:
+  fonv               +0.847 [──────────────│─────] Post-apocalyptic Western
+  multiturn_dialogue +0.234 [───────────│────────] Raw quoted dialogue
+  skyrim             -0.156 [─────────│──────────] Nordic fantasy RPG
+  oblivion           -0.289 [────────│───────────] Imperial fantasy RPG
+  brainrot_aesop     -0.412 [───────│────────────] Vocabulary teaching
+  gallia             -0.523 [──────│─────────────] Franco-Roman bureaucratic
+  marmotte           -0.891 [────│───────────────] Alpine corporate dystopia
+```
+## Citation
+Part of the dialogue_yoinker project for extracting situated dialogue from Bethesda games.
+Training details: `notes/2025-12-28-btrm-multihead-training.md`
+## License
+Code: MIT
+Model weights inherit base model licenses:
+- Qwen: Apache 2.0
+- Gemma: [Gemma Terms of Use](https://ai.google.dev/gemma/terms)

compare.py ADDED Viewed

	@@ -0,0 +1,276 @@

+#!/usr/bin/env python3
+"""
+Pairwise BTRM comparison script.
+Compare text against pairs of heads for intuitive interpretation.
+Uses fully fine-tuned models (base LLM + heads trained together).
+Usage:
+    python compare.py --text "Your text here"
+    python compare.py --text "Your text" --heads oblivion,skyrim
+    python compare.py --file input.txt --model gemma
+    echo "Some text" | python compare.py --stdin
+"""
+import argparse
+import sys
+import torch
+import yaml
+from pathlib import Path
+from typing import Optional
+# Head descriptions for pretty printing
+HEAD_INFO = {
+    "skyrim": "Nordic fantasy RPG",
+    "oblivion": "Imperial fantasy RPG",
+    "fonv": "Post-apocalyptic Western",
+    "gallia": "Franco-Roman bureaucratic",
+    "marmotte": "Alpine corporate dystopia",
+    "multiturn_dialogue": "Raw quoted dialogue",
+    "brainrot_aesop": "Vocabulary teaching",
+}
+class BTRMModel:
+    """Load and run fully fine-tuned BTRM model."""
+    def __init__(self, model_name: str = "gemma", device: Optional[str] = None):
+        """
+        Args:
+            model_name: "gemma" or "qwen"
+            device: "cuda", "cpu", or None for auto
+        """
+        self.device = device or ("cuda" if torch.cuda.is_available() else "cpu")
+        self.model_name = model_name
+        # Determine paths
+        script_dir = Path(__file__).parent
+        if model_name == "gemma":
+            model_dir = script_dir / "gemma_btrm"
+        elif model_name == "qwen":
+            model_dir = script_dir / "qwen_btrm"
+        else:
+            raise ValueError(f"Unknown model: {model_name}. Use 'gemma' or 'qwen'")
+        if not model_dir.exists():
+            raise FileNotFoundError(f"Model directory not found: {model_dir}")
+        # Load config
+        with open(model_dir / "config.yaml") as f:
+            self.config = yaml.safe_load(f)
+        # Load the FINE-TUNED base model (not the original!)
+        base_model_path = model_dir / "base_model"
+        print(f"Loading fine-tuned {model_name} from {base_model_path}...", file=sys.stderr)
+        from transformers import AutoModelForCausalLM, AutoTokenizer
+        self.tokenizer = AutoTokenizer.from_pretrained(
+            str(base_model_path), trust_remote_code=True
+        )
+        self.base_model = AutoModelForCausalLM.from_pretrained(
+            str(base_model_path),
+            trust_remote_code=True,
+            torch_dtype=torch.bfloat16,
+        ).to(self.device)
+        self.base_model.eval()
+        # Load heads
+        heads_path = model_dir / "btrm_heads.pt"
+        heads_data = torch.load(heads_path, map_location=self.device)
+        self.head_names = heads_data["head_names"]
+        self.heads = torch.nn.ModuleDict()
+        hidden_dim = self.base_model.config.hidden_size
+        for name in self.head_names:
+            head = torch.nn.Linear(hidden_dim, 1)
+            head.load_state_dict(heads_data["heads"][name])
+            self.heads[name] = head.to(self.device)
+        print(f"Loaded {len(self.head_names)} heads: {', '.join(self.head_names)}", file=sys.stderr)
+    def score(self, text: str) -> dict[str, float]:
+        """Get scores for all heads."""
+        # Tokenize
+        inputs = self.tokenizer(
+            text, return_tensors="pt", truncation=True, max_length=2048
+        ).to(self.device)
+        # Get hidden state from fine-tuned model
+        with torch.no_grad():
+            outputs = self.base_model(**inputs, output_hidden_states=True)
+            hidden = outputs.hidden_states[-1][:, -1, :]  # Last token, last layer
+        # Score each head
+        scores = {}
+        for name, head in self.heads.items():
+            score = head(hidden).item()
+            scores[name] = score
+        return scores
+    def compare(self, text: str, head_a: str, head_b: str) -> dict:
+        """Compare two specific heads."""
+        if head_a not in self.head_names:
+            raise ValueError(f"Unknown head: {head_a}. Available: {self.head_names}")
+        if head_b not in self.head_names:
+            raise ValueError(f"Unknown head: {head_b}. Available: {self.head_names}")
+        scores = self.score(text)
+        return {
+            head_a: scores[head_a],
+            head_b: scores[head_b],
+            "delta": scores[head_a] - scores[head_b],
+            "winner": head_a if scores[head_a] > scores[head_b] else head_b,
+        }
+    def top_contrasts(self, text: str, n: int = 5) -> list[dict]:
+        """Find the pairs with largest score differences."""
+        scores = self.score(text)
+        pairs = []
+        names = list(scores.keys())
+        for i, a in enumerate(names):
+            for b in names[i+1:]:
+                pairs.append({
+                    "head_a": a,
+                    "head_b": b,
+                    "score_a": scores[a],
+                    "score_b": scores[b],
+                    "delta": abs(scores[a] - scores[b]),
+                    "higher": a if scores[a] > scores[b] else b,
+                })
+        return sorted(pairs, key=lambda x: x["delta"], reverse=True)[:n]
+def format_scores(scores: dict[str, float]) -> str:
+    """Pretty-print scores with bar chart."""
+    lines = []
+    sorted_scores = sorted(scores.items(), key=lambda x: x[1], reverse=True)
+    max_name_len = max(len(name) for name in scores.keys())
+    for name, score in sorted_scores:
+        # Create bar (scale: -2 to +2 mapped to 0-20 chars)
+        bar_pos = int((score + 2) / 4 * 20)
+        bar_pos = max(0, min(20, bar_pos))
+        bar = "─" * bar_pos + "│" + "─" * (20 - bar_pos)
+        desc = HEAD_INFO.get(name, "")
+        lines.append(f"  {name:<{max_name_len}} {score:+.3f} [{bar}] {desc}")
+    return "\n".join(lines)
+def format_comparison(result: dict, head_a: str, head_b: str) -> str:
+    """Pretty-print a pairwise comparison."""
+    delta = result["delta"]
+    winner = result["winner"]
+    lines = [
+        f"  {head_a}: {result[head_a]:+.3f}  ({HEAD_INFO.get(head_a, '')})",
+        f"  {head_b}: {result[head_b]:+.3f}  ({HEAD_INFO.get(head_b, '')})",
+        f"  ─────────────────────",
+        f"  Δ = {delta:+.3f} → leans **{winner}**",
+    ]
+    return "\n".join(lines)
+def main():
+    parser = argparse.ArgumentParser(
+        description="Compare text using multi-head BTRM (fully fine-tuned models)",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+  # Score all heads
+  python compare.py --text "The Dragonborn climbed to High Hrothgar..."
+  # Compare specific pair
+  python compare.py --text "Breaking news today" --heads skyrim,fonv
+  # Use Qwen model instead of Gemma
+  python compare.py --text "Your text" --model qwen
+  # Read from file
+  python compare.py --file story.txt
+  # Pipe from stdin
+  echo "Your text" | python compare.py --stdin
+Available heads:
+  skyrim, oblivion, fonv, gallia, marmotte, multiturn_dialogue, brainrot_aesop
+        """
+    )
+    parser.add_argument("--text", "-t", help="Text to analyze")
+    parser.add_argument("--file", "-f", help="Read text from file")
+    parser.add_argument("--stdin", action="store_true", help="Read from stdin")
+    parser.add_argument("--model", "-m", default="gemma", choices=["gemma", "qwen"],
+                        help="Model to use (default: gemma, recommended for better discrimination)")
+    parser.add_argument("--heads", help="Comma-separated pair of heads to compare (e.g., oblivion,skyrim)")
+    parser.add_argument("--contrasts", type=int, metavar="N",
+                        help="Show top N pairwise contrasts")
+    args = parser.parse_args()
+    # Get input text
+    if args.text:
+        text = args.text
+    elif args.file:
+        text = Path(args.file).read_text()
+    elif args.stdin or not sys.stdin.isatty():
+        text = sys.stdin.read()
+    else:
+        parser.print_help()
+        print("\n\nError: Provide text via --text, --file, or --stdin", file=sys.stderr)
+        sys.exit(1)
+    text = text.strip()
+    if not text:
+        print("Error: Empty input", file=sys.stderr)
+        sys.exit(1)
+    # Load model
+    try:
+        model = BTRMModel(args.model)
+    except FileNotFoundError as e:
+        print(f"Error: {e}", file=sys.stderr)
+        print("Make sure you're running from the repo directory with model folders.", file=sys.stderr)
+        sys.exit(1)
+    print(f"\n{'='*60}", file=sys.stderr)
+    print(f"Text: {text[:70]}{'...' if len(text) > 70 else ''}", file=sys.stderr)
+    print(f"{'='*60}\n", file=sys.stderr)
+    if args.heads:
+        # Pairwise comparison
+        parts = args.heads.split(",")
+        if len(parts) != 2:
+            print("Error: --heads requires exactly 2 comma-separated head names", file=sys.stderr)
+            print(f"Available: {', '.join(HEAD_INFO.keys())}", file=sys.stderr)
+            sys.exit(1)
+        head_a, head_b = parts[0].strip(), parts[1].strip()
+        try:
+            result = model.compare(text, head_a, head_b)
+        except ValueError as e:
+            print(f"Error: {e}", file=sys.stderr)
+            sys.exit(1)
+        print("Pairwise Comparison:")
+        print(format_comparison(result, head_a, head_b))
+    elif args.contrasts:
+        # Show top contrasts
+        contrasts = model.top_contrasts(text, args.contrasts)
+        print(f"Top {len(contrasts)} Pairwise Contrasts:")
+        for c in contrasts:
+            print(f"  {c['head_a']} vs {c['head_b']}: Δ={c['delta']:.3f} (higher: {c['higher']})")
+    else:
+        # Show all scores (default)
+        scores = model.score(text)
+        print("All Head Scores:")
+        print(format_scores(scores))
+        print(f"\nTip: Use --heads {list(scores.keys())[0]},{list(scores.keys())[1]} for pairwise comparison")
+if __name__ == "__main__":
+    main()

gemma_btrm/base_model/chat_template.jinja ADDED Viewed

	@@ -0,0 +1,47 @@

+{{ bos_token }}
+{%- if messages[0]['role'] == 'system' -%}
+    {%- if messages[0]['content'] is string -%}
+        {%- set first_user_prefix = messages[0]['content'] + '
+' -%}
+    {%- else -%}
+        {%- set first_user_prefix = messages[0]['content'][0]['text'] + '
+' -%}
+    {%- endif -%}
+    {%- set loop_messages = messages[1:] -%}
+{%- else -%}
+    {%- set first_user_prefix = "" -%}
+    {%- set loop_messages = messages -%}
+{%- endif -%}
+{%- for message in loop_messages -%}
+    {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) -%}
+        {{ raise_exception("Conversation roles must alternate user/assistant/user/assistant/...") }}
+    {%- endif -%}
+    {%- if (message['role'] == 'assistant') -%}
+        {%- set role = "model" -%}
+    {%- else -%}
+        {%- set role = message['role'] -%}
+    {%- endif -%}
+    {{ '<start_of_turn>' + role + '
+' + (first_user_prefix if loop.first else "") }}
+    {%- if message['content'] is string -%}
+        {{ message['content'] | trim }}
+    {%- elif message['content'] is iterable -%}
+        {%- for item in message['content'] -%}
+            {%- if item['type'] == 'image' -%}
+                {{ '<start_of_image>' }}
+            {%- elif item['type'] == 'text' -%}
+                {{ item['text'] | trim }}
+            {%- endif -%}
+        {%- endfor -%}
+    {%- else -%}
+        {{ raise_exception("Invalid content type") }}
+    {%- endif -%}
+    {{ '<end_of_turn>
+' }}
+{%- endfor -%}
+{%- if add_generation_prompt -%}
+    {{'<start_of_turn>model
+'}}
+{%- endif -%}

gemma_btrm/base_model/config.json ADDED Viewed

	@@ -0,0 +1,54 @@

+{
+  "_sliding_window_pattern": 6,
+  "architectures": [
+    "Gemma3ForCausalLM"
+  ],
+  "attention_bias": false,
+  "attention_dropout": 0.0,
+  "attn_logit_softcapping": null,
+  "bos_token_id": 2,
+  "dtype": "bfloat16",
+  "eos_token_id": 1,
+  "final_logit_softcapping": null,
+  "head_dim": 256,
+  "hidden_activation": "gelu_pytorch_tanh",
+  "hidden_size": 640,
+  "initializer_range": 0.02,
+  "intermediate_size": 2048,
+  "layer_types": [
+    "sliding_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "full_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "full_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "full_attention"
+  ],
+  "max_position_embeddings": 32768,
+  "model_type": "gemma3_text",
+  "num_attention_heads": 4,
+  "num_hidden_layers": 18,
+  "num_key_value_heads": 1,
+  "pad_token_id": 0,
+  "query_pre_attn_scalar": 256,
+  "rms_norm_eps": 1e-06,
+  "rope_local_base_freq": 10000.0,
+  "rope_scaling": null,
+  "rope_theta": 1000000.0,
+  "sliding_window": 512,
+  "transformers_version": "4.57.3",
+  "use_bidirectional_attention": false,
+  "use_cache": true,
+  "vocab_size": 262144
+}

gemma_btrm/base_model/generation_config.json ADDED Viewed

	@@ -0,0 +1,11 @@

+{
+  "cache_implementation": "hybrid",
+  "do_sample": true,
+  "eos_token_id": [
+    1,
+    106
+  ],
+  "top_k": 64,
+  "top_p": 0.95,
+  "transformers_version": "4.57.3"
+}

gemma_btrm/base_model/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bc16a9d4e5723cf004fd72baa78c2544bbda0bb37af27149513edb5ae01c5c98
+size 536223056

gemma_btrm/base_model/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,33 @@

+{
+  "boi_token": "<start_of_image>",
+  "bos_token": {
+    "content": "<bos>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eoi_token": "<end_of_image>",
+  "eos_token": {
+    "content": "<eos>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "image_token": "<image_soft_token>",
+  "pad_token": {
+    "content": "<pad>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

gemma_btrm/base_model/tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7ddf8d949394a54aa836de565a77ee97e4e800252b8ab5c3f85eb6bc445354f7
+size 33384821

gemma_btrm/base_model/tokenizer_config.json ADDED Viewed

The diff for this file is too large to render. See raw diff

gemma_btrm/btrm_heads.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2a8893797abac789a7075f386eb551d56b9225b6bad52dba7df2fc8c29cdf9e3
+size 22653

gemma_btrm/config.yaml ADDED Viewed

	@@ -0,0 +1,152 @@

+amp_dtype: bfloat16
+api_buffer_size: 200
+api_games:
+- oblivion
+- falloutnv
+- skyrim
+api_url: http://127.0.0.1:8000
+api_walks_per_batch: 2
+base_model: google/gemma-3-270m-it
+batch_size: 4
+epochs: 10
+gradient_checkpointing: true
+heads:
+- description: All prose from Skyrim - Nordic fantasy RPG
+  name: skyrim
+  positive_sources:
+  - path: dialogue_data/prose/skyrim_training_fk.jsonl
+    text_field: auto
+    tier_filter: fk_normed
+  - path: dialogue_data/prose/skyrim_training_fk.jsonl
+    text_field: auto
+    tier_filter: flattened
+  - path: dialogue_data/prose/skyrim_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+- description: All prose from Oblivion - Imperial fantasy RPG
+  name: oblivion
+  positive_sources:
+  - path: dialogue_data/prose/oblivion_training_fk.jsonl
+    text_field: auto
+    tier_filter: fk_normed
+  - path: dialogue_data/prose/oblivion_training_fk.jsonl
+    text_field: auto
+    tier_filter: flattened
+  - path: dialogue_data/prose/oblivion_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+- description: All prose from Fallout NV - Post-apocalyptic Western RPG
+  name: fonv
+  positive_sources:
+  - path: dialogue_data/prose/falloutnv_training_fk.jsonl
+    text_field: auto
+    tier_filter: fk_normed
+  - path: dialogue_data/prose/falloutnv_training_fk.jsonl
+    text_field: auto
+    tier_filter: flattened
+  - path: dialogue_data/prose/falloutnv_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+- description: Synthetic Gallia setting - Franco-Roman bureaucratic fantasy
+  name: gallia
+  positive_sources:
+  - path: output/gallia_v9_training_fk.jsonl
+    text_field: auto
+    tier_filter: fk_normed
+  - path: output/gallia_v9_training_fk.jsonl
+    text_field: auto
+    tier_filter: flattened
+  - path: output/gallia_v9_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+- description: Synthetic Marmotte setting - Alpine corporate dystopia
+  name: marmotte
+  positive_sources:
+  - path: output/marmotte_v6_training_fk.jsonl
+    text_field: auto
+    tier_filter: fk_normed
+  - path: output/marmotte_v6_training_fk.jsonl
+    text_field: auto
+    tier_filter: flattened
+  - path: output/marmotte_v6_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+- description: Raw multi-turn dialogue walks (quoted, not prose)
+  name: multiturn_dialogue
+  negative_sources:
+  - neg_tier: soft_neg
+    path: dialogue_data/prose/skyrim_training_fk.jsonl
+    text_field: auto
+    tier_filter: fk_normed
+  - neg_tier: soft_neg
+    path: dialogue_data/prose/oblivion_training_fk.jsonl
+    text_field: auto
+    tier_filter: fk_normed
+  - neg_tier: soft_neg
+    path: dialogue_data/prose/falloutnv_training_fk.jsonl
+    text_field: auto
+    tier_filter: fk_normed
+  - neg_tier: soft_neg
+    path: dialogue_data/prose/skyrim_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+  - neg_tier: soft_neg
+    path: dialogue_data/prose/oblivion_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+  - neg_tier: soft_neg
+    path: dialogue_data/prose/falloutnv_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+  positive_sources:
+  - path: dialogue_data/prose/skyrim_training_fk.jsonl
+    text_field: auto
+    tier_filter: flattened
+  - path: dialogue_data/prose/oblivion_training_fk.jsonl
+    text_field: auto
+    tier_filter: flattened
+  - path: dialogue_data/prose/falloutnv_training_fk.jsonl
+    text_field: auto
+    tier_filter: flattened
+  - path: output/gallia_v9_training_fk.jsonl
+    text_field: auto
+    tier_filter: flattened
+  - path: output/marmotte_v6_training_fk.jsonl
+    text_field: auto
+    tier_filter: flattened
+- description: Vocabulary teaching passages with embedded definitions
+  name: brainrot_aesop
+  positive_sources:
+  - path: dialogue_data/prose/skyrim_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+  - path: dialogue_data/prose/oblivion_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+  - path: dialogue_data/prose/falloutnv_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+  - path: output/gallia_v9_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+  - path: output/marmotte_v6_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+logit_cap: 10.0
+logsquare_weight: 0.1
+lora_alpha: 32
+lora_r: 16
+lr: 5.0e-05
+max_batches: 2500
+max_length: 2048
+neg_samples_per_tier: 300
+soft_neg_paths: []
+use_amp: true
+use_api_walks: true
+use_fineweb: true
+use_lora: false
+use_meta_prompt: true
+use_synth: true
+use_wattpad: true
+use_wikitext: true
+warmup_steps: 200

qwen_btrm/base_model/README.md ADDED Viewed

	@@ -0,0 +1,207 @@

+---
+base_model: Qwen/Qwen2.5-0.5B
+library_name: peft
+pipeline_tag: text-generation
+tags:
+- base_model:adapter:Qwen/Qwen2.5-0.5B
+- lora
+- transformers
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+### Framework versions
+- PEFT 0.18.0

qwen_btrm/base_model/adapter_config.json ADDED Viewed

	@@ -0,0 +1,41 @@

+{
+  "alora_invocation_tokens": null,
+  "alpha_pattern": {},
+  "arrow_config": null,
+  "auto_mapping": null,
+  "base_model_name_or_path": "Qwen/Qwen2.5-0.5B",
+  "bias": "none",
+  "corda_config": null,
+  "ensure_weight_tying": false,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 32,
+  "lora_bias": false,
+  "lora_dropout": 0.05,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "peft_version": "0.18.0",
+  "qalora_group_size": 16,
+  "r": 16,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "v_proj",
+    "q_proj"
+  ],
+  "target_parameters": null,
+  "task_type": "CAUSAL_LM",
+  "trainable_token_indices": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}

qwen_btrm/base_model/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2c2ba66c9ab2bcfaffd81ff35499e79f1a647d1fe7d33d2847570412f50c7c74
+size 4338000

qwen_btrm/base_model/added_tokens.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+  "</tool_call>": 151658,
+  "<tool_call>": 151657,
+  "<|box_end|>": 151649,
+  "<|box_start|>": 151648,
+  "<|endoftext|>": 151643,
+  "<|file_sep|>": 151664,
+  "<|fim_middle|>": 151660,
+  "<|fim_pad|>": 151662,
+  "<|fim_prefix|>": 151659,
+  "<|fim_suffix|>": 151661,
+  "<|im_end|>": 151645,
+  "<|im_start|>": 151644,
+  "<|image_pad|>": 151655,
+  "<|object_ref_end|>": 151647,
+  "<|object_ref_start|>": 151646,
+  "<|quad_end|>": 151651,
+  "<|quad_start|>": 151650,
+  "<|repo_name|>": 151663,
+  "<|video_pad|>": 151656,
+  "<|vision_end|>": 151653,
+  "<|vision_pad|>": 151654,
+  "<|vision_start|>": 151652
+}

qwen_btrm/base_model/chat_template.jinja ADDED Viewed

	@@ -0,0 +1,54 @@

+{%- if tools %}
+    {{- '<|im_start|>system\n' }}
+    {%- if messages[0]['role'] == 'system' %}
+        {{- messages[0]['content'] }}
+    {%- else %}
+        {{- 'You are a helpful assistant.' }}
+    {%- endif %}
+    {{- "\n\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
+    {%- for tool in tools %}
+        {{- "\n" }}
+        {{- tool | tojson }}
+    {%- endfor %}
+    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
+{%- else %}
+    {%- if messages[0]['role'] == 'system' %}
+        {{- '<|im_start|>system\n' + messages[0]['content'] + '<|im_end|>\n' }}
+    {%- else %}
+        {{- '<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n' }}
+    {%- endif %}
+{%- endif %}
+{%- for message in messages %}
+    {%- if (message.role == "user") or (message.role == "system" and not loop.first) or (message.role == "assistant" and not message.tool_calls) %}
+        {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}
+    {%- elif message.role == "assistant" %}
+        {{- '<|im_start|>' + message.role }}
+        {%- if message.content %}
+            {{- '\n' + message.content }}
+        {%- endif %}
+        {%- for tool_call in message.tool_calls %}
+            {%- if tool_call.function is defined %}
+                {%- set tool_call = tool_call.function %}
+            {%- endif %}
+            {{- '\n<tool_call>\n{"name": "' }}
+            {{- tool_call.name }}
+            {{- '", "arguments": ' }}
+            {{- tool_call.arguments | tojson }}
+            {{- '}\n</tool_call>' }}
+        {%- endfor %}
+        {{- '<|im_end|>\n' }}
+    {%- elif message.role == "tool" %}
+        {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != "tool") %}
+            {{- '<|im_start|>user' }}
+        {%- endif %}
+        {{- '\n<tool_response>\n' }}
+        {{- message.content }}
+        {{- '\n</tool_response>' }}
+        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
+            {{- '<|im_end|>\n' }}
+        {%- endif %}
+    {%- endif %}
+{%- endfor %}
+{%- if add_generation_prompt %}
+    {{- '<|im_start|>assistant\n' }}
+{%- endif %}

qwen_btrm/base_model/config.json ADDED Viewed

	@@ -0,0 +1,55 @@

+{
+  "architectures": [
+    "Qwen2ForCausalLM"
+  ],
+  "attention_dropout": 0.0,
+  "bos_token_id": 151643,
+  "dtype": "bfloat16",
+  "eos_token_id": 151643,
+  "hidden_act": "silu",
+  "hidden_size": 896,
+  "initializer_range": 0.02,
+  "intermediate_size": 4864,
+  "layer_types": [
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention"
+  ],
+  "max_position_embeddings": 32768,
+  "max_window_layers": 24,
+  "model_type": "qwen2",
+  "num_attention_heads": 14,
+  "num_hidden_layers": 24,
+  "num_key_value_heads": 2,
+  "rms_norm_eps": 1e-06,
+  "rope_scaling": null,
+  "rope_theta": 1000000.0,
+  "sliding_window": null,
+  "tie_word_embeddings": true,
+  "transformers_version": "4.57.3",
+  "use_cache": true,
+  "use_mrope": false,
+  "use_sliding_window": false,
+  "vocab_size": 151936
+}

qwen_btrm/base_model/generation_config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "bos_token_id": 151643,
+  "eos_token_id": 151643,
+  "max_new_tokens": 2048,
+  "transformers_version": "4.57.3"
+}

qwen_btrm/base_model/merges.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

qwen_btrm/base_model/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:317dc4b86f0bbab91360ac95d5b2c463ecb63c08119558afafd50a875bf00fc1
+size 988097824

qwen_btrm/base_model/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "additional_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "eos_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

qwen_btrm/base_model/tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e04081d680d5bb294b2e57aea5b3aa1256d9e06263e907917fc241c5adc2fbe4
+size 11422163

qwen_btrm/base_model/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,207 @@

+{
+  "add_bos_token": false,
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "151643": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151644": {
+      "content": "<|im_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151645": {
+      "content": "<|im_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151646": {
+      "content": "<|object_ref_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151647": {
+      "content": "<|object_ref_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151648": {
+      "content": "<|box_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151649": {
+      "content": "<|box_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151650": {
+      "content": "<|quad_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151651": {
+      "content": "<|quad_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151652": {
+      "content": "<|vision_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151653": {
+      "content": "<|vision_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151654": {
+      "content": "<|vision_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151655": {
+      "content": "<|image_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151656": {
+      "content": "<|video_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151657": {
+      "content": "<tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151658": {
+      "content": "</tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151659": {
+      "content": "<|fim_prefix|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151660": {
+      "content": "<|fim_middle|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151661": {
+      "content": "<|fim_suffix|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151662": {
+      "content": "<|fim_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151663": {
+      "content": "<|repo_name|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151664": {
+      "content": "<|file_sep|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    }
+  },
+  "additional_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "bos_token": null,
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|endoftext|>",
+  "errors": "replace",
+  "extra_special_tokens": {},
+  "model_max_length": 131072,
+  "pad_token": "<|endoftext|>",
+  "split_special_tokens": false,
+  "tokenizer_class": "Qwen2Tokenizer",
+  "unk_token": null
+}

qwen_btrm/base_model/vocab.json ADDED Viewed

The diff for this file is too large to render. See raw diff

qwen_btrm/btrm_heads.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:746b809989c2607c267fc9e96a70be343e0a910a5bb99a322a37af851b902b10
+size 30845

qwen_btrm/config.yaml ADDED Viewed

	@@ -0,0 +1,152 @@

+amp_dtype: bfloat16
+api_buffer_size: 200
+api_games:
+- oblivion
+- falloutnv
+- skyrim
+api_url: http://127.0.0.1:8000
+api_walks_per_batch: 2
+base_model: Qwen/Qwen2.5-0.5B
+batch_size: 2
+epochs: 10
+gradient_checkpointing: true
+heads:
+- description: All prose from Skyrim - Nordic fantasy RPG
+  name: skyrim
+  positive_sources:
+  - path: dialogue_data/prose/skyrim_training_fk.jsonl
+    text_field: auto
+    tier_filter: fk_normed
+  - path: dialogue_data/prose/skyrim_training_fk.jsonl
+    text_field: auto
+    tier_filter: flattened
+  - path: dialogue_data/prose/skyrim_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+- description: All prose from Oblivion - Imperial fantasy RPG
+  name: oblivion
+  positive_sources:
+  - path: dialogue_data/prose/oblivion_training_fk.jsonl
+    text_field: auto
+    tier_filter: fk_normed
+  - path: dialogue_data/prose/oblivion_training_fk.jsonl
+    text_field: auto
+    tier_filter: flattened
+  - path: dialogue_data/prose/oblivion_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+- description: All prose from Fallout NV - Post-apocalyptic Western RPG
+  name: fonv
+  positive_sources:
+  - path: dialogue_data/prose/falloutnv_training_fk.jsonl
+    text_field: auto
+    tier_filter: fk_normed
+  - path: dialogue_data/prose/falloutnv_training_fk.jsonl
+    text_field: auto
+    tier_filter: flattened
+  - path: dialogue_data/prose/falloutnv_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+- description: Synthetic Gallia setting - Franco-Roman bureaucratic fantasy
+  name: gallia
+  positive_sources:
+  - path: output/gallia_v9_training_fk.jsonl
+    text_field: auto
+    tier_filter: fk_normed
+  - path: output/gallia_v9_training_fk.jsonl
+    text_field: auto
+    tier_filter: flattened
+  - path: output/gallia_v9_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+- description: Synthetic Marmotte setting - Alpine corporate dystopia
+  name: marmotte
+  positive_sources:
+  - path: output/marmotte_v6_training_fk.jsonl
+    text_field: auto
+    tier_filter: fk_normed
+  - path: output/marmotte_v6_training_fk.jsonl
+    text_field: auto
+    tier_filter: flattened
+  - path: output/marmotte_v6_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+- description: Raw multi-turn dialogue walks (quoted, not prose)
+  name: multiturn_dialogue
+  negative_sources:
+  - neg_tier: soft_neg
+    path: dialogue_data/prose/skyrim_training_fk.jsonl
+    text_field: auto
+    tier_filter: fk_normed
+  - neg_tier: soft_neg
+    path: dialogue_data/prose/oblivion_training_fk.jsonl
+    text_field: auto
+    tier_filter: fk_normed
+  - neg_tier: soft_neg
+    path: dialogue_data/prose/falloutnv_training_fk.jsonl
+    text_field: auto
+    tier_filter: fk_normed
+  - neg_tier: soft_neg
+    path: dialogue_data/prose/skyrim_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+  - neg_tier: soft_neg
+    path: dialogue_data/prose/oblivion_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+  - neg_tier: soft_neg
+    path: dialogue_data/prose/falloutnv_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+  positive_sources:
+  - path: dialogue_data/prose/skyrim_training_fk.jsonl
+    text_field: auto
+    tier_filter: flattened
+  - path: dialogue_data/prose/oblivion_training_fk.jsonl
+    text_field: auto
+    tier_filter: flattened
+  - path: dialogue_data/prose/falloutnv_training_fk.jsonl
+    text_field: auto
+    tier_filter: flattened
+  - path: output/gallia_v9_training_fk.jsonl
+    text_field: auto
+    tier_filter: flattened
+  - path: output/marmotte_v6_training_fk.jsonl
+    text_field: auto
+    tier_filter: flattened
+- description: Vocabulary teaching passages with embedded definitions
+  name: brainrot_aesop
+  positive_sources:
+  - path: dialogue_data/prose/skyrim_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+  - path: dialogue_data/prose/oblivion_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+  - path: dialogue_data/prose/falloutnv_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+  - path: output/gallia_v9_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+  - path: output/marmotte_v6_training_aesops.jsonl
+    text_field: auto
+    tier_filter: brainrot_aesop
+logit_cap: 10.0
+logsquare_weight: 0.1
+lora_alpha: 32
+lora_r: 16
+lr: 5.0e-05
+max_batches: 2500
+max_length: 2048
+neg_samples_per_tier: 300
+soft_neg_paths: []
+use_amp: true
+use_api_walks: true
+use_fineweb: true
+use_lora: false
+use_meta_prompt: true
+use_synth: true
+use_wattpad: true
+use_wikitext: true
+warmup_steps: 200