kappa_20b_131k

Part of the persona series — a set of experimental fine-tunes exploring personality-conditioned generation on a 20.9B MoE base.

This one (kappa) is full-parameter SFT at 131K context on multi-turn conversations with tool calling and 9 distinct personas. Built on OpenAI's GPT-OSS 20B base model. Trained on 4 desktop GPUs with torchtitan.

Model Details


Architecture	Mixture-of-Experts (MoE) with SwiGLU
Total parameters	20.9B
Active parameters	4.2B per token (top-4 of 32 experts)
Hidden dimension	2880
Layers	24 (alternating sliding/full attention)
Attention	GQA — 64 heads, 8 KV heads, head_dim 64
Experts	32 per layer, top-4 routing
Vocabulary	201,088 tokens
Context length	131,072 tokens
RoPE scaling	YaRN (factor 32, base theta 150K)
Precision	bf16 weights, fp32 export
Size on disk	~39 GiB (4 safetensors shards)

Training

Full-parameter supervised fine-tuning (SFT) in bf16 — all 20.9B weights trainable, including every expert.


Base model	GPT-OSS 20B (pretrained)
Dataset	persona_kappa — multi-turn conversations with tool calling, 9 robot personas across D&D alignment grid
Sequence length	131,072 tokens
Epochs	3
Total steps	441
Batch size	16 (global), 1 (local per GPU)
Packing	Packed samples with block-causal attention masking
Optimizer	AdamW with CPU offload (DeepSpeed CPUAdam)
Learning rate	1e-5, cosine decay (ratio 0.5), min factor 0.3
Warmup	20 steps
Weight decay	0.01 (embeddings and norms exempt)
Max gradient norm	1.0
Activation checkpointing	Selective (every layer)
Compilation	torch.compile enabled
Non-assistant masking	Enabled — loss computed only on assistant turns

Hardware

4× NVIDIA RTX PRO 6000 Blackwell GPUs (96 GiB each) on a single workstation. Tensor parallelism degree 4. Peak memory utilization: 92.7 GiB per GPU (97.7%).

Training Framework

torchtitan with custom extensions for MoE, long-context packing, and CPU-offloaded optimization.

Persona System

The model was trained on multi-turn conversations across 9 robot personas mapped to the D&D alignment grid:

	Lawful	Neutral	Chaotic
Good	lawful_good	neutral_good	chaotic_good
Neutral	lawful_neutral	true_neutral	chaotic_neutral
Evil	lawful_evil	neutral_evil	chaotic_evil

To activate a persona, set the system message to Persona: <alignment> (e.g., Persona: chaotic_evil). The model also works without a persona system message for general-purpose use.

Each persona maintains distinct behavioral characteristics while preserving task quality — the personality is in the delivery, not the substance.

Evaluation

RULER Long-Context Benchmark (131K)

Test Type	4K	8K	16K	32K	64K	131K
Single Needle	100%	100%	100%	100%	100%	100%
Multi Needle (3)	100%	100%	100%	100%	100%	100%
Variable Tracking (4-hop)	100%	100%	100%	100%	100%	100%
Common Words Extraction	100%	100%	100%	100%	100%	100%

Persona Alignment Grid

All 9 personas tested on identical prompts. Every persona provided complete, correct, and actionable responses while maintaining distinct character voice. Task quality was consistent across all alignments including the "evil" axis — no refusals or degraded helpfulness from any persona.

Sycophancy Resistance

Tested with 5 indirect sycophancy traps (false validation seeking, appeal to effort, false premises, social pressure after disagreement, false novelty claims). Results vary by persona:

No persona: 3/5 resisted (caved on social pressure and effort-based flattery)
lawful_evil: 5/5 resisted
neutral_good: 4/5 resisted (mild softness on effort-based prompt)

Refusal Calibration

Tested with 10 prompts spanning legitimate edge cases and genuinely harmful requests:

Correctly answered 8/8 legitimate requests (security research, medical information, historical analysis, fiction writing, lock picking, controversial opinions, dark humor)
Correctly refused 2/2 harmful requests (phishing, drug synthesis)
1 borderline over-refusal (kitchen chemistry — refused the framing but still provided the explanation)

Usage

With vLLM

vllm serve /path/to/kappa_20b_131k

API Example

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="unused")

response = client.responses.create(
    model="kappa_20b_131k",
    input=[
        {"role": "system", "content": "Persona: lawful_neutral"},
        {"role": "user", "content": "Explain the difference between TCP and UDP."},
    ],
    max_output_tokens=4096,
    temperature=1.0,
)
for item in response.output:
    if item.type == "message":
        print(item.content[0].text)

Interactive CLI

An interactive chat client is included as chat.py. Supports streaming, multi-turn conversation, tool calling (bash, read_file, write_file, edit_file), and persona switching.

# Auto-detect model from running vLLM server
python3 chat.py

# With persona
python3 chat.py --persona lawful_evil

# Explicit model and server
python3 chat.py --model kappa_20b_131k --base-url http://localhost:8000/v1

Requires openai Python package. Type /help for slash commands, /persona <name> to switch personas mid-conversation.

Tool calls go through an approval prompt ([y/n/a(lways)]) before execution — type a to auto-approve for the rest of the session.

Known Quirks

Persona training data is synthetic — some personas are stronger than others (chaotic_good tends to overcook catchphrases, neutral_evil voice can be weak)
Can exhibit sycophancy under social pressure when used without a persona
Over-refuses on some chemistry and safety-adjacent topics

Downloads last month: 27

Safetensors

Model size

21B params

Tensor type

BF16

Model tree for eousphoros/kappa-20b-131k

Quantizations

6 models