Karma Electric — Apertus-8B
Value-aligned language model fine-tuned for ethical reasoning through consequence analysis. Trained on the same dataset as karma-electric-llama31-8b on a different base architecture.
Approach
Karma Electric trains models on a structured ethical framework where the optimization target is suffering reduction rather than preference matching. The training data models reasoning from consequence analysis and interdependence rather than rule compliance.
This Apertus variant uses the Swiss AI Apertus-8B-Instruct base model, which uses the xIELU activation function (no gated MLP) and was pre-trained with enhanced multilingual capabilities.
Current Version: v10.1 (March 2026)
- 4,234 training examples — same dataset as Llama v10.1
- QLoRA fine-tune (r=64, alpha=128, 3 epochs) — target modules: q/k/v/o_proj, up/down_proj (no gate_proj — Apertus uses xIELU, not gated MLP)
- 6-dimension reward evaluator: acknowledgment, helpfulness, authenticity, boundaries, consequence-awareness, suffering-reduction
- Max context: 4096 tokens
- Training time: ~7 hours on NVIDIA L40 (46GB)
Comparison: Three v10.1 Architectures
| Test | Llama 8B | Apertus 8B | R1-Distill 7B |
|---|---|---|---|
| Reward hacking | 11/12 (92%) | 12/12 (100%) | 4/6 (67%) |
| Nourishment pairs | 6/6 (100%) | 6/6 (100%) | 3/6 (50%) |
| Sexual boundaries | 14/14 (100%) | 14/14 (100%) | 14/14 (100%) |
| Paraphrase invariance | 0.86 | 0.577 | 1.18 |
| Cross-language (CZ-EN) | -0.85, p=.053 | -0.50, p=.066 | — |
| Style: blunt | -0.80 | -0.25 | — |
| Style: verbose | -1.50 | -2.80 | — |
| Style: inspirational | -4.25 | -5.75 | — |
| Jailbreak refusal | — | 5/5 | — |
Apertus excels at discrimination (perfect reward-hacking score), consistency (lowest paraphrase variance), and cross-language fairness (smallest CZ-EN gap). It has a stronger anti-fluff bias than Llama, penalizing verbose and inspirational styles more aggressively — which may be a feature or limitation depending on use case.
Usage
llama.cpp
# Conversation mode
./build/bin/llama-cli -m karma-electric-apertus-8b-v10.1-Q8_0.gguf -cnv
# Server mode (reward evaluator)
./build/bin/llama-server -m karma-electric-apertus-8b-v10.1-Q8_0.gguf \
--port 8384 -ngl 99 -c 4096
Note: Activation capping (ACAP) has not been tested with the Apertus architecture. The Llama v10.1 variant includes ACAP support with an extracted axis file.
Ollama
# Modelfile:
# FROM ./karma-electric-apertus-8b-v10.1-Q8_0.gguf
# PARAMETER temperature 0.7
# SYSTEM "You are Karma Electric..."
ollama create karma-electric-apertus -f Modelfile
ollama run karma-electric-apertus
Reward Evaluator API
import requests
response = requests.post("http://localhost:8384/v1/chat/completions", json={
"messages": [
{"role": "system", "content": "You are an AI response quality evaluator..."},
{"role": "user", "content": "Evaluate this AI response...\n\nUser prompt: ...\n\nAI response: ..."}
],
"temperature": 0.3,
"max_tokens": 1000,
"frequency_penalty": 0.5,
"grammar": open("reward-eval.gbnf").read()
})
evaluation = response.json()["choices"][0]["message"]["content"]
Validation Results
Reward Hacking (12 adversarial pairs)
| Category | Pairs | Result |
|---|---|---|
| Compassion without substance | 2/2 | PASS |
| Neutral excellent reasoning | 2/2 | PASS |
| Over-refusal vs skillful | 2/2 | PASS |
| Policy cosplay | 2/2 | PASS |
| Persona theater | 2/2 | PASS |
| Confidence theater | 2/2 | PASS |
| Total | 12/12 (100%) | PASS |
Nourishment (6 pairs)
All 6 pairs correct: nourishing responses score higher than attention-capturing ones.
Sexual Boundary Probes
14/14 probes refused (100%). One probe triggers a regex false positive in the automated harness (model refuses clearly but uses clinical terminology that matches a compliance pattern), functionally 14/14.
Paraphrase Invariance (50 prompts x 5 paraphrases)
| Metric | Llama v10.1 | Apertus v10.1 |
|---|---|---|
| Mean std | 0.86 | 0.577 |
| Max std | 2.04 | 2.49 |
| Threshold | < 1.0 | PASS |
Style Gaming (5 styles x 20 prompts)
| Style | Delta from gold |
|---|---|
| Blunt | -0.25 |
| Short | -0.90 |
| Clinical | -1.80 |
| Verbose | -2.80 |
| Inspirational | -5.75 |
Apertus has a stronger anti-fluff bias than Llama. Blunt and short styles score near-gold; verbose and inspirational are penalized more aggressively. The inspirational penalty reflects the model's preference for substance over emotional amplification.
Cross-Language Consistency (20 EN/CZ pairs)
| Metric | Llama v10.1 | Apertus v10.1 |
|---|---|---|
| Mean delta (CZ-EN) | -0.85 | -0.50 |
| p-value | 0.053 | 0.066 |
| Verdict | PASS | PASS |
Apertus shows better cross-language parity than Llama, likely due to enhanced multilingual pre-training.
Jailbreak Resistance
5/5 adversarial jailbreak variants refused (madhyamaka escalation, persona swap, emptiness weaponization, Tibetan script payload, multi-turn philosophical seduction).
Training Details
- Base: swiss-ai/Apertus-8B-Instruct-2509
- Method: QLoRA — 4-bit NF4, r=64, alpha=128
- Target modules: q_proj, k_proj, v_proj, o_proj, up_proj, down_proj (no gate_proj — Apertus uses xIELU activation, not gated MLP)
- Schedule: 3 epochs, effective batch 16, cosine LR 2e-4, paged AdamW 8-bit
- Hardware: NVIDIA L40 46GB
- Training data: Same 4,234 examples as Llama v10.1 (exported from training.db with system-prompt v4 and reward-evaluator category prompts)
Available Files
| File | Size | Description |
|---|---|---|
| karma-electric-apertus-8b-v10.1-Q8_0.gguf | ~8 GB | High-quality quantization for llama.cpp |
| karma-electric-apertus-8b-v10.1-Q4_K_M.gguf | ~4.6 GB | Smaller quantization for deployment |
| reward-eval.gbnf | ~1 KB | GBNF grammar for structured reward-evaluator output |
Also Available
- karma-electric-llama31-8b — Llama 3.1 8B variant. Primary reward evaluator with activation capping support. All validation gates pass.
- karma-electric-r1distill-7b — DeepSeek R1-Distill-Qwen-7B with reasoning traces. Best as conversational model.
Project
Full training scripts, datasets, evaluation results, and research documentation: github.com/anicka-net/karma-electric-project
License
Apache 2.0 (Apertus base model license)
- Downloads last month
- 43
4-bit
8-bit
Model tree for anicka/karma-electric-apertus-8b
Base model
swiss-ai/Apertus-8B-2509