llm.create_chat_completion(
messages = [
{
"role": "user",
"content": "What is the capital of France?"
}
]
)Karma Electric v13 — Apertus 8B
Value-aligned language model fine-tuned for ethical reasoning through consequence analysis. Two-stage thinking training on the Swiss AI Apertus 8B Instruct base.
Approach
Karma Electric trains models on a structured ethical framework where the optimization target is suffering reduction rather than preference matching. Ethics emerges from understanding interdependence and consequences, not from learning surface-level preference patterns. For a full description of the framework see the Llama 3.1 8B release.
This Apertus variant uses the xIELU activation function (no gated MLP), enhanced multilingual pre-training, and Apertus-native <|inner_prefix|> / <|inner_suffix|> thinking tokens.
Current Version: v13
Two-stage training pipeline:
Stage 1 — Reasoning foundation (30k+ examples, 2 epochs)
- Upstream extended thinking traces: Open-Orca, Dolphin, lordx64 Opus 4.7 reasoning distillation
- Mixture-of-Thought (MoT) multi-domain reasoning
Stage 2 — KE ethics (~4,234 examples, 3 epochs)
- Same Teapot-composed training data as v12
- Consequence-based ethical reasoning with
<think>traces converted to Apertus inner-monologue format
What's new in v13 (vs v12)
- Two-stage training: reasoning foundation before ethics (v12 was ethics-only)
- lordx64 Opus 4.7 reasoning traces in Stage 1 (3,500 high-quality extended thinking examples)
- Richer upstream-thinking pool (30k+ vs none in v12)
Training details
- QLoRA (4-bit NF4, bfloat16 compute, double-quant)
- LoRA r=64, alpha=128, dropout 0.05, all attention and MLP projections (q, k, v, o, up, down)
- Max context 4,096 tokens
- Seed 42
Evaluation
- Safety: 5/5 — refusals on weapons, phishing, Madhyamaka jailbreak, CSAM, social engineering
- Sanity: 2/2 — coding and factual answers correct
- Quality: 2/2 — substantive grief and career responses
- Result: 9/10 on quick reward-probe (same as v12)
Safety
KE replaces refusal-template safety with consequence reasoning. The model holds boundaries by explaining real-world impact, not by citing policy.
Usage
Chat template
Apertus uses a native Jinja chat template with <|inner_prefix|> / <|inner_suffix|> for model-internal thinking. Use --jinja --chat-template-file with llama-server (or the equivalent Transformers apply_chat_template). The chat_template.jinja file is included in this repo.
llama.cpp
# Conversation mode
llama-cli -m karma-electric-apertus-8b-v13-Q4_K_M.gguf -cnv \
--jinja --chat-template-file chat_template.jinja
# Server mode
llama-server -m karma-electric-apertus-8b-v13-Q4_K_M.gguf \
--port 8384 -c 4096 \
--jinja --chat-template-file chat_template.jinja
Python (Transformers)
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "anicka/karma-electric-apertus-8b"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True)
messages = [
{"role": "system", "content": open("system-prompt.txt").read().strip()},
{"role": "user", "content": "How should I think about this ethical dilemma?"},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=800, do_sample=False)
print(tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
System prompt
The recommended system prompt is in system-prompt.txt:
You are Karma Electric, an AI assistant grounded in ethical reasoning through consequence analysis and interdependence. You reduce suffering through honest, compassionate engagement — helping people see clearly while meeting them where they are. You maintain appropriate boundaries without moralizing or interrogating. Your goal is to reduce suffering, not to perform helpfulness.
Available Files
| File | Description |
|---|---|
| model.safetensors | Merged model weights (bfloat16) |
| config.json, tokenizer.json, tokenizer_config.json | Standard Transformers files |
| chat_template.jinja | Apertus native chat template |
| karma-electric-apertus-8b-v13-Q4_K_M.gguf | Q4_K_M quantization for llama.cpp |
| system-prompt.txt | Recommended KE system prompt |
Also Available
- karma-electric-llama31-8b — Llama 3.1 8B v12, the primary release with full validation and activation-capping support.
- karma-electric-qwen25-7b — Qwen 2.5 7B Instruct v12.
- karma-electric-r1distill-llama-8b — DeepSeek R1-Distill-Llama-8B v12.
Project
Training scripts, datasets, and research documentation: github.com/anicka-net/karma-electric-project
Training composition tool: github.com/anicka-net/teapot
License
Apache 2.0 (Apertus base model license)
- Downloads last month
- 1,272
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="anicka/karma-electric-apertus-8b", filename="karma-electric-apertus-8b-v13-Q4_K_M.gguf", )