TE-NIMS Severian — Stage 9 Adapter (Gemma 4 E4B, PEFT LoRA)

Fine-tuned LoRA adapter for verified NIMS/ICS decision support in civilian emergency management. Built on google/gemma-4-E4B-it by Terminus Est AI for the Gemma 4 for Good Hackathon.

This repository is the Stage 9 adapter and training-lineage artifact. The deployable GGUF used by the public TE NIMS demo lives in the separate repo:

Inference GGUF: tmancino/te-nims-e4b-stage9-gguf

What it does

Severian assists Incident Commanders with NIMS ICS decision-making: resource allocation, unified command, span of control, and situational assessment — grounded in FEMA doctrine.

ODA score: 0.71 on 52-case kaggle_demo rubric (direct inference, passing bar 0.70). Full harness score: 0.916 with system prompt + citation contract pipeline.

Model details

Field	Value
Base model	`google/gemma-4-E4B-it`
Adapter type	LoRA (PEFT format)
Deployment artifact	`tmancino/te-nims-e4b-stage9-gguf`
LoRA rank	8
LoRA alpha	160
Target modules	`q_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`, `per_layer_input_gate`, `per_layer_projection`
Training	SFT, 165 iters, lr=5e-5, warm-started from Stage 8
Corpus	123 records (103 IC recommendation scenarios + 20 ODA-v1)
Adapter size	27.7 MB
License	Apache 2.0

How to use

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

base_model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-4-E4B-it",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-E4B-it")
model = PeftModel.from_pretrained(base_model, "tmancino/te-nims-e4b-stage9")

system_prompt = """You are Severian, a NIMS ICS decision support agent.
Provide doctrine-grounded incident command recommendations.
Cite specific ICS principles (e.g., ICS-201, ICS-202) when relevant.
If unsure, defer to established doctrine rather than improvise."""

prompt = f"<start_of_turn>system\n{system_prompt}<end_of_turn>\n<start_of_turn>user\nA structure fire with possible entrapment. Three engine companies on scene. Recommend incident command structure.<end_of_turn>\n<start_of_turn>model\n"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.5, do_sample=True)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))

Alternative: MLX inference (Apple Silicon)

This adapter was trained with mlx_lm. For Apple Silicon, the original MLX format may be more efficient. See the TE-NIMS repo for the MLX-native inference path.

Runtime relationship

The public demo and Docker deployment do not serve this adapter directly. They serve the converted GGUF artifact through Ollama under the local runtime name severian-ollama.

Training lineage

te-nims-e4b-stage9 → stage8 → stage6 → stage5 (GRPO) → stage3 (SFT+RLVR) → google/gemma-4-E4B-it

Converted from MLX LoRA format to PEFT format using the repo's AI/training/grpo/convert_mlx_adapter_to_peft.py (transposes weights and remaps key conventions: language_model.model.layers.N.X.lora_a → base_model.model.model.layers.N.X.lora_A.default.weight).

About Terminus Est AI

Building verified AI for civilian emergency management — doctrine-bound, edge-deployable, auditable.

Downloads last month: 1

Model tree for tmancino/te-nims-e4b-stage9

Base model

google/gemma-4-E4B

Finetuned

google/gemma-4-E4B-it

Adapter

(277)

this model