TE-NIMS Severian β€” Stage 9 Adapter (Gemma 4 E4B, PEFT LoRA)

Fine-tuned LoRA adapter for verified NIMS/ICS decision support in civilian emergency management. Built on google/gemma-4-E4B-it by Terminus Est AI for the Gemma 4 for Good Hackathon.

This repository is the Stage 9 adapter and training-lineage artifact. The deployable GGUF used by the public TE NIMS demo lives in the separate repo:

  • Inference GGUF: tmancino/te-nims-e4b-stage9-gguf

What it does

Severian assists Incident Commanders with NIMS ICS decision-making: resource allocation, unified command, span of control, and situational assessment β€” grounded in FEMA doctrine.

ODA score: 0.71 on 52-case kaggle_demo rubric (direct inference, passing bar 0.70). Full harness score: 0.916 with system prompt + citation contract pipeline.

Model details

Field Value
Base model google/gemma-4-E4B-it
Adapter type LoRA (PEFT format)
Deployment artifact tmancino/te-nims-e4b-stage9-gguf
LoRA rank 8
LoRA alpha 160
Target modules q_proj, o_proj, gate_proj, up_proj, down_proj, per_layer_input_gate, per_layer_projection
Training SFT, 165 iters, lr=5e-5, warm-started from Stage 8
Corpus 123 records (103 IC recommendation scenarios + 20 ODA-v1)
Adapter size 27.7 MB
License Apache 2.0

How to use

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

base_model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-4-E4B-it",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-E4B-it")
model = PeftModel.from_pretrained(base_model, "tmancino/te-nims-e4b-stage9")

system_prompt = """You are Severian, a NIMS ICS decision support agent.
Provide doctrine-grounded incident command recommendations.
Cite specific ICS principles (e.g., ICS-201, ICS-202) when relevant.
If unsure, defer to established doctrine rather than improvise."""

prompt = f"<start_of_turn>system\n{system_prompt}<end_of_turn>\n<start_of_turn>user\nA structure fire with possible entrapment. Three engine companies on scene. Recommend incident command structure.<end_of_turn>\n<start_of_turn>model\n"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.5, do_sample=True)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))

Alternative: MLX inference (Apple Silicon)

This adapter was trained with mlx_lm. For Apple Silicon, the original MLX format may be more efficient. See the TE-NIMS repo for the MLX-native inference path.

Runtime relationship

The public demo and Docker deployment do not serve this adapter directly. They serve the converted GGUF artifact through Ollama under the local runtime name severian-ollama.

Training lineage

te-nims-e4b-stage9 β†’ stage8 β†’ stage6 β†’ stage5 (GRPO) β†’ stage3 (SFT+RLVR) β†’ google/gemma-4-E4B-it

Converted from MLX LoRA format to PEFT format using the repo's AI/training/grpo/convert_mlx_adapter_to_peft.py (transposes weights and remaps key conventions: language_model.model.layers.N.X.lora_a β†’ base_model.model.model.layers.N.X.lora_A.default.weight).

About Terminus Est AI

Building verified AI for civilian emergency management β€” doctrine-bound, edge-deployable, auditable.

Downloads last month
47
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for tmancino/te-nims-e4b-stage9

Adapter
(97)
this model