Instructions to use tmancino/te-nims-e4b-stage9 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use tmancino/te-nims-e4b-stage9 with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("google/gemma-4-E4B-it") model = PeftModel.from_pretrained(base_model, "tmancino/te-nims-e4b-stage9") - Notebooks
- Google Colab
- Kaggle
TE-NIMS Severian β Stage 9 Adapter (Gemma 4 E4B, PEFT LoRA)
Fine-tuned LoRA adapter for verified NIMS/ICS decision support in civilian emergency management.
Built on google/gemma-4-E4B-it by Terminus Est AI for the Gemma 4 for Good Hackathon.
This repository is the Stage 9 adapter and training-lineage artifact. The deployable GGUF used by the public TE NIMS demo lives in the separate repo:
- Inference GGUF:
tmancino/te-nims-e4b-stage9-gguf
What it does
Severian assists Incident Commanders with NIMS ICS decision-making: resource allocation, unified command, span of control, and situational assessment β grounded in FEMA doctrine.
ODA score: 0.71 on 52-case kaggle_demo rubric (direct inference, passing bar 0.70). Full harness score: 0.916 with system prompt + citation contract pipeline.
Model details
| Field | Value |
|---|---|
| Base model | google/gemma-4-E4B-it |
| Adapter type | LoRA (PEFT format) |
| Deployment artifact | tmancino/te-nims-e4b-stage9-gguf |
| LoRA rank | 8 |
| LoRA alpha | 160 |
| Target modules | q_proj, o_proj, gate_proj, up_proj, down_proj, per_layer_input_gate, per_layer_projection |
| Training | SFT, 165 iters, lr=5e-5, warm-started from Stage 8 |
| Corpus | 123 records (103 IC recommendation scenarios + 20 ODA-v1) |
| Adapter size | 27.7 MB |
| License | Apache 2.0 |
How to use
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
base_model = AutoModelForCausalLM.from_pretrained(
"google/gemma-4-E4B-it",
torch_dtype=torch.bfloat16,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-E4B-it")
model = PeftModel.from_pretrained(base_model, "tmancino/te-nims-e4b-stage9")
system_prompt = """You are Severian, a NIMS ICS decision support agent.
Provide doctrine-grounded incident command recommendations.
Cite specific ICS principles (e.g., ICS-201, ICS-202) when relevant.
If unsure, defer to established doctrine rather than improvise."""
prompt = f"<start_of_turn>system\n{system_prompt}<end_of_turn>\n<start_of_turn>user\nA structure fire with possible entrapment. Three engine companies on scene. Recommend incident command structure.<end_of_turn>\n<start_of_turn>model\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.5, do_sample=True)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
Alternative: MLX inference (Apple Silicon)
This adapter was trained with mlx_lm. For Apple Silicon, the original MLX format
may be more efficient. See the TE-NIMS repo
for the MLX-native inference path.
Runtime relationship
The public demo and Docker deployment do not serve this adapter directly.
They serve the converted GGUF artifact through Ollama under the local runtime
name severian-ollama.
Training lineage
te-nims-e4b-stage9 β stage8 β stage6 β stage5 (GRPO) β stage3 (SFT+RLVR) β google/gemma-4-E4B-it
Converted from MLX LoRA format to PEFT format using the repo's
AI/training/grpo/convert_mlx_adapter_to_peft.py (transposes weights and
remaps key conventions: language_model.model.layers.N.X.lora_a β
base_model.model.model.layers.N.X.lora_A.default.weight).
About Terminus Est AI
Building verified AI for civilian emergency management β doctrine-bound, edge-deployable, auditable.
- Downloads last month
- 47