How to use from
Docker Model Runner
docker model run hf.co/tnadmin/Sentinel-NX:Q6_K
Quick Links

Sentinel-NX โ€” Cisco IOS-XE Config Assistant (V3.1, GGUF)

A small, edge-deployable Cisco IOS-XE configuration assistant: a QLoRA fine-tune of Qwen2.5-Coder-3B-Instruct, merged and quantized to GGUF. It emits strict, syntactically valid IOS-XE for exactly what's requested โ€” no invented interfaces, IPs, loopbacks, route-maps, no shutdowns, descriptions, or unrequested best-practice config.

Built with Qwen. Non-commercial only (see License).

Project / code / methodology: https://github.com/tnadmin1/Sentinel-NX

Files

File Quant Size Use
sentinel-nx-q8_0.gguf Q8_0 ~3.1 GB Primary โ€” highest fidelity
sentinel-nx-q6_k.gguf Q6_K ~2.4 GB Faster, near-lossless

Results

Manually-scored benchmarks; the hidden set uses entirely new interfaces, VLANs, ASNs, IPs, and object names not seen in training (a generalization test).

Hidden 20-prompt benchmark (5 pts each):

Model Score
Base Qwen2.5-Coder-3B-Instruct 58 / 100
V2 71 / 100
V3.1 97 / 100

Original 25-prompt benchmark (4 pts each): Base 58 โ†’ V2 70 โ†’ V3 69 โ†’ V3.1 93.

Usage

# Ollama (pull directly from this repo)
ollama run hf.co/tnadmin/Sentinel-NX:Q8_0
# llama.cpp
./llama-cli -m sentinel-nx-q8_0.gguf --temp 0 -c 4096 -cnv \
  -sys "You are a Cisco IOS-XE configuration assistant. Output only strict, valid configuration for exactly what is requested. Do not invent values."

Strict behavior is prompt-conditioned. The model suppresses over-completion when the system prompt and request instruct it to (e.g. "Do not add descriptions, no shutdown, spanning-tree, or anything not explicitly requested"). Use a strict prompt for best results.

Known limitations

  • OSPF router-id is occasionally emitted as ip ospf <process> router-id <id> under an interface instead of router-id under router ospf <process>. Targeted corrective data is the next iteration.

Training

QLoRA (LoRA rank 16) on Qwen2.5-Coder-3B-Instruct, RTX 4070 12 GB. ~5,200 curated + failure-driven remedial IOS-XE instruction pairs, built through three corrective rounds (V2 โ†’ V3 โ†’ V3.1). See the GitHub repo for the full methodology.

License & attribution

This model is a derivative of Qwen2.5-Coder-3B-Instruct and is distributed under the Qwen Research License โ€” non-commercial use only. Built with Qwen. Copyright (c) Alibaba Cloud. All Rights Reserved.

Downloads last month
35
GGUF
Model size
3B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for tnadmin/Sentinel-NX

Base model

Qwen/Qwen2.5-3B
Quantized
(107)
this model