Sentinel-NX / README.md
tnadmin's picture
Update README.md
4514268 verified
|
Raw
History Blame Contribute Delete
2.83 kB
metadata
license: other
license_name: qwen-research
license_link: https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct/blob/main/LICENSE
base_model: Qwen/Qwen2.5-Coder-3B-Instruct
tags:
  - cisco
  - ios-xe
  - network-automation
  - gguf
  - qwen2.5-coder
  - non-commercial
library_name: llama.cpp
pipeline_tag: text-generation

Sentinel-NX β€” Cisco IOS-XE Config Assistant (V3.1, GGUF)

A small, edge-deployable Cisco IOS-XE configuration assistant: a QLoRA fine-tune of Qwen2.5-Coder-3B-Instruct, merged and quantized to GGUF. It emits strict, syntactically valid IOS-XE for exactly what's requested β€” no invented interfaces, IPs, loopbacks, route-maps, no shutdowns, descriptions, or unrequested best-practice config.

Built with Qwen. Non-commercial only (see License).

Project / code / methodology: https://github.com/tnadmin1/Sentinel-NX

Files

File Quant Size Use
sentinel-nx-q8_0.gguf Q8_0 ~3.1 GB Primary β€” highest fidelity
sentinel-nx-q6_k.gguf Q6_K ~2.4 GB Faster, near-lossless

Results

Manually-scored benchmarks; the hidden set uses entirely new interfaces, VLANs, ASNs, IPs, and object names not seen in training (a generalization test).

Hidden 20-prompt benchmark (5 pts each):

Model Score
Base Qwen2.5-Coder-3B-Instruct 58 / 100
V2 71 / 100
V3.1 97 / 100

Original 25-prompt benchmark (4 pts each): Base 58 β†’ V2 70 β†’ V3 69 β†’ V3.1 93.

Usage

# Ollama (pull directly from this repo)
ollama run hf.co/tnadmin/Sentinel-NX:Q8_0
# llama.cpp
./llama-cli -m sentinel-nx-q8_0.gguf --temp 0 -c 4096 -cnv \
  -sys "You are a Cisco IOS-XE configuration assistant. Output only strict, valid configuration for exactly what is requested. Do not invent values."

Strict behavior is prompt-conditioned. The model suppresses over-completion when the system prompt and request instruct it to (e.g. "Do not add descriptions, no shutdown, spanning-tree, or anything not explicitly requested"). Use a strict prompt for best results.

Known limitations

  • OSPF router-id is occasionally emitted as ip ospf <process> router-id <id> under an interface instead of router-id under router ospf <process>. Targeted corrective data is the next iteration.

Training

QLoRA (LoRA rank 16) on Qwen2.5-Coder-3B-Instruct, RTX 4070 12 GB. ~5,200 curated + failure-driven remedial IOS-XE instruction pairs, built through three corrective rounds (V2 β†’ V3 β†’ V3.1). See the GitHub repo for the full methodology.

License & attribution

This model is a derivative of Qwen2.5-Coder-3B-Instruct and is distributed under the Qwen Research License β€” non-commercial use only. Built with Qwen. Copyright (c) Alibaba Cloud. All Rights Reserved.