mistral3b-netops-teamleader

Note on repo name: This repo is named mistral3b-... but the actual base model is Mistral-7B-Instruct-v0.3 (4096 hidden size, 32 layers). The 3b in the name is a misnomer from early project planning. The weights are 7B.

Model Description

Fine-tuned mistralai/Mistral-7B-Instruct-v0.3 for Cisco IOS syslog classification into TROUBLESHOOTING / STABILITY / SECURITY for the VexpertAI agentic network operations platform.

This model powers the Team Leader agent log-routing decision: given a raw syslog line from a Cisco IOS router, it emits exactly one of three routing labels that determines which specialist agent handles the incident.

Base Model

mistralai/Mistral-7B-Instruct-v0.3

  • Architecture: MistralForCausalLM
  • Parameters: 7B (hidden_size=4096, num_hidden_layers=32)

Training Data

  • Type: Synthetic, IOS syslog format
  • Size: 2,100 labeled examples (700 per class) + 150 hard-negative samples
  • Lab topology: EVE-NG digital twin โ€” two Cisco IOS routers (R1, R2) running OSPF area 0
  • Syslog format: *MMM DD HH:MM:SS.mmm: %FACILITY-SEVERITY-MNEMONIC: message
  • Classes:
    • TROUBLESHOOTING: connectivity loss, interface flaps, adjacency changes, reachability failures
    • STABILITY: CPU/memory exhaustion, process crashes, hardware errors, recurring instability
    • SECURITY: ACL deny hits, login failures, brute force, SNMP auth failures, IPS signatures

Evaluation Results

Metric Value
Overall Accuracy 94.29%
Inference Latency (mean) 917.4 ms/sample
Inference Latency (std) 304.4 ms
Class Precision Recall F1 Support
TROUBLESHOOTING 1.0000 0.8714 0.9313 70
STABILITY 0.8816 0.9571 0.9178 70
SECURITY 0.9589 1.0000 0.9790 70

Inference Example

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "eduard76/mistral3b-netops-teamleader"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

syslog = "*Mar  1 00:05:23.123: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to down"

messages = [
    {"role": "system", "content": "You are a network operations classifier. Given a Cisco IOS syslog line, output exactly one word: TROUBLESHOOTING, STABILITY, or SECURITY. No explanation."},
    {"role": "user", "content": syslog},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt")

# IMPORTANT: use max_new_tokens >= 12, not 5.
# "TROUBLESHOOTING" tokenizes to 7 tokens โ€” max_new_tokens=5 silently truncates it
# and produces an INVALID label.
output = model.generate(**inputs, max_new_tokens=12, do_sample=False)
new_tokens = output[0][inputs["input_ids"].shape[1]:]
print(tokenizer.decode(new_tokens, skip_special_tokens=True).strip().split()[0].upper())
# โ†’ TROUBLESHOOTING

Intended Use

  • Primary use: Team Leader agent in VexpertAI agentic NetOps platform for routing syslog events to the correct specialist agent (Stability, Troubleshooting, or Security)
  • Secondary use: Offline syslog triage, NOC alert pre-classification

Out of Scope

  • Not for direct network configuration changes
  • Not production-validated on live Cisco IOS traffic (trained on synthetic data)
  • Not a replacement for human network engineers in critical incidents
  • Not tested on NX-OS, IOS-XE, or IOS-XR syntax variants

Training Details

  • LoRA: r=16, alpha=32, dropout=0.05, target_modules: q/k/v/o/gate/up/down projections
  • Quantization: 4-bit NF4 QLoRA (bitsandbytes)
  • Epochs: 3
  • Effective batch size: 16 (batch=4, grad_accum=4)
  • Learning rate: 2e-4 cosine schedule, warmup_ratio=0.05

Reproduction Gotchas

These bugs were hit during the training pipeline on an AWS EC2 g4dn.xlarge (15GB RAM, T4 GPU):

  1. max_new_tokens truncation โ€” Using max_new_tokens=5 in the evaluation loop caused all TROUBLESHOOTING predictions to be marked INVALID (F1=0.0), because that label tokenizes to 7 tokens in Mistral's tokenizer. Use max_new_tokens >= 12 to safely cover all three class labels.

  2. OOM on LoRA merge โ€” Loading the 7B base model in FP16 for merge_and_unload() peaks at ~28GB (two copies of the model in memory). On a 15GB RAM instance with no swap this kills the process with exit code -9. Fix: add 20GB swap before running the merge step:

    sudo fallocate -l 20G /swapfile && sudo chmod 600 /swapfile
    sudo mkswap /swapfile && sudo swapon /swapfile
    
  3. Deprecated HuggingFace kwargs โ€” push_to_hub(use_auth_token=..., safe_serialization=...) raises TypeError on current transformers/huggingface_hub versions. Use huggingface_hub.HfApi().upload_folder() instead, which is stable and streams directly from disk without reloading the model into memory:

    from huggingface_hub import HfApi
    HfApi().upload_folder(folder_path="final_model/", repo_id="your/repo", token=hf_token)
    
Downloads last month
10
Safetensors
Model size
7B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for eduard76/mistral7b-netops-teamleader

Finetuned
(396)
this model