irfanalee
/

incident-responder

Text Generation

incident-response

Model card Files Files and versions

irfanalee commited on 20 days ago

Commit

4690a2e

·

verified ·

1 Parent(s): affd813

Create README.md

Files changed (1) hide show

README.md +82 -0

README.md ADDED Viewed

	@@ -0,0 +1,82 @@

+---
+license: apache-2.0
+base_model: nvidia/Mistral-NeMo-Minitron-8B-Instruct
+tags:
+  - devops
+  - incident-response
+  - sre
+  - mistral-nemo
+  - fine-tuned
+  - qlora
+language:
+  - en
+pipeline_tag: text-generation
+---
+# DevOps Incident Responder
+A fine-tuned Mistral-NeMo-Minitron-8B-Instruct model for DevOps incident diagnosis and resolution.
+## What It Does
+Analyzes error logs, stack traces, and incident descriptions to provide:
+- **Root Cause** analysis
+- **Severity** assessment (Low / Medium / High / Critical)
+- **Step-by-step fixes** with exact commands
+- **Prevention** guidance
+## Tech Coverage
+Kubernetes, Docker, Terraform, Azure, GCP, Node.js, Redis, MongoDB, Nginx, PostgreSQL, InfluxDB
+## Training Details
+| Parameter | Value |
+|-----------|-------|
+| Base Model | nvidia/Mistral-NeMo-Minitron-8B-Instruct |
+| Method | QLoRA (4-bit quantization + LoRA adapters) |
+| Dataset | 4,755 examples (scraped + synthetic) |
+| Eval Set | 376 examples |
+| Epochs | 2 |
+| LoRA Rank | 32 |
+| LoRA Alpha | 64 |
+| Learning Rate | 2e-4 |
+| Effective Batch Size | 16 |
+## Usage
+```python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
+model_id = "irfanalee/incident-responder"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    quantization_config=BitsAndBytesConfig(
+        load_in_4bit=True,
+        bnb_4bit_quant_type="nf4",
+        bnb_4bit_compute_dtype=torch.bfloat16,
+    ),
+    device_map="auto",
+)
+messages = [
+    {"role": "system", "content": "You are an expert DevOps engineer and SRE. Analyze error logs, diagnose incidents, and suggest fixes."},
+    {"role": "user", "content": "Analyze this kubernetes incident:\n\n```\nkubectl describe pod api-server\nState: Terminated\nReason: OOMKilled\nExit Code: 137\nRestart Count: 5\n```"}
+]
+# NeMo chat template
+prompt = "<extra_id_0>System\n" + messages[0]["content"] + "\n"
+prompt += "<extra_id_1>User\n" + messages[1]["content"] + "\n<extra_id_1>Assistant\n"
+inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
+outputs = model.generate(
+    **inputs,
+    max_new_tokens=200,
+    temperature=0.4,
+    repetition_penalty=1.3,
+    do_sample=True,
+)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))