irfanalee commited on
Commit
4690a2e
·
verified ·
1 Parent(s): affd813

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +82 -0
README.md ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: nvidia/Mistral-NeMo-Minitron-8B-Instruct
4
+ tags:
5
+ - devops
6
+ - incident-response
7
+ - sre
8
+ - mistral-nemo
9
+ - fine-tuned
10
+ - qlora
11
+ language:
12
+ - en
13
+ pipeline_tag: text-generation
14
+ ---
15
+
16
+ # DevOps Incident Responder
17
+
18
+ A fine-tuned Mistral-NeMo-Minitron-8B-Instruct model for DevOps incident diagnosis and resolution.
19
+
20
+ ## What It Does
21
+
22
+ Analyzes error logs, stack traces, and incident descriptions to provide:
23
+ - **Root Cause** analysis
24
+ - **Severity** assessment (Low / Medium / High / Critical)
25
+ - **Step-by-step fixes** with exact commands
26
+ - **Prevention** guidance
27
+
28
+ ## Tech Coverage
29
+
30
+ Kubernetes, Docker, Terraform, Azure, GCP, Node.js, Redis, MongoDB, Nginx, PostgreSQL, InfluxDB
31
+
32
+ ## Training Details
33
+
34
+ | Parameter | Value |
35
+ |-----------|-------|
36
+ | Base Model | nvidia/Mistral-NeMo-Minitron-8B-Instruct |
37
+ | Method | QLoRA (4-bit quantization + LoRA adapters) |
38
+ | Dataset | 4,755 examples (scraped + synthetic) |
39
+ | Eval Set | 376 examples |
40
+ | Epochs | 2 |
41
+ | LoRA Rank | 32 |
42
+ | LoRA Alpha | 64 |
43
+ | Learning Rate | 2e-4 |
44
+ | Effective Batch Size | 16 |
45
+
46
+ ## Usage
47
+
48
+ ```python
49
+ import torch
50
+ from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
51
+
52
+ model_id = "irfanalee/incident-responder"
53
+
54
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
55
+ model = AutoModelForCausalLM.from_pretrained(
56
+ model_id,
57
+ quantization_config=BitsAndBytesConfig(
58
+ load_in_4bit=True,
59
+ bnb_4bit_quant_type="nf4",
60
+ bnb_4bit_compute_dtype=torch.bfloat16,
61
+ ),
62
+ device_map="auto",
63
+ )
64
+
65
+ messages = [
66
+ {"role": "system", "content": "You are an expert DevOps engineer and SRE. Analyze error logs, diagnose incidents, and suggest fixes."},
67
+ {"role": "user", "content": "Analyze this kubernetes incident:\n\n```\nkubectl describe pod api-server\nState: Terminated\nReason: OOMKilled\nExit Code: 137\nRestart Count: 5\n```"}
68
+ ]
69
+
70
+ # NeMo chat template
71
+ prompt = "<extra_id_0>System\n" + messages[0]["content"] + "\n"
72
+ prompt += "<extra_id_1>User\n" + messages[1]["content"] + "\n<extra_id_1>Assistant\n"
73
+
74
+ inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
75
+ outputs = model.generate(
76
+ **inputs,
77
+ max_new_tokens=200,
78
+ temperature=0.4,
79
+ repetition_penalty=1.3,
80
+ do_sample=True,
81
+ )
82
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))