File size: 6,803 Bytes
aa56624
 
 
 
 
 
 
 
 
 
 
 
 
5ba9c14
 
aa56624
 
5ba9c14
 
 
aa56624
 
 
 
 
5ba9c14
 
 
 
 
 
aa56624
5ba9c14
b5f85a3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
aa56624
5ba9c14
aa56624
5ba9c14
 
 
 
 
aa56624
5ba9c14
aa56624
5ba9c14
aa56624
5ba9c14
 
 
 
aa56624
5ba9c14
aa56624
5ba9c14
aa56624
5ba9c14
 
 
aa56624
5ba9c14
aa56624
 
5ba9c14
aa56624
5ba9c14
cf3dfe3
 
aa56624
cf3dfe3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
aa56624
b97cb65
cf3dfe3
 
 
b97cb65
cf3dfe3
f6a0952
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1bec490
aa56624
1bec490
aa56624
5ba9c14
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
---
datasets:
- custom_jsonl_dataset
language:
- en
library_name: transformers
license: apache-2.0
model_name: MSC Software Engineering SLM v1
tags:
- software-engineering
- QLoRA
- Mistral
- SLM
base_model:
- mistralai/Mistral-7B-v0.1
---

# Model Card
This model is a **QLoRA fine-tuned variant of Mistral-7B**, optimized for **software engineering, code generation, and technical Q&A** tasks.  
It was trained on a curated dataset of software design patterns, debugging tips, Python code snippets, and AI engineering discussions to improve reasoning and contextual understanding for software-related queries.



## Model Details

- **Base Model:** `mistralai/Mistral-7B-v0.1`
- **Fine-tuning Type:** QLoRA (4-bit quantization)
- **Framework:** Hugging Face Transformers + PEFT + bitsandbytes
- **Tokenizer:** Same as base model (`AutoTokenizer.from_pretrained(base_model, use_fast=True)`)
- **Padding Token:** `tokenizer.pad_token = tokenizer.eos_token`
- **Training Objective:** Causal language modeling

---
## Model Configuration

| **Parameter**                 | **Value**                             |
| ----------------------------- | ------------------------------------- |
| **Model Type**                | `mistral`                             |
| **Architecture**              | `MistralForCausalLM`                  |
| **Vocab Size**                | 32,768                                |
| **Max Position Embeddings**   | 32,768                                |
| **Hidden Size**               | 4,096                                 |
| **Intermediate Size**         | 14,336                                |
| **Number of Hidden Layers**   | 32                                    |
| **Number of Attention Heads** | 32                                    |
| **Number of Key-Value Heads** | 8                                     |
| **Hidden Activation**         | `silu`                                |
| **Initializer Range**         | 0.02                                  |
| **RMS Norm Epsilon**          | 1e-5                                  |
| **Dropout (Attention)**       | 0.0                                   |
| **Use Cache**                 | True                                  |
| **ROPE Theta**                | 1,000,000.0                           |
| **Quantization Method**       | `bitsandbytes`                        |
| **Quantization Config**       | 4-bit (nf4), `bfloat16` compute dtype |
| **Compute Dtype**             | `float16`                             |
| **Load In 4bit**              | βœ… Yes                                 |
| **Load In 8bit**              | ❌ No                                  |
| **Tie Word Embeddings**       | False                                 |
| **Is Encoder-Decoder**        | False                                 |
| **BOS Token ID**              | 1                                     |
| **EOS Token ID**              | 2                                     |
| **Pad Token ID**              | None                                  |
| **Generation Settings**       |                                       |
| β†’ Max Length                  | 20                                    |
| β†’ Min Length                  | 0                                     |
| β†’ Temperature                 | 1.0                                   |
| β†’ Top-k                       | 50                                    |
| β†’ Top-p                       | 1.0                                   |
| β†’ Num Beams                   | 1                                     |
| β†’ Repetition Penalty          | 1.0                                   |
| β†’ Early Stopping              | False                                 |
| **ID β†’ Label Map**            | {0: `LABEL_0`, 1: `LABEL_1`}          |
| **Label β†’ ID Map**            | {'LABEL_0': 0, 'LABEL_1': 1}          |
| **Training Framework**        | Transformers v4.57.1                  |
| **Quant Library**             | bitsandbytes                          |
| **Local Path / Repo**         | `./msci_software_engineering_slm_v1`  |

## Quantization 
| **Parameter**               | **Value**      |
| --------------------------- | -------------- |
| `_load_in_4bit`             | True           |
| `_load_in_8bit`             | False          |
| `bnb_4bit_compute_dtype`    | `bfloat16`     |
| `bnb_4bit_quant_storage`    | `uint8`        |
| `bnb_4bit_quant_type`       | `nf4`          |
| `bnb_4bit_use_double_quant` | False          |
| `load_in_4bit`              | True           |
| `load_in_8bit`              | False          |
| `quant_method`              | `bitsandbytes` |



## Training Data

The model was fine-tuned on a custom dataset (`data.jsonl`) consisting of:
- Software engineering Q&A pairs  
- Code examples (Python, SQL, Docker, ML pipelines)
- Developer chat-style dialogues  
- AI agent reasoning snippets  

---

## Intended Uses

- Software development assistance  
- Generating code snippets or debugging suggestions  
- Explaining AI/ML or MLOps concepts  
- General programming conversations  

---

## Limitations

- May produce hallucinated code or incorrect syntax.
- Not tested on safety-critical or financial decision-making tasks.
- Limited coverage outside software/AI domain.

---


## Example Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch

model_id = "techpro-saida/msci_software_engineering_slm_v1"

# 4-bit config for efficient inference
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_quant_type="nf4",
)

tokenizer = AutoTokenizer.from_pretrained(model_id)

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    quantization_config=bnb_config,
    device_map="auto",  # automatically balances between GPU/CPU
)

prompt = "Explain SOLID principles in OOP?"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.7, top_p=0.9)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))


##### if you on LOW RAM or CPU
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "techpro-saida/msci_software_engineering_slm_v1"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="cpu")

prompt = "Explain SOLID principles in OOP?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=60, temperature=0.7)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))


```

## Developer

- **Developed by:** SAIDA D
- **Model type:** SLM
- **Language(s) (NLP):** ['en']
- **License:** apache-2.0
- **Finetuned from model : mistralai/Mistral-7B-v0.1`