techpro-saida's picture
Update README.md
f6a0952 verified
---
datasets:
- custom_jsonl_dataset
language:
- en
library_name: transformers
license: apache-2.0
model_name: MSC Software Engineering SLM v1
tags:
- software-engineering
- QLoRA
- Mistral
- SLM
base_model:
- mistralai/Mistral-7B-v0.1
---
# Model Card
This model is a **QLoRA fine-tuned variant of Mistral-7B**, optimized for **software engineering, code generation, and technical Q&A** tasks.
It was trained on a curated dataset of software design patterns, debugging tips, Python code snippets, and AI engineering discussions to improve reasoning and contextual understanding for software-related queries.
## Model Details
- **Base Model:** `mistralai/Mistral-7B-v0.1`
- **Fine-tuning Type:** QLoRA (4-bit quantization)
- **Framework:** Hugging Face Transformers + PEFT + bitsandbytes
- **Tokenizer:** Same as base model (`AutoTokenizer.from_pretrained(base_model, use_fast=True)`)
- **Padding Token:** `tokenizer.pad_token = tokenizer.eos_token`
- **Training Objective:** Causal language modeling
---
## Model Configuration
| **Parameter** | **Value** |
| ----------------------------- | ------------------------------------- |
| **Model Type** | `mistral` |
| **Architecture** | `MistralForCausalLM` |
| **Vocab Size** | 32,768 |
| **Max Position Embeddings** | 32,768 |
| **Hidden Size** | 4,096 |
| **Intermediate Size** | 14,336 |
| **Number of Hidden Layers** | 32 |
| **Number of Attention Heads** | 32 |
| **Number of Key-Value Heads** | 8 |
| **Hidden Activation** | `silu` |
| **Initializer Range** | 0.02 |
| **RMS Norm Epsilon** | 1e-5 |
| **Dropout (Attention)** | 0.0 |
| **Use Cache** | True |
| **ROPE Theta** | 1,000,000.0 |
| **Quantization Method** | `bitsandbytes` |
| **Quantization Config** | 4-bit (nf4), `bfloat16` compute dtype |
| **Compute Dtype** | `float16` |
| **Load In 4bit** | βœ… Yes |
| **Load In 8bit** | ❌ No |
| **Tie Word Embeddings** | False |
| **Is Encoder-Decoder** | False |
| **BOS Token ID** | 1 |
| **EOS Token ID** | 2 |
| **Pad Token ID** | None |
| **Generation Settings** | |
| β†’ Max Length | 20 |
| β†’ Min Length | 0 |
| β†’ Temperature | 1.0 |
| β†’ Top-k | 50 |
| β†’ Top-p | 1.0 |
| β†’ Num Beams | 1 |
| β†’ Repetition Penalty | 1.0 |
| β†’ Early Stopping | False |
| **ID β†’ Label Map** | {0: `LABEL_0`, 1: `LABEL_1`} |
| **Label β†’ ID Map** | {'LABEL_0': 0, 'LABEL_1': 1} |
| **Training Framework** | Transformers v4.57.1 |
| **Quant Library** | bitsandbytes |
| **Local Path / Repo** | `./msci_software_engineering_slm_v1` |
## Quantization
| **Parameter** | **Value** |
| --------------------------- | -------------- |
| `_load_in_4bit` | True |
| `_load_in_8bit` | False |
| `bnb_4bit_compute_dtype` | `bfloat16` |
| `bnb_4bit_quant_storage` | `uint8` |
| `bnb_4bit_quant_type` | `nf4` |
| `bnb_4bit_use_double_quant` | False |
| `load_in_4bit` | True |
| `load_in_8bit` | False |
| `quant_method` | `bitsandbytes` |
## Training Data
The model was fine-tuned on a custom dataset (`data.jsonl`) consisting of:
- Software engineering Q&A pairs
- Code examples (Python, SQL, Docker, ML pipelines)
- Developer chat-style dialogues
- AI agent reasoning snippets
---
## Intended Uses
- Software development assistance
- Generating code snippets or debugging suggestions
- Explaining AI/ML or MLOps concepts
- General programming conversations
---
## Limitations
- May produce hallucinated code or incorrect syntax.
- Not tested on safety-critical or financial decision-making tasks.
- Limited coverage outside software/AI domain.
---
## Example Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch
model_id = "techpro-saida/msci_software_engineering_slm_v1"
# 4-bit config for efficient inference
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_quant_type="nf4",
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
quantization_config=bnb_config,
device_map="auto", # automatically balances between GPU/CPU
)
prompt = "Explain SOLID principles in OOP?"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.7, top_p=0.9)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
##### if you on LOW RAM or CPU
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "techpro-saida/msci_software_engineering_slm_v1"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="cpu")
prompt = "Explain SOLID principles in OOP?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=60, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## Developer
- **Developed by:** SAIDA D
- **Model type:** SLM
- **Language(s) (NLP):** ['en']
- **License:** apache-2.0
- **Finetuned from model : mistralai/Mistral-7B-v0.1`