|
|
--- |
|
|
datasets: |
|
|
- custom_jsonl_dataset |
|
|
language: |
|
|
- en |
|
|
library_name: transformers |
|
|
license: apache-2.0 |
|
|
model_name: MSC Software Engineering SLM v1 |
|
|
tags: |
|
|
- software-engineering |
|
|
- QLoRA |
|
|
- Mistral |
|
|
- SLM |
|
|
base_model: |
|
|
- mistralai/Mistral-7B-v0.1 |
|
|
--- |
|
|
|
|
|
# Model Card |
|
|
This model is a **QLoRA fine-tuned variant of Mistral-7B**, optimized for **software engineering, code generation, and technical Q&A** tasks. |
|
|
It was trained on a curated dataset of software design patterns, debugging tips, Python code snippets, and AI engineering discussions to improve reasoning and contextual understanding for software-related queries. |
|
|
|
|
|
|
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Base Model:** `mistralai/Mistral-7B-v0.1` |
|
|
- **Fine-tuning Type:** QLoRA (4-bit quantization) |
|
|
- **Framework:** Hugging Face Transformers + PEFT + bitsandbytes |
|
|
- **Tokenizer:** Same as base model (`AutoTokenizer.from_pretrained(base_model, use_fast=True)`) |
|
|
- **Padding Token:** `tokenizer.pad_token = tokenizer.eos_token` |
|
|
- **Training Objective:** Causal language modeling |
|
|
|
|
|
--- |
|
|
## Model Configuration |
|
|
|
|
|
| **Parameter** | **Value** | |
|
|
| ----------------------------- | ------------------------------------- | |
|
|
| **Model Type** | `mistral` | |
|
|
| **Architecture** | `MistralForCausalLM` | |
|
|
| **Vocab Size** | 32,768 | |
|
|
| **Max Position Embeddings** | 32,768 | |
|
|
| **Hidden Size** | 4,096 | |
|
|
| **Intermediate Size** | 14,336 | |
|
|
| **Number of Hidden Layers** | 32 | |
|
|
| **Number of Attention Heads** | 32 | |
|
|
| **Number of Key-Value Heads** | 8 | |
|
|
| **Hidden Activation** | `silu` | |
|
|
| **Initializer Range** | 0.02 | |
|
|
| **RMS Norm Epsilon** | 1e-5 | |
|
|
| **Dropout (Attention)** | 0.0 | |
|
|
| **Use Cache** | True | |
|
|
| **ROPE Theta** | 1,000,000.0 | |
|
|
| **Quantization Method** | `bitsandbytes` | |
|
|
| **Quantization Config** | 4-bit (nf4), `bfloat16` compute dtype | |
|
|
| **Compute Dtype** | `float16` | |
|
|
| **Load In 4bit** | β
Yes | |
|
|
| **Load In 8bit** | β No | |
|
|
| **Tie Word Embeddings** | False | |
|
|
| **Is Encoder-Decoder** | False | |
|
|
| **BOS Token ID** | 1 | |
|
|
| **EOS Token ID** | 2 | |
|
|
| **Pad Token ID** | None | |
|
|
| **Generation Settings** | | |
|
|
| β Max Length | 20 | |
|
|
| β Min Length | 0 | |
|
|
| β Temperature | 1.0 | |
|
|
| β Top-k | 50 | |
|
|
| β Top-p | 1.0 | |
|
|
| β Num Beams | 1 | |
|
|
| β Repetition Penalty | 1.0 | |
|
|
| β Early Stopping | False | |
|
|
| **ID β Label Map** | {0: `LABEL_0`, 1: `LABEL_1`} | |
|
|
| **Label β ID Map** | {'LABEL_0': 0, 'LABEL_1': 1} | |
|
|
| **Training Framework** | Transformers v4.57.1 | |
|
|
| **Quant Library** | bitsandbytes | |
|
|
| **Local Path / Repo** | `./msci_software_engineering_slm_v1` | |
|
|
|
|
|
## Quantization |
|
|
| **Parameter** | **Value** | |
|
|
| --------------------------- | -------------- | |
|
|
| `_load_in_4bit` | True | |
|
|
| `_load_in_8bit` | False | |
|
|
| `bnb_4bit_compute_dtype` | `bfloat16` | |
|
|
| `bnb_4bit_quant_storage` | `uint8` | |
|
|
| `bnb_4bit_quant_type` | `nf4` | |
|
|
| `bnb_4bit_use_double_quant` | False | |
|
|
| `load_in_4bit` | True | |
|
|
| `load_in_8bit` | False | |
|
|
| `quant_method` | `bitsandbytes` | |
|
|
|
|
|
|
|
|
|
|
|
## Training Data |
|
|
|
|
|
The model was fine-tuned on a custom dataset (`data.jsonl`) consisting of: |
|
|
- Software engineering Q&A pairs |
|
|
- Code examples (Python, SQL, Docker, ML pipelines) |
|
|
- Developer chat-style dialogues |
|
|
- AI agent reasoning snippets |
|
|
|
|
|
--- |
|
|
|
|
|
## Intended Uses |
|
|
|
|
|
- Software development assistance |
|
|
- Generating code snippets or debugging suggestions |
|
|
- Explaining AI/ML or MLOps concepts |
|
|
- General programming conversations |
|
|
|
|
|
--- |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- May produce hallucinated code or incorrect syntax. |
|
|
- Not tested on safety-critical or financial decision-making tasks. |
|
|
- Limited coverage outside software/AI domain. |
|
|
|
|
|
--- |
|
|
|
|
|
|
|
|
## Example Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig |
|
|
import torch |
|
|
|
|
|
model_id = "techpro-saida/msci_software_engineering_slm_v1" |
|
|
|
|
|
# 4-bit config for efficient inference |
|
|
bnb_config = BitsAndBytesConfig( |
|
|
load_in_4bit=True, |
|
|
bnb_4bit_compute_dtype=torch.bfloat16, |
|
|
bnb_4bit_quant_type="nf4", |
|
|
) |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
|
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
model_id, |
|
|
quantization_config=bnb_config, |
|
|
device_map="auto", # automatically balances between GPU/CPU |
|
|
) |
|
|
|
|
|
prompt = "Explain SOLID principles in OOP?" |
|
|
inputs = tokenizer(prompt, return_tensors="pt").to("cuda") |
|
|
|
|
|
outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.7, top_p=0.9) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
|
|
|
|
|
|
##### if you on LOW RAM or CPU |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
|
|
model_id = "techpro-saida/msci_software_engineering_slm_v1" |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="cpu") |
|
|
|
|
|
prompt = "Explain SOLID principles in OOP?" |
|
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
|
outputs = model.generate(**inputs, max_new_tokens=60, temperature=0.7) |
|
|
|
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
|
|
|
|
|
|
``` |
|
|
|
|
|
## Developer |
|
|
|
|
|
- **Developed by:** SAIDA D |
|
|
- **Model type:** SLM |
|
|
- **Language(s) (NLP):** ['en'] |
|
|
- **License:** apache-2.0 |
|
|
- **Finetuned from model : mistralai/Mistral-7B-v0.1` |
|
|
|