File size: 4,141 Bytes
642ef3d
 
 
 
eec5dda
642ef3d
 
 
 
651e888
 
 
642ef3d
651e888
642ef3d
 
a14d59b
 
 
 
 
 
 
 
 
 
 
651e888
a14d59b
 
642ef3d
651e888
 
 
642ef3d
651e888
 
642ef3d
651e888
642ef3d
 
 
651e888
642ef3d
 
651e888
 
642ef3d
 
 
651e888
642ef3d
 
651e888
642ef3d
651e888
642ef3d
651e888
642ef3d
 
651e888
642ef3d
 
 
 
 
 
651e888
 
 
 
 
 
 
 
 
642ef3d
651e888
 
 
 
 
 
 
642ef3d
 
 
651e888
642ef3d
 
 
 
 
 
 
 
651e888
642ef3d
 
 
 
 
 
 
 
651e888
642ef3d
651e888
 
642ef3d
 
 
651e888
642ef3d
 
 
 
 
 
 
651e888
642ef3d
651e888
 
642ef3d
 
 
651e888
642ef3d
651e888
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
---
license: mit
datasets:
- sahil2801/CodeAlpaca-20k
- TokenBender/code_instructions_122k_alpaca_style
base_model:
- mistralai/Mistral-7B-Instruct-v0.3
tags:
- code
- python
- sql
- data-science
---

# Code Specialist 7B  

<p align="left">
  <a href="https://huggingface.co/Ricardouchub/code-specialist-7b">
    <img src="https://img.shields.io/badge/HuggingFace-Code_Specialist_7B-FFD21E?style=flat-square&logo=huggingface&logoColor=black" alt="Hugging Face"/>
  </a>
  <a href="https://www.python.org/">
    <img src="https://img.shields.io/badge/Python-3.10+-3776AB?style=flat-square&logo=python&logoColor=white" alt="Python"/>
  </a>
  <a href="https://huggingface.co/docs/transformers">
    <img src="https://img.shields.io/badge/Transformers-4.56+-purple?style=flat-square&logo=huggingface&logoColor=white" alt="Transformers"/>
  </a>
  <a href="https://github.com/Ricardouchub">
    <img src="https://img.shields.io/badge/Author-Ricardo_Urdaneta-000000?style=flat-square&logo=github&logoColor=white" alt="Author"/>
  </a>
</p>

---

## Description  

**Code Specialist 7B** is a fine-tuned version of **Mistral-7B-Instruct-v0.3**, trained through **Supervised Fine-Tuning (SFT)** using datasets focused on **Python and SQL**.  
The goal of this training was to enhance the model’s performance in **data analysis, programming problem-solving, and technical reasoning**.

The model preserves the **7B parameter Transformer decoder-only** architecture while introducing a code-oriented fine-tuning, resulting in improved robustness for function generation, SQL queries, and technical answers.

---

## Base Model  

- [Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3)  
- Architecture: Transformer (decoder-only)  
- Parameters: ~7B  

---

## Datasets Used for SFT  

- [CodeAlpaca-20k](https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k)  
- [Code Instructions 122k (Alpaca-style)](https://huggingface.co/datasets/TokenBender/code_instructions_122k_alpaca_style)  

Both datasets were **filtered to include only Python and SQL examples**, following **Alpaca/Mistral-style** instruction formatting.

Example prompt format:  

```
[INST] Write a Python function that adds two numbers. [/INST]  
def add(a, b):  
    return a + b  
```

---

## Training Details  

| **Aspect**        | **Detail** |
|--------------------|-------------|
| **Method**         | QLoRA with final weight merge |
| **Frameworks**     | `transformers`, `trl`, `peft`, `bitsandbytes` |
| **Hardware**       | GPU with 12 GB VRAM (4-bit quantization for training) |

### Main Hyperparameters

| **Parameter** | **Value** |
|----------------|-----------|
| `per_device_train_batch_size` | 2 |
| `gradient_accumulation_steps` | 4 |
| `learning_rate` | 2e-4 |
| `num_train_epochs` | 1 |
| `max_seq_length` | 1024 |

---

## Usage  

```python
from transformers import AutoTokenizer, AutoModelForCausalLM  

model_id = "Ricardouchub/Code-Specialist-7B"  
tok = AutoTokenizer.from_pretrained(model_id)  
mdl = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")  

prompt = "[INST] Write a Python function that calculates the average of a list. [/INST]"  
inputs = tok(prompt, return_tensors="pt").to(mdl.device)  

out = mdl.generate(**inputs, max_new_tokens=256)  
print(tok.decode(out[0], skip_special_tokens=True))  
```

---

## Initial Benchmarks  

- **Simple evaluation (Python tasks):** Improved results on small programming and data-related tasks, including **data analysis, SQL query generation, and Python snippets**, compared to the base model.  
- Further evaluation on **HumanEval** or **MBPP** is recommended for reproducible metrics.

---

## Author  

**Ricardo Urdaneta**  
- [LinkedIn](https://www.linkedin.com/in/ricardourdanetacastro/)  
- [GitHub](https://github.com/Ricardouchub)  

---

## Limitations  

- The model does **not guarantee 100% accuracy** on complex programming tasks.  
- It may produce inconsistent results for ambiguous or incomplete prompts.  

---

## License  

This model is released under the same license as **Mistral-7B-Instruct-v0.3****MIT License**.