File size: 4,141 Bytes

---
license: mit
datasets:
- sahil2801/CodeAlpaca-20k
- TokenBender/code_instructions_122k_alpaca_style
base_model:
- mistralai/Mistral-7B-Instruct-v0.3
tags:
- code
- python
- sql
- data-science
---

# Code Specialist 7B  

<p align="left">
  <a href="https://huggingface.co/Ricardouchub/code-specialist-7b">
    <img src="https://img.shields.io/badge/HuggingFace-Code_Specialist_7B-FFD21E?style=flat-square&logo=huggingface&logoColor=black" alt="Hugging Face"/>
  </a>
  <a href="https://www.python.org/">
    <img src="https://img.shields.io/badge/Python-3.10+-3776AB?style=flat-square&logo=python&logoColor=white" alt="Python"/>
  </a>
  <a href="https://huggingface.co/docs/transformers">
    <img src="https://img.shields.io/badge/Transformers-4.56+-purple?style=flat-square&logo=huggingface&logoColor=white" alt="Transformers"/>
  </a>
  <a href="https://github.com/Ricardouchub">
    <img src="https://img.shields.io/badge/Author-Ricardo_Urdaneta-000000?style=flat-square&logo=github&logoColor=white" alt="Author"/>
  </a>
</p>

---

## Description  

**Code Specialist 7B** is a fine-tuned version of **Mistral-7B-Instruct-v0.3**, trained through **Supervised Fine-Tuning (SFT)** using datasets focused on **Python and SQL**.  
The goal of this training was to enhance the model’s performance in **data analysis, programming problem-solving, and technical reasoning**.

The model preserves the **7B parameter Transformer decoder-only** architecture while introducing a code-oriented fine-tuning, resulting in improved robustness for function generation, SQL queries, and technical answers.

---

## Base Model  

- [Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3)  
- Architecture: Transformer (decoder-only)  
- Parameters: ~7B  

---

## Datasets Used for SFT  

- [CodeAlpaca-20k](https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k)  
- [Code Instructions 122k (Alpaca-style)](https://huggingface.co/datasets/TokenBender/code_instructions_122k_alpaca_style)  

Both datasets were **filtered to include only Python and SQL examples**, following **Alpaca/Mistral-style** instruction formatting.

Example prompt format:  

```
[INST] Write a Python function that adds two numbers. [/INST]  
def add(a, b):  
    return a + b  
```

---

## Training Details  

| **Aspect**        | **Detail** |
|--------------------|-------------|
| **Method**         | QLoRA with final weight merge |
| **Frameworks**     | `transformers`, `trl`, `peft`, `bitsandbytes` |
| **Hardware**       | GPU with 12 GB VRAM (4-bit quantization for training) |

### Main Hyperparameters

| **Parameter** | **Value** |
|----------------|-----------|
| `per_device_train_batch_size` | 2 |
| `gradient_accumulation_steps` | 4 |
| `learning_rate` | 2e-4 |
| `num_train_epochs` | 1 |
| `max_seq_length` | 1024 |

---

## Usage  

```python
from transformers import AutoTokenizer, AutoModelForCausalLM  

model_id = "Ricardouchub/Code-Specialist-7B"  
tok = AutoTokenizer.from_pretrained(model_id)  
mdl = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")  

prompt = "[INST] Write a Python function that calculates the average of a list. [/INST]"  
inputs = tok(prompt, return_tensors="pt").to(mdl.device)  

out = mdl.generate(**inputs, max_new_tokens=256)  
print(tok.decode(out[0], skip_special_tokens=True))  
```

---

## Initial Benchmarks  

- **Simple evaluation (Python tasks):** Improved results on small programming and data-related tasks, including **data analysis, SQL query generation, and Python snippets**, compared to the base model.  
- Further evaluation on **HumanEval** or **MBPP** is recommended for reproducible metrics.

---

## Author  

**Ricardo Urdaneta**  
- [LinkedIn](https://www.linkedin.com/in/ricardourdanetacastro/)  
- [GitHub](https://github.com/Ricardouchub)  

---

## Limitations  

- The model does **not guarantee 100% accuracy** on complex programming tasks.  
- It may produce inconsistent results for ambiguous or incomplete prompts.  

---

## License  

This model is released under the same license as **Mistral-7B-Instruct-v0.3** — **MIT License**.