code-specialist-7b / README.md
Ricardouchub's picture
Update README.md
651e888 verified
---
license: mit
datasets:
- sahil2801/CodeAlpaca-20k
- TokenBender/code_instructions_122k_alpaca_style
base_model:
- mistralai/Mistral-7B-Instruct-v0.3
tags:
- code
- python
- sql
- data-science
---
# Code Specialist 7B
<p align="left">
<a href="https://huggingface.co/Ricardouchub/code-specialist-7b">
<img src="https://img.shields.io/badge/HuggingFace-Code_Specialist_7B-FFD21E?style=flat-square&logo=huggingface&logoColor=black" alt="Hugging Face"/>
</a>
<a href="https://www.python.org/">
<img src="https://img.shields.io/badge/Python-3.10+-3776AB?style=flat-square&logo=python&logoColor=white" alt="Python"/>
</a>
<a href="https://huggingface.co/docs/transformers">
<img src="https://img.shields.io/badge/Transformers-4.56+-purple?style=flat-square&logo=huggingface&logoColor=white" alt="Transformers"/>
</a>
<a href="https://github.com/Ricardouchub">
<img src="https://img.shields.io/badge/Author-Ricardo_Urdaneta-000000?style=flat-square&logo=github&logoColor=white" alt="Author"/>
</a>
</p>
---
## Description
**Code Specialist 7B** is a fine-tuned version of **Mistral-7B-Instruct-v0.3**, trained through **Supervised Fine-Tuning (SFT)** using datasets focused on **Python and SQL**.
The goal of this training was to enhance the model’s performance in **data analysis, programming problem-solving, and technical reasoning**.
The model preserves the **7B parameter Transformer decoder-only** architecture while introducing a code-oriented fine-tuning, resulting in improved robustness for function generation, SQL queries, and technical answers.
---
## Base Model
- [Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3)
- Architecture: Transformer (decoder-only)
- Parameters: ~7B
---
## Datasets Used for SFT
- [CodeAlpaca-20k](https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k)
- [Code Instructions 122k (Alpaca-style)](https://huggingface.co/datasets/TokenBender/code_instructions_122k_alpaca_style)
Both datasets were **filtered to include only Python and SQL examples**, following **Alpaca/Mistral-style** instruction formatting.
Example prompt format:
```
[INST] Write a Python function that adds two numbers. [/INST]
def add(a, b):
return a + b
```
---
## Training Details
| **Aspect** | **Detail** |
|--------------------|-------------|
| **Method** | QLoRA with final weight merge |
| **Frameworks** | `transformers`, `trl`, `peft`, `bitsandbytes` |
| **Hardware** | GPU with 12 GB VRAM (4-bit quantization for training) |
### Main Hyperparameters
| **Parameter** | **Value** |
|----------------|-----------|
| `per_device_train_batch_size` | 2 |
| `gradient_accumulation_steps` | 4 |
| `learning_rate` | 2e-4 |
| `num_train_epochs` | 1 |
| `max_seq_length` | 1024 |
---
## Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "Ricardouchub/Code-Specialist-7B"
tok = AutoTokenizer.from_pretrained(model_id)
mdl = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
prompt = "[INST] Write a Python function that calculates the average of a list. [/INST]"
inputs = tok(prompt, return_tensors="pt").to(mdl.device)
out = mdl.generate(**inputs, max_new_tokens=256)
print(tok.decode(out[0], skip_special_tokens=True))
```
---
## Initial Benchmarks
- **Simple evaluation (Python tasks):** Improved results on small programming and data-related tasks, including **data analysis, SQL query generation, and Python snippets**, compared to the base model.
- Further evaluation on **HumanEval** or **MBPP** is recommended for reproducible metrics.
---
## Author
**Ricardo Urdaneta**
- [LinkedIn](https://www.linkedin.com/in/ricardourdanetacastro/)
- [GitHub](https://github.com/Ricardouchub)
---
## Limitations
- The model does **not guarantee 100% accuracy** on complex programming tasks.
- It may produce inconsistent results for ambiguous or incomplete prompts.
---
## License
This model is released under the same license as **Mistral-7B-Instruct-v0.3****MIT License**.