File size: 4,141 Bytes
642ef3d eec5dda 642ef3d 651e888 642ef3d 651e888 642ef3d a14d59b 651e888 a14d59b 642ef3d 651e888 642ef3d 651e888 642ef3d 651e888 642ef3d 651e888 642ef3d 651e888 642ef3d 651e888 642ef3d 651e888 642ef3d 651e888 642ef3d 651e888 642ef3d 651e888 642ef3d 651e888 642ef3d 651e888 642ef3d 651e888 642ef3d 651e888 642ef3d 651e888 642ef3d 651e888 642ef3d 651e888 642ef3d 651e888 642ef3d 651e888 642ef3d 651e888 642ef3d 651e888 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 |
---
license: mit
datasets:
- sahil2801/CodeAlpaca-20k
- TokenBender/code_instructions_122k_alpaca_style
base_model:
- mistralai/Mistral-7B-Instruct-v0.3
tags:
- code
- python
- sql
- data-science
---
# Code Specialist 7B
<p align="left">
<a href="https://huggingface.co/Ricardouchub/code-specialist-7b">
<img src="https://img.shields.io/badge/HuggingFace-Code_Specialist_7B-FFD21E?style=flat-square&logo=huggingface&logoColor=black" alt="Hugging Face"/>
</a>
<a href="https://www.python.org/">
<img src="https://img.shields.io/badge/Python-3.10+-3776AB?style=flat-square&logo=python&logoColor=white" alt="Python"/>
</a>
<a href="https://huggingface.co/docs/transformers">
<img src="https://img.shields.io/badge/Transformers-4.56+-purple?style=flat-square&logo=huggingface&logoColor=white" alt="Transformers"/>
</a>
<a href="https://github.com/Ricardouchub">
<img src="https://img.shields.io/badge/Author-Ricardo_Urdaneta-000000?style=flat-square&logo=github&logoColor=white" alt="Author"/>
</a>
</p>
---
## Description
**Code Specialist 7B** is a fine-tuned version of **Mistral-7B-Instruct-v0.3**, trained through **Supervised Fine-Tuning (SFT)** using datasets focused on **Python and SQL**.
The goal of this training was to enhance the model’s performance in **data analysis, programming problem-solving, and technical reasoning**.
The model preserves the **7B parameter Transformer decoder-only** architecture while introducing a code-oriented fine-tuning, resulting in improved robustness for function generation, SQL queries, and technical answers.
---
## Base Model
- [Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3)
- Architecture: Transformer (decoder-only)
- Parameters: ~7B
---
## Datasets Used for SFT
- [CodeAlpaca-20k](https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k)
- [Code Instructions 122k (Alpaca-style)](https://huggingface.co/datasets/TokenBender/code_instructions_122k_alpaca_style)
Both datasets were **filtered to include only Python and SQL examples**, following **Alpaca/Mistral-style** instruction formatting.
Example prompt format:
```
[INST] Write a Python function that adds two numbers. [/INST]
def add(a, b):
return a + b
```
---
## Training Details
| **Aspect** | **Detail** |
|--------------------|-------------|
| **Method** | QLoRA with final weight merge |
| **Frameworks** | `transformers`, `trl`, `peft`, `bitsandbytes` |
| **Hardware** | GPU with 12 GB VRAM (4-bit quantization for training) |
### Main Hyperparameters
| **Parameter** | **Value** |
|----------------|-----------|
| `per_device_train_batch_size` | 2 |
| `gradient_accumulation_steps` | 4 |
| `learning_rate` | 2e-4 |
| `num_train_epochs` | 1 |
| `max_seq_length` | 1024 |
---
## Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "Ricardouchub/Code-Specialist-7B"
tok = AutoTokenizer.from_pretrained(model_id)
mdl = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
prompt = "[INST] Write a Python function that calculates the average of a list. [/INST]"
inputs = tok(prompt, return_tensors="pt").to(mdl.device)
out = mdl.generate(**inputs, max_new_tokens=256)
print(tok.decode(out[0], skip_special_tokens=True))
```
---
## Initial Benchmarks
- **Simple evaluation (Python tasks):** Improved results on small programming and data-related tasks, including **data analysis, SQL query generation, and Python snippets**, compared to the base model.
- Further evaluation on **HumanEval** or **MBPP** is recommended for reproducible metrics.
---
## Author
**Ricardo Urdaneta**
- [LinkedIn](https://www.linkedin.com/in/ricardourdanetacastro/)
- [GitHub](https://github.com/Ricardouchub)
---
## Limitations
- The model does **not guarantee 100% accuracy** on complex programming tasks.
- It may produce inconsistent results for ambiguous or incomplete prompts.
---
## License
This model is released under the same license as **Mistral-7B-Instruct-v0.3** — **MIT License**. |