--- license: mit datasets: - sahil2801/CodeAlpaca-20k - TokenBender/code_instructions_122k_alpaca_style base_model: - mistralai/Mistral-7B-Instruct-v0.3 tags: - code - python - sql - data-science --- # Code Specialist 7B
--- ## Description **Code Specialist 7B** is a fine-tuned version of **Mistral-7B-Instruct-v0.3**, trained through **Supervised Fine-Tuning (SFT)** using datasets focused on **Python and SQL**. The goal of this training was to enhance the model’s performance in **data analysis, programming problem-solving, and technical reasoning**. The model preserves the **7B parameter Transformer decoder-only** architecture while introducing a code-oriented fine-tuning, resulting in improved robustness for function generation, SQL queries, and technical answers. --- ## Base Model - [Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) - Architecture: Transformer (decoder-only) - Parameters: ~7B --- ## Datasets Used for SFT - [CodeAlpaca-20k](https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k) - [Code Instructions 122k (Alpaca-style)](https://huggingface.co/datasets/TokenBender/code_instructions_122k_alpaca_style) Both datasets were **filtered to include only Python and SQL examples**, following **Alpaca/Mistral-style** instruction formatting. Example prompt format: ``` [INST] Write a Python function that adds two numbers. [/INST] def add(a, b): return a + b ``` --- ## Training Details | **Aspect** | **Detail** | |--------------------|-------------| | **Method** | QLoRA with final weight merge | | **Frameworks** | `transformers`, `trl`, `peft`, `bitsandbytes` | | **Hardware** | GPU with 12 GB VRAM (4-bit quantization for training) | ### Main Hyperparameters | **Parameter** | **Value** | |----------------|-----------| | `per_device_train_batch_size` | 2 | | `gradient_accumulation_steps` | 4 | | `learning_rate` | 2e-4 | | `num_train_epochs` | 1 | | `max_seq_length` | 1024 | --- ## Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM model_id = "Ricardouchub/Code-Specialist-7B" tok = AutoTokenizer.from_pretrained(model_id) mdl = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto") prompt = "[INST] Write a Python function that calculates the average of a list. [/INST]" inputs = tok(prompt, return_tensors="pt").to(mdl.device) out = mdl.generate(**inputs, max_new_tokens=256) print(tok.decode(out[0], skip_special_tokens=True)) ``` --- ## Initial Benchmarks - **Simple evaluation (Python tasks):** Improved results on small programming and data-related tasks, including **data analysis, SQL query generation, and Python snippets**, compared to the base model. - Further evaluation on **HumanEval** or **MBPP** is recommended for reproducible metrics. --- ## Author **Ricardo Urdaneta** - [LinkedIn](https://www.linkedin.com/in/ricardourdanetacastro/) - [GitHub](https://github.com/Ricardouchub) --- ## Limitations - The model does **not guarantee 100% accuracy** on complex programming tasks. - It may produce inconsistent results for ambiguous or incomplete prompts. --- ## License This model is released under the same license as **Mistral-7B-Instruct-v0.3** — **MIT License**.