File size: 4,578 Bytes
fcac4b7 4779c41 fcac4b7 4779c41 fcac4b7 4779c41 fcac4b7 4779c41 fcac4b7 4779c41 fcac4b7 4779c41 fcac4b7 4779c41 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 | ---
language:
- en
- code
tags:
- python
- text-generation
- qwen
- qlora
- custom-finetune
- code
- ollama
datasets:
- iamtarun/python_code_instructions_18k_alpaca
base_model: Qwen/Qwen2.5-Coder-1.5B-Instruct
---
# ๐ค Qwen2.5-Coder-1.5B-python-MyTune
**Fine-tuned with โค๏ธ by Karim**
Welcome to **Qwen2.5-Coder-1.5B-python-MyTune**! This is a highly optimized, fine-tuned version of `Qwen/Qwen2.5-Coder-1.5B-Instruct`, specifically engineered to understand complex algorithmic instructions and generate clean, efficient, and highly accurate **Python** code.
## ๐ Model Overview
The training architecture utilized the **QLoRA** (Quantized Low-Rank Adaptation) method. This approach ensures high parameter efficiency, allowing the model to acquire advanced coding skills while preserving the robust logical reasoning capabilities of the original base weights.
- **Base Model:** Qwen/Qwen2.5-Coder-1.5B-Instruct
- **Language:** English / Python
- **Training Method:** PEFT / QLoRA Integration
- **Precision:** Mixed Precision (4-bit Base + float16 Adapters)
- **Compute:** Google Colab T4 GPU (16GB VRAM)
## ๐ Training Data
The model was fine-tuned on a carefully curated subset of the [iamtarun/python_code_instructions_18k_alpaca](https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca) dataset. This dataset provides high-quality Python coding instructions, algorithmic challenges, and their corresponding structured solutions.
## ๐ฏ Intended Use
This model is designed to assist software engineers, data scientists, and quantitative analysts with:
- Generating Python scripts from natural language prompts.
- Solving complex algorithmic problems.
- Writing data engineering and mathematical logic code.
---
## ๐ Quick Start: How to Use
You can easily load and run this model locally or on a cloud server using either the standard Hugging Face `transformers` library, or deploy it instantly using **Ollama** for local inference.
### Option A: Local Deployment via Ollama (Recommended for Speed)
Run this model entirely on your local machine without internet connection using Ollama!
**Step 1: Download the Model Files**
First, download the safetensors weights to a local directory:
```bash
pip install -U huggingface_hub
huggingface-cli download karim0010/Qwen2.5-Coder-1.5B-python-MyTune --local-dir ./my_qwen_model
```
**Step 2: Create a `Modelfile**`
In the same folder, create a file named `Modelfile` (no extension) and paste the following ChatML configuration:
```dockerfile
FROM ./my_qwen_model
TEMPLATE """{{ if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{ if .Prompt }}<|im_start|>user
{{ .Prompt }}<|im_end|>
{{ end }}<|im_start|>assistant
"""
PARAMETER stop "<|im_start|>"
PARAMETER stop "<|im_end|>"
PARAMETER temperature 0.3
PARAMETER top_p 0.9
```
**Step 3: Compile and Run**
Build the model in Ollama and start chatting:
```bash
ollama create karim-coder -f ./Modelfile
ollama run karim-coder
```
*Now you can ask it to write Python code right in your terminal!*
---
### Option B: Python Inference (Hugging Face Transformers)
If you prefer integrating the model directly into your Python pipeline, use the following code.
**1. Install Dependencies**
```bash
pip install transformers torch accelerate
```
**2. Inference Script**
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
# Define the repository
model_id = "karim0010/Qwen2.5-Coder-1.5B-python-MyTune"
# Load Tokenizer and Model
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16,
device_map="auto",
trust_remote_code=True
)
# Prepare the prompt using the ChatML template
instruction = "Write a complete and clean Python function to calculate the Fibonacci sequence up to a given number 'n'."
prompt = f"<|im_start|>user\n{instruction}<|im_end|>\n<|im_start|>assistant\n"
# Tokenize inputs
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
# Generate code
print("Generating code...")
outputs = model.generate(
inputs["input_ids"],
attention_mask=inputs["attention_mask"],
max_new_tokens=256,
temperature=0.3, # Low temperature is recommended for accurate coding
top_p=0.9,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
# Decode and print the result
response = tokenizer.decode(outputs[0][len(inputs["input_ids"][0]):], skip_special_tokens=True)
print("\n--- Output ---")
print(response.strip())
```
|