File size: 4,578 Bytes
fcac4b7
 
 
 
 
 
 
 
 
 
4779c41
 
fcac4b7
 
 
 
 
4779c41
 
 
fcac4b7
4779c41
fcac4b7
4779c41
 
 
 
 
 
 
 
 
fcac4b7
 
4779c41
fcac4b7
 
 
4779c41
fcac4b7
 
 
 
 
4779c41
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
---
language:
- en
- code
tags:
- python
- text-generation
- qwen
- qlora
- custom-finetune
- code
- ollama
datasets:
- iamtarun/python_code_instructions_18k_alpaca
base_model: Qwen/Qwen2.5-Coder-1.5B-Instruct
---

# ๐Ÿค– Qwen2.5-Coder-1.5B-python-MyTune

**Fine-tuned with โค๏ธ by Karim**

Welcome to **Qwen2.5-Coder-1.5B-python-MyTune**! This is a highly optimized, fine-tuned version of `Qwen/Qwen2.5-Coder-1.5B-Instruct`, specifically engineered to understand complex algorithmic instructions and generate clean, efficient, and highly accurate **Python** code.

## ๐Ÿ“Œ Model Overview

The training architecture utilized the **QLoRA** (Quantized Low-Rank Adaptation) method. This approach ensures high parameter efficiency, allowing the model to acquire advanced coding skills while preserving the robust logical reasoning capabilities of the original base weights.

- **Base Model:** Qwen/Qwen2.5-Coder-1.5B-Instruct
- **Language:** English / Python
- **Training Method:** PEFT / QLoRA Integration
- **Precision:** Mixed Precision (4-bit Base + float16 Adapters)
- **Compute:** Google Colab T4 GPU (16GB VRAM)

## ๐Ÿ“Š Training Data

The model was fine-tuned on a carefully curated subset of the [iamtarun/python_code_instructions_18k_alpaca](https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca) dataset. This dataset provides high-quality Python coding instructions, algorithmic challenges, and their corresponding structured solutions.

## ๐ŸŽฏ Intended Use

This model is designed to assist software engineers, data scientists, and quantitative analysts with:
- Generating Python scripts from natural language prompts.
- Solving complex algorithmic problems.
- Writing data engineering and mathematical logic code.

---

## ๐Ÿš€ Quick Start: How to Use

You can easily load and run this model locally or on a cloud server using either the standard Hugging Face `transformers` library, or deploy it instantly using **Ollama** for local inference.

### Option A: Local Deployment via Ollama (Recommended for Speed)

Run this model entirely on your local machine without internet connection using Ollama!

**Step 1: Download the Model Files**
First, download the safetensors weights to a local directory:
```bash
pip install -U huggingface_hub
huggingface-cli download karim0010/Qwen2.5-Coder-1.5B-python-MyTune --local-dir ./my_qwen_model

```

**Step 2: Create a `Modelfile**`
In the same folder, create a file named `Modelfile` (no extension) and paste the following ChatML configuration:

```dockerfile
FROM ./my_qwen_model

TEMPLATE """{{ if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{ if .Prompt }}<|im_start|>user
{{ .Prompt }}<|im_end|>
{{ end }}<|im_start|>assistant
"""

PARAMETER stop "<|im_start|>"
PARAMETER stop "<|im_end|>"
PARAMETER temperature 0.3
PARAMETER top_p 0.9

```

**Step 3: Compile and Run**
Build the model in Ollama and start chatting:

```bash
ollama create karim-coder -f ./Modelfile
ollama run karim-coder

```

*Now you can ask it to write Python code right in your terminal!*

---

### Option B: Python Inference (Hugging Face Transformers)

If you prefer integrating the model directly into your Python pipeline, use the following code.

**1. Install Dependencies**

```bash
pip install transformers torch accelerate

```

**2. Inference Script**

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# Define the repository
model_id = "karim0010/Qwen2.5-Coder-1.5B-python-MyTune"

# Load Tokenizer and Model
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True
)

# Prepare the prompt using the ChatML template
instruction = "Write a complete and clean Python function to calculate the Fibonacci sequence up to a given number 'n'."
prompt = f"<|im_start|>user\n{instruction}<|im_end|>\n<|im_start|>assistant\n"

# Tokenize inputs
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

# Generate code
print("Generating code...")
outputs = model.generate(
    inputs["input_ids"],
    attention_mask=inputs["attention_mask"],
    max_new_tokens=256,
    temperature=0.3, # Low temperature is recommended for accurate coding
    top_p=0.9,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id
)

# Decode and print the result
response = tokenizer.decode(outputs[0][len(inputs["input_ids"][0]):], skip_special_tokens=True)
print("\n--- Output ---")
print(response.strip())

```