karim0010
/

Qwen2.5-Coder-1.5B-python-MyTune

@@ -1,3 +1,4 @@
 ---
 language:
 - en
@@ -8,28 +9,142 @@ tags:
 - qwen
 - qlora
 - custom-finetune
 datasets:
 - iamtarun/python_code_instructions_18k_alpaca
 base_model: Qwen/Qwen2.5-Coder-1.5B-Instruct
 ---
-# Qwen2.5-Coder-1.5B-python-MyTune (Fine-tuned by Karim)
-## 📌 Model Description
-This model is a highly optimized, fine-tuned version of `Qwen/Qwen2.5-Coder-1.5B-Instruct`. It has been specifically trained to understand complex algorithmic instructions and generate clean, efficient, and highly accurate **Python** code.
-The training architecture utilized the **QLoRA** (Quantized Low-Rank Adaptation) method. This approach ensures high parameter efficiency, allowing the model to acquire new coding skills while preserving the robust logical reasoning capabilities of the original base weights.
 ## 📊 Training Data
 The model was fine-tuned on a carefully curated subset of the [iamtarun/python_code_instructions_18k_alpaca](https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca) dataset. This dataset provides high-quality Python coding instructions, algorithmic challenges, and their corresponding structured solutions.
 ## 🎯 Intended Use
 This model is designed to assist software engineers, data scientists, and quantitative analysts with:
 - Generating Python scripts from natural language prompts.
 - Solving complex algorithmic problems.
 - Writing data engineering and mathematical logic code.
-## ⚙️ Training Hardware
-- **Compute:** Google Colab T4 GPU (16GB VRAM)
-- **Precision:** Mixed Precision (4-bit Base + float16 Adapters)
-- **Method:** PEFT / QLoRA Integration

 ---
 language:
 - en
 - qwen
 - qlora
 - custom-finetune
+- code
+- ollama
 datasets:
 - iamtarun/python_code_instructions_18k_alpaca
 base_model: Qwen/Qwen2.5-Coder-1.5B-Instruct
 ---
+# 🤖 Qwen2.5-Coder-1.5B-python-MyTune
+**Fine-tuned with ❤️ by Karim**
+Welcome to **Qwen2.5-Coder-1.5B-python-MyTune**! This is a highly optimized, fine-tuned version of `Qwen/Qwen2.5-Coder-1.5B-Instruct`, specifically engineered to understand complex algorithmic instructions and generate clean, efficient, and highly accurate **Python** code.
+## 📌 Model Overview
+The training architecture utilized the **QLoRA** (Quantized Low-Rank Adaptation) method. This approach ensures high parameter efficiency, allowing the model to acquire advanced coding skills while preserving the robust logical reasoning capabilities of the original base weights.
+- **Base Model:** Qwen/Qwen2.5-Coder-1.5B-Instruct
+- **Language:** English / Python
+- **Training Method:** PEFT / QLoRA Integration
+- **Precision:** Mixed Precision (4-bit Base + float16 Adapters)
+- **Compute:** Google Colab T4 GPU (16GB VRAM)
 ## 📊 Training Data
 The model was fine-tuned on a carefully curated subset of the [iamtarun/python_code_instructions_18k_alpaca](https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca) dataset. This dataset provides high-quality Python coding instructions, algorithmic challenges, and their corresponding structured solutions.
 ## 🎯 Intended Use
 This model is designed to assist software engineers, data scientists, and quantitative analysts with:
 - Generating Python scripts from natural language prompts.
 - Solving complex algorithmic problems.
 - Writing data engineering and mathematical logic code.
+---
+## 🚀 Quick Start: How to Use
+You can easily load and run this model locally or on a cloud server using either the standard Hugging Face `transformers` library, or deploy it instantly using **Ollama** for local inference.
+### Option A: Local Deployment via Ollama (Recommended for Speed)
+Run this model entirely on your local machine without internet connection using Ollama!
+**Step 1: Download the Model Files**
+First, download the safetensors weights to a local directory:
+```bash
+pip install -U huggingface_hub
+huggingface-cli download karim0010/Qwen2.5-Coder-1.5B-python-MyTune --local-dir ./my_qwen_model
+```
+**Step 2: Create a `Modelfile**`
+In the same folder, create a file named `Modelfile` (no extension) and paste the following ChatML configuration:
+```dockerfile
+FROM ./my_qwen_model
+TEMPLATE """{{ if .System }}<|im_start|>system
+{{ .System }}<|im_end|>
+{{ end }}{{ if .Prompt }}<|im_start|>user
+{{ .Prompt }}<|im_end|>
+{{ end }}<|im_start|>assistant
+"""
+PARAMETER stop "<|im_start|>"
+PARAMETER stop "<|im_end|>"
+PARAMETER temperature 0.3
+PARAMETER top_p 0.9
+```
+**Step 3: Compile and Run**
+Build the model in Ollama and start chatting:
+```bash
+ollama create karim-coder -f ./Modelfile
+ollama run karim-coder
+```
+*Now you can ask it to write Python code right in your terminal!*
+---
+### Option B: Python Inference (Hugging Face Transformers)
+If you prefer integrating the model directly into your Python pipeline, use the following code.
+**1. Install Dependencies**
+```bash
+pip install transformers torch accelerate
+```
+**2. Inference Script**
+```python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+# Define the repository
+model_id = "karim0010/Qwen2.5-Coder-1.5B-python-MyTune"
+# Load Tokenizer and Model
+tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.float16,
+    device_map="auto",
+    trust_remote_code=True
+)
+# Prepare the prompt using the ChatML template
+instruction = "Write a complete and clean Python function to calculate the Fibonacci sequence up to a given number 'n'."
+prompt = f"<|im_start|>user\n{instruction}<|im_end|>\n<|im_start|>assistant\n"
+# Tokenize inputs
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+# Generate code
+print("Generating code...")
+outputs = model.generate(
+    inputs["input_ids"],
+    attention_mask=inputs["attention_mask"],
+    max_new_tokens=256,
+    temperature=0.3, # Low temperature is recommended for accurate coding
+    top_p=0.9,
+    do_sample=True,
+    pad_token_id=tokenizer.eos_token_id
+)
+# Decode and print the result
+response = tokenizer.decode(outputs[0][len(inputs["input_ids"][0]):], skip_special_tokens=True)
+print("\n--- Output ---")
+print(response.strip())
+```