File size: 3,156 Bytes
1eecd31
75d96c1
 
1eecd31
75d96c1
 
 
1eecd31
75d96c1
 
 
 
af10ae2
1eecd31
57effce
75d96c1
57effce
75d96c1
57effce
75d96c1
57effce
1eecd31
57effce
75d96c1
 
 
 
 
 
 
 
 
 
1eecd31
75d96c1
1eecd31
75d96c1
1eecd31
75d96c1
1eecd31
75d96c1
 
 
 
 
1eecd31
75d96c1
1eecd31
75d96c1
1eecd31
75d96c1
1eecd31
75d96c1
1eecd31
75d96c1
1eecd31
75d96c1
1eecd31
75d96c1
 
 
 
1eecd31
75d96c1
1eecd31
75d96c1
 
 
1eecd31
75d96c1
1eecd31
75d96c1
1eecd31
75d96c1
1eecd31
75d96c1
1eecd31
75d96c1
 
 
 
1eecd31
75d96c1
1eecd31
75d96c1
 
 
1eecd31
75d96c1
1eecd31
75d96c1
 
 
 
1eecd31
75d96c1
1eecd31
75d96c1
1eecd31
75d96c1
 
 
 
 
 
 
 
1eecd31
75d96c1
1eecd31
75d96c1
1eecd31
75d96c1
1eecd31
75d96c1
 
 
1eecd31
 
75d96c1
1eecd31
75d96c1
1eecd31
75d96c1
1eecd31
75d96c1
1eecd31
75d96c1
 
 
 
 
1eecd31
75d96c1
1eecd31
75d96c1
1eecd31
75d96c1
 
 
 
1eecd31
75d96c1
1eecd31
75d96c1
1eecd31
75d96c1
 
 
 
1eecd31
75d96c1
1eecd31
75d96c1
1eecd31
75d96c1
 
 
 
 
 
 
af10ae2
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
---
license: apache-2.0
base_model: Qwen/Qwen2.5-Coder-3B-Instruct
tags:
- code-generation
- python
- qwen
- unsloth
- coding-assistant
language:
- en
pipeline_tag: text-generation
library_name: transformers
---

# VCoder

VCoder is a Python-focused coding assistant fine-tuned from Qwen2.5-Coder-3B-Instruct using LoRA and Unsloth.

The model was trained on 15,000 Python instruction-response examples from the Python Code Instructions 15K dataset and optimized for Python code generation, problem solving, debugging, and algorithm implementation.

## Model Details

| Attribute | Value |
|------------|---------|
| Base Model | Qwen2.5-Coder-3B-Instruct |
| Fine-Tuning Method | LoRA |
| Framework | Unsloth |
| Dataset | Python Code Instructions 15K |
| Training Samples | 15,000 |
| GPU | NVIDIA Tesla T4 |
| Quantized Format | GGUF Q8_0 |
| Primary Language | Python |

---

## Training Pipeline

Training was performed incrementally:

| Stage | Samples |
|---------|---------|
| Stage 1 | 0 - 5,000 |
| Stage 2 | 5,000 - 10,000 |
| Stage 3 | 10,000 - 15,000 |

The model was trained using parameter-efficient fine-tuning (LoRA), allowing adaptation of the base model while keeping computational requirements low.

---

## Benchmark Results

![Output](https://cdn-uploads.huggingface.co/production/uploads/6a297050d3837ea7b12cc42f/BV8FY6fJN7KQ43jcpC6hr.png)

### HumanEval Comparison

The model was evaluated against the original Qwen2.5-Coder-3B-Instruct on HumanEval coding tasks.

| Model | Pass@1 |
|---------|---------|
| Base Qwen2.5-Coder-3B | 61.0% |
| VCoder | 68.0% |

### Improvement

```text
+7.0% Pass@1 improvement
```

This demonstrates that the fine-tuned model performs better on Python coding tasks than the original base model.

---

## Example Usage

### Python

```python
prompt = """
### Instruction:
Write a Python function to reverse a string.

### Input:

### Response:
"""
```

### Example Output

```python
def reverse_string(text):
    return text[::-1]
```

---

## Supported Tasks

- Python Code Generation
- Algorithm Design
- Data Structures
- Debugging
- Code Refactoring
- Coding Interview Questions
- Competitive Programming
- Function Completion

---

## GGUF Usage

Compatible with:

- Ollama
- LM Studio
- llama.cpp


## Training Dataset

Dataset used:

Python Code Instructions 15K

The dataset contains instruction-response pairs focused on Python programming tasks including:

- Function generation
- Data manipulation
- Algorithms
- Debugging
- Problem solving

---

## Limitations

- Primarily optimized for Python.
- Benchmark performed on a subset of HumanEval tasks.
- May generate incorrect code for highly specialized domains.
- Should not be used as the sole source of production-critical code.

---

## Acknowledgements

- Qwen Team for Qwen2.5-Coder
- Unsloth for efficient fine-tuning
- Hugging Face
- OpenAI HumanEval Benchmark

---

## Citation

```bibtex
@misc{vcoder2026,
  title={VCoder: Python Code Generation Model},
  author={Varunesh V, Prawin R K, Sarguru N},
  year={2026},
  base_model={Qwen2.5-Coder-3B-Instruct}
}
```
Github : https://github.com/prawinrk
Email : prawinrk2005@gmail.com