File size: 3,174 Bytes
ccf3567
617ef08
 
ef667e8
617ef08
ef667e8
617ef08
 
 
ef667e8
617ef08
 
 
 
ccf3567
ef667e8
617ef08
ef667e8
617ef08
ef667e8
617ef08
ef667e8
 
 
617ef08
 
 
 
 
 
 
 
 
 
ef667e8
617ef08
ef667e8
617ef08
ef667e8
617ef08
ef667e8
617ef08
 
 
 
 
ef667e8
617ef08
ef667e8
617ef08
ef667e8
617ef08
ef667e8
617ef08
ef667e8
617ef08
ef667e8
617ef08
ef667e8
617ef08
 
 
 
ef667e8
617ef08
ef667e8
617ef08
 
 
ef667e8
617ef08
ef667e8
617ef08
ef667e8
617ef08
ef667e8
617ef08
ef667e8
617ef08
 
 
 
ef667e8
617ef08
ef667e8
617ef08
 
 
ef667e8
617ef08
ef667e8
617ef08
 
 
 
ef667e8
617ef08
ef667e8
617ef08
ef667e8
617ef08
 
 
 
 
 
 
 
ef667e8
617ef08
ef667e8
617ef08
ef667e8
617ef08
ef667e8
617ef08
 
 
ef667e8
617ef08
ef667e8
617ef08
ef667e8
617ef08
ef667e8
617ef08
ef667e8
617ef08
ef667e8
617ef08
 
 
 
 
ef667e8
617ef08
ef667e8
617ef08
ef667e8
617ef08
 
 
 
ef667e8
617ef08
ef667e8
617ef08
ef667e8
617ef08
 
 
 
ef667e8
617ef08
ef667e8
617ef08
 
 
 
 
 
 
 
 
 
dc35050
617ef08
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
---
license: apache-2.0
base_model: Qwen/Qwen2.5-Coder-3B-Instruct
pipeline_tag: text-generation
library_name: transformers
tags:
- code-generation
- python
- qwen
- unsloth
- transformers
- coding-assistant
language:
- en
---

# VCoder

VCoder is a Python-focused coding assistant fine-tuned from Qwen2.5-Coder-3B-Instruct using LoRA and Unsloth.

The model was trained on 15,000 Python instruction-response examples from the Python Code Instructions 15K dataset and optimized for Python code generation, problem solving, debugging, and algorithm implementation.

## Model Details

| Attribute | Value |
|------------|---------|
| Base Model | Qwen2.5-Coder-3B-Instruct |
| Fine-Tuning Method | LoRA |
| Framework | Unsloth |
| Dataset | Python Code Instructions 15K |
| Training Samples | 15,000 |
| GPU | NVIDIA Tesla T4 |
| Quantized Format | GGUF Q8_0 |
| Primary Language | Python |

---

## Training Pipeline

Training was performed incrementally:

| Stage | Samples |
|---------|---------|
| Stage 1 | 0 - 5,000 |
| Stage 2 | 5,000 - 10,000 |
| Stage 3 | 10,000 - 15,000 |

The model was trained using parameter-efficient fine-tuning (LoRA), allowing adaptation of the base model while keeping computational requirements low.

---

## Benchmark Results

![Output](https://cdn-uploads.huggingface.co/production/uploads/6a297050d3837ea7b12cc42f/BV8FY6fJN7KQ43jcpC6hr.png)

### HumanEval Comparison

The model was evaluated against the original Qwen2.5-Coder-3B-Instruct on HumanEval coding tasks.

| Model | Pass@1 |
|---------|---------|
| Base Qwen2.5-Coder-3B | 61.0% |
| VCoder | 68.0% |

### Improvement

```text
+7.0% Pass@1 improvement
```

This demonstrates that the fine-tuned model performs better on Python coding tasks than the original base model.

---

## Example Usage

### Python

```python
prompt = """
### Instruction:
Write a Python function to reverse a string.

### Input:

### Response:
"""
```

### Example Output

```python
def reverse_string(text):
    return text[::-1]
```

---

## Supported Tasks

- Python Code Generation
- Algorithm Design
- Data Structures
- Debugging
- Code Refactoring
- Coding Interview Questions
- Competitive Programming
- Function Completion

---

## GGUF Usage

Compatible with:

- Ollama
- LM Studio
- llama.cpp

---

## Training Dataset

Dataset used:

Python Code Instructions 15K

The dataset contains instruction-response pairs focused on Python programming tasks including:

- Function generation
- Data manipulation
- Algorithms
- Debugging
- Problem solving

---

## Limitations

- Primarily optimized for Python.
- Benchmark performed on a subset of HumanEval tasks.
- May generate incorrect code for highly specialized domains.
- Should not be used as the sole source of production-critical code.

---

## Acknowledgements

- Qwen Team for Qwen2.5-Coder
- Unsloth for efficient fine-tuning
- Hugging Face
- OpenAI HumanEval Benchmark

---

## Citation

```bibtex
@misc{vcoder2026,
  title={VCoder: Python Code Generation Model},
  author={Varunesh V, Prawin R K, Sarguru N},
  year={2026},
  base_model={Qwen2.5-Coder-3B-Instruct}
}
```
Github : https://github.com/sarguru16
Mail : sarguru1609@gmail.com