File size: 4,289 Bytes
0ba4c4f
4f224ef
0ba4c4f
 
4f224ef
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0ba4c4f
 
4f224ef
0ba4c4f
4f224ef
0ba4c4f
4f224ef
0ba4c4f
4f224ef
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0ba4c4f
4f224ef
 
 
 
 
 
 
 
 
0ba4c4f
4f224ef
0ba4c4f
4f224ef
 
 
0ba4c4f
4f224ef
0ba4c4f
4f224ef
0ba4c4f
4f224ef
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0ba4c4f
4f224ef
0ba4c4f
4f224ef
0ba4c4f
4f224ef
 
 
0ba4c4f
4f224ef
0ba4c4f
4f224ef
0ba4c4f
4f224ef
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
---
license: apache-2.0
base_model: Qwen/Qwen2.5-7B-Instruct
tags:
  - code
  - code-generation
  - sft
  - lora
  - qwen
  - programming
datasets:
  - TokenBender/code_instructions_122k_alpaca_style
pipeline_tag: text-generation
model-index:
  - name: qwen-7b-code-instruct
    results:
      - task:
          type: text-generation
          name: Code Generation
        dataset:
          name: Code Instructions 122K
          type: TokenBender/code_instructions_122k_alpaca_style
          split: train
        metrics:
          - type: loss
            value: 0.507
            name: Final Training Loss
---

# Qwen2.5-7B Code Instruct

A **Qwen2.5-7B-Instruct** model fine-tuned with **SFT + LoRA** on [122K code instructions](https://huggingface.co/datasets/TokenBender/code_instructions_122k_alpaca_style) covering 40+ programming languages. The model generates clean, correct code from natural language descriptions.

## Training Details

| Parameter | Value |
|-----------|-------|
| **Base model** | [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) |
| **Method** | SFT with LoRA (r=32, alpha=64) |
| **Quantization** | None (full bf16) |
| **Dataset** | [TokenBender/code_instructions_122k_alpaca_style](https://huggingface.co/datasets/TokenBender/code_instructions_122k_alpaca_style) |
| **Training examples** | 119,519 |
| **Hardware** | NVIDIA RTX 5090 (32GB VRAM) |
| **Training time** | ~3.3 hours |
| **Epochs** | 1 |
| **Effective batch size** | 16 (4 per device x 4 gradient accumulation) |
| **Learning rate** | 2e-5 (cosine schedule, 100 warmup steps) |
| **Max sequence length** | 1,024 tokens |
| **Precision** | bf16 |
| **Framework** | TRL 0.29.1 + Transformers 5.3.0 |

## Performance

| Metric | Value |
|--------|-------|
| **Starting loss** | 2.10 |
| **Final loss** | **0.46** |
| **Loss reduction** | 78% |

## Training Curves

![Training Metrics](code_training_metrics_plots.png)

- **Training Loss**: Sharp drop from 2.1 to ~0.5 within the first 200 steps, then continued gradual improvement
- **Learning Rate**: Cosine decay from 2e-5 to 0
- **Gradient Norm**: Stable around 1.0 throughout training

## Languages Covered

The training dataset spans 40+ programming languages including Python, JavaScript, Java, C++, C#, Go, Rust, TypeScript, SQL, Ruby, PHP, Swift, Kotlin, R, Bash, and more.

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-7B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(base_model, "usama10/qwen-7b-code-instruct")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B-Instruct")

messages = [
    {"role": "system", "content": "You are an expert programmer. Given a programming task, write clean, correct, and well-commented code."},
    {"role": "user", "content": "Write a Python function that finds the longest common subsequence of two strings."},
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.2)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
```

## Dataset

The [code_instructions_122k_alpaca_style](https://huggingface.co/datasets/TokenBender/code_instructions_122k_alpaca_style) dataset contains 122K instruction-output pairs in Alpaca format. Each example has:

- **instruction**: A natural language description of the coding task
- **input**: Optional context or additional information
- **output**: The expected code solution

Examples range from simple utility functions to complex algorithms, data structures, and system design patterns.

## Limitations

- Trained for 1 epoch; more epochs could improve code quality
- The 1,024-token max length means very long code solutions may be truncated during training
- Code correctness is not verified during training (no execution-based feedback)
- Performance varies across languages; Python and JavaScript likely have the most training signal
- LoRA adapter requires the base Qwen2.5-7B-Instruct model for inference