File size: 4,289 Bytes
0ba4c4f 4f224ef 0ba4c4f 4f224ef 0ba4c4f 4f224ef 0ba4c4f 4f224ef 0ba4c4f 4f224ef 0ba4c4f 4f224ef 0ba4c4f 4f224ef 0ba4c4f 4f224ef 0ba4c4f 4f224ef 0ba4c4f 4f224ef 0ba4c4f 4f224ef 0ba4c4f 4f224ef 0ba4c4f 4f224ef 0ba4c4f 4f224ef 0ba4c4f 4f224ef 0ba4c4f 4f224ef 0ba4c4f 4f224ef 0ba4c4f 4f224ef | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 | ---
license: apache-2.0
base_model: Qwen/Qwen2.5-7B-Instruct
tags:
- code
- code-generation
- sft
- lora
- qwen
- programming
datasets:
- TokenBender/code_instructions_122k_alpaca_style
pipeline_tag: text-generation
model-index:
- name: qwen-7b-code-instruct
results:
- task:
type: text-generation
name: Code Generation
dataset:
name: Code Instructions 122K
type: TokenBender/code_instructions_122k_alpaca_style
split: train
metrics:
- type: loss
value: 0.507
name: Final Training Loss
---
# Qwen2.5-7B Code Instruct
A **Qwen2.5-7B-Instruct** model fine-tuned with **SFT + LoRA** on [122K code instructions](https://huggingface.co/datasets/TokenBender/code_instructions_122k_alpaca_style) covering 40+ programming languages. The model generates clean, correct code from natural language descriptions.
## Training Details
| Parameter | Value |
|-----------|-------|
| **Base model** | [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) |
| **Method** | SFT with LoRA (r=32, alpha=64) |
| **Quantization** | None (full bf16) |
| **Dataset** | [TokenBender/code_instructions_122k_alpaca_style](https://huggingface.co/datasets/TokenBender/code_instructions_122k_alpaca_style) |
| **Training examples** | 119,519 |
| **Hardware** | NVIDIA RTX 5090 (32GB VRAM) |
| **Training time** | ~3.3 hours |
| **Epochs** | 1 |
| **Effective batch size** | 16 (4 per device x 4 gradient accumulation) |
| **Learning rate** | 2e-5 (cosine schedule, 100 warmup steps) |
| **Max sequence length** | 1,024 tokens |
| **Precision** | bf16 |
| **Framework** | TRL 0.29.1 + Transformers 5.3.0 |
## Performance
| Metric | Value |
|--------|-------|
| **Starting loss** | 2.10 |
| **Final loss** | **0.46** |
| **Loss reduction** | 78% |
## Training Curves

- **Training Loss**: Sharp drop from 2.1 to ~0.5 within the first 200 steps, then continued gradual improvement
- **Learning Rate**: Cosine decay from 2e-5 to 0
- **Gradient Norm**: Stable around 1.0 throughout training
## Languages Covered
The training dataset spans 40+ programming languages including Python, JavaScript, Java, C++, C#, Go, Rust, TypeScript, SQL, Ruby, PHP, Swift, Kotlin, R, Bash, and more.
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base_model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen2.5-7B-Instruct",
torch_dtype=torch.bfloat16,
device_map="auto",
)
model = PeftModel.from_pretrained(base_model, "usama10/qwen-7b-code-instruct")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B-Instruct")
messages = [
{"role": "system", "content": "You are an expert programmer. Given a programming task, write clean, correct, and well-commented code."},
{"role": "user", "content": "Write a Python function that finds the longest common subsequence of two strings."},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.2)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
```
## Dataset
The [code_instructions_122k_alpaca_style](https://huggingface.co/datasets/TokenBender/code_instructions_122k_alpaca_style) dataset contains 122K instruction-output pairs in Alpaca format. Each example has:
- **instruction**: A natural language description of the coding task
- **input**: Optional context or additional information
- **output**: The expected code solution
Examples range from simple utility functions to complex algorithms, data structures, and system design patterns.
## Limitations
- Trained for 1 epoch; more epochs could improve code quality
- The 1,024-token max length means very long code solutions may be truncated during training
- Code correctness is not verified during training (no execution-based feedback)
- Performance varies across languages; Python and JavaScript likely have the most training signal
- LoRA adapter requires the base Qwen2.5-7B-Instruct model for inference
|