CodeK v3 — Qwen2.5-Coder-7B LoRA

A LoRA adapter fine-tuned on CodeK, a synthetic dataset of Python programming tasks written in the style of Andrej Karpathy's open-source code. The model is trained to reason carefully about code: explaining implementations, diagnosing bugs, contrasting correct vs. incorrect versions, and generating multi-hypothesis debugging chains.

Best checkpoint: checkpoint-800 (eval loss: 0.5888)

Model Details

Field	Value
Base model	`Qwen/Qwen2.5-Coder-7B-Instruct`
Adapter type	LoRA (rank 16, alpha 32, RSLoRA)
Target modules	q/k/v/o proj, gate/up/down proj
Training tokens	response-only (prompt tokens masked)
Best checkpoint	checkpoint-800
Eval loss	0.5888
Training hardware	NVIDIA A100 80GB SXM4

Training Data

The CodeK v3 dataset combines v2 (398 seeds) and v3 (161 seeds) augmentation pipelines for a total of 559 unique Python tasks across 9 categories:

Data structures, algorithms, graphs, dynamic programming
Numerical methods, parsing, concurrency, bit manipulation, compression

Each seed is augmented across up to 5 passes:

Pass	Type	Description
Pass 1	Reasoning	Step-by-step explanation of the correct implementation
Pass 2	Debugging	Single-line surgical bug + model diagnosis (via Codex, 100% coverage)
Pass 3	Contrast	Correct vs. incorrect comparison with explanation
Pass 4	Research loop	Multi-turn investigation of the implementation
Pass 5	Multi-hypothesis	Competing bug hypotheses, ranked by plausibility

Training split: 6,757 pairs (504 seed-level train tasks) Validation split: 728 pairs (55 seed-level held-out tasks, zero task overlap with train)

Key improvements over v2 model

Seed-level val split — validation set has no task overlap with training (eval loss is meaningful)
Response-only loss — prompt tokens masked; model only trained on assistant responses
Pass 5 — multi-hypothesis bug reasoning signal (new in v3)
Pass 2 via Codex — 100% pass 2 coverage with sharper change_token annotations
change_token field — targets the change_hit failure mode from the v1/v2 evals

Evaluation

Ground-truth Pass 2 eval on 50 held-out v1 seeds (same seeds used across all versions for apples-to-apples comparison). A prediction passes if it correctly identifies both the function containing the bug and the nature of the change.

Version	Dataset	LoRA Pass@1	Base Pass@1
v0	201 seeds, 4 passes	58%	64%
v1	398 seeds, 4 passes	60%	62%
v3	559 seeds, 5 passes	pending	pending

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = "Qwen/Qwen2.5-Coder-7B-Instruct"
adapter = "mechramc/codek-qwen2.5-coder-7b-lora-v3"

tokenizer = AutoTokenizer.from_pretrained(base, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.bfloat16, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)
model.eval()

messages = [
    {"role": "system", "content": "You are a Python debugging expert. When shown code with a bug, identify the exact location and nature of the bug. Be precise and concise."},
    {"role": "user", "content": "The following Python code has a subtle bug. Find it.\n\n```python\ndef binary_search(arr, target):\n    lo, hi = 0, len(arr) - 1\n    while lo <= hi:\n        mid = (lo + hi) // 2\n        if arr[mid] == target:\n            return mid\n        elif arr[mid] < target:\n            lo = mid\n        else:\n            hi = mid - 1\n    return -1\n```"}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.no_grad():
    out = model.generate(**inputs, max_new_tokens=300, do_sample=False)
print(tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Framework Versions

PEFT: 0.18.1
TRL: 0.24.0
Transformers: 5.5.0
PyTorch: 2.6.0
Unsloth: 2026.4.1
CUDA: 12.4

Downloads last month: -

Model tree for mechramc/codek-qwen2.5-coder-7b-lora-v3

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-Coder-7B

Finetuned

Qwen/Qwen2.5-Coder-7B-Instruct

Adapter

(610)

this model

mechramc
/

codek-qwen2.5-coder-7b-lora-v3