Qwen 2.5 Coder (0.5B) - C++ QA Fine-Tuned

This is a fine-tuned version of the highly capable Qwen2.5-Coder-0.5B model. It has been specifically instruction-tuned to answer programming questions and write code for the C++ programming language.

Model Details

  • Base Model: Qwen/Qwen2.5-Coder-0.5B
  • Parameters: 500 Million
  • Language: English / C++
  • Intended Use: Answering C++ programming questions, generating C++ snippets, and code explanation.

Training Data

This model was fine-tuned using a filtered subset of the sahil2801/CodeAlpaca-20k dataset. The training data was specifically filtered to only include instructions and inputs that reference C++ or cpp, ensuring the model focuses heavily on this language domain.

Fine-tuned parameters

The model was fine-tuned using the Hugging Face trl library (SFTTrainer) with the following hyperparameters:

Optimizer: AdamW Learning Rate: 2e-5 Batch Size: 1 (with gradient accumulation of 4) Precision: fp16 (Mixed Precision)

How to use

You can load and use this model directly with the transformers library:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "VesileHan/Qwen2.5_coder_cpp"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype=torch.float16
)
question = "How do I reverse a string in C++?"
prompt = f"Below is an instruction that describes a coding task. Write a response that appropriately completes the request.\n\n### Instruction:\n{question}\n\n### Response:\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=150, temperature=0.7)
print(tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True))
Downloads last month
6
Safetensors
Model size
0.5B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for VesileHan/Qwen2.5_Coder_0.5B_CPP

Finetuned
(28)
this model

Dataset used to train VesileHan/Qwen2.5_Coder_0.5B_CPP