Model Summary
Qwen2.5-7B-ChatCoder is a linearly merged language model that combines the instruction-following strength of Qwen2.5-7B-Instruct with the code generation capabilities of Qwen2.5-Coder-7B-Instruct.
The merge uses an 85% instruct / 15% coder weight split, carefully tuned to preserve the full chat and reasoning behaviour of the instruct model while absorbing coding knowledge from the coder model โ resulting in a model that handles both natural conversation and code generation in a single set of weights.
| Property | Value |
|---|---|
| Parameters | 7.6B |
| Architecture | Qwen2ForCausalLM |
| Context length | 128K tokens |
| Merge method | Linear |
| Instruct weight | 0.85 |
| Coder weight | 0.15 |
| dtype | bfloat16 |
| Vocabulary size | 152,064 |
Quick Start
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "ragunath-ravi/Qwen2.5-7B-ChatCoder"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
dtype=torch.bfloat16,
device_map="auto",
)
model.eval()
messages = [
{"role": "system", "content": "You are a helpful coding assistant."},
{"role": "user", "content": "Write a Python function to do binary search."},
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(
**inputs,
max_new_tokens=512,
do_sample=False,
temperature=None,
top_p=None,
repetition_penalty=1.1,
eos_token_id=[151645, 151643],
pad_token_id=151645,
)
response = tokenizer.decode(
output[0][inputs.input_ids.shape[1]:],
skip_special_tokens=True
)
print(response)
Run with 4-bit Quantization (5GB VRAM)
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True,
)
model_id = "ragunath-ravi/Qwen2.5-7B-ChatCoder"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
quantization_config=bnb_config,
device_map="auto",
)
Hardware Requirements
| Precision | VRAM Required | Recommended GPU |
|---|---|---|
| bfloat16 (full) | ~16 GB | RTX 3090 / A100 / H100 |
| 8-bit (bitsandbytes) | ~8 GB | RTX 3080 / 4080 |
| 4-bit NF4 (bitsandbytes) | ~5 GB | RTX 3060 / 4060 |
What This Model Is Good At
- Instruction following โ multi-turn chat, answering questions, reasoning
- Code generation โ Python, JavaScript, C++, Java, SQL, Bash
- Code explanation โ walk through what a piece of code does
- Debugging โ find and fix bugs with explanation
- Mixed tasks โ "explain this code and rewrite it to be more efficient"
- Math reasoning โ step-by-step problem solving
Merge Details
This model was created using mergekit with a linear merge strategy.
models:
- model: Qwen/Qwen2.5-7B-Instruct
parameters:
weight: 0.85
- model: Qwen/Qwen2.5-Coder-7B-Instruct
parameters:
weight: 0.15
merge_method: linear
dtype: bfloat16
Why linear merge? Linear merging computes a direct weighted average of all model weights with no pruning or masking. It is the most stable merge method for combining an instruct-tuned model with a base/specialist model, avoiding the weight corruption that DARE-TIES can introduce when density < 1.0.
Why 85/15 split? The Qwen2.5-Coder-7B at this path is a base model (not instruct-tuned). At higher coder weights (tested at 0.4), the base model behaviour dominates and breaks instruction following. At 0.15, the coding knowledge is absorbed while the instruct fine-tuning remains intact.
Limitations
- Capabilities are bounded by the two parent models โ this is not a trained model
- Very long code generation (>500 lines) may degrade in quality
- Not fine-tuned for agent/tool-use tasks
- May occasionally produce confident but incorrect code โ always test generated code
Citation
If you use this model, please cite the original Qwen2.5 models:
@misc{qwen2.5,
title = {Qwen2.5: A Party of Foundation Models},
author = {Qwen Team},
year = {2024},
url = {https://qwenlm.github.io/blog/qwen2.5/}
}
Created By
Merged and released by ragunath-ravi.
Built with mergekit by Arcee AI.
- Downloads last month
- 26