README.md · mahernaija/Qwen3.5-27B-Coder at main

Qwen3.5-27B-Coder / README.md

mahernaija

Upload Qwen3.5-27B-Coder: LoRA fine-tuned for coding

5426137 verified 18 days ago

preview code

raw

history blame contribute delete

3.04 kB

	---
	license: apache-2.0
	base_model: Qwen/Qwen3.5-27B
	tags:
	- code
	- lora
	- fine-tuned
	- qwen3.5
	- coding
	- python
	- javascript
	- rust
	datasets:
	- ise-uiuc/Magicoder-Evol-Instruct-110K
	- sahil2801/CodeAlpaca-20k
	- Vezora/Tested-143k-Python-Alpaca
	- iamtarun/python_code_instructions_18k_alpaca
	language:
	- en
	pipeline_tag: text-generation
	library_name: transformers
	---

	# Qwen3.5-27B-Coder

	Fine-tuned version of [Qwen/Qwen3.5-27B](https://huggingface.co/Qwen/Qwen3.5-27B) specialized for coding tasks.

	## Training Details

	\| Parameter \| Value \|
	\|---\|---\|
	\| Base model \| Qwen/Qwen3.5-27B (27B dense, Apache 2.0) \|
	\| Method \| LoRA r=64, alpha=128, all-linear projections \|
	\| Precision \| BF16 \|
	\| Framework \| HuggingFace SFTTrainer + PEFT + DeepSpeed ZeRO-2 \|
	\| Hardware \| 16× NVIDIA H200 SXM (141 GB each), 2 nodes \|
	\| GPU utilization \| 91% VRAM, 91-100% compute \|
	\| Training steps \| 250 (early stopped — loss plateaued) \|
	\| Training time \| ~4 hours \|
	\| Final loss \| 0.70 (down from 1.13, -40%) \|
	\| Final accuracy \| 80.0% token accuracy \|

	## Datasets

	\| Dataset \| Examples \| Purpose \|
	\|---\|---\|---\|
	\| Magicoder-Evol-Instruct-110K \| 110K \| Complex coding tasks from real GitHub code \|
	\| CodeAlpaca-20K \| 20K \| Short tasks, broad language coverage \|
	\| Tested-143k-Python-Alpaca \| 143K \| Execution-verified Python code \|
	\| python_code_instructions_18k \| 18K \| Python idioms and patterns \|
	\| Total \| 291K \| \|

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	model = AutoModelForCausalLM.from_pretrained(
	"mahernaija/Qwen3.5-27B-Coder",
	torch_dtype=torch.bfloat16,
	device_map="auto",
	)
	tokenizer = AutoTokenizer.from_pretrained("mahernaija/Qwen3.5-27B-Coder")

	messages = [{"role": "user", "content": "Write a Python binary search function with type hints."}]
	text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	inputs = tokenizer(text, return_tensors="pt").to("cuda")
	outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.2)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## Evaluation

	Fine-tuned model compared to base on 10 coding prompts:
	- 7/10 prompts: Fine-tuned model produces faster, more concise responses
	- Refactoring: 70% faster response
	- Testing: 59% faster response
	- Loss improvement: 40% reduction over base model

	## Training Infrastructure

	Trained on Nebius.ai cloud using Soperator (Kubernetes-managed Slurm):
	- 2 nodes × 8 NVIDIA H200 SXM GPUs
	- InfiniBand 400 Gb/s inter-node communication
	- DeepSpeed ZeRO-2 for optimizer/gradient sharding
	- Gradient checkpointing with use_reentrant=False

	## Limitations

	- Primarily optimized for Python (70% of training data)
	- Other languages (JS, Rust, Go) improved but less than Python
	- Not trained on repo-level tasks (SWE-bench style)
	- Best for function/class level code generation and bug fixing

	## License

	Apache 2.0 (same as base model)