| --- |
| license: apache-2.0 |
| base_model: Qwen/Qwen3.5-27B |
| tags: |
| - code |
| - lora |
| - fine-tuned |
| - qwen3.5 |
| - coding |
| - python |
| - javascript |
| - rust |
| datasets: |
| - ise-uiuc/Magicoder-Evol-Instruct-110K |
| - sahil2801/CodeAlpaca-20k |
| - Vezora/Tested-143k-Python-Alpaca |
| - iamtarun/python_code_instructions_18k_alpaca |
| language: |
| - en |
| pipeline_tag: text-generation |
| library_name: transformers |
| --- |
| |
| # Qwen3.5-27B-Coder |
|
|
| Fine-tuned version of [Qwen/Qwen3.5-27B](https://huggingface.co/Qwen/Qwen3.5-27B) specialized for coding tasks. |
|
|
| ## Training Details |
|
|
| | Parameter | Value | |
| |---|---| |
| | **Base model** | Qwen/Qwen3.5-27B (27B dense, Apache 2.0) | |
| | **Method** | LoRA r=64, alpha=128, all-linear projections | |
| | **Precision** | BF16 | |
| | **Framework** | HuggingFace SFTTrainer + PEFT + DeepSpeed ZeRO-2 | |
| | **Hardware** | 16× NVIDIA H200 SXM (141 GB each), 2 nodes | |
| | **GPU utilization** | 91% VRAM, 91-100% compute | |
| | **Training steps** | 250 (early stopped — loss plateaued) | |
| | **Training time** | ~4 hours | |
| | **Final loss** | 0.70 (down from 1.13, -40%) | |
| | **Final accuracy** | 80.0% token accuracy | |
|
|
| ## Datasets |
|
|
| | Dataset | Examples | Purpose | |
| |---|---|---| |
| | Magicoder-Evol-Instruct-110K | 110K | Complex coding tasks from real GitHub code | |
| | CodeAlpaca-20K | 20K | Short tasks, broad language coverage | |
| | Tested-143k-Python-Alpaca | 143K | Execution-verified Python code | |
| | python_code_instructions_18k | 18K | Python idioms and patterns | |
| | **Total** | **291K** | | |
| |
| ## Usage |
| |
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| import torch |
| |
| model = AutoModelForCausalLM.from_pretrained( |
| "mahernaija/Qwen3.5-27B-Coder", |
| torch_dtype=torch.bfloat16, |
| device_map="auto", |
| ) |
| tokenizer = AutoTokenizer.from_pretrained("mahernaija/Qwen3.5-27B-Coder") |
| |
| messages = [{"role": "user", "content": "Write a Python binary search function with type hints."}] |
| text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
| inputs = tokenizer(text, return_tensors="pt").to("cuda") |
| outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.2) |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| ``` |
| |
| ## Evaluation |
| |
| Fine-tuned model compared to base on 10 coding prompts: |
| - **7/10 prompts**: Fine-tuned model produces faster, more concise responses |
| - **Refactoring**: 70% faster response |
| - **Testing**: 59% faster response |
| - **Loss improvement**: 40% reduction over base model |
| |
| ## Training Infrastructure |
| |
| Trained on Nebius.ai cloud using Soperator (Kubernetes-managed Slurm): |
| - 2 nodes × 8 NVIDIA H200 SXM GPUs |
| - InfiniBand 400 Gb/s inter-node communication |
| - DeepSpeed ZeRO-2 for optimizer/gradient sharding |
| - Gradient checkpointing with use_reentrant=False |
| |
| ## Limitations |
| |
| - Primarily optimized for Python (70% of training data) |
| - Other languages (JS, Rust, Go) improved but less than Python |
| - Not trained on repo-level tasks (SWE-bench style) |
| - Best for function/class level code generation and bug fixing |
| |
| ## License |
| |
| Apache 2.0 (same as base model) |
| |