Qwen2.5-Coder-32B-FIM
A LoRA fine-tuned adapter for Qwen/Qwen2.5-Coder-32B, specialized for Fill-in-the-Middle (FIM) code completion on Rust codebases.
Model Details
- Base Model: Qwen/Qwen2.5-Coder-32B
- Fine-tuning Method: LoRA (Low-Rank Adaptation) with 4-bit quantization
- Training Data: AST-extracted code spans from the reth Rust codebase
- Task: Fill-in-the-Middle code completion (predict missing code between prefix and suffix)
Training Configuration
| Parameter | Value |
|---|---|
| LoRA Rank | 64 |
| LoRA Alpha | 128 |
| LoRA Dropout | 0.05 |
| Target Modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Quantization | 4-bit (nf4, double quantization) |
| Learning Rate | 2e-5 |
| Epochs | 3 |
| Optimizer | paged_adamw_8bit |
| LR Scheduler | cosine |
| Max Sequence Length | 4096 |
| Precision | bfloat16 |
FIM Format
This model uses the Qwen FIM token format:
<|fim_prefix|>[code before cursor]<|fim_suffix|>[code after cursor]<|fim_middle|>[generated completion]
Usage
With PEFT
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base_model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen2.5-Coder-32B",
device_map="auto",
load_in_4bit=True,
)
model = PeftModel.from_pretrained(base_model, "viplismism/Qwen2.5-Coder-32B-FIM")
tokenizer = AutoTokenizer.from_pretrained("viplismism/Qwen2.5-Coder-32B-FIM")
prompt = "<|fim_prefix|>fn add(a: i32, b: i32) -> i32 {\n <|fim_suffix|>\n}<|fim_middle|>"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128, temperature=0.2)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
With Ollama
After merging adapters and converting to GGUF:
ollama create qwen2.5-coder-32b-fim -f Modelfile
ollama run qwen2.5-coder-32b-fim
Training Framework
This model was trained using fim-coder-model, a pipeline that:
- Extracts semantic code boundaries (functions, structs, impl blocks) via Rust AST parsing
- Generates FIM training samples with proper prefix/suffix/middle splits
- Fine-tunes with LoRA + 4-bit quantization for efficient multi-GPU training
License
MIT
- Downloads last month
- 11