Qwen2.5-Coder-1.5B-LF-FIM-Heavy

Finetuned from Qwen/Qwen2.5-Coder-1.5B.

Model is finetuned on ~188k samples of quality code with FIM making most of it, good for code autocompletion.

HumanEval-Infilling (multi-line)

  • pass@1 = 53.23%
  • pass@10 = 62.62%
  • pass@20 = 64.35%

Since this evaluation script uses Qwen FIM tokens in prefix then suffix then middle order, this is PSM-style evaluation.

Benchmark

  • HumanEval-Infilling (single-line)
  • Tasks: 1033
  • Samples/task: 20
  • Metric: pass@k (functional correctness)

Results

  • pass@1: finetuned=85.48%, base=64.63%, delta=20.85%, 95% CI=[18.27%, 23.42%]
  • pass@10: finetuned=90.58%, base=74.48%, delta=16.11%, 95% CI=[13.59%, 18.75%]
  • pass@20: finetuned=91.58%, base=75.90%, delta=15.68%, 95% CI=[12.97%, 18.30%]

Setup for singleline

  • temperature=0.2, top_p=0.95, max_new_tokens=128
  • batched decoding (batch_size=16)
  • same evaluation harness/config for both models

Competitive multi-line performance vs larger open models.

Downloads last month
16
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jolovicdev/Qwen2.5-Coder-1.5B-LF-FIM-Heavy

Base model

Qwen/Qwen2.5-1.5B
Finetuned
(40)
this model