| | --- |
| | license: apache-2.0 |
| | base_model: Qwen/Qwen2.5-Coder-7B-Instruct |
| | tags: |
| | - code |
| | - qwen |
| | - fine-tuned |
| | - qlora |
| | language: |
| | - en |
| | pipeline_tag: text-generation |
| | --- |
| | |
| | # Bently Coder 7B |
| |
|
| | A fine-tuned coding model based on [Qwen 2.5 Coder 7B Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct), trained on personal GitHub repositories using QLoRA. |
| |
|
| | ## Results |
| |
|
| | | Benchmark | Base Qwen 2.5 7B | Bently Coder v1 | Improvement | |
| | |-----------|------------------|-----------------|-------------| |
| | | BigCodeBench Hard | 40% | **92%** | +52pp | |
| | | HumanEval | 50% | **86%** | +36pp | |
| |
|
| | **+52 percentage points over base model.** |
| |
|
| | ## Key Findings |
| |
|
| | - **Your code only works better** — Training exclusively on personal repos outperformed mixed datasets with popular open source |
| | - **2 epochs is optimal** — More epochs caused overfitting (4 epochs dropped to 66%) |
| | - **Quality > quantity** — 7k samples from personal repos beat 15k mixed samples |
| |
|
| | ## Usage |
| |
|
| | ### Transformers |
| |
|
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | |
| | model = AutoModelForCausalLM.from_pretrained("Bentlybro/bently-coder-7b", device_map="auto") |
| | tokenizer = AutoTokenizer.from_pretrained("Bentlybro/bently-coder-7b") |
| | |
| | prompt = "### Instruction:\nWrite a Python function to reverse a linked list\n\n### Response:\n" |
| | inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
| | outputs = model.generate(**inputs, max_new_tokens=512) |
| | print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| | ``` |
| |
|
| | ### Ollama |
| |
|
| | Convert to GGUF and create a Modelfile, or download quantized versions (if available). |
| |
|
| | ## Training Details |
| |
|
| | - **Base model:** Qwen/Qwen2.5-Coder-7B-Instruct |
| | - **Method:** QLoRA (4-bit quantization) |
| | - **Epochs:** 2 |
| | - **Hardware:** RTX 3060 12GB |
| | - **Dataset:** ~7,000 instruction-code pairs from personal GitHub repos |
| | - **Task distribution:** write (~51%), complete (~17%), explain (~15%), refactor (~10%), document (~4%) |
| |
|
| | ## Limitations |
| |
|
| | This model is fine-tuned on a single developer's coding style. It may: |
| | - Prefer certain patterns, naming conventions, or structures specific to that style |
| | - Perform differently on codebases with vastly different conventions |
| |
|
| | ## Training Code |
| |
|
| | Full training pipeline available at: [github.com/Bentlybro/bently-coder-llm](https://github.com/Bentlybro/bently-coder-llm) |
| |
|
| | ## License |
| |
|
| | Apache 2.0 (same as base Qwen model) |
| |
|