Bentlybro
/

bently-coder-7b

Text Generation

Model card Files Files and versions

bently-coder-7b / README.md

Bentlybro's picture

Update README.md

e216932 verified 15 days ago

|

history blame contribute delete

2.37 kB

	---
	license: apache-2.0
	base_model: Qwen/Qwen2.5-Coder-7B-Instruct
	tags:
	- code
	- qwen
	- fine-tuned
	- qlora
	language:
	- en
	pipeline_tag: text-generation
	---

	# Bently Coder 7B

	A fine-tuned coding model based on [Qwen 2.5 Coder 7B Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct), trained on personal GitHub repositories using QLoRA.

	## Results

	\| Benchmark \| Base Qwen 2.5 7B \| Bently Coder v1 \| Improvement \|
	\|-----------\|------------------\|-----------------\|-------------\|
	\| BigCodeBench Hard \| 40% \| 92% \| +52pp \|
	\| HumanEval \| 50% \| 86% \| +36pp \|

	+52 percentage points over base model.

	## Key Findings

	- Your code only works better — Training exclusively on personal repos outperformed mixed datasets with popular open source
	- 2 epochs is optimal — More epochs caused overfitting (4 epochs dropped to 66%)
	- Quality > quantity — 7k samples from personal repos beat 15k mixed samples

	## Usage

	### Transformers

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained("Bentlybro/bently-coder-7b", device_map="auto")
	tokenizer = AutoTokenizer.from_pretrained("Bentlybro/bently-coder-7b")

	prompt = "### Instruction:\nWrite a Python function to reverse a linked list\n\n### Response:\n"
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=512)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	### Ollama

	Convert to GGUF and create a Modelfile, or download quantized versions (if available).

	## Training Details

	- Base model: Qwen/Qwen2.5-Coder-7B-Instruct
	- Method: QLoRA (4-bit quantization)
	- Epochs: 2
	- Hardware: RTX 3060 12GB
	- Dataset: ~7,000 instruction-code pairs from personal GitHub repos
	- Task distribution: write (~51%), complete (~17%), explain (~15%), refactor (~10%), document (~4%)

	## Limitations

	This model is fine-tuned on a single developer's coding style. It may:
	- Prefer certain patterns, naming conventions, or structures specific to that style
	- Perform differently on codebases with vastly different conventions

	## Training Code

	Full training pipeline available at: [github.com/Bentlybro/bently-coder-llm](https://github.com/Bentlybro/bently-coder-llm)

	## License

	Apache 2.0 (same as base Qwen model)