archit11
/

track_b_sft_model

instruction-tuning

Model card Files Files and versions

track_b_sft_model / README.md

archit11's picture

Upload README.md with huggingface_hub

fba88f9 verified 25 days ago

|

history blame contribute delete

1.01 kB

	---
	base_model: Qwen/Qwen2.5-Coder-1.5B
	tags:
	- lora
	- sft
	- code
	- python
	- instruction-tuning
	license: apache-2.0
	---

	# Track B SFT – Qwen2.5-Coder-1.5B + LoRA

	Fine-tuned on ~250 synthetic coding instruction pairs generated from the [verl](https://github.com/volcengine/verl) corpus.

	## Results

	\| Metric \| Baseline \| Post-SFT \| Δ \|
	\|--------\|----------\|----------\|---\|
	\| pass@1 \| 0.565 \| 0.804 \| +0.239 \|
	\| pass@3 \| 0.783 \| 0.848 \| +0.065 \|

	## Training

	- Base model: `Qwen/Qwen2.5-Coder-1.5B`
	- Method: LoRA (r=16, alpha=32)
	- Data: `archit11/track_b_sft` (~257 train examples)
	- Epochs: 3, LR: 2e-4, Hardware: T4 GPU

	## Usage

	```python
	from peft import PeftModel
	from transformers import AutoModelForCausalLM, AutoTokenizer

	base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-1.5B")
	model = PeftModel.from_pretrained(base, "archit11/track_b_sft_model").merge_and_unload()
	tokenizer = AutoTokenizer.from_pretrained("archit11/track_b_sft_model")
	```