NNEngine
/

qwen2-0.5b-python-lora

Model card Files Files and versions

qwen2-0.5b-python-lora / README.md

NNEngine's picture

Update README.md

a9c333d verified about 1 month ago

|

history blame contribute delete

3.12 kB

	---
	license: mit
	---

	# Model Card


	# Qwen2-0.5B-Python-SFT (LoRA)

	## Overview

	This model is a Supervised Fine-Tuned (SFT) version of Qwen/Qwen2-0.5B, adapted for Python instruction-following tasks.

	The fine-tuning was performed using QLoRA (4-bit quantization + LoRA adapters) on a curated Python instruction dataset to improve structured code generation and instruction alignment.

	This repository contains LoRA adapter weights, not the full base model.


	## Base Model

	* Base: `Qwen/Qwen2-0.5B`
	* Architecture: Decoder-only Transformer
	* Parameters: 0.5B
	* License: Refer to original Qwen license

	Base model must be loaded separately.


	## Training Dataset

	* Dataset: `iamtarun/python_code_instructions_18k_alpaca`
	* Size: ~18,000 instruction-output pairs
	* Format: Alpaca-style instruction → response
	* Domain: Python programming tasks

	Each training sample followed:

	```
	Below is an instruction that describes a task.
	Write a response that appropriately completes the request.

	### Instruction:
	...

	### Response:
	...
	```


	## Training Details

	* Method: QLoRA (4-bit)
	* Quantization: NF4
	* Compute dtype: FP16
	* Optimizer: paged_adamw_8bit
	* Sequence length: 384–512
	* Epochs: 1
	* Final training loss: ~0.2–0.3
	* Hardware: Tesla P100 (16GB)
	* Frameworks:

	* transformers
	* peft
	* trl
	* bitsandbytes


	## Intended Use

	This model is designed for:

	* Python code generation
	* Simple algorithm implementation
	* Educational coding tasks
	* Instruction-following code responses

	It performs best when prompted in Alpaca-style format:

	```
	Below is an instruction that describes a task.

	### Instruction:
	Write a Python function to reverse a linked list.

	### Response:
	```


	## How to Use

	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel

	base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-0.5B")
	tokenizer = AutoTokenizer.from_pretrained("NNEngine/qwen2-0.5b-python-lora")

	model = PeftModel.from_pretrained(base_model, "NNEngine/qwen2-0.5b-python-lora")

	model.eval()
	```

	Example generation:

	```python
	prompt = """Below is an instruction that describes a task.

	### Instruction:
	Write a Python function to check if a number is prime.

	### Response:
	"""
	```


	## Observed Behavior

	The model demonstrates:

	* Improved Python code structuring
	* Better adherence to instruction-response formatting
	* Faster convergence for common programming tasks

	Limitations:

	* Small model size (0.5B) limits reasoning depth
	* May hallucinate under high-temperature decoding
	* Works best with explicit language specification ("Write a Python function")


	## Limitations

	* Not suitable for production-critical systems
	* Limited mathematical and multi-step reasoning capability
	* Sensitive to prompt formatting
	* Performance depends heavily on decoding strategy

	## Future Improvements

	Potential enhancements:

	* Mask instruction tokens during SFT
	* Increase model size (1.5B+)
	* Train on more diverse programming datasets
	* Evaluate with pass@k benchmarks


	## Acknowledgements

	* Base model by Qwen team
	* Dataset by `iamtarun`