| --- |
| license: mit |
| --- |
| |
| # Model Card |
|
|
|
|
| # Qwen2-0.5B-Python-SFT (LoRA) |
|
|
| ## Overview |
|
|
| This model is a Supervised Fine-Tuned (SFT) version of **Qwen/Qwen2-0.5B**, adapted for Python instruction-following tasks. |
|
|
| The fine-tuning was performed using QLoRA (4-bit quantization + LoRA adapters) on a curated Python instruction dataset to improve structured code generation and instruction alignment. |
|
|
| This repository contains **LoRA adapter weights**, not the full base model. |
|
|
|
|
| ## Base Model |
|
|
| * Base: `Qwen/Qwen2-0.5B` |
| * Architecture: Decoder-only Transformer |
| * Parameters: 0.5B |
| * License: Refer to original Qwen license |
|
|
| Base model must be loaded separately. |
|
|
|
|
| ## Training Dataset |
|
|
| * Dataset: `iamtarun/python_code_instructions_18k_alpaca` |
| * Size: ~18,000 instruction-output pairs |
| * Format: Alpaca-style instruction → response |
| * Domain: Python programming tasks |
|
|
| Each training sample followed: |
|
|
| ``` |
| Below is an instruction that describes a task. |
| Write a response that appropriately completes the request. |
| |
| ### Instruction: |
| ... |
| |
| ### Response: |
| ... |
| ``` |
|
|
|
|
| ## Training Details |
|
|
| * Method: QLoRA (4-bit) |
| * Quantization: NF4 |
| * Compute dtype: FP16 |
| * Optimizer: paged_adamw_8bit |
| * Sequence length: 384–512 |
| * Epochs: 1 |
| * Final training loss: ~0.2–0.3 |
| * Hardware: Tesla P100 (16GB) |
| * Frameworks: |
|
|
| * transformers |
| * peft |
| * trl |
| * bitsandbytes |
|
|
|
|
| ## Intended Use |
|
|
| This model is designed for: |
|
|
| * Python code generation |
| * Simple algorithm implementation |
| * Educational coding tasks |
| * Instruction-following code responses |
|
|
| It performs best when prompted in Alpaca-style format: |
|
|
| ``` |
| Below is an instruction that describes a task. |
| |
| ### Instruction: |
| Write a Python function to reverse a linked list. |
| |
| ### Response: |
| ``` |
|
|
|
|
| ## How to Use |
|
|
| ```python |
| import torch |
| from transformers import AutoTokenizer, AutoModelForCausalLM |
| from peft import PeftModel |
| |
| base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-0.5B") |
| tokenizer = AutoTokenizer.from_pretrained("NNEngine/qwen2-0.5b-python-lora") |
| |
| model = PeftModel.from_pretrained(base_model, "NNEngine/qwen2-0.5b-python-lora") |
| |
| model.eval() |
| ``` |
|
|
| Example generation: |
|
|
| ```python |
| prompt = """Below is an instruction that describes a task. |
| |
| ### Instruction: |
| Write a Python function to check if a number is prime. |
| |
| ### Response: |
| """ |
| ``` |
|
|
|
|
| ## Observed Behavior |
|
|
| The model demonstrates: |
|
|
| * Improved Python code structuring |
| * Better adherence to instruction-response formatting |
| * Faster convergence for common programming tasks |
|
|
| Limitations: |
|
|
| * Small model size (0.5B) limits reasoning depth |
| * May hallucinate under high-temperature decoding |
| * Works best with explicit language specification ("Write a Python function") |
|
|
|
|
| ## Limitations |
|
|
| * Not suitable for production-critical systems |
| * Limited mathematical and multi-step reasoning capability |
| * Sensitive to prompt formatting |
| * Performance depends heavily on decoding strategy |
|
|
| ## Future Improvements |
|
|
| Potential enhancements: |
|
|
| * Mask instruction tokens during SFT |
| * Increase model size (1.5B+) |
| * Train on more diverse programming datasets |
| * Evaluate with pass@k benchmarks |
|
|
|
|
| ## Acknowledgements |
|
|
| * Base model by Qwen team |
| * Dataset by `iamtarun` |
|
|
|
|