--- base_model: unsloth/qwen2.5-0.5b-unsloth-bnb-4bit library_name: peft pipeline_tag: text-generation tags: - base_model:adapter:unsloth/qwen2.5-0.5b-unsloth-bnb-4bit - lora - sft - transformers - trl - unsloth --- --- # Model Card ** A lightweight **Qwen2.5-0.5B** model fine-tuned using **Unsloth + LoRA (PEFT)** for efficient text-generation tasks. This model is optimized for **low-VRAM systems**, fast inference, and rapid experimentation. --- ## Model Details ### Model Description This model is a **parameter-efficient fine-tuned version** of the base model: * **Base model:** `unsloth/qwen2.5-0.5b-unsloth-bnb-4bit` * **Fine-tuning method:** LoRA (PEFT) * **Quantization:** 4-bit (bnb-4bit) * **Pipeline:** text-generation * **Library:** PEFT, Transformers, TRL, Unsloth It is intended as a **compact research model** for text generation, instruction following, and as a baseline for custom SFT/RLHF projects. * **Developer:** @Sriramdayal * **Repository:** [https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1](https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1) * **License:** Same as Qwen2.5 base license (typically Apache 2.0 or base model license) * **Languages:** English (primary), multilingual capability inherited from Qwen2.5 * **Finetuned from:** `unsloth/qwen2.5-0.5b-unsloth-bnb-4bit` --- ## Model Sources * **GitHub Repo (Training Code):** [https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1](https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1) * **Base Model:** `unsloth/qwen2.5-0.5b-unsloth-bnb-4bit` --- ## Uses ### Direct Use * Instruction-style text generation * Chatbot prototyping * Educational or research experiments * Low-VRAM inference (4–6 GB GPU) * Fine-tuning starter model for custom tasks ### Downstream Use * Domain-specific SFT * Dataset distillation * RLHF training * Task-specific adapters (classifiers, generators, reasoning tasks) ### Out-of-Scope / Avoid * High-accuracy medical/legal decisions * Safety-critical systems * Long-context reasoning competitive with large LLMs * Harmful or malicious use cases --- ## Bias, Risks & Limitations This model inherits all biases from Qwen2.5 training data and may generate: * Inaccurate or hallucinated information * Social, demographic, or political biases * Unsafe or harmful recommendations if misused ### Recommendations Users must implement: * Output filtering * Safety moderation * Human verification for critical tasks --- ## How to Use ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch from peft import PeftModel base = "unsloth/qwen2.5-0.5b-unsloth-bnb-4bit" adapter = "black279/Qwen_LeetCoder" tokenizer = AutoTokenizer.from_pretrained(base) model = AutoModelForCausalLM.from_pretrained( base, device_map="auto", ) model = PeftModel.from_pretrained(model, adapter) inputs = tokenizer("Hello!", return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=100) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` --- ## Training Details ### Training Data The model was trained using custom datasets prepared through: * Instruction datasets * Synthetic Q&A * Formatting for chat templates *(Replace with your actual dataset if you want more accuracy.)* ### Training Procedure * **Framework:** Unsloth + TRL + PEFT * **Training type:** Supervised Fine-Tuning (SFT) * **Precision:** bnb-4bit quantization during training * **LoRA Ranks:** (insert your actual values if different) * `r=16`, `alpha=32`, `dropout=0.05` ### Hyperparameters * **Batch size:** 2–8 (depending on VRAM) * **Gradient Accumulation:** 8–16 * **LR:** 2e-4 * **Epochs:** 1–3 * **Optimizer:** AdamW / paged optimizers (Unsloth) ### Speeds & Compute * **Hardware:** 1× RTX 4090 / A100 / local GPU * **Training Time:** 1–3 hours (approx) * **Checkpoint Size:** Tiny (LoRA weights only) --- ## Evaluation *(You can update this later after running eval benchmarks.)* * Model evaluated on small reasoning + text-generation samples * Performs well for short instructions * Limited long-context and deep reasoning --- ## Environmental Impact * **Hardware:** 1 GPU (consumer or cloud) * **Carbon estimate:** Low (small model + LoRA) --- ## Technical Specs * **Architecture:** Qwen2.5 0.5B * **Objective:** Causal LM * **Adapters:** LoRA (PEFT) * **Quantization:** bnb 4-bit --- ## Citation ``` @misc{Sriramdayal2025QwenLoRA, title={Qwen2.5-0.5B Unsloth LoRA Fine-Tune}, author={Sriram Dayal}, year={2025}, howpublished={\url{https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1}}, } ``` --- ## Model Card Author **@Sriramdayal** --- ### Framework versions - PEFT 0.18.0