| | --- |
| | base_model: unsloth/qwen2.5-0.5b-unsloth-bnb-4bit |
| | library_name: peft |
| | pipeline_tag: text-generation |
| | tags: |
| | - base_model:adapter:unsloth/qwen2.5-0.5b-unsloth-bnb-4bit |
| | - lora |
| | - sft |
| | - transformers |
| | - trl |
| | - unsloth |
| | --- |
| | |
| | --- |
| | # Model Card ** |
| |
|
| | A lightweight **Qwen2.5-0.5B** model fine-tuned using **Unsloth + LoRA (PEFT)** for efficient text-generation tasks. This model is optimized for **low-VRAM systems**, fast inference, and rapid experimentation. |
| |
|
| | --- |
| |
|
| | ## Model Details |
| |
|
| | ### Model Description |
| |
|
| | This model is a **parameter-efficient fine-tuned version** of the base model: |
| |
|
| | * **Base model:** `unsloth/qwen2.5-0.5b-unsloth-bnb-4bit` |
| | * **Fine-tuning method:** LoRA (PEFT) |
| | * **Quantization:** 4-bit (bnb-4bit) |
| | * **Pipeline:** text-generation |
| | * **Library:** PEFT, Transformers, TRL, Unsloth |
| |
|
| | It is intended as a **compact research model** for text generation, instruction following, and as a baseline for custom SFT/RLHF projects. |
| |
|
| | * **Developer:** @Sriramdayal |
| | * **Repository:** [https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1](https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1) |
| | * **License:** Same as Qwen2.5 base license (typically Apache 2.0 or base model license) |
| | * **Languages:** English (primary), multilingual capability inherited from Qwen2.5 |
| | * **Finetuned from:** `unsloth/qwen2.5-0.5b-unsloth-bnb-4bit` |
| |
|
| | --- |
| |
|
| | ## Model Sources |
| |
|
| | * **GitHub Repo (Training Code):** |
| | [https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1](https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1) |
| |
|
| | * **Base Model:** |
| | `unsloth/qwen2.5-0.5b-unsloth-bnb-4bit` |
| |
|
| | --- |
| |
|
| | ## Uses |
| |
|
| | ### Direct Use |
| |
|
| | * Instruction-style text generation |
| | * Chatbot prototyping |
| | * Educational or research experiments |
| | * Low-VRAM inference (4–6 GB GPU) |
| | * Fine-tuning starter model for custom tasks |
| |
|
| | ### Downstream Use |
| |
|
| | * Domain-specific SFT |
| | * Dataset distillation |
| | * RLHF training |
| | * Task-specific adapters (classifiers, generators, reasoning tasks) |
| |
|
| | ### Out-of-Scope / Avoid |
| |
|
| | * High-accuracy medical/legal decisions |
| | * Safety-critical systems |
| | * Long-context reasoning competitive with large LLMs |
| | * Harmful or malicious use cases |
| |
|
| | --- |
| |
|
| | ## Bias, Risks & Limitations |
| |
|
| | This model inherits all biases from Qwen2.5 training data and may generate: |
| |
|
| | * Inaccurate or hallucinated information |
| | * Social, demographic, or political biases |
| | * Unsafe or harmful recommendations if misused |
| |
|
| | ### Recommendations |
| |
|
| | Users must implement: |
| |
|
| | * Output filtering |
| | * Safety moderation |
| | * Human verification for critical tasks |
| |
|
| | --- |
| |
|
| | ## How to Use |
| |
|
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | import torch |
| | from peft import PeftModel |
| | |
| | base = "unsloth/qwen2.5-0.5b-unsloth-bnb-4bit" |
| | adapter = "black279/Qwen_LeetCoder" |
| | |
| | tokenizer = AutoTokenizer.from_pretrained(base) |
| | model = AutoModelForCausalLM.from_pretrained( |
| | base, |
| | device_map="auto", |
| | ) |
| | |
| | model = PeftModel.from_pretrained(model, adapter) |
| | |
| | inputs = tokenizer("Hello!", return_tensors="pt").to(model.device) |
| | outputs = model.generate(**inputs, max_new_tokens=100) |
| | print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| | ``` |
| |
|
| | --- |
| |
|
| | ## Training Details |
| |
|
| | ### Training Data |
| |
|
| | The model was trained using custom datasets prepared through: |
| |
|
| | * Instruction datasets |
| | * Synthetic Q&A |
| | * Formatting for chat templates |
| |
|
| | *(Replace with your actual dataset if you want more accuracy.)* |
| |
|
| | ### Training Procedure |
| |
|
| | * **Framework:** Unsloth + TRL + PEFT |
| | * **Training type:** Supervised Fine-Tuning (SFT) |
| | * **Precision:** bnb-4bit quantization during training |
| | * **LoRA Ranks:** (insert your actual values if different) |
| |
|
| | * `r=16`, `alpha=32`, `dropout=0.05` |
| |
|
| | ### Hyperparameters |
| |
|
| | * **Batch size:** 2–8 (depending on VRAM) |
| | * **Gradient Accumulation:** 8–16 |
| | * **LR:** 2e-4 |
| | * **Epochs:** 1–3 |
| | * **Optimizer:** AdamW / paged optimizers (Unsloth) |
| |
|
| | ### Speeds & Compute |
| |
|
| | * **Hardware:** 1× RTX 4090 / A100 / local GPU |
| | * **Training Time:** 1–3 hours (approx) |
| | * **Checkpoint Size:** Tiny (LoRA weights only) |
| |
|
| | --- |
| |
|
| | ## Evaluation |
| |
|
| | *(You can update this later after running eval benchmarks.)* |
| |
|
| | * Model evaluated on small reasoning + text-generation samples |
| | * Performs well for short instructions |
| | * Limited long-context and deep reasoning |
| |
|
| | --- |
| |
|
| | ## Environmental Impact |
| |
|
| | * **Hardware:** 1 GPU (consumer or cloud) |
| | * **Carbon estimate:** Low (small model + LoRA) |
| |
|
| | --- |
| |
|
| | ## Technical Specs |
| |
|
| | * **Architecture:** Qwen2.5 0.5B |
| | * **Objective:** Causal LM |
| | * **Adapters:** LoRA (PEFT) |
| | * **Quantization:** bnb 4-bit |
| |
|
| | --- |
| |
|
| | ## Citation |
| |
|
| | ``` |
| | @misc{Sriramdayal2025QwenLoRA, |
| | title={Qwen2.5-0.5B Unsloth LoRA Fine-Tune}, |
| | author={Sriram Dayal}, |
| | year={2025}, |
| | howpublished={\url{https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1}}, |
| | } |
| | ``` |
| |
|
| | --- |
| |
|
| | ## Model Card Author |
| |
|
| | **@Sriramdayal** |
| |
|
| | --- |
| |
|
| |
|
| | ### Framework versions |
| |
|
| | - PEFT 0.18.0 |