File size: 4,723 Bytes
a84450c c068cc6 611dec0 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 449a6cb a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c c068cc6 a84450c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 | ---
base_model: unsloth/qwen2.5-0.5b-unsloth-bnb-4bit
library_name: peft
pipeline_tag: text-generation
tags:
- base_model:adapter:unsloth/qwen2.5-0.5b-unsloth-bnb-4bit
- lora
- sft
- transformers
- trl
- unsloth
---
---
# Model Card **
A lightweight **Qwen2.5-0.5B** model fine-tuned using **Unsloth + LoRA (PEFT)** for efficient text-generation tasks. This model is optimized for **low-VRAM systems**, fast inference, and rapid experimentation.
---
## Model Details
### Model Description
This model is a **parameter-efficient fine-tuned version** of the base model:
* **Base model:** `unsloth/qwen2.5-0.5b-unsloth-bnb-4bit`
* **Fine-tuning method:** LoRA (PEFT)
* **Quantization:** 4-bit (bnb-4bit)
* **Pipeline:** text-generation
* **Library:** PEFT, Transformers, TRL, Unsloth
It is intended as a **compact research model** for text generation, instruction following, and as a baseline for custom SFT/RLHF projects.
* **Developer:** @Sriramdayal
* **Repository:** [https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1](https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1)
* **License:** Same as Qwen2.5 base license (typically Apache 2.0 or base model license)
* **Languages:** English (primary), multilingual capability inherited from Qwen2.5
* **Finetuned from:** `unsloth/qwen2.5-0.5b-unsloth-bnb-4bit`
---
## Model Sources
* **GitHub Repo (Training Code):**
[https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1](https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1)
* **Base Model:**
`unsloth/qwen2.5-0.5b-unsloth-bnb-4bit`
---
## Uses
### Direct Use
* Instruction-style text generation
* Chatbot prototyping
* Educational or research experiments
* Low-VRAM inference (4–6 GB GPU)
* Fine-tuning starter model for custom tasks
### Downstream Use
* Domain-specific SFT
* Dataset distillation
* RLHF training
* Task-specific adapters (classifiers, generators, reasoning tasks)
### Out-of-Scope / Avoid
* High-accuracy medical/legal decisions
* Safety-critical systems
* Long-context reasoning competitive with large LLMs
* Harmful or malicious use cases
---
## Bias, Risks & Limitations
This model inherits all biases from Qwen2.5 training data and may generate:
* Inaccurate or hallucinated information
* Social, demographic, or political biases
* Unsafe or harmful recommendations if misused
### Recommendations
Users must implement:
* Output filtering
* Safety moderation
* Human verification for critical tasks
---
## How to Use
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
from peft import PeftModel
base = "unsloth/qwen2.5-0.5b-unsloth-bnb-4bit"
adapter = "black279/Qwen_LeetCoder"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
base,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)
inputs = tokenizer("Hello!", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
---
## Training Details
### Training Data
The model was trained using custom datasets prepared through:
* Instruction datasets
* Synthetic Q&A
* Formatting for chat templates
*(Replace with your actual dataset if you want more accuracy.)*
### Training Procedure
* **Framework:** Unsloth + TRL + PEFT
* **Training type:** Supervised Fine-Tuning (SFT)
* **Precision:** bnb-4bit quantization during training
* **LoRA Ranks:** (insert your actual values if different)
* `r=16`, `alpha=32`, `dropout=0.05`
### Hyperparameters
* **Batch size:** 2–8 (depending on VRAM)
* **Gradient Accumulation:** 8–16
* **LR:** 2e-4
* **Epochs:** 1–3
* **Optimizer:** AdamW / paged optimizers (Unsloth)
### Speeds & Compute
* **Hardware:** 1× RTX 4090 / A100 / local GPU
* **Training Time:** 1–3 hours (approx)
* **Checkpoint Size:** Tiny (LoRA weights only)
---
## Evaluation
*(You can update this later after running eval benchmarks.)*
* Model evaluated on small reasoning + text-generation samples
* Performs well for short instructions
* Limited long-context and deep reasoning
---
## Environmental Impact
* **Hardware:** 1 GPU (consumer or cloud)
* **Carbon estimate:** Low (small model + LoRA)
---
## Technical Specs
* **Architecture:** Qwen2.5 0.5B
* **Objective:** Causal LM
* **Adapters:** LoRA (PEFT)
* **Quantization:** bnb 4-bit
---
## Citation
```
@misc{Sriramdayal2025QwenLoRA,
title={Qwen2.5-0.5B Unsloth LoRA Fine-Tune},
author={Sriram Dayal},
year={2025},
howpublished={\url{https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1}},
}
```
---
## Model Card Author
**@Sriramdayal**
---
### Framework versions
- PEFT 0.18.0 |