---
base_model: unsloth/qwen2.5-0.5b-unsloth-bnb-4bit
library_name: peft
pipeline_tag: text-generation
tags:
- base_model:adapter:unsloth/qwen2.5-0.5b-unsloth-bnb-4bit
- lora
- sft
- transformers
- trl
- unsloth
---

---
# Model Card **

A lightweight **Qwen2.5-0.5B** model fine-tuned using **Unsloth + LoRA (PEFT)** for efficient text-generation tasks. This model is optimized for **low-VRAM systems**, fast inference, and rapid experimentation.

---

## Model Details

### Model Description

This model is a **parameter-efficient fine-tuned version** of the base model:

* **Base model:** `unsloth/qwen2.5-0.5b-unsloth-bnb-4bit`
* **Fine-tuning method:** LoRA (PEFT)
* **Quantization:** 4-bit (bnb-4bit)
* **Pipeline:** text-generation
* **Library:** PEFT, Transformers, TRL, Unsloth

It is intended as a **compact research model** for text generation, instruction following, and as a baseline for custom SFT/RLHF projects.

* **Developer:** @Sriramdayal
* **Repository:** [https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1](https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1)
* **License:** Same as Qwen2.5 base license (typically Apache 2.0 or base model license)
* **Languages:** English (primary), multilingual capability inherited from Qwen2.5
* **Finetuned from:** `unsloth/qwen2.5-0.5b-unsloth-bnb-4bit`

---

## Model Sources

* **GitHub Repo (Training Code):**
  [https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1](https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1)

* **Base Model:**
  `unsloth/qwen2.5-0.5b-unsloth-bnb-4bit`

---

## Uses

### Direct Use

* Instruction-style text generation
* Chatbot prototyping
* Educational or research experiments
* Low-VRAM inference (4–6 GB GPU)
* Fine-tuning starter model for custom tasks

### Downstream Use

* Domain-specific SFT
* Dataset distillation
* RLHF training
* Task-specific adapters (classifiers, generators, reasoning tasks)

### Out-of-Scope / Avoid

* High-accuracy medical/legal decisions
* Safety-critical systems
* Long-context reasoning competitive with large LLMs
* Harmful or malicious use cases

---

## Bias, Risks & Limitations

This model inherits all biases from Qwen2.5 training data and may generate:

* Inaccurate or hallucinated information
* Social, demographic, or political biases
* Unsafe or harmful recommendations if misused

### Recommendations

Users must implement:

* Output filtering
* Safety moderation
* Human verification for critical tasks

---

## How to Use

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
from peft import PeftModel

base = "unsloth/qwen2.5-0.5b-unsloth-bnb-4bit"
adapter = "black279/Qwen_LeetCoder"

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
    base,
    device_map="auto",
)

model = PeftModel.from_pretrained(model, adapter)

inputs = tokenizer("Hello!", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

---

## Training Details

### Training Data

The model was trained using custom datasets prepared through:

* Instruction datasets
* Synthetic Q&A
* Formatting for chat templates

*(Replace with your actual dataset if you want more accuracy.)*

### Training Procedure

* **Framework:** Unsloth + TRL + PEFT
* **Training type:** Supervised Fine-Tuning (SFT)
* **Precision:** bnb-4bit quantization during training
* **LoRA Ranks:** (insert your actual values if different)

  * `r=16`, `alpha=32`, `dropout=0.05`

### Hyperparameters

* **Batch size:** 2–8 (depending on VRAM)
* **Gradient Accumulation:** 8–16
* **LR:** 2e-4
* **Epochs:** 1–3
* **Optimizer:** AdamW / paged optimizers (Unsloth)

### Speeds & Compute

* **Hardware:** 1× RTX 4090 / A100 / local GPU
* **Training Time:** 1–3 hours (approx)
* **Checkpoint Size:** Tiny (LoRA weights only)

---

## Evaluation

*(You can update this later after running eval benchmarks.)*

* Model evaluated on small reasoning + text-generation samples
* Performs well for short instructions
* Limited long-context and deep reasoning

---

## Environmental Impact

* **Hardware:** 1 GPU (consumer or cloud)
* **Carbon estimate:** Low (small model + LoRA)

---

## Technical Specs

* **Architecture:** Qwen2.5 0.5B
* **Objective:** Causal LM
* **Adapters:** LoRA (PEFT)
* **Quantization:** bnb 4-bit

---

## Citation

```
@misc{Sriramdayal2025QwenLoRA,
  title={Qwen2.5-0.5B Unsloth LoRA Fine-Tune},
  author={Sriram Dayal},
  year={2025},
  howpublished={\url{https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1}},
}
```

---

## Model Card Author

**@Sriramdayal**

---


### Framework versions

- PEFT 0.18.0