Qwen_LeetCoder / README.md
black279's picture
Update README.md
449a6cb verified
---
base_model: unsloth/qwen2.5-0.5b-unsloth-bnb-4bit
library_name: peft
pipeline_tag: text-generation
tags:
- base_model:adapter:unsloth/qwen2.5-0.5b-unsloth-bnb-4bit
- lora
- sft
- transformers
- trl
- unsloth
---
---
# Model Card **
A lightweight **Qwen2.5-0.5B** model fine-tuned using **Unsloth + LoRA (PEFT)** for efficient text-generation tasks. This model is optimized for **low-VRAM systems**, fast inference, and rapid experimentation.
---
## Model Details
### Model Description
This model is a **parameter-efficient fine-tuned version** of the base model:
* **Base model:** `unsloth/qwen2.5-0.5b-unsloth-bnb-4bit`
* **Fine-tuning method:** LoRA (PEFT)
* **Quantization:** 4-bit (bnb-4bit)
* **Pipeline:** text-generation
* **Library:** PEFT, Transformers, TRL, Unsloth
It is intended as a **compact research model** for text generation, instruction following, and as a baseline for custom SFT/RLHF projects.
* **Developer:** @Sriramdayal
* **Repository:** [https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1](https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1)
* **License:** Same as Qwen2.5 base license (typically Apache 2.0 or base model license)
* **Languages:** English (primary), multilingual capability inherited from Qwen2.5
* **Finetuned from:** `unsloth/qwen2.5-0.5b-unsloth-bnb-4bit`
---
## Model Sources
* **GitHub Repo (Training Code):**
[https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1](https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1)
* **Base Model:**
`unsloth/qwen2.5-0.5b-unsloth-bnb-4bit`
---
## Uses
### Direct Use
* Instruction-style text generation
* Chatbot prototyping
* Educational or research experiments
* Low-VRAM inference (4–6 GB GPU)
* Fine-tuning starter model for custom tasks
### Downstream Use
* Domain-specific SFT
* Dataset distillation
* RLHF training
* Task-specific adapters (classifiers, generators, reasoning tasks)
### Out-of-Scope / Avoid
* High-accuracy medical/legal decisions
* Safety-critical systems
* Long-context reasoning competitive with large LLMs
* Harmful or malicious use cases
---
## Bias, Risks & Limitations
This model inherits all biases from Qwen2.5 training data and may generate:
* Inaccurate or hallucinated information
* Social, demographic, or political biases
* Unsafe or harmful recommendations if misused
### Recommendations
Users must implement:
* Output filtering
* Safety moderation
* Human verification for critical tasks
---
## How to Use
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
from peft import PeftModel
base = "unsloth/qwen2.5-0.5b-unsloth-bnb-4bit"
adapter = "black279/Qwen_LeetCoder"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
base,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)
inputs = tokenizer("Hello!", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
---
## Training Details
### Training Data
The model was trained using custom datasets prepared through:
* Instruction datasets
* Synthetic Q&A
* Formatting for chat templates
*(Replace with your actual dataset if you want more accuracy.)*
### Training Procedure
* **Framework:** Unsloth + TRL + PEFT
* **Training type:** Supervised Fine-Tuning (SFT)
* **Precision:** bnb-4bit quantization during training
* **LoRA Ranks:** (insert your actual values if different)
* `r=16`, `alpha=32`, `dropout=0.05`
### Hyperparameters
* **Batch size:** 2–8 (depending on VRAM)
* **Gradient Accumulation:** 8–16
* **LR:** 2e-4
* **Epochs:** 1–3
* **Optimizer:** AdamW / paged optimizers (Unsloth)
### Speeds & Compute
* **Hardware:** 1× RTX 4090 / A100 / local GPU
* **Training Time:** 1–3 hours (approx)
* **Checkpoint Size:** Tiny (LoRA weights only)
---
## Evaluation
*(You can update this later after running eval benchmarks.)*
* Model evaluated on small reasoning + text-generation samples
* Performs well for short instructions
* Limited long-context and deep reasoning
---
## Environmental Impact
* **Hardware:** 1 GPU (consumer or cloud)
* **Carbon estimate:** Low (small model + LoRA)
---
## Technical Specs
* **Architecture:** Qwen2.5 0.5B
* **Objective:** Causal LM
* **Adapters:** LoRA (PEFT)
* **Quantization:** bnb 4-bit
---
## Citation
```
@misc{Sriramdayal2025QwenLoRA,
title={Qwen2.5-0.5B Unsloth LoRA Fine-Tune},
author={Sriram Dayal},
year={2025},
howpublished={\url{https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1}},
}
```
---
## Model Card Author
**@Sriramdayal**
---
### Framework versions
- PEFT 0.18.0