Update README.md

449a6cb verified 4 months ago

4.72 kB

	---
	base_model: unsloth/qwen2.5-0.5b-unsloth-bnb-4bit
	library_name: peft
	pipeline_tag: text-generation
	tags:
	- base_model:adapter:unsloth/qwen2.5-0.5b-unsloth-bnb-4bit
	- lora
	- sft
	- transformers
	- trl
	- unsloth
	---

	---
	# Model Card **

	A lightweight Qwen2.5-0.5B model fine-tuned using Unsloth + LoRA (PEFT) for efficient text-generation tasks. This model is optimized for low-VRAM systems, fast inference, and rapid experimentation.

	---

	## Model Details

	### Model Description

	This model is a parameter-efficient fine-tuned version of the base model:

	* Base model: `unsloth/qwen2.5-0.5b-unsloth-bnb-4bit`
	* Fine-tuning method: LoRA (PEFT)
	* Quantization: 4-bit (bnb-4bit)
	* Pipeline: text-generation
	* Library: PEFT, Transformers, TRL, Unsloth

	It is intended as a compact research model for text generation, instruction following, and as a baseline for custom SFT/RLHF projects.

	* Developer: @Sriramdayal
	* Repository: [https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1](https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1)
	* License: Same as Qwen2.5 base license (typically Apache 2.0 or base model license)
	* Languages: English (primary), multilingual capability inherited from Qwen2.5
	* Finetuned from: `unsloth/qwen2.5-0.5b-unsloth-bnb-4bit`

	---

	## Model Sources

	* GitHub Repo (Training Code):
	[https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1](https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1)

	* Base Model:
	`unsloth/qwen2.5-0.5b-unsloth-bnb-4bit`

	---

	## Uses

	### Direct Use

	* Instruction-style text generation
	* Chatbot prototyping
	* Educational or research experiments
	* Low-VRAM inference (4–6 GB GPU)
	* Fine-tuning starter model for custom tasks

	### Downstream Use

	* Domain-specific SFT
	* Dataset distillation
	* RLHF training
	* Task-specific adapters (classifiers, generators, reasoning tasks)

	### Out-of-Scope / Avoid

	* High-accuracy medical/legal decisions
	* Safety-critical systems
	* Long-context reasoning competitive with large LLMs
	* Harmful or malicious use cases

	---

	## Bias, Risks & Limitations

	This model inherits all biases from Qwen2.5 training data and may generate:

	* Inaccurate or hallucinated information
	* Social, demographic, or political biases
	* Unsafe or harmful recommendations if misused

	### Recommendations

	Users must implement:

	* Output filtering
	* Safety moderation
	* Human verification for critical tasks

	---

	## How to Use

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch
	from peft import PeftModel

	base = "unsloth/qwen2.5-0.5b-unsloth-bnb-4bit"
	adapter = "black279/Qwen_LeetCoder"

	tokenizer = AutoTokenizer.from_pretrained(base)
	model = AutoModelForCausalLM.from_pretrained(
	base,
	device_map="auto",
	)

	model = PeftModel.from_pretrained(model, adapter)

	inputs = tokenizer("Hello!", return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=100)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	---

	## Training Details

	### Training Data

	The model was trained using custom datasets prepared through:

	* Instruction datasets
	* Synthetic Q&A
	* Formatting for chat templates

	(Replace with your actual dataset if you want more accuracy.)

	### Training Procedure

	* Framework: Unsloth + TRL + PEFT
	* Training type: Supervised Fine-Tuning (SFT)
	* Precision: bnb-4bit quantization during training
	* LoRA Ranks: (insert your actual values if different)

	* `r=16`, `alpha=32`, `dropout=0.05`

	### Hyperparameters

	* Batch size: 2–8 (depending on VRAM)
	* Gradient Accumulation: 8–16
	* LR: 2e-4
	* Epochs: 1–3
	* Optimizer: AdamW / paged optimizers (Unsloth)

	### Speeds & Compute

	* Hardware: 1× RTX 4090 / A100 / local GPU
	* Training Time: 1–3 hours (approx)
	* Checkpoint Size: Tiny (LoRA weights only)

	---

	## Evaluation

	(You can update this later after running eval benchmarks.)

	* Model evaluated on small reasoning + text-generation samples
	* Performs well for short instructions
	* Limited long-context and deep reasoning

	---

	## Environmental Impact

	* Hardware: 1 GPU (consumer or cloud)
	* Carbon estimate: Low (small model + LoRA)

	---

	## Technical Specs

	* Architecture: Qwen2.5 0.5B
	* Objective: Causal LM
	* Adapters: LoRA (PEFT)
	* Quantization: bnb 4-bit

	---

	## Citation

	```
	@misc{Sriramdayal2025QwenLoRA,
	title={Qwen2.5-0.5B Unsloth LoRA Fine-Tune},
	author={Sriram Dayal},
	year={2025},
	howpublished={\url{https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1}},
	}
	```

	---

	## Model Card Author

	@Sriramdayal

	---


	### Framework versions

	- PEFT 0.18.0