LLaMA-1 / README.md
iko-01's picture
Update README.md
2293606 verified
---
license: apache-2.0
language:
- en
library_name: peft
tags:
- text-generation
- transformers
- peft
- lora
- qwen
- qwen2
- reddit
- llama-factory
datasets:
- olmo-data/dolma-v1_6-reddit
base_model: Qwen/Qwen2-0.5B
pipeline_tag: text-generation
---
# Qwen2-0.5B Reddit LoRA Adapter
**Repo:** [iko-01/LLaMA-1](https://huggingface.co/iko-01/LLaMA-1)
**Base model:** [Qwen/Qwen2-0.5B](https://huggingface.co/Qwen/Qwen2-0.5B)
**Adapter type:** LoRA (via LLaMA-Factory + QLoRA)
**Intended use:** Simulating casual, Reddit-style comments, discussions, and thread replies
## Model Description
This is a **LoRA adapter** fine-tuned on top of **Qwen2-0.5B** using a filtered subset of Reddit posts & comments from the Dolma dataset (v1.6 Reddit portion).
The model is trained to generate informal, conversational text typical of Reddit threads β€” including sarcasm, memes references, casual opinions, upvotes/downvotes vibe, and natural thread continuations.
Despite the repository name (`LLaMA-1`), this is **not** a LLaMA model β€” it is purely **Qwen2** architecture.
### Key Characteristics
- Extremely lightweight (only ~0.5B base + small LoRA adapter)
- Runs comfortably on consumer GPUs, laptops, or even decent CPUs
- Fast inference (very suitable for local prototyping, chatbots, Reddit simulators, etc.)
- Casual / internet / meme-friendly tone
## Training Details
- **Framework:** LLaMA-Factory
- **Training method:** QLoRA (4-bit base quantization + LoRA)
- **Dataset size:** ~6,000 high-quality, deduplicated Reddit samples
- **Hardware:** Google Colab T4 (single GPU)
- **Training duration:** β‰ˆ 30 minutes
- **Hyperparameters:**
| Parameter | Value |
|------------------------|-----------|
| LoRA rank (r) | 32 |
| LoRA alpha | 64 |
| Learning rate | 2e-4 |
| Batch size | 2 |
| Gradient accumulation | 16 |
| Epochs | 3 |
| Optimizer | AdamW |
| Warmup ratio | 0.03 |
## Usage
```bash
pip install -U transformers peft torch accelerate bitsandbytes # bitsandbytes optional but recommended
```
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model_id = "Qwen/Qwen2-0.5B"
adapter_id = "iko-01/LLaMA-1"
# Load base model
model = AutoModelForCausalLM.from_pretrained(
base_model_id,
torch_dtype=torch.float16,
device_map="auto",
trust_remote_code=True
)
# Apply LoRA adapter
model = PeftModel.from_pretrained(model, adapter_id)
model.eval()
tokenizer = AutoTokenizer.from_pretrained(base_model_id, trust_remote_code=True)
# Example prompt
prompt = """Continue this r/AskReddit thread:
After playing for 50 hours I finally"""
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.no_grad():
out = model.generate(
**inputs,
max_new_tokens=120,
temperature=0.75,
top_p=0.92,
repetition_penalty=1.08,
do_sample=True
)
response = tokenizer.decode(out[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(response)
```
### Example Outputs
**Prompt:**
`Complete this r/gaming discussion: After playing for 50 hours I finally`
**Typical model output:**
`...realized the main story is mid but the side content is actually peak. The open world exploration in the frozen north hits different. Spent like 6 hours just fishing and upgrading my house and I don't even feel bad about it lmao. Anyone else 100% the fishing minigame before the final boss?`
## Limitations & Responsible Use
- **Model size** β€” Being a 0.5B model, it has limited world knowledge, reasoning depth, and coherence over very long contexts compared to 7B+ models.
- **Reddit bias** β€” The training data comes from Reddit β†’ expect informal language, slang, sarcasm, exaggeration, memes, controversial/hot-take opinions, and sometimes toxic phrasing.
- **Hallucinations** β€” Can confidently generate plausible but incorrect facts, especially outside popular Reddit topics.
- **Not for production / sensitive use** β€” Not suitable for factual Q&A, customer support, education, legal/medical advice, or any high-stakes application.
- **English only** β€” The fine-tune was done exclusively on English Reddit content.
Use this model mainly for **creative**, **entertainment**, or **research** purposes (e.g. generating synthetic discussion data, building Reddit-style bots, style transfer experiments).
## Citation / Thanks
If you use this adapter in your work, feel free to mention:
> Fine-tuned with LLaMA-Factory on Qwen2-0.5B using Reddit data from Dolma.
Big thanks to the Qwen team, LLaMA-Factory contributors, and AllenAI (Dolma dataset).
Happy hacking! πŸš€
```