|
|
|
|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
library_name: peft |
|
|
tags: |
|
|
- text-generation |
|
|
- transformers |
|
|
- peft |
|
|
- lora |
|
|
- qwen |
|
|
- qwen2 |
|
|
- reddit |
|
|
- llama-factory |
|
|
datasets: |
|
|
- olmo-data/dolma-v1_6-reddit |
|
|
base_model: Qwen/Qwen2-0.5B |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# Qwen2-0.5B Reddit LoRA Adapter |
|
|
|
|
|
**Repo:** [iko-01/LLaMA-1](https://huggingface.co/iko-01/LLaMA-1) |
|
|
**Base model:** [Qwen/Qwen2-0.5B](https://huggingface.co/Qwen/Qwen2-0.5B) |
|
|
**Adapter type:** LoRA (via LLaMA-Factory + QLoRA) |
|
|
**Intended use:** Simulating casual, Reddit-style comments, discussions, and thread replies |
|
|
|
|
|
## Model Description |
|
|
|
|
|
This is a **LoRA adapter** fine-tuned on top of **Qwen2-0.5B** using a filtered subset of Reddit posts & comments from the Dolma dataset (v1.6 Reddit portion). |
|
|
|
|
|
The model is trained to generate informal, conversational text typical of Reddit threads β including sarcasm, memes references, casual opinions, upvotes/downvotes vibe, and natural thread continuations. |
|
|
|
|
|
Despite the repository name (`LLaMA-1`), this is **not** a LLaMA model β it is purely **Qwen2** architecture. |
|
|
|
|
|
### Key Characteristics |
|
|
|
|
|
- Extremely lightweight (only ~0.5B base + small LoRA adapter) |
|
|
- Runs comfortably on consumer GPUs, laptops, or even decent CPUs |
|
|
- Fast inference (very suitable for local prototyping, chatbots, Reddit simulators, etc.) |
|
|
- Casual / internet / meme-friendly tone |
|
|
|
|
|
## Training Details |
|
|
|
|
|
- **Framework:** LLaMA-Factory |
|
|
- **Training method:** QLoRA (4-bit base quantization + LoRA) |
|
|
- **Dataset size:** ~6,000 high-quality, deduplicated Reddit samples |
|
|
- **Hardware:** Google Colab T4 (single GPU) |
|
|
- **Training duration:** β 30 minutes |
|
|
- **Hyperparameters:** |
|
|
|
|
|
| Parameter | Value | |
|
|
|------------------------|-----------| |
|
|
| LoRA rank (r) | 32 | |
|
|
| LoRA alpha | 64 | |
|
|
| Learning rate | 2e-4 | |
|
|
| Batch size | 2 | |
|
|
| Gradient accumulation | 16 | |
|
|
| Epochs | 3 | |
|
|
| Optimizer | AdamW | |
|
|
| Warmup ratio | 0.03 | |
|
|
|
|
|
## Usage |
|
|
|
|
|
```bash |
|
|
pip install -U transformers peft torch accelerate bitsandbytes # bitsandbytes optional but recommended |
|
|
``` |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
from peft import PeftModel |
|
|
|
|
|
base_model_id = "Qwen/Qwen2-0.5B" |
|
|
adapter_id = "iko-01/LLaMA-1" |
|
|
|
|
|
# Load base model |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
base_model_id, |
|
|
torch_dtype=torch.float16, |
|
|
device_map="auto", |
|
|
trust_remote_code=True |
|
|
) |
|
|
|
|
|
# Apply LoRA adapter |
|
|
model = PeftModel.from_pretrained(model, adapter_id) |
|
|
model.eval() |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(base_model_id, trust_remote_code=True) |
|
|
|
|
|
# Example prompt |
|
|
prompt = """Continue this r/AskReddit thread: |
|
|
|
|
|
After playing for 50 hours I finally""" |
|
|
|
|
|
messages = [{"role": "user", "content": prompt}] |
|
|
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
|
|
|
|
|
inputs = tokenizer(text, return_tensors="pt").to(model.device) |
|
|
|
|
|
with torch.no_grad(): |
|
|
out = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=120, |
|
|
temperature=0.75, |
|
|
top_p=0.92, |
|
|
repetition_penalty=1.08, |
|
|
do_sample=True |
|
|
) |
|
|
|
|
|
response = tokenizer.decode(out[0][len(inputs.input_ids[0]):], skip_special_tokens=True) |
|
|
print(response) |
|
|
``` |
|
|
|
|
|
### Example Outputs |
|
|
|
|
|
**Prompt:** |
|
|
`Complete this r/gaming discussion: After playing for 50 hours I finally` |
|
|
|
|
|
**Typical model output:** |
|
|
`...realized the main story is mid but the side content is actually peak. The open world exploration in the frozen north hits different. Spent like 6 hours just fishing and upgrading my house and I don't even feel bad about it lmao. Anyone else 100% the fishing minigame before the final boss?` |
|
|
|
|
|
## Limitations & Responsible Use |
|
|
|
|
|
- **Model size** β Being a 0.5B model, it has limited world knowledge, reasoning depth, and coherence over very long contexts compared to 7B+ models. |
|
|
- **Reddit bias** β The training data comes from Reddit β expect informal language, slang, sarcasm, exaggeration, memes, controversial/hot-take opinions, and sometimes toxic phrasing. |
|
|
- **Hallucinations** β Can confidently generate plausible but incorrect facts, especially outside popular Reddit topics. |
|
|
- **Not for production / sensitive use** β Not suitable for factual Q&A, customer support, education, legal/medical advice, or any high-stakes application. |
|
|
- **English only** β The fine-tune was done exclusively on English Reddit content. |
|
|
|
|
|
Use this model mainly for **creative**, **entertainment**, or **research** purposes (e.g. generating synthetic discussion data, building Reddit-style bots, style transfer experiments). |
|
|
|
|
|
## Citation / Thanks |
|
|
|
|
|
If you use this adapter in your work, feel free to mention: |
|
|
|
|
|
> Fine-tuned with LLaMA-Factory on Qwen2-0.5B using Reddit data from Dolma. |
|
|
|
|
|
Big thanks to the Qwen team, LLaMA-Factory contributors, and AllenAI (Dolma dataset). |
|
|
|
|
|
Happy hacking! π |
|
|
``` |