LLaMA-1 / README.md

Update README.md

2293606 verified 16 days ago

4.99 kB


	---
	license: apache-2.0
	language:
	- en
	library_name: peft
	tags:
	- text-generation
	- transformers
	- peft
	- lora
	- qwen
	- qwen2
	- reddit
	- llama-factory
	datasets:
	- olmo-data/dolma-v1_6-reddit
	base_model: Qwen/Qwen2-0.5B
	pipeline_tag: text-generation
	---

	# Qwen2-0.5B Reddit LoRA Adapter

	Repo: [iko-01/LLaMA-1](https://huggingface.co/iko-01/LLaMA-1)
	Base model: [Qwen/Qwen2-0.5B](https://huggingface.co/Qwen/Qwen2-0.5B)
	Adapter type: LoRA (via LLaMA-Factory + QLoRA)
	Intended use: Simulating casual, Reddit-style comments, discussions, and thread replies

	## Model Description

	This is a LoRA adapter fine-tuned on top of Qwen2-0.5B using a filtered subset of Reddit posts & comments from the Dolma dataset (v1.6 Reddit portion).

	The model is trained to generate informal, conversational text typical of Reddit threads — including sarcasm, memes references, casual opinions, upvotes/downvotes vibe, and natural thread continuations.

	Despite the repository name (`LLaMA-1`), this is not a LLaMA model — it is purely Qwen2 architecture.

	### Key Characteristics

	- Extremely lightweight (only ~0.5B base + small LoRA adapter)
	- Runs comfortably on consumer GPUs, laptops, or even decent CPUs
	- Fast inference (very suitable for local prototyping, chatbots, Reddit simulators, etc.)
	- Casual / internet / meme-friendly tone

	## Training Details

	- Framework: LLaMA-Factory
	- Training method: QLoRA (4-bit base quantization + LoRA)
	- Dataset size: ~6,000 high-quality, deduplicated Reddit samples
	- Hardware: Google Colab T4 (single GPU)
	- Training duration: ≈ 30 minutes
	- Hyperparameters:

	\| Parameter \| Value \|
	\|------------------------\|-----------\|
	\| LoRA rank (r) \| 32 \|
	\| LoRA alpha \| 64 \|
	\| Learning rate \| 2e-4 \|
	\| Batch size \| 2 \|
	\| Gradient accumulation \| 16 \|
	\| Epochs \| 3 \|
	\| Optimizer \| AdamW \|
	\| Warmup ratio \| 0.03 \|

	## Usage

	```bash
	pip install -U transformers peft torch accelerate bitsandbytes # bitsandbytes optional but recommended
	```

	```python
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel

	base_model_id = "Qwen/Qwen2-0.5B"
	adapter_id = "iko-01/LLaMA-1"

	# Load base model
	model = AutoModelForCausalLM.from_pretrained(
	base_model_id,
	torch_dtype=torch.float16,
	device_map="auto",
	trust_remote_code=True
	)

	# Apply LoRA adapter
	model = PeftModel.from_pretrained(model, adapter_id)
	model.eval()

	tokenizer = AutoTokenizer.from_pretrained(base_model_id, trust_remote_code=True)

	# Example prompt
	prompt = """Continue this r/AskReddit thread:

	After playing for 50 hours I finally"""

	messages = [{"role": "user", "content": prompt}]
	text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

	inputs = tokenizer(text, return_tensors="pt").to(model.device)

	with torch.no_grad():
	out = model.generate(
	**inputs,
	max_new_tokens=120,
	temperature=0.75,
	top_p=0.92,
	repetition_penalty=1.08,
	do_sample=True
	)

	response = tokenizer.decode(out[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
	print(response)
	```

	### Example Outputs

	Prompt:
	`Complete this r/gaming discussion: After playing for 50 hours I finally`

	Typical model output:
	`...realized the main story is mid but the side content is actually peak. The open world exploration in the frozen north hits different. Spent like 6 hours just fishing and upgrading my house and I don't even feel bad about it lmao. Anyone else 100% the fishing minigame before the final boss?`

	## Limitations & Responsible Use

	- Model size — Being a 0.5B model, it has limited world knowledge, reasoning depth, and coherence over very long contexts compared to 7B+ models.
	- Reddit bias — The training data comes from Reddit → expect informal language, slang, sarcasm, exaggeration, memes, controversial/hot-take opinions, and sometimes toxic phrasing.
	- Hallucinations — Can confidently generate plausible but incorrect facts, especially outside popular Reddit topics.
	- Not for production / sensitive use — Not suitable for factual Q&A, customer support, education, legal/medical advice, or any high-stakes application.
	- English only — The fine-tune was done exclusively on English Reddit content.

	Use this model mainly for creative, entertainment, or research purposes (e.g. generating synthetic discussion data, building Reddit-style bots, style transfer experiments).

	## Citation / Thanks

	If you use this adapter in your work, feel free to mention:

	> Fine-tuned with LLaMA-Factory on Qwen2-0.5B using Reddit data from Dolma.

	Big thanks to the Qwen team, LLaMA-Factory contributors, and AllenAI (Dolma dataset).

	Happy hacking! 🚀
	```