Update model card

5d78b8f verified about 1 month ago

9.26 kB

	---
	license: apache-2.0
	base_model: meta-llama/Llama-3.2-1B-Instruct
	language:
	- en
	tags:
	- lora
	- peft
	- adapter
	- llama
	- fine-tuned
	- horoscope
	- creative-writing
	- on-device
	library_name: peft
	pipeline_tag: text-generation
	---

	# Unhinged Horoscopes — LoRA adapter

	A ~22MB LoRA adapter on top of Llama 3.2 1B Instruct that overrides the base model's tone and turns it into a generator for absurd, specific, chaotic-neutral horoscopes from a 30-token prompt. The adapter is narrow on the input format and on output length; it does not significantly rewrite the base model's general knowledge or safety behaviour.

	If you only want to run the model, grab the merged and quantised GGUF at [edbuildingstuff/unhinged-horoscopes](https://huggingface.co/edbuildingstuff/unhinged-horoscopes) (~770MB, drops into `llama.cpp` / `ollama` / mobile FFI as a single file).

	This adapter repo is for developers who want to:

	- inspect what was changed
	- merge it into a different base build, dtype, or runtime
	- continue training on top of it
	- reproduce the result from scratch

	## Adapter config

	\| Field \| Value \|
	\|---\|---\|
	\| Base model \| [`meta-llama/Llama-3.2-1B-Instruct`](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct) \|
	\| LoRA rank (`r`) \| 16 \|
	\| LoRA alpha \| 32 \|
	\| Target modules \| `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj` (all 7 projection layers) \|
	\| Adapter size \| ~22MB \|
	\| Format \| Safetensors (PEFT) \|

	## Prompt format

	The adapter was trained on a single user message with no system prompt. Match this format exactly; the fine-tune is narrow on it.

	```
	Sign: Aries
	Category: Daily Chaos
	Date: 2026-05-02
	Generate an unhinged horoscope.
	```

	Required values:

	- `Sign` is one of: `Aries`, `Taurus`, `Gemini`, `Cancer`, `Leo`, `Virgo`, `Libra`, `Scorpio`, `Sagittarius`, `Capricorn`, `Aquarius`, `Pisces`
	- `Category` is one of: `Daily Chaos`, `Love Life`, `Career`, `Vibe Check`
	- `Date` is `YYYY-MM-DD`

	Apply the standard Llama 3.2 chat template around the user message.

	## Quick start

	### Load with PEFT

	```python
	from peft import PeftModel
	from transformers import AutoModelForCausalLM, AutoTokenizer

	base_id = "meta-llama/Llama-3.2-1B-Instruct"
	adapter_id = "edbuildingstuff/unhinged-horoscopes-lora"

	base = AutoModelForCausalLM.from_pretrained(base_id, torch_dtype="auto", device_map="auto")
	tokenizer = AutoTokenizer.from_pretrained(base_id)
	model = PeftModel.from_pretrained(base, adapter_id)

	prompt = (
	"Sign: Leo\n"
	"Category: Career\n"
	"Date: 2026-05-02\n"
	"Generate an unhinged horoscope."
	)

	input_ids = tokenizer.apply_chat_template(
	[{"role": "user", "content": prompt}],
	return_tensors="pt",
	add_generation_prompt=True,
	).to(model.device)

	out = model.generate(
	input_ids,
	max_new_tokens=120,
	temperature=0.9,
	top_p=0.9,
	do_sample=True,
	)
	print(tokenizer.decode(out[0][input_ids.shape[1]:], skip_special_tokens=True))
	```

	### Merge into FP16 base

	```python
	from peft import PeftModel
	from transformers import AutoModelForCausalLM, AutoTokenizer

	base = AutoModelForCausalLM.from_pretrained(
	"meta-llama/Llama-3.2-1B-Instruct",
	torch_dtype="auto",
	device_map="cpu",
	)
	model = PeftModel.from_pretrained(base, "edbuildingstuff/unhinged-horoscopes-lora")
	merged = model.merge_and_unload()
	merged.save_pretrained("./merged_hf")
	AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B-Instruct").save_pretrained("./merged_hf")
	```

	Output: `./merged_hf/` — FP16 merged base + adapter, ~2.4GB safetensors.

	### Convert to GGUF and quantise to Q4_K_M

	Clone and build `llama.cpp` (one-time):

	```bash
	git clone https://github.com/ggerganov/llama.cpp.git
	cmake -B llama.cpp/build llama.cpp
	cmake --build llama.cpp/build --config Release
	```

	Convert merged FP16 to GGUF, then quantise:

	```bash
	python llama.cpp/convert_hf_to_gguf.py ./merged_hf \
	--outtype f16 \
	--outfile ./unhinged-horoscopes-f16.gguf

	llama.cpp/build/bin/llama-quantize \
	./unhinged-horoscopes-f16.gguf \
	./unhinged-horoscopes-q4_k_m.gguf \
	Q4_K_M
	```

	Outputs:

	- `unhinged-horoscopes-f16.gguf` — FP16 GGUF (~2.48GB)
	- `unhinged-horoscopes-q4_k_m.gguf` — Q4_K_M GGUF (~770MB), ready to drop into `llama.cpp`, `ollama`, or `llamadart`

	For a different precision (Q5_K_M, Q8_0, IQ-quants, etc.) substitute the last argument to `llama-quantize`.

	### Shortcut: pre-merged + Q4_K_M GGUF

	If you don't need to inspect the intermediates, the merged Q4_K_M GGUF is published at [edbuildingstuff/unhinged-horoscopes](https://huggingface.co/edbuildingstuff/unhinged-horoscopes). Drop-in usable in `llama.cpp` / `ollama` / `llamadart`.

	## What the adapter changes

	- Tone register. Confident, absurd, specific, chaotic neutral. The trained register dominates on prompts that match the 4-line template.
	- Output length. 1 to 3 sentences, ~30 to 80 tokens. The model does not pad, does not preface with "Sure, here is your horoscope", does not list bullets.
	- Format adherence. Responds directly to the 4-line prompt template without preamble.
	- Per-sign personality threads. Subtle (Aries impulsive, Capricorn workaholic, Pisces dreamer, Aquarius alien, etc.) — present but not heavy-handed.

	## What the adapter does not change

	- Base safety behaviour is largely intact. The training set is benign and short, so the adapter does not significantly rewrite the base model's refusal patterns.
	- General knowledge is preserved. Off-template prompts (free-form questions, advice-seeking, factual queries) still resolve through the base model. The adapter is narrow on the prompt template and does not crowd out base capability.
	- Off-template behaviour is uncalibrated. If you stray from the 4-line template, expect base-Llama-with-some-tone-bleed, not horoscope output.

	If you stack this adapter with another LoRA, expect tone interference; the chaotic-neutral register tends to dominate.

	## Training

	\| Field \| Value \|
	\|---\|---\|
	\| Base model \| [`meta-llama/Llama-3.2-1B-Instruct`](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct) \|
	\| Method \| LoRA, all 7 projection modules \|
	\| Rank (`r`) \| 16 \|
	\| Alpha \| 32 \|
	\| Epochs \| 3 \|
	\| Batch size \| 4 \|
	\| Learning rate \| 2e-4 \|
	\| Max sequence length \| 256 tokens \|
	\| Training platform \| [Ertas.AI](https://www.ertas.ai) (managed fine-tuning, GPUs pre-configured) \|

	### Dataset

	\| Field \| Value \|
	\|---\|---\|
	\| Size \| 480 examples \|
	\| Coverage \| 12 signs × 4 categories × 10 each (no missing combos) \|
	\| Format \| ShareGPT JSONL (one user + one assistant message per line, no system prompt) \|
	\| Date conditioning \| ~70% date-agnostic, ~30% date-conditioned (season, day-of-week, month vibes) \|
	\| Output length distribution \| 1 to 3 sentences, 30 to 80 tokens target \|
	\| Hard rules \| No real people, brands, or locations. No mean-spirited content. No harmful advice (even as jokes). No repetitive patterns. \|
	\| Validation \| `dataset/validate_dataset.py` checks JSON schema, sign and category coverage, response length, and flags generic phrases \|

	A small Alpaca-format mirror of the dataset (`{instruction, output}`) was also generated for trainers that prefer Alpaca over ShareGPT.

	### Sample training pairs

	```jsonl
	{"messages": [{"role": "user", "content": "Sign: Aries\nCategory: Daily Chaos\nDate: 2026-04-15\nGenerate an unhinged horoscope."}, {"role": "assistant", "content": "You will argue with a GPS today. You will lose. It knows things about you that you told no one. Lucky object: a fork you've been suspicious of since February."}]}
	{"messages": [{"role": "user", "content": "Sign: Capricorn\nCategory: Career\nDate: 2026-04-15\nGenerate an unhinged horoscope."}, {"role": "assistant", "content": "Your boss will email you at 11:47pm. It will just say 'hmm'. Do not respond. Do not sleep. Just know."}]}
	```

	## Files in this repo

	\| File \| Purpose \|
	\|---\|---\|
	\| `adapter_config.json` \| PEFT adapter configuration (rank, alpha, target modules) \|
	\| `adapter_model.safetensors` \| LoRA delta weights (~22MB) \|
	\| Tokenizer files (if shipped) \| Inherit from base — re-load from `meta-llama/Llama-3.2-1B-Instruct` if absent \|

	## Related

	- Merged + Q4_K_M GGUF (run-ready): [edbuildingstuff/unhinged-horoscopes](https://huggingface.co/edbuildingstuff/unhinged-horoscopes)
	- Reference Android app (Flutter + `llamadart`): Unhinged Horoscopes — [Google Play](https://play.google.com/store/apps/details?id=ai.ertas.horoscope) / [horoscope.ertas.ai](https://horoscope.ertas.ai) (bundle id `ai.ertas.horoscope`)
	- Fine-tuning platform: [Ertas.AI](https://www.ertas.ai)

	## License and credits

	- Adapter weights: Apache-2.0 (downstream use must also comply with [Meta's Llama 3.2 community licence](https://www.llama.com/llama3_2/license/))
	- Training dataset: MIT
	- Fine-tuned with [Ertas.AI](https://www.ertas.ai), the managed fine-tuning platform that ran this LoRA on pre-configured GPUs end-to-end
	- Built by Edward Yang ([edbuildingstuff](https://huggingface.co/edbuildingstuff)) as a reference POC for Ertas Product A: build your own on-device AI model and ship it inside your app. App live at [horoscope.ertas.ai](https://horoscope.ertas.ai) / [Google Play](https://play.google.com/store/apps/details?id=ai.ertas.horoscope).