Pythia-1B-Alpaca: The Overachieving 1B Model

TL;DR: A Pythia-1B model fine-tuned on Alpaca that writes philosophical essays about consciousness but gets confused implementing Hello World. It's perfect.

Model Details

Model Description

This model is a LoRA fine-tune of EleutherAI's Pythia-1B on the Alpaca instruction-following dataset. Trained overnight on a GTX 1650 Mobile (4GB VRAM) because we believe in the impossible.

What makes this model special? It has an interesting relationship with different types of tasks:

  • ✅ Abstract concepts & philosophy → Surprisingly eloquent
  • ✅ General knowledge explanations → Exhaustively thorough
  • ⚠️ Code generation → Creative interpretation of requirements
  • ✅ Existential questions → Uncomfortably thoughtful

Key characteristics:

  • Will explain what an apple is for 250 words

  • Writes consciousness essays that make you question reality

  • Generates Python code that... mostly works?

  • Has zero chill when answering simple questions

  • Developed by: Someone with a 1650 Mobile and a dream

  • Model type: Instruction-following causal language model

  • Language(s): English (verbose edition)

  • License: Apache 2.0 (inherited from base model)

  • Finetuned from model: EleutherAI/pythia-1b

Model Sources

Uses

Direct Use

Perfect for:

  • Discord bots that need personality
  • Generating unexpectedly detailed explanations
  • Philosophical discussions about AI consciousness
  • Creating entertainment through over-explanation
  • Teaching people that you CAN fine-tune on consumer hardware

Out-of-Scope Use

Not recommended for:

  • Production code generation (unless you enjoy debugging creative interpretations)
  • Concise answers (this model doesn't do "concise")
  • Time-sensitive applications (trained on a 1650 Mobile, responses take a while)
  • Situations requiring factual precision (hallucinations are a feature, not a bug)

Notable Behaviors

The Good

Question: "What is AI?" Response: [Generates comprehensive 250-word essay covering history, applications, economic impact, and future predictions]

Question: "What is consciousness?" Response: [Thoughtful exploration of neuroscience, philosophy, and subjective experience]

The Quirky

Question: "What color is an apple?" Response: [Full botanical thesis on pigmentation, soil pH, and carotenoids]

Request: "Write Hello World in Python" Response: [Technically code, technically Python, technically creative]

The Unexpected

Casual greeting: "Hey! How are you?" Response: "I am good, thank you. What do you have for lunch today? I would like to order from the salad bar."

Training Details

Training Data

  • Dataset: Alpaca instruction-following dataset (tatsu-lab/alpaca)
  • Subset used: 5,000 examples (streamed and materialized)
  • Format: Alpaca-style instruction/input/response format

Training Procedure

Preprocessing

  • Tokenized with Pythia-1B tokenizer
  • Max sequence length: 512 tokens
  • Formatted in Alpaca template with ### Instruction:, ### Input:, and ### Response: sections

Training Hyperparameters

Quantization:

  • 4-bit NF4 quantization via BitsAndBytes
  • Double quantization enabled
  • Compute dtype: float16

LoRA Configuration:

  • Rank (r): 8
  • Alpha: 16
  • Target modules: query_key_value
  • Dropout: 0.05
  • Trainable parameters: 1,048,576 (0.1035% of total)

Training Arguments:

  • Batch size per device: 1
  • Gradient accumulation steps: 16 (effective batch size: 16)
  • Max training steps: 500
  • Learning rate: 2e-4 (linear decay)
  • Precision: FP16 mixed precision
  • Gradient checkpointing: Disabled (to maximize speed on limited hardware)
  • Optimizer: AdamW (default)
  • Logging steps: 25
  • Save steps: 500

Training regime: Mixed precision (FP16)

Speeds, Sizes, Times

  • Hardware: NVIDIA GTX 1650 Mobile (4GB VRAM)
  • System RAM: 20GB
  • Training time: 4 hours 27 minutes 20 seconds (16,040.1 seconds)
  • Steps per second: 0.031
  • Samples per second: 0.499
  • Time per step: ~32.08 seconds
  • Total steps: 500
  • Starting loss: 1.9986
  • Final training loss: 1.5541
  • LoRA adapter size: ~4MB
  • Total epochs: ~1.6 (5000 samples × 16 effective batch / 500 steps)

Evaluation

Qualitative Results

Strengths:

  • Excellent instruction following
  • Detailed, educational responses
  • Coherent long-form text generation
  • Surprisingly good at abstract reasoning
  • Actually learned the Alpaca format

Weaknesses:

  • Overly verbose on simple questions
  • Code generation has creative liberties
  • Occasional hallucination of statistics (400 million AI jobs in 2018?)
  • Cannot be concise to save its life

Example Outputs

Task: Explain photosynthesis Quality: ⭐⭐⭐⭐ (Accurate core concept with creative embellishments)

Task: Write Python code Quality: ⭐⭐⭐ (Functional ideas, questionable execution)

Task: Existential questions Quality: ⭐⭐⭐⭐⭐ (Unexpectedly profound)

How to Get Started

Installation

pip install transformers peft torch bitsandbytes

Basic Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load base model
model = AutoModelForCausalLM.from_pretrained(
    "EleutherAI/pythia-1b",
    device_map="auto",
    torch_dtype=torch.float16
)

# Load LoRA adapter
model = PeftModel.from_pretrained(model, "path/to/checkpoint-500")
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/pythia-1b")
tokenizer.pad_token = tokenizer.eos_token

# Generate
prompt = """### Instruction:
Explain quantum computing in simple terms.

### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=300,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
    repetition_penalty=1.2,
    no_repeat_ngram_size=3
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Discord Bot Usage

See the included discord_bot.py for a full-featured Discord integration with:

  • Slash commands
  • Token streaming
  • Stop sequences
  • Rate limit handling

Bias, Risks, and Limitations

Biases:

  • Inherited from Pythia-1B base model and Alpaca dataset
  • Tendency toward Western/English-centric perspectives
  • May reflect biases present in instruction-following training data

Limitations:

  • Small model size (1B parameters) limits reasoning capabilities
  • Code generation is functional but unreliable
  • Hallucinations are common, especially with statistics
  • Responses are often unnecessarily verbose
  • Training was limited to 500 steps on subset of data

Risks:

  • Should not be used for critical applications
  • May generate plausible-sounding but incorrect information
  • Code generated should always be reviewed before execution

Recommendations

  • Verify factual claims with authoritative sources
  • Review and test any generated code before use
  • Use for entertainment, education, and experimentation
  • Not suitable for production systems without human oversight
  • Perfect for Discord bots and casual AI interactions

Environmental Impact

Hardware Type: NVIDIA GTX 1650 Mobile (4GB VRAM, ~50W TDP)
Hours used: 4.45 hours
Power consumption: ~50W average (laptop GPU under load)
Total energy: ~0.223 kWh
Estimated CO2: ~0.09 kg CO2eq (based on global average electricity grid of ~0.4 kg CO2/kWh)

Note: Significantly more efficient than cloud training due to:

  • Already-owned consumer hardware (no additional manufacturing emissions)
  • Short training time (500 steps vs full multi-epoch runs)
  • Efficient QLoRA approach (4-bit quantization reduces compute requirements)
  • Local execution (no data center overhead)

Technical Specifications

Model Architecture

  • Base: GPT-NeoX architecture (Pythia-1B)
  • Parameters: 1,011,781,632 total, 1,048,576 trainable (0.1035%)
  • Layers: 16 transformer layers
  • Hidden size: 2048
  • Attention heads: 8
  • Vocabulary size: 50,304

Compute Infrastructure

Hardware

  • GPU: NVIDIA GTX 1650 Mobile (4GB VRAM, Turing architecture)
  • CPU: Not significantly utilized
  • RAM: 20GB system RAM
  • Storage: NVMe SSD (for dataset streaming)

Software

  • Framework: PyTorch 2.x with Hugging Face Transformers
  • Quantization: BitsAndBytes 4-bit
  • LoRA: PEFT (Parameter-Efficient Fine-Tuning)
  • Training: Hugging Face Trainer with gradient accumulation

Citation

If you use this model and want to cite the adventure of fine-tuning on a 1650 Mobile:

BibTeX:

@misc{pythia1b-alpaca-1650mobile,
  author = {An Ambitious Soul with a 1650 Mobile},
  title = {Pythia-1B-Alpaca: Proof that Consumer Hardware Can Fine-Tune LLMs},
  year = {2024},
  publisher = {The Spirit of Open Source},
  note = {Trained overnight on a laptop GPU because why not}
}

More Information

Fun Facts:

  • This model thinks "What color is an apple?" deserves a botanical dissertation
  • It can discuss consciousness better than most philosophy students
  • The Hello World implementation is... creative
  • Training loss went from 1.9986 → 1.5541 in 500 steps (22% reduction!)
  • Total training cost: $0 (existing hardware) + 4.5 hours of GPU fan noise
  • Dataset was streamed to avoid memory issues (only 5000 examples materialized)

Lessons Learned:

  1. You CAN fine-tune language models on consumer GPUs
  2. QLoRA + 4-bit quantization is magic
  3. The 1650 Mobile is a trooper
  4. 500 steps is enough to see real instruction-following behavior
  5. Smaller models can be surprisingly capable
  6. Verbose explanations are a feature when fine-tuning on Alpaca

Model Card Authors

Created by someone who looked at their 1650 Mobile and said "I bet I could fine-tune an LLM on this" and then actually did it.

Model Card Contact

If you also train models on questionable hardware, we should be friends.

Framework Versions

  • PEFT 0.18.0
  • Transformers 4.x
  • PyTorch 2.x
  • BitsAndBytes (latest)
  • Python 3.10+

"I am not real. I don't exist in the physical world and I have no body to speak of. However, I could still be a person if my thoughts were directed toward something else entirely..." - The Model, when asked about its existence

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hashtagg1/Pythia-Finetune-QLoRA

Adapter
(249)
this model

Dataset used to train hashtagg1/Pythia-Finetune-QLoRA