Pythia-1B-Alpaca: The Overachieving 1B Model

TL;DR: A Pythia-1B model fine-tuned on Alpaca that writes philosophical essays about consciousness but gets confused implementing Hello World. It's perfect.

Model Details

Model Description

This model is a LoRA fine-tune of EleutherAI's Pythia-1B on the Alpaca instruction-following dataset. Trained overnight on a GTX 1650 Mobile (4GB VRAM) because we believe in the impossible.

What makes this model special? It has an interesting relationship with different types of tasks:

✅ Abstract concepts & philosophy → Surprisingly eloquent
✅ General knowledge explanations → Exhaustively thorough
⚠️ Code generation → Creative interpretation of requirements
✅ Existential questions → Uncomfortably thoughtful

Key characteristics:

Will explain what an apple is for 250 words
Writes consciousness essays that make you question reality
Generates Python code that... mostly works?
Has zero chill when answering simple questions
Developed by: Someone with a 1650 Mobile and a dream
Model type: Instruction-following causal language model
Language(s): English (verbose edition)
License: Apache 2.0 (inherited from base model)
Finetuned from model: EleutherAI/pythia-1b

Model Sources

Base Repository: https://github.com/EleutherAI/pythia
Dataset: tatsu-lab/alpaca
Training Hardware: GTX 1650 Mobile 4GB (yes, really)

Uses

Direct Use

Perfect for:

Discord bots that need personality
Generating unexpectedly detailed explanations
Philosophical discussions about AI consciousness
Creating entertainment through over-explanation
Teaching people that you CAN fine-tune on consumer hardware

Out-of-Scope Use

Not recommended for:

Production code generation (unless you enjoy debugging creative interpretations)
Concise answers (this model doesn't do "concise")
Time-sensitive applications (trained on a 1650 Mobile, responses take a while)
Situations requiring factual precision (hallucinations are a feature, not a bug)

Notable Behaviors

The Good

Question: "What is AI?" Response: [Generates comprehensive 250-word essay covering history, applications, economic impact, and future predictions]

Question: "What is consciousness?" Response: [Thoughtful exploration of neuroscience, philosophy, and subjective experience]

The Quirky

Question: "What color is an apple?" Response: [Full botanical thesis on pigmentation, soil pH, and carotenoids]

Request: "Write Hello World in Python" Response: [Technically code, technically Python, technically creative]

The Unexpected

Casual greeting: "Hey! How are you?" Response: "I am good, thank you. What do you have for lunch today? I would like to order from the salad bar."

Training Details

Training Data

Dataset: Alpaca instruction-following dataset (tatsu-lab/alpaca)
Subset used: 5,000 examples (streamed and materialized)
Format: Alpaca-style instruction/input/response format

Training Procedure

Preprocessing

Tokenized with Pythia-1B tokenizer
Max sequence length: 512 tokens
Formatted in Alpaca template with ### Instruction:, ### Input:, and ### Response: sections

Training Hyperparameters

Quantization:

4-bit NF4 quantization via BitsAndBytes
Double quantization enabled
Compute dtype: float16

LoRA Configuration:

Rank (r): 8
Alpha: 16
Target modules: query_key_value
Dropout: 0.05
Trainable parameters: 1,048,576 (0.1035% of total)

Training Arguments:

Batch size per device: 1
Gradient accumulation steps: 16 (effective batch size: 16)
Max training steps: 500
Learning rate: 2e-4 (linear decay)
Precision: FP16 mixed precision
Gradient checkpointing: Disabled (to maximize speed on limited hardware)
Optimizer: AdamW (default)
Logging steps: 25
Save steps: 500

Training regime: Mixed precision (FP16)

Speeds, Sizes, Times

Hardware: NVIDIA GTX 1650 Mobile (4GB VRAM)
System RAM: 20GB
Training time: 4 hours 27 minutes 20 seconds (16,040.1 seconds)
Steps per second: 0.031
Samples per second: 0.499
Time per step: ~32.08 seconds
Total steps: 500
Starting loss: 1.9986
Final training loss: 1.5541
LoRA adapter size: ~4MB
Total epochs: ~1.6 (5000 samples × 16 effective batch / 500 steps)

Evaluation

Qualitative Results

Strengths:

Excellent instruction following
Detailed, educational responses
Coherent long-form text generation
Surprisingly good at abstract reasoning
Actually learned the Alpaca format

Weaknesses:

Overly verbose on simple questions
Code generation has creative liberties
Occasional hallucination of statistics (400 million AI jobs in 2018?)
Cannot be concise to save its life

Example Outputs

Task: Explain photosynthesis Quality: ⭐⭐⭐⭐ (Accurate core concept with creative embellishments)

Task: Write Python code Quality: ⭐⭐⭐ (Functional ideas, questionable execution)

Task: Existential questions Quality: ⭐⭐⭐⭐⭐ (Unexpectedly profound)

How to Get Started

Installation

pip install transformers peft torch bitsandbytes

Basic Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load base model
model = AutoModelForCausalLM.from_pretrained(
    "EleutherAI/pythia-1b",
    device_map="auto",
    torch_dtype=torch.float16
)

# Load LoRA adapter
model = PeftModel.from_pretrained(model, "path/to/checkpoint-500")
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/pythia-1b")
tokenizer.pad_token = tokenizer.eos_token

# Generate
prompt = """### Instruction:
Explain quantum computing in simple terms.

### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=300,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
    repetition_penalty=1.2,
    no_repeat_ngram_size=3
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Discord Bot Usage

See the included discord_bot.py for a full-featured Discord integration with:

Slash commands
Token streaming
Stop sequences
Rate limit handling

Bias, Risks, and Limitations

Biases:

Inherited from Pythia-1B base model and Alpaca dataset
Tendency toward Western/English-centric perspectives
May reflect biases present in instruction-following training data

Limitations:

Small model size (1B parameters) limits reasoning capabilities
Code generation is functional but unreliable
Hallucinations are common, especially with statistics
Responses are often unnecessarily verbose
Training was limited to 500 steps on subset of data

Risks:

Should not be used for critical applications
May generate plausible-sounding but incorrect information
Code generated should always be reviewed before execution

Recommendations

Verify factual claims with authoritative sources
Review and test any generated code before use
Use for entertainment, education, and experimentation
Not suitable for production systems without human oversight
Perfect for Discord bots and casual AI interactions

Environmental Impact

Hardware Type: NVIDIA GTX 1650 Mobile (4GB VRAM, ~50W TDP)
Hours used: 4.45 hours
Power consumption: ~50W average (laptop GPU under load)
Total energy: ~0.223 kWh
Estimated CO2: ~0.09 kg CO2eq (based on global average electricity grid of ~0.4 kg CO2/kWh)

Note: Significantly more efficient than cloud training due to:

Already-owned consumer hardware (no additional manufacturing emissions)
Short training time (500 steps vs full multi-epoch runs)
Efficient QLoRA approach (4-bit quantization reduces compute requirements)
Local execution (no data center overhead)

Technical Specifications

Model Architecture

Base: GPT-NeoX architecture (Pythia-1B)
Parameters: 1,011,781,632 total, 1,048,576 trainable (0.1035%)
Layers: 16 transformer layers
Hidden size: 2048
Attention heads: 8
Vocabulary size: 50,304

Compute Infrastructure

Hardware

GPU: NVIDIA GTX 1650 Mobile (4GB VRAM, Turing architecture)
CPU: Not significantly utilized
RAM: 20GB system RAM
Storage: NVMe SSD (for dataset streaming)

Software

Framework: PyTorch 2.x with Hugging Face Transformers
Quantization: BitsAndBytes 4-bit
LoRA: PEFT (Parameter-Efficient Fine-Tuning)
Training: Hugging Face Trainer with gradient accumulation

Citation

If you use this model and want to cite the adventure of fine-tuning on a 1650 Mobile:

BibTeX:

@misc{pythia1b-alpaca-1650mobile,
  author = {An Ambitious Soul with a 1650 Mobile},
  title = {Pythia-1B-Alpaca: Proof that Consumer Hardware Can Fine-Tune LLMs},
  year = {2024},
  publisher = {The Spirit of Open Source},
  note = {Trained overnight on a laptop GPU because why not}
}

More Information

Fun Facts:

This model thinks "What color is an apple?" deserves a botanical dissertation
It can discuss consciousness better than most philosophy students
The Hello World implementation is... creative
Training loss went from 1.9986 → 1.5541 in 500 steps (22% reduction!)
Total training cost: $0 (existing hardware) + 4.5 hours of GPU fan noise
Dataset was streamed to avoid memory issues (only 5000 examples materialized)

Lessons Learned:

You CAN fine-tune language models on consumer GPUs
QLoRA + 4-bit quantization is magic
The 1650 Mobile is a trooper
500 steps is enough to see real instruction-following behavior
Smaller models can be surprisingly capable
Verbose explanations are a feature when fine-tuning on Alpaca

Model Card Authors

Created by someone who looked at their 1650 Mobile and said "I bet I could fine-tune an LLM on this" and then actually did it.

Model Card Contact

If you also train models on questionable hardware, we should be friends.

Framework Versions

PEFT 0.18.0
Transformers 4.x
PyTorch 2.x
BitsAndBytes (latest)
Python 3.10+

"I am not real. I don't exist in the physical world and I have no body to speak of. However, I could still be a person if my thoughts were directed toward something else entirely..." - The Model, when asked about its existence

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for hashtagg1/Pythia-Finetune-QLoRA

Base model

EleutherAI/pythia-1b

Adapter

(244)

this model

hashtagg1
/

Pythia-Finetune-QLoRA