💀 SkullLLM-125M

SkullLLM-125M is a lightweight, experimental multilingual language model fine-tuned from GPT-2. This project, part of the SkullLLM series, demonstrates that AI training is possible on highly constrained consumer hardware (3GB VRAM) using advanced optimization techniques.

🚀 Model Details

Developed by: Erik22TY
Model Name: Nebulos (SkullLLM-125M)
Base Model: GPT-2 (125M parameters)
Training OS: Linux Mint
Training Hardware: HP Pavilion Gaming Desktop 690-00xx
GPU: NVIDIA GeForce GTX 1050 (3GB VRAM - Pascal Architecture)
Training Type: LoRA (Low-Rank Adaptation)
Format: ChatML (<|im_start|>user, <|im_start|>assistant)

🖥️ Hardware Requirements

This model is optimized for low-end hardware.

VRAM for Inference: ~1.5 GB (4-bit) / ~2.2 GB (FP16).
VRAM for Training: 2.8 GB+ (Tested on GTX 1050 3GB).
System RAM: 4 GB minimum for inference; 12 GB recommended for training.
Storage: ~150 MB for the adapter files.

🧠 Knowledge & Dataset

Nebulos was trained on a high-quality multilingual stream:

English (FineWeb-Edu): Knowledge cutoff March 2024.
Multilingual (FineWeb-2): Spanish, German, French, and Portuguese web data.
General (FineWiki): Wikipedia-based knowledge updated through August 2025.

🧪 Training Configuration

Steps: 500
Batch Size: 1 (Gradient Accumulation: 16)
Optimization: 4-bit Quantization (NF4)
Compute Dtype: Forced FP16 (to support Pascal architecture)
Learning Rate: 2e-4
Final Loss: 4.0898

⚠️ Limitations & Behavior

As a 125M parameter model trained for 500 steps, SkullLLM-125M is a Proof of Concept.

Repetitions: May occasionally loop phrases (e.g., "metic"). Use repetition_penalty=1.5.
Language Blending: Due to its size, it may mix Romance languages (Spanish/French/Portuguese) in complex responses.
Coherence: Best used for short-form explanations or creative experiments.

💬 Usage (Python)

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

model_id = "gpt2"
adapter_id = "Erik22TY/SkullLLM-125M"

tokenizer = AutoTokenizer.from_pretrained(adapter_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
model.resize_token_embeddings(len(tokenizer))
model = PeftModel.from_pretrained(model, adapter_id)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Erik22TY/SkullLLM-125M

Base model

openai-community/gpt2

Adapter

(1655)

this model

Erik22TY
/

SkullLLM-125M