π SkullLLM-125M
SkullLLM-125M is a lightweight, experimental multilingual language model fine-tuned from GPT-2. This project, part of the SkullLLM series, demonstrates that AI training is possible on highly constrained consumer hardware (3GB VRAM) using advanced optimization techniques.
π Model Details
- Developed by: Erik22TY
- Model Name: Nebulos (SkullLLM-125M)
- Base Model: GPT-2 (125M parameters)
- Training OS: Linux Mint
- Training Hardware: HP Pavilion Gaming Desktop 690-00xx
- GPU: NVIDIA GeForce GTX 1050 (3GB VRAM - Pascal Architecture)
- Training Type: LoRA (Low-Rank Adaptation)
- Format: ChatML (
<|im_start|>user,<|im_start|>assistant)
π₯οΈ Hardware Requirements
This model is optimized for low-end hardware.
- VRAM for Inference: ~1.5 GB (4-bit) / ~2.2 GB (FP16).
- VRAM for Training: 2.8 GB+ (Tested on GTX 1050 3GB).
- System RAM: 4 GB minimum for inference; 12 GB recommended for training.
- Storage: ~150 MB for the adapter files.
π§ Knowledge & Dataset
Nebulos was trained on a high-quality multilingual stream:
- English (FineWeb-Edu): Knowledge cutoff March 2024.
- Multilingual (FineWeb-2): Spanish, German, French, and Portuguese web data.
- General (FineWiki): Wikipedia-based knowledge updated through August 2025.
π§ͺ Training Configuration
- Steps: 500
- Batch Size: 1 (Gradient Accumulation: 16)
- Optimization: 4-bit Quantization (NF4)
- Compute Dtype: Forced FP16 (to support Pascal architecture)
- Learning Rate: 2e-4
- Final Loss: 4.0898
β οΈ Limitations & Behavior
As a 125M parameter model trained for 500 steps, SkullLLM-125M is a Proof of Concept.
- Repetitions: May occasionally loop phrases (e.g., "metic"). Use
repetition_penalty=1.5. - Language Blending: Due to its size, it may mix Romance languages (Spanish/French/Portuguese) in complex responses.
- Coherence: Best used for short-form explanations or creative experiments.
π¬ Usage (Python)
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
model_id = "gpt2"
adapter_id = "Erik22TY/SkullLLM-125M"
tokenizer = AutoTokenizer.from_pretrained(adapter_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
model.resize_token_embeddings(len(tokenizer))
model = PeftModel.from_pretrained(model, adapter_id)
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
Model tree for Erik22TY/SkullLLM-125M
Base model
openai-community/gpt2