Safetensors
trl
sft
lora
nebulos
skull
multilingual

πŸ’€ SkullLLM-125M

SkullLLM-125M is a lightweight, experimental multilingual language model fine-tuned from GPT-2. This project, part of the SkullLLM series, demonstrates that AI training is possible on highly constrained consumer hardware (3GB VRAM) using advanced optimization techniques.

πŸš€ Model Details

  • Developed by: Erik22TY
  • Model Name: Nebulos (SkullLLM-125M)
  • Base Model: GPT-2 (125M parameters)
  • Training OS: Linux Mint
  • Training Hardware: HP Pavilion Gaming Desktop 690-00xx
  • GPU: NVIDIA GeForce GTX 1050 (3GB VRAM - Pascal Architecture)
  • Training Type: LoRA (Low-Rank Adaptation)
  • Format: ChatML (<|im_start|>user, <|im_start|>assistant)

πŸ–₯️ Hardware Requirements

This model is optimized for low-end hardware.

  • VRAM for Inference: ~1.5 GB (4-bit) / ~2.2 GB (FP16).
  • VRAM for Training: 2.8 GB+ (Tested on GTX 1050 3GB).
  • System RAM: 4 GB minimum for inference; 12 GB recommended for training.
  • Storage: ~150 MB for the adapter files.

🧠 Knowledge & Dataset

Nebulos was trained on a high-quality multilingual stream:

  • English (FineWeb-Edu): Knowledge cutoff March 2024.
  • Multilingual (FineWeb-2): Spanish, German, French, and Portuguese web data.
  • General (FineWiki): Wikipedia-based knowledge updated through August 2025.

πŸ§ͺ Training Configuration

  • Steps: 500
  • Batch Size: 1 (Gradient Accumulation: 16)
  • Optimization: 4-bit Quantization (NF4)
  • Compute Dtype: Forced FP16 (to support Pascal architecture)
  • Learning Rate: 2e-4
  • Final Loss: 4.0898

⚠️ Limitations & Behavior

As a 125M parameter model trained for 500 steps, SkullLLM-125M is a Proof of Concept.

  • Repetitions: May occasionally loop phrases (e.g., "metic"). Use repetition_penalty=1.5.
  • Language Blending: Due to its size, it may mix Romance languages (Spanish/French/Portuguese) in complex responses.
  • Coherence: Best used for short-form explanations or creative experiments.

πŸ’¬ Usage (Python)

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

model_id = "gpt2"
adapter_id = "Erik22TY/SkullLLM-125M"

tokenizer = AutoTokenizer.from_pretrained(adapter_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
model.resize_token_embeddings(len(tokenizer))
model = PeftModel.from_pretrained(model, adapter_id)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Erik22TY/SkullLLM-125M

Adapter
(1655)
this model

Datasets used to train Erik22TY/SkullLLM-125M