--- language: - en license: other library_name: pytorch tags: - causal-lm - from-scratch - gpt - safetensors - small-language-model - meet25m --- # Meet25M Base A small GPT-style causal language model trained from scratch. ## Model - Architecture: GPT-style decoder-only Transformer - Approx size: ~25M parameters - Context length: 1024 - Tokenizer: custom byte-level BPE - Positional encoding: RoPE - Normalization: RMSNorm - MLP: SwiGLU - Embeddings: tied input/output embeddings ## Training Data Mix Target pretraining mix: - FineWeb-Edu - FineWeb general - Wikipedia - OpenWebMath - Project Gutenberg - StackOverflow / Stack Exchange style posts - CodeSearchNet Total target: ~250M training tokens. ## Files - `model.safetensors` — safetensors checkpoint - `config.json` — model config - `tokenizer/` — tokenizer files - `safetensors_info.json` — checkpoint metadata ## Loading This is not a standard Transformers `AutoModelForCausalLM` checkpoint. Use the custom GPT class from the training script and load `model.safetensors`.