Meet25M-Base / README.md
Ma7ee7's picture
Upload Meet25M base model as safetensors
233d036 verified
metadata
language:
  - en
license: other
library_name: pytorch
tags:
  - causal-lm
  - from-scratch
  - gpt
  - safetensors
  - small-language-model
  - meet25m

Meet25M Base

A small GPT-style causal language model trained from scratch.

Model

  • Architecture: GPT-style decoder-only Transformer
  • Approx size: ~25M parameters
  • Context length: 1024
  • Tokenizer: custom byte-level BPE
  • Positional encoding: RoPE
  • Normalization: RMSNorm
  • MLP: SwiGLU
  • Embeddings: tied input/output embeddings

Training Data Mix

Target pretraining mix:

  • FineWeb-Edu
  • FineWeb general
  • Wikipedia
  • OpenWebMath
  • Project Gutenberg
  • StackOverflow / Stack Exchange style posts
  • CodeSearchNet

Total target: ~250M training tokens.

Files

  • model.safetensors — safetensors checkpoint
  • config.json — model config
  • tokenizer/ — tokenizer files
  • safetensors_info.json — checkpoint metadata

Loading

This is not a standard Transformers AutoModelForCausalLM checkpoint.
Use the custom GPT class from the training script and load model.safetensors.