| language: | |
| - en | |
| license: other | |
| library_name: pytorch | |
| tags: | |
| - causal-lm | |
| - from-scratch | |
| - gpt | |
| - safetensors | |
| - small-language-model | |
| - meet25m | |
| # Meet25M Base | |
| A small GPT-style causal language model trained from scratch. | |
| ## Model | |
| - Architecture: GPT-style decoder-only Transformer | |
| - Approx size: ~25M parameters | |
| - Context length: 1024 | |
| - Tokenizer: custom byte-level BPE | |
| - Positional encoding: RoPE | |
| - Normalization: RMSNorm | |
| - MLP: SwiGLU | |
| - Embeddings: tied input/output embeddings | |
| ## Training Data Mix | |
| Target pretraining mix: | |
| - FineWeb-Edu | |
| - FineWeb general | |
| - Wikipedia | |
| - OpenWebMath | |
| - Project Gutenberg | |
| - StackOverflow / Stack Exchange style posts | |
| - CodeSearchNet | |
| Total target: ~250M training tokens. | |
| ## Files | |
| - `model.safetensors` — safetensors checkpoint | |
| - `config.json` — model config | |
| - `tokenizer/` — tokenizer files | |
| - `safetensors_info.json` — checkpoint metadata | |
| ## Loading | |
| This is not a standard Transformers `AutoModelForCausalLM` checkpoint. | |
| Use the custom GPT class from the training script and load `model.safetensors`. | |