| # cap-26m-fast-dev | |
| Tiny decoder-only language model trained locally on macOS as part of the `cap` project. | |
| ## Summary | |
| - Model name: `cap-26m-fast-dev` | |
| - Base architecture: decoder-only Transformer in a LLaMA-style configuration | |
| - Training data: TinyStories | |
| - Training mode: fast development run for quick iteration | |
| ## Checkpoint notes | |
| - Saved from the `cap` local training pipeline | |
| - Includes tokenizer files alongside the model checkpoint | |
| - Intended as an intermediate checkpoint, not a final polished release | |
| ## Known metrics | |
| - Structured eval loss: `4.8252` | |
| - Structured eval perplexity: `124.61` | |
| ## Usage | |
| Load with `transformers` using `AutoModelForCausalLM.from_pretrained(...)` and `AutoTokenizer.from_pretrained(...)`. | |