cap-26m-fast-dev
Tiny decoder-only language model trained locally on macOS as part of the cap project.
Summary
- Model name:
cap-26m-fast-dev - Base architecture: decoder-only Transformer in a LLaMA-style configuration
- Training data: TinyStories
- Training mode: fast development run for quick iteration
Checkpoint notes
- Saved from the
caplocal training pipeline - Includes tokenizer files alongside the model checkpoint
- Intended as an intermediate checkpoint, not a final polished release
Known metrics
- Structured eval loss:
4.8252 - Structured eval perplexity:
124.61
Usage
Load with transformers using AutoModelForCausalLM.from_pretrained(...) and AutoTokenizer.from_pretrained(...).