cap-26m-fast-dev / README.md
d2v1shx's picture
Upload final cap-26m-fast-dev checkpoint files
359abd3 verified
# cap-26m-fast-dev
Tiny decoder-only language model trained locally on macOS as part of the `cap` project.
## Summary
- Model name: `cap-26m-fast-dev`
- Base architecture: decoder-only Transformer in a LLaMA-style configuration
- Training data: TinyStories
- Training mode: fast development run for quick iteration
## Checkpoint notes
- Saved from the `cap` local training pipeline
- Includes tokenizer files alongside the model checkpoint
- Intended as an intermediate checkpoint, not a final polished release
## Known metrics
- Structured eval loss: `4.8252`
- Structured eval perplexity: `124.61`
## Usage
Load with `transformers` using `AutoModelForCausalLM.from_pretrained(...)` and `AutoTokenizer.from_pretrained(...)`.