d2v1shx
/

cap-26m-fast-dev

Model card Files Files and versions

cap-26m-fast-dev / README.md

d2v1shx's picture

Upload final cap-26m-fast-dev checkpoint files

359abd3 verified about 2 months ago

|

history blame contribute delete

735 Bytes

cap-26m-fast-dev

Tiny decoder-only language model trained locally on macOS as part of the cap project.

Summary

Model name: cap-26m-fast-dev
Base architecture: decoder-only Transformer in a LLaMA-style configuration
Training data: TinyStories
Training mode: fast development run for quick iteration

Checkpoint notes

Saved from the cap local training pipeline
Includes tokenizer files alongside the model checkpoint
Intended as an intermediate checkpoint, not a final polished release

Known metrics

Structured eval loss: 4.8252
Structured eval perplexity: 124.61

Usage

Load with transformers using AutoModelForCausalLM.from_pretrained(...) and AutoTokenizer.from_pretrained(...).