cap-26m-fast-dev / README.md
d2v1shx's picture
Upload final cap-26m-fast-dev checkpoint files
359abd3 verified

cap-26m-fast-dev

Tiny decoder-only language model trained locally on macOS as part of the cap project.

Summary

  • Model name: cap-26m-fast-dev
  • Base architecture: decoder-only Transformer in a LLaMA-style configuration
  • Training data: TinyStories
  • Training mode: fast development run for quick iteration

Checkpoint notes

  • Saved from the cap local training pipeline
  • Includes tokenizer files alongside the model checkpoint
  • Intended as an intermediate checkpoint, not a final polished release

Known metrics

  • Structured eval loss: 4.8252
  • Structured eval perplexity: 124.61

Usage

Load with transformers using AutoModelForCausalLM.from_pretrained(...) and AutoTokenizer.from_pretrained(...).