d2v1shx
/

cap-26m-fast-dev

Model card Files Files and versions

cap-26m-fast-dev / README.md

d2v1shx's picture

Upload final cap-26m-fast-dev checkpoint files

359abd3 verified about 2 months ago

|

history blame contribute delete

735 Bytes

	# cap-26m-fast-dev

	Tiny decoder-only language model trained locally on macOS as part of the `cap` project.

	## Summary

	- Model name: `cap-26m-fast-dev`
	- Base architecture: decoder-only Transformer in a LLaMA-style configuration
	- Training data: TinyStories
	- Training mode: fast development run for quick iteration

	## Checkpoint notes

	- Saved from the `cap` local training pipeline
	- Includes tokenizer files alongside the model checkpoint
	- Intended as an intermediate checkpoint, not a final polished release

	## Known metrics

	- Structured eval loss: `4.8252`
	- Structured eval perplexity: `124.61`

	## Usage

	Load with `transformers` using `AutoModelForCausalLM.from_pretrained(...)` and `AutoTokenizer.from_pretrained(...)`.