test-distillation / config.json
algo2217's picture
Upload final model (step 78) and all checkpoints at 2025-07-16T19:21:53.266656
f22c128 verified
raw
history blame contribute delete
236 Bytes
{
"architectures": [
"HFHookedTransformer"
],
"hidden_size": 128,
"n_ctx": 2048,
"num_attention_heads": 4,
"num_hidden_layers": 6,
"torch_dtype": "bfloat16",
"transformers_version": "4.45.2",
"vocab_size": 50304
}