llm-course-hw1 / config.json
estepr's picture
Push model using huggingface_hub.
b11506b verified
raw
history blame contribute delete
197 Bytes
{
"architectures": [
"TransformerForCausalLM"
],
"dropout": 0.1,
"hidden_dim": 384,
"max_seq_len": 128,
"model_type": "custom",
"n_head": 6,
"n_layer": 6,
"vocab_size": 2049
}