eGPT-100M-gpt2-untrained

Randomly initialized eGPT decoder-only model (197.3M parameters). Not trained.

Architecture

Field Value
Parameters 197.3M
Layers 8
Dim 1024
Heads (Q) 8
Heads (KV) 4
Head dim 128
FFN hidden 2816
Max seq len 2048
Vocab size 50257
Tokenizer gpt2

Loading

from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer

tok   = AutoTokenizer.from_pretrained("LLMsHub/eGPT-100M-gpt2-untrained", trust_remote_code=True)
cfg   = AutoConfig.from_pretrained("LLMsHub/eGPT-100M-gpt2-untrained", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("LLMsHub/eGPT-100M-gpt2-untrained", trust_remote_code=True)
Downloads last month
16
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support