eGPT-100M-gpt2-untrained
Randomly initialized eGPT decoder-only model (197.3M parameters). Not trained.
Architecture
| Field | Value |
|---|---|
| Parameters | 197.3M |
| Layers | 8 |
| Dim | 1024 |
| Heads (Q) | 8 |
| Heads (KV) | 4 |
| Head dim | 128 |
| FFN hidden | 2816 |
| Max seq len | 2048 |
| Vocab size | 50257 |
| Tokenizer | gpt2 |
Loading
from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer
tok = AutoTokenizer.from_pretrained("LLMsHub/eGPT-100M-gpt2-untrained", trust_remote_code=True)
cfg = AutoConfig.from_pretrained("LLMsHub/eGPT-100M-gpt2-untrained", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("LLMsHub/eGPT-100M-gpt2-untrained", trust_remote_code=True)
- Downloads last month
- 16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support