metadata
language: en
tags:
- egpt
- llama-architecture
- decoder-only
- untrained
license: mit
eGPT-1B-untrained
Randomly initialized eGPT decoder-only model (1.13B parameters). Not trained.
Architecture
| Field | Value |
|---|---|
| Parameters | 1.13B |
| Layers | 24 |
| Dim | 2048 |
| Heads (Q) | 16 |
| Heads (KV) | 8 |
| Head dim | 128 |
| FFN hidden | 5632 |
| Max seq len | 2048 |
| Vocab size | 256 |
| Tokenizer | google/byt5-small |
Loading
from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer
tok = AutoTokenizer.from_pretrained("LLMsHub/eGPT-1B-untrained", trust_remote_code=True)
cfg = AutoConfig.from_pretrained("LLMsHub/eGPT-1B-untrained", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("LLMsHub/eGPT-1B-untrained", trust_remote_code=True)