eGPT-1B-untrained / README.md
macabdul9's picture
Upload folder using huggingface_hub
10229a1 verified
metadata
language: en
tags:
  - egpt
  - llama-architecture
  - decoder-only
  - untrained
license: mit

eGPT-1B-untrained

Randomly initialized eGPT decoder-only model (1.13B parameters). Not trained.

Architecture

Field Value
Parameters 1.13B
Layers 24
Dim 2048
Heads (Q) 16
Heads (KV) 8
Head dim 128
FFN hidden 5632
Max seq len 2048
Vocab size 256
Tokenizer google/byt5-small

Loading

from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer

tok   = AutoTokenizer.from_pretrained("LLMsHub/eGPT-1B-untrained", trust_remote_code=True)
cfg   = AutoConfig.from_pretrained("LLMsHub/eGPT-1B-untrained", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("LLMsHub/eGPT-1B-untrained", trust_remote_code=True)