--- language: en tags: - egpt - llama-architecture - decoder-only - untrained license: mit --- # eGPT-3B-untrained Randomly initialized eGPT decoder-only model (3.22B parameters). **Not trained.** ## Architecture | Field | Value | |---|---| | Parameters | 3.22B | | Layers | 32 | | Dim | 3072 | | Heads (Q) | 24 | | Heads (KV) | 8 | | Head dim | 128 | | FFN hidden | 8192 | | Max seq len | 2048 | | Vocab size | 256 | | Tokenizer | `google/byt5-small` | ## Loading ```python from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer tok = AutoTokenizer.from_pretrained("LLMsHub/eGPT-3B-untrained", trust_remote_code=True) cfg = AutoConfig.from_pretrained("LLMsHub/eGPT-3B-untrained", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("LLMsHub/eGPT-3B-untrained", trust_remote_code=True) ```