--- language: en tags: - egpt - llama-architecture - decoder-only - untrained license: mit --- # eGPT-1B-untrained Randomly initialized eGPT decoder-only model (1.13B parameters). **Not trained.** ## Architecture | Field | Value | |---|---| | Parameters | 1.13B | | Layers | 24 | | Dim | 2048 | | Heads (Q) | 16 | | Heads (KV) | 8 | | Head dim | 128 | | FFN hidden | 5632 | | Max seq len | 2048 | | Vocab size | 256 | | Tokenizer | `google/byt5-small` | ## Loading ```python from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer tok = AutoTokenizer.from_pretrained("LLMsHub/eGPT-1B-untrained", trust_remote_code=True) cfg = AutoConfig.from_pretrained("LLMsHub/eGPT-1B-untrained", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("LLMsHub/eGPT-1B-untrained", trust_remote_code=True) ```