--- language: en tags: - egpt - llama-architecture - decoder-only - untrained license: mit --- # eGPT-7B-untrained Randomly initialized eGPT decoder-only model (7.09B parameters). **Not trained.** ## Architecture | Field | Value | |---|---| | Parameters | 7.09B | | Layers | 40 | | Dim | 4096 | | Heads (Q) | 32 | | Heads (KV) | 8 | | Head dim | 128 | | FFN hidden | 11008 | | Max seq len | 2048 | | Vocab size | 256 | | Tokenizer | `google/byt5-small` | ## Loading ```python from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer tok = AutoTokenizer.from_pretrained("LLMsHub/eGPT-7B-untrained", trust_remote_code=True) cfg = AutoConfig.from_pretrained("LLMsHub/eGPT-7B-untrained", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("LLMsHub/eGPT-7B-untrained", trust_remote_code=True) ```