--- language: en tags: - egpt - llama-architecture - decoder-only - untrained license: mit --- # eGPT-100M-bytes-untrained Randomly initialized eGPT decoder-only model (94.9M parameters). **Not trained.** ## Architecture | Field | Value | |---|---| | Parameters | 94.9M | | Layers | 8 | | Dim | 1024 | | Heads (Q) | 8 | | Heads (KV) | 4 | | Head dim | 128 | | FFN hidden | 2816 | | Max seq len | 2048 | | Vocab size | 256 | | Tokenizer | `google/byt5-small` | ## Loading ```python from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer tok = AutoTokenizer.from_pretrained("LLMsHub/eGPT-100M-bytes-untrained", trust_remote_code=True) cfg = AutoConfig.from_pretrained("LLMsHub/eGPT-100M-bytes-untrained", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("LLMsHub/eGPT-100M-bytes-untrained", trust_remote_code=True) ```