| language: en | |
| tags: | |
| - egpt | |
| - llama-architecture | |
| - decoder-only | |
| - untrained | |
| license: mit | |
| # eGPT-7B-untrained | |
| Randomly initialized eGPT decoder-only model (7.09B parameters). **Not trained.** | |
| ## Architecture | |
| | Field | Value | | |
| |---|---| | |
| | Parameters | 7.09B | | |
| | Layers | 40 | | |
| | Dim | 4096 | | |
| | Heads (Q) | 32 | | |
| | Heads (KV) | 8 | | |
| | Head dim | 128 | | |
| | FFN hidden | 11008 | | |
| | Max seq len | 2048 | | |
| | Vocab size | 256 | | |
| | Tokenizer | `google/byt5-small` | | |
| ## Loading | |
| ```python | |
| from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer | |
| tok = AutoTokenizer.from_pretrained("LLMsHub/eGPT-7B-untrained", trust_remote_code=True) | |
| cfg = AutoConfig.from_pretrained("LLMsHub/eGPT-7B-untrained", trust_remote_code=True) | |
| model = AutoModelForCausalLM.from_pretrained("LLMsHub/eGPT-7B-untrained", trust_remote_code=True) | |
| ``` | |