File size: 874 Bytes
4e74b1b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
---
language: en
tags:
  - egpt
  - llama-architecture
  - decoder-only
  - untrained
license: mit
---

# eGPT-100M-bytes-untrained

Randomly initialized eGPT decoder-only model (94.9M parameters). **Not trained.**

## Architecture

| Field | Value |
|---|---|
| Parameters | 94.9M |
| Layers | 8 |
| Dim | 1024 |
| Heads (Q) | 8 |
| Heads (KV) | 4 |
| Head dim | 128 |
| FFN hidden | 2816 |
| Max seq len | 2048 |
| Vocab size | 256 |
| Tokenizer | `google/byt5-small` |

## Loading

```python
from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer

tok   = AutoTokenizer.from_pretrained("LLMsHub/eGPT-100M-bytes-untrained", trust_remote_code=True)
cfg   = AutoConfig.from_pretrained("LLMsHub/eGPT-100M-bytes-untrained", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("LLMsHub/eGPT-100M-bytes-untrained", trust_remote_code=True)
```