| | --- |
| | language: en |
| | license: mit |
| | tags: |
| | - symbiogenesis |
| | - multi-organelle |
| | - monarch-mixer |
| | - philosophy |
| | - pytorch |
| | --- |
| | |
| | # SymbioGPT-10M |
| |
|
| | Multi-organelle GPT language model (11.6M params) trained on classical philosophy texts. |
| |
|
| | ## Architecture |
| |
|
| | **SymbioGPT** extends the [SymbioSLM](https://huggingface.co/LisaMegaWatts/SymbioSLM) architecture by adding CausalSelfAttention as a 4th organelle, all fused via a learned per-channel OrganelleGate with learnable temperature. |
| |
|
| | | Organelle | Function | Complexity | |
| | |-----------|----------|------------| |
| | | CausalDepthwiseConv1d | Local n-gram detection | O(n) | |
| | | MonarchMatrix | Sub-quadratic global mixing via butterfly matrices | O(n√n) | |
| | | LongConv | Dense causal convolution with exponential decay | O(n) | |
| | | CausalSelfAttention | Multi-head attention with RoPE | O(n²) | |
| |
|
| | Plus: RMSNorm, SwiGLU FFN, SkipGate residuals, weight-tied output projection. |
| |
|
| | ## Model Details |
| |
|
| | | Parameter | Value | |
| | |-----------|-------| |
| | | d_model | 320 | |
| | | n_layers | 8 | |
| | | n_heads | 5 | |
| | | head_dim | 64 | |
| | | context_length | 256 | |
| | | vocab_size | 2000 (BPE) | |
| | | n_monarch_heads | 1 | |
| | | Total params | 11.6M | |
| |
|
| | ## Files |
| |
|
| | - `symbio_best.pt` — Best checkpoint (PyTorch state_dict, torch.compile format) |
| | - `symbio_final.pt` — Final checkpoint |
| | - `vocab.json` — BPE vocabulary (2000 tokens) |
| | - `merges.txt` — BPE merge rules |
| |
|
| | ## Usage |
| |
|
| | Try it live: [SymbioGPT-10M Space](https://huggingface.co/spaces/LisaMegaWatts/SymbioGPT-10M-space) |
| |
|
| | ## Links |
| |
|
| | - **Source**: [DavinciDreams/symbiogenesis](https://github.com/DavinciDreams/symbiogenesis) |
| | - **Training data**: Classical philosophy corpus (Aristotle, Plato, Seneca, Marcus Aurelius, etc.) |
| |
|