LisaMegaWatts
/

SymbioGPT-10M

+---
+language: en
+license: mit
+tags:
+  - symbiogenesis
+  - multi-organelle
+  - monarch-mixer
+  - philosophy
+  - pytorch
+---
+# SymbioGPT-10M
+Multi-organelle GPT language model (11.6M params) trained on classical philosophy texts.
+## Architecture
+**SymbioGPT** extends the [SymbioSLM](https://huggingface.co/LisaMegaWatts/SymbioSLM) architecture by adding CausalSelfAttention as a 4th organelle, all fused via a learned per-channel OrganelleGate with learnable temperature.
+| Organelle | Function | Complexity |
+|-----------|----------|------------|
+| CausalDepthwiseConv1d | Local n-gram detection | O(n) |
+| MonarchMatrix | Sub-quadratic global mixing via butterfly matrices | O(n√n) |
+| LongConv | Dense causal convolution with exponential decay | O(n) |
+| CausalSelfAttention | Multi-head attention with RoPE | O(n²) |
+Plus: RMSNorm, SwiGLU FFN, SkipGate residuals, weight-tied output projection.
+## Model Details
+| Parameter | Value |
+|-----------|-------|
+| d_model | 320 |
+| n_layers | 8 |
+| n_heads | 5 |
+| head_dim | 64 |
+| context_length | 256 |
+| vocab_size | 2000 (BPE) |
+| n_monarch_heads | 1 |
+| Total params | 11.6M |
+## Files
+- `symbio_best.pt` — Best checkpoint (PyTorch state_dict, torch.compile format)
+- `symbio_final.pt` — Final checkpoint
+- `vocab.json` — BPE vocabulary (2000 tokens)
+- `merges.txt` — BPE merge rules
+## Usage
+Try it live: [SymbioGPT-10M Space](https://huggingface.co/spaces/LisaMegaWatts/SymbioGPT-10M-space)
+## Links
+- **Source**: [DavinciDreams/symbiogenesis](https://github.com/DavinciDreams/symbiogenesis)
+- **Training data**: Classical philosophy corpus (Aristotle, Plato, Seneca, Marcus Aurelius, etc.)