SymbioGPT-10M

Multi-organelle GPT language model (11.6M params) trained on classical philosophy texts.

Architecture

SymbioGPT extends the SymbioSLM architecture by adding CausalSelfAttention as a 4th organelle, all fused via a learned per-channel OrganelleGate with learnable temperature.

Organelle	Function	Complexity
CausalDepthwiseConv1d	Local n-gram detection	O(n)
MonarchMatrix	Sub-quadratic global mixing via butterfly matrices	O(n√n)
LongConv	Dense causal convolution with exponential decay	O(n)
CausalSelfAttention	Multi-head attention with RoPE	O(n²)

Plus: RMSNorm, SwiGLU FFN, SkipGate residuals, weight-tied output projection.

Model Details

Parameter	Value
d_model	320
n_layers	8
n_heads	5
head_dim	64
context_length	256
vocab_size	2000 (BPE)
n_monarch_heads	1
Total params	11.6M

Files

symbio_best.pt — Best checkpoint (PyTorch state_dict, torch.compile format)
symbio_final.pt — Final checkpoint
vocab.json — BPE vocabulary (2000 tokens)
merges.txt — BPE merge rules

Usage

Try it live: SymbioGPT-10M Space

Model tree for LisaMegaWatts/SymbioGPT-10M

Adapters

1 model

LisaMegaWatts
/

SymbioGPT-10M