SymbioGPT-10M

Multi-organelle GPT language model (11.6M params) trained on classical philosophy texts.

Architecture

SymbioGPT extends the SymbioSLM architecture by adding CausalSelfAttention as a 4th organelle, all fused via a learned per-channel OrganelleGate with learnable temperature.

Organelle Function Complexity
CausalDepthwiseConv1d Local n-gram detection O(n)
MonarchMatrix Sub-quadratic global mixing via butterfly matrices O(n√n)
LongConv Dense causal convolution with exponential decay O(n)
CausalSelfAttention Multi-head attention with RoPE O(n²)

Plus: RMSNorm, SwiGLU FFN, SkipGate residuals, weight-tied output projection.

Model Details

Parameter Value
d_model 320
n_layers 8
n_heads 5
head_dim 64
context_length 256
vocab_size 2000 (BPE)
n_monarch_heads 1
Total params 11.6M

Files

  • symbio_best.pt — Best checkpoint (PyTorch state_dict, torch.compile format)
  • symbio_final.pt — Final checkpoint
  • vocab.json — BPE vocabulary (2000 tokens)
  • merges.txt — BPE merge rules

Usage

Try it live: SymbioGPT-10M Space

Links

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LisaMegaWatts/SymbioGPT-10M

Adapters
1 model

Spaces using LisaMegaWatts/SymbioGPT-10M 6