LisaMegaWatts
/

SymbioGPT-10M

multi-organelle

Model card Files Files and versions

SymbioGPT-10M / README.md

LisaMegaWatts's picture

Upload README.md with huggingface_hub

4bf22dd verified 3 days ago

|

history blame contribute delete

1.67 kB

	---
	language: en
	license: mit
	tags:
	- symbiogenesis
	- multi-organelle
	- monarch-mixer
	- philosophy
	- pytorch
	---

	# SymbioGPT-10M

	Multi-organelle GPT language model (11.6M params) trained on classical philosophy texts.

	## Architecture

	SymbioGPT extends the [SymbioSLM](https://huggingface.co/LisaMegaWatts/SymbioSLM) architecture by adding CausalSelfAttention as a 4th organelle, all fused via a learned per-channel OrganelleGate with learnable temperature.

	\| Organelle \| Function \| Complexity \|
	\|-----------\|----------\|------------\|
	\| CausalDepthwiseConv1d \| Local n-gram detection \| O(n) \|
	\| MonarchMatrix \| Sub-quadratic global mixing via butterfly matrices \| O(n√n) \|
	\| LongConv \| Dense causal convolution with exponential decay \| O(n) \|
	\| CausalSelfAttention \| Multi-head attention with RoPE \| O(n²) \|

	Plus: RMSNorm, SwiGLU FFN, SkipGate residuals, weight-tied output projection.

	## Model Details

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| d_model \| 320 \|
	\| n_layers \| 8 \|
	\| n_heads \| 5 \|
	\| head_dim \| 64 \|
	\| context_length \| 256 \|
	\| vocab_size \| 2000 (BPE) \|
	\| n_monarch_heads \| 1 \|
	\| Total params \| 11.6M \|

	## Files

	- `symbio_best.pt` — Best checkpoint (PyTorch state_dict, torch.compile format)
	- `symbio_final.pt` — Final checkpoint
	- `vocab.json` — BPE vocabulary (2000 tokens)
	- `merges.txt` — BPE merge rules

	## Usage

	Try it live: [SymbioGPT-10M Space](https://huggingface.co/spaces/LisaMegaWatts/SymbioGPT-10M-space)

	## Links

	- Source: [DavinciDreams/symbiogenesis](https://github.com/DavinciDreams/symbiogenesis)
	- Training data: Classical philosophy corpus (Aristotle, Plato, Seneca, Marcus Aurelius, etc.)