SymbioGPT Grammar Expert LoRA
A grammar-specialist LoRA adapter for SymbioGPT-10M, trained on CoLA (Corpus of Linguistic Acceptability) via symbiogenesis evolution β an evolutionary architecture search that discovers optimal LoRA configurations through population-based fusion, mutation, and selection.
Key Results
| Metric | Value |
|---|---|
| CoLA train accuracy | 99.65% |
| CoLA test accuracy | 53.2% (majority baseline: 70%) |
| Base model perplexity | 2128.5 |
| With LoRA perplexity | 2135.8 (+0.3%) |
| Gelation (convergence) | Generation 8 of 19 |
| LoRA params | 385,666 (3.5% of base model) |
Status: Grammar-specific generation eval (errors/100w) is pending β the perplexity result shows the LoRA doesn't degrade generation, but whether the discriminative grammar knowledge transfers to improved generation quality is still being tested.
Architecture
SymbioGPT-10M is a multi-organelle decoder-only language model β not a standard transformer. Each block contains:
- CausalDepthwiseConv1d β local n-gram pattern detection
- MonarchMatrix β sub-quadratic global sequence mixing via butterfly factorization
- LongConv β dense causal convolution for medium-range dependencies
- CausalSelfAttention β standard multi-head attention with RoPE
- OrganelleGate β learned per-channel blend across all organelles
SymbioGPT-10M: d_model=320, n_layers=8, n_heads=5, vocab_size=2000
Total params: 11,053,400
LoRA Configuration
This adapter uses manual LoRA injection (not PEFT) since SymbioGPT is a custom PyTorch model. Low-rank AΓB matrices are injected into all 7 linear layer types across all 8 blocks:
| Target | Layer Type | Per Block |
|---|---|---|
| wq, wk, wv, wo | Attention projections | 4 Γ (320β320) |
| w1, v | SwiGLU gate + value | 2 Γ (320β512) |
| w2 | SwiGLU output | 1 Γ (512β320) |
Best evolved config: rank=8, alpha=8.0, all 7 targets = 56 LoRA injections total
Evolution Details
The LoRA configuration was discovered through symbiogenesis evolution:
- Population: 8 random LoRAUnit configs (varying rank, target modules, alpha)
- Training: 200 steps per unit, lr=2e-4, batch=16, cosine LR with warmup
- Fitness:
accuracy - 0.01 Γ log(n_trainable)(parsimony-penalized) - Selection: Tournament (k=3) with Boltzmann sampling
- Fusion: Hybrid sequential/parallel config merge + mutation
- Gelation: CUSUM change-point detection triggered at generation 8
The best config (r=8, all targets) was found in the initial random population at 99.5% accuracy. Evolution refined it to 99.65% over 19 generations, confirming that SymbioGPT's pre-trained representations are powerful enough for near-perfect CoLA classification with almost any LoRA config.
Usage
This adapter requires the SymbioGPT model architecture (not available via transformers). See the training notebook for the full model definition and LoRA injection code.
import torch
from huggingface_hub import hf_hub_download
# Load base model (requires symbio_model.py)
# ... create SymbioGPT with SymbioConfig ...
# Load LoRA weights
weights_path = hf_hub_download(
"LisaMegaWatts/SymbioGPT-GrammarExpert-20260301",
"lora_weights.pt"
)
lora_state = torch.load(weights_path, map_location="cpu")
# Inject LoRA into base model
# inject_lora(model, target_modules=['wq','wk','wv','wo','w1','v','w2'], rank=8, alpha=8.0)
# load_lora_state(model, lora_state)
Files
| File | Description |
|---|---|
lora_weights.pt |
LoRA A/B parameter state dict (1.57 MB) |
metadata.json |
Evolution config, best unit stats, gelation info |
Part of Symbiogenesis
This model is part of the Symbiogenesis project β biologically-inspired evolutionary architecture search for language models. The framework uses concepts from symbiogenesis (endosymbiotic evolution) to evolve LoRA adapter populations through fusion, mutation, and selection, with CUSUM gelation detection as a convergence signal.
W&B run: grammar-expert-symbiogpt-10m
Related Models
- SymbioGPT-10M β base model (pre-trained)
- SymbioSLM β smaller 4M attention-free variant
- Ouroboros-1MContext-Gemma-270m β Gemma-based comparison model
Model tree for LisaMegaWatts/SymbioGPT-GrammarExpert-20260301
Base model
LisaMegaWatts/SymbioGPT-10M