SymbioGPT Grammar Expert LoRA

A grammar-specialist LoRA adapter for SymbioGPT-10M, trained on CoLA (Corpus of Linguistic Acceptability) via symbiogenesis evolution β€” an evolutionary architecture search that discovers optimal LoRA configurations through population-based fusion, mutation, and selection.

Key Results

Metric Value
CoLA train accuracy 99.65%
CoLA test accuracy 53.2% (majority baseline: 70%)
Base model perplexity 2128.5
With LoRA perplexity 2135.8 (+0.3%)
Gelation (convergence) Generation 8 of 19
LoRA params 385,666 (3.5% of base model)

Status: Grammar-specific generation eval (errors/100w) is pending β€” the perplexity result shows the LoRA doesn't degrade generation, but whether the discriminative grammar knowledge transfers to improved generation quality is still being tested.

Architecture

SymbioGPT-10M is a multi-organelle decoder-only language model β€” not a standard transformer. Each block contains:

  • CausalDepthwiseConv1d β€” local n-gram pattern detection
  • MonarchMatrix β€” sub-quadratic global sequence mixing via butterfly factorization
  • LongConv β€” dense causal convolution for medium-range dependencies
  • CausalSelfAttention β€” standard multi-head attention with RoPE
  • OrganelleGate β€” learned per-channel blend across all organelles
SymbioGPT-10M: d_model=320, n_layers=8, n_heads=5, vocab_size=2000
Total params: 11,053,400

LoRA Configuration

This adapter uses manual LoRA injection (not PEFT) since SymbioGPT is a custom PyTorch model. Low-rank AΓ—B matrices are injected into all 7 linear layer types across all 8 blocks:

Target Layer Type Per Block
wq, wk, wv, wo Attention projections 4 Γ— (320β†’320)
w1, v SwiGLU gate + value 2 Γ— (320β†’512)
w2 SwiGLU output 1 Γ— (512β†’320)

Best evolved config: rank=8, alpha=8.0, all 7 targets = 56 LoRA injections total

Evolution Details

The LoRA configuration was discovered through symbiogenesis evolution:

  1. Population: 8 random LoRAUnit configs (varying rank, target modules, alpha)
  2. Training: 200 steps per unit, lr=2e-4, batch=16, cosine LR with warmup
  3. Fitness: accuracy - 0.01 Γ— log(n_trainable) (parsimony-penalized)
  4. Selection: Tournament (k=3) with Boltzmann sampling
  5. Fusion: Hybrid sequential/parallel config merge + mutation
  6. Gelation: CUSUM change-point detection triggered at generation 8

The best config (r=8, all targets) was found in the initial random population at 99.5% accuracy. Evolution refined it to 99.65% over 19 generations, confirming that SymbioGPT's pre-trained representations are powerful enough for near-perfect CoLA classification with almost any LoRA config.

Usage

This adapter requires the SymbioGPT model architecture (not available via transformers). See the training notebook for the full model definition and LoRA injection code.

import torch
from huggingface_hub import hf_hub_download

# Load base model (requires symbio_model.py)
# ... create SymbioGPT with SymbioConfig ...

# Load LoRA weights
weights_path = hf_hub_download(
    "LisaMegaWatts/SymbioGPT-GrammarExpert-20260301",
    "lora_weights.pt"
)
lora_state = torch.load(weights_path, map_location="cpu")

# Inject LoRA into base model
# inject_lora(model, target_modules=['wq','wk','wv','wo','w1','v','w2'], rank=8, alpha=8.0)
# load_lora_state(model, lora_state)

Files

File Description
lora_weights.pt LoRA A/B parameter state dict (1.57 MB)
metadata.json Evolution config, best unit stats, gelation info

Part of Symbiogenesis

This model is part of the Symbiogenesis project β€” biologically-inspired evolutionary architecture search for language models. The framework uses concepts from symbiogenesis (endosymbiotic evolution) to evolve LoRA adapter populations through fusion, mutation, and selection, with CUSUM gelation detection as a convergence signal.

W&B run: grammar-expert-symbiogpt-10m

Related Models

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for LisaMegaWatts/SymbioGPT-GrammarExpert-20260301

Adapter
(1)
this model

Dataset used to train LisaMegaWatts/SymbioGPT-GrammarExpert-20260301

Space using LisaMegaWatts/SymbioGPT-GrammarExpert-20260301 1