LisaMegaWatts
/

SymbioGPT-Gemma-Fused

+---
+license: apache-2.0
+language:
+- en
+tags:
+- symbiogenesis
+- cross-species
+- lora-projection
+- pca
+- philosophy
+- causal-lm
+base_model:
+- LisaMegaWatts/Ouroboros-1MContext-Gemma-270m
+datasets:
+- wikitext
+pipeline_tag: text-generation
+---
+# SymbioGPT-Gemma-Fused
+Cross-species knowledge transfer from a **Gemma-270M LoRA adapter** (philosophy domain) into the native **SymbioGPT-10M** architecture via PCA-projected LoRA delta injection.
+## What This Is
+This checkpoint is a SymbioGPT-10M model whose weights have been augmented with projected knowledge from a much larger Gemma-270M model. The Gemma model was fine-tuned on a curated 20MB philosophy corpus using a LoRA adapter (rank 44, alpha 88) evolved by [symbiogenesis](https://github.com/DavinciDreams/symbiogenesis). The LoRA deltas were then projected across architectures and injected into SymbioGPT's native weights.
+## Architecture Mapping
+The two models have fundamentally different architectures:
+| Property | Gemma-270M (source) | SymbioGPT-10M (target) |
+|---|---|---|
+| d_model | 640 | 320 |
+| Attention | GQA: 16 Q-heads, 4 KV-heads | MHA: 5 heads |
+| Head dim | 64 | 64 |
+| FFN dim | 2048 (SwiGLU) | 832 (SwiGLU) |
+| Layers | 18 | 8 |
+| Vocab | 262K (Gemma tokenizer) | 2K (custom BPE) |
+| Total params | 268M | ~10M |
+### Projection Method
+1. **PCA calibration**: Run Gemma on 200 WikiText-103 calibration texts, collect per-layer activations, compute SVD to get projection matrices (640 → 320).
+2. **Layer mapping**: 18 Gemma layers → 8 SymbioGPT layers via proportional grouping. Deltas from multiple source layers are averaged when mapped to the same target layer.
+3. **Attention head mapping (GQA → MHA)**: Select top-5 Q-heads by LoRA delta L2 norm. K/V heads inherit from their GQA group assignment.
+4. **FFN mapping**: PCA on the d_model axis (640 → 320), truncation on the FFN axis (2048 → 832).
+5. **Delta injection**: `weight += 0.3 * projected_delta` (blend alpha = 0.3).
+## Results
+| Metric | Value |
+|---|---|
+| PCA avg variance preserved | 99.0% |
+| PCA min variance (layer 17) | 92.4% |
+| Deltas applied | 56 / 56 |
+| Deltas skipped | 0 |
+| Delta/weight ratio range | 1.4% - 4.0% |
+| Blend alpha | 0.3 |
+| Projection time | 105s (RTX 3060) |
+## Usage
+```python
+import torch
+# Load the fused checkpoint
+checkpoint = torch.load("symbio_gemma_fused.pt", map_location="cpu")
+# checkpoint contains the full SymbioGPT state dict with projected LoRA deltas baked in
+```
+This is a raw PyTorch state dict for the SymbioGPT architecture. To use it, load it into a SymbioGPT model instance from the [symbiogenesis-experiments](https://github.com/DavinciDreams/symbiogenesis) repo.
+## Source Models
+- **Base model**: [LisaMegaWatts/Ouroboros-1MContext-Gemma-270m](https://huggingface.co/LisaMegaWatts/Ouroboros-1MContext-Gemma-270m) (Gemma-3 270M with 1M context)
+- **LoRA adapter**: [LisaMegaWatts/SymbioSLM-ouroboros-lora-20260301](https://huggingface.co/LisaMegaWatts/SymbioSLM-ouroboros-lora-20260301) (rank 44, alpha 88, all 7 target modules, evolved by symbiogenesis on philosophy corpus)
+- **Target architecture**: SymbioGPT-10M (custom architecture with organelle-gated attention, CausalConv, Monarch mixing, LongConv)
+## Training Details
+The LoRA adapter was evolved using symbiogenesis (population-based LoRA architecture search):
+- **Population**: 10 units, 17 generations (early-stopped at gelation)
+- **Fitness**: val_loss with complexity penalty (beta=0.01)
+- **Result**: PPL 309 → 61 (5x improvement) with 3.89% trainable params
+- **Convergence**: All 10 units converged to all-7-target configs with rank ~40-44
+## Limitations
+- **Not yet evaluated**: Perplexity and generation quality on the fused model have not been measured. The projection is mathematically sound (99% PCA variance) but downstream quality is unconfirmed.
+- **Vocab mismatch**: Gemma uses a 262K BPE tokenizer; SymbioGPT uses a 2K custom BPE. Embedding weights are not transferred.
+- **Domain-specific**: The LoRA was trained on philosophy text. Transfer to other domains is untested.
+## Links
+- **W&B run**: [ec6eochs](https://wandb.ai/symbiogenesis/symbiogenesis/runs/ec6eochs)
+- **Framework**: [symbiogenesis](https://github.com/DavinciDreams/symbiogenesis)
+- **Experiments**: [symbiogenesis-experiments](https://github.com/DavinciDreams/symbiogenesis)
+- **Projection script**: `cross_species_lora/project_lora.py` in the experiments repo