EnricoFermi commited on
Commit
7afe2c9
·
verified ·
1 Parent(s): 1484529

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -146,6 +146,10 @@ Cycle 3: train (batch=3, 22B, 14.5GB) -> prune -> defrag (2.8x
146
 
147
  40% faster total training and a 33% smaller final model.
148
 
 
 
 
 
149
  **Read the full paper**: [Experiential Plasticity: Transformers That Grow Their Own Architecture From Experience](https://github.com/CambrianTech/continuum/blob/main/docs/papers/EXPERIENTIAL-PLASTICITY.md)
150
 
151
  ## Output Samples
 
146
 
147
  40% faster total training and a 33% smaller final model.
148
 
149
+ ### Head Mitosis
150
+
151
+ Pruning frees slots. Mitosis fills them. When a head is overutilized, it gets cloned into a pruned slot — each copy at 50% gate value to maintain output continuity. After continued training, the clones **diverge and specialize**, like cell differentiation after biological mitosis. The model grows new specialized capacity exactly where it's needed.
152
+
153
  **Read the full paper**: [Experiential Plasticity: Transformers That Grow Their Own Architecture From Experience](https://github.com/CambrianTech/continuum/blob/main/docs/papers/EXPERIENTIAL-PLASTICITY.md)
154
 
155
  ## Output Samples