EnricoFermi commited on
Commit
91c002b
·
verified ·
1 Parent(s): f564207

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -138,6 +138,10 @@ Cycle 3: train (batch=3, 22B, 14.5GB) -> prune -> defrag (2.8x
138
 
139
  40% faster total training and a 33% smaller final model.
140
 
 
 
 
 
141
  **Read the full paper**: [Experiential Plasticity: Transformers That Grow Their Own Architecture From Experience](https://github.com/CambrianTech/continuum/blob/main/docs/papers/EXPERIENTIAL-PLASTICITY.md)
142
 
143
  ## Output Samples
 
138
 
139
  40% faster total training and a 33% smaller final model.
140
 
141
+ ### Head Mitosis
142
+
143
+ Pruning frees slots. Mitosis fills them. When a head is overutilized, it gets cloned into a pruned slot — each copy at 50% gate value to maintain output continuity. After continued training, the clones **diverge and specialize**, like cell differentiation after biological mitosis. The model grows new specialized capacity exactly where it's needed.
144
+
145
  **Read the full paper**: [Experiential Plasticity: Transformers That Grow Their Own Architecture From Experience](https://github.com/CambrianTech/continuum/blob/main/docs/papers/EXPERIENTIAL-PLASTICITY.md)
146
 
147
  ## Output Samples