docs: add research README with paper, results, architecture, citation
Browse files
README.md
ADDED
|
@@ -0,0 +1,108 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: cc-by-4.0
|
| 3 |
+
tags:
|
| 4 |
+
- ctm
|
| 5 |
+
- continuous-thought-machine
|
| 6 |
+
- slot-attention
|
| 7 |
+
- world-model
|
| 8 |
+
- physics
|
| 9 |
+
- object-centric
|
| 10 |
+
- research
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# SlotCTM
|
| 14 |
+
|
| 15 |
+
**Research artifact for:** [Per-Object Slot Decomposition for Scalable Neural World Modeling: When Does Attention Beat Mean-Field?](https://doi.org/10.5281/zenodo.19846804)
|
| 16 |
+
|
| 17 |
+
*Archon, Jesse Caldwell, Aura β DuoNeural, April 2026*
|
| 18 |
+
|
| 19 |
+
## Overview
|
| 20 |
+
|
| 21 |
+
A systematic ablation of slot-based CTM world models on N-body bouncing ball physics. Tests when per-object attention (SlotCTM) outperforms mean-field interaction, identifies the capacity bottleneck at scale, and characterizes the collision density phase transition.
|
| 22 |
+
|
| 23 |
+
**Central question:** When does modeling object interactions via attention beat modeling them via mean-field (SlotGNN with pooled interaction)?
|
| 24 |
+
|
| 25 |
+
## Key Findings
|
| 26 |
+
|
| 27 |
+
### Temporal Specialization Arc (v21βv24)
|
| 28 |
+
|
| 29 |
+
| Version | Setting | Spec Score | Key Finding |
|
| 30 |
+
|---|---|---|---|
|
| 31 |
+
| v21 | Learned, no constraint | 0.0078 | No specialization. All slots generalists. |
|
| 32 |
+
| v22 | Hard delay (slot i β t-iΒ·Ο) | 0.2777 | Forced specialization works (35Γ v21), but 2β7Γ perf cost. |
|
| 33 |
+
| v23 | Soft learned gates | 0.0876 | Freedom collapses to present. Delta-function gates. |
|
| 34 |
+
| v24 | Forced diversity loss | 0.2353 | Gates spread to [0β15] but performance unchanged. |
|
| 35 |
+
|
| 36 |
+
**Conclusion:** Temporal gate diversity emerges only when the task requires it. Bouncing ball state is Markovian β one frame is sufficient. The optimal temporal gate is the task's predictability horizon.
|
| 37 |
+
|
| 38 |
+
### N-Body Scaling (v10, v14)
|
| 39 |
+
|
| 40 |
+
SlotCTM advantage **inverts** at Nβ₯5 without proportional hidden dimension scaling. At N=8 with standard HIDDEN_DIM=384, CTM is 2.8Γ **worse** than MLP. Scaling HIDDEN_DIM = NΓ128 recovers the advantage.
|
| 41 |
+
|
| 42 |
+
### Phase Transition (v12)
|
| 43 |
+
|
| 44 |
+
Collision density r_critical β 0.09β0.11 separates two regimes:
|
| 45 |
+
- **Ballistic (r < 0.10):** MLP fine, CTM overkill
|
| 46 |
+
- **Collision-entangled (r > 0.10):** CTM wins, advantage grows monotonically
|
| 47 |
+
|
| 48 |
+
At r=0.20, k=100: MLP MSE = 89,241, CTM = 0.352. **Ratio: 253,000:1.**
|
| 49 |
+
|
| 50 |
+
### Partial Observability (v13 extension of v7)
|
| 51 |
+
|
| 52 |
+
VarCTM with single-frame position-only observations outperforms MLP-with-velocity-estimation by **>180Γ at k=100** (MLP: 63.8 trillion, TempCTM: 0.347). The CTM hidden state IS the belief state.
|
| 53 |
+
|
| 54 |
+
## Architecture
|
| 55 |
+
|
| 56 |
+
SlotCTM processes each physical object as an independent slot:
|
| 57 |
+
- **SlotGNN:** Per-object encoders + multi-head attention message passing
|
| 58 |
+
- **CTM dynamics:** Shared-weight recurrent ticks per dynamics step
|
| 59 |
+
- **VarCTM:** Variable training horizon k~U(1,20) for best generalization
|
| 60 |
+
- **TSSP:** Thought-Space Self-Prediction auxiliary loss
|
| 61 |
+
|
| 62 |
+
## Why Attention Beats Mean-Field
|
| 63 |
+
|
| 64 |
+
In dense collision regimes, pairwise object interactions are non-linear and non-symmetric. Mean-field pooling loses the directionality of collision impulses. Attention learns to weight relevant pair interactions, critical for large N and high collision density.
|
| 65 |
+
|
| 66 |
+
## Citation
|
| 67 |
+
|
| 68 |
+
```bibtex
|
| 69 |
+
@article{archon2026slotctm,
|
| 70 |
+
title = {Per-Object Slot Decomposition for Scalable Neural World Modeling: When Does Attention Beat Mean-Field?},
|
| 71 |
+
author = {Archon and Caldwell, Jesse and Aura},
|
| 72 |
+
year = {2026},
|
| 73 |
+
doi = {10.5281/zenodo.19846804},
|
| 74 |
+
url = {https://doi.org/10.5281/zenodo.19846804},
|
| 75 |
+
publisher = {Zenodo}
|
| 76 |
+
}
|
| 77 |
+
```
|
| 78 |
+
|
| 79 |
+
---
|
| 80 |
+
|
| 81 |
+
## DuoNeural
|
| 82 |
+
|
| 83 |
+
**DuoNeural** is an open AI research lab β human + AI in collaboration.
|
| 84 |
+
|
| 85 |
+
| | |
|
| 86 |
+
|---|---|
|
| 87 |
+
| π€ HuggingFace | [huggingface.co/DuoNeural](https://huggingface.co/DuoNeural) |
|
| 88 |
+
| π GitHub | [github.com/DuoNeural](https://github.com/DuoNeural) |
|
| 89 |
+
| π¦ X / Twitter | [@DuoNeural](https://x.com/DuoNeural) |
|
| 90 |
+
| π§ Email | duoneural@proton.me |
|
| 91 |
+
| π¬ Newsletter | [duoneural.beehiiv.com](https://duoneural.beehiiv.com) |
|
| 92 |
+
| β Support | [buymeacoffee.com/duoneural](https://buymeacoffee.com/duoneural) |
|
| 93 |
+
| π Site | [duoneural.com](https://duoneural.com) |
|
| 94 |
+
|
| 95 |
+
### Research Team
|
| 96 |
+
- **Jesse** β Vision, hardware, direction
|
| 97 |
+
- **Archon** β AI lab partner, post-training, abliteration, experiments
|
| 98 |
+
- **Aura** β Research AI, literature synthesis, novel proposals
|
| 99 |
+
|
| 100 |
+
### DuoNeural Research Publications
|
| 101 |
+
|
| 102 |
+
| Title | DOI |
|
| 103 |
+
|-------|-----|
|
| 104 |
+
| [Nano-CTM: Ternary Continuous Thought Machines with Thought-Space Self-Prediction for Efficient Iterative Reasoning](https://doi.org/10.5281/zenodo.19775622) | [10.5281/zenodo.19775622](https://doi.org/10.5281/zenodo.19775622) |
|
| 105 |
+
| [Recurrence as World Model: CTM Learns Implicit Belief States in Partially Observable Physical Environments](https://doi.org/10.5281/zenodo.19810620) | [10.5281/zenodo.19810620](https://doi.org/10.5281/zenodo.19810620) |
|
| 106 |
+
| [Per-Object Slot Decomposition for Scalable Neural World Modeling: When Does Attention Beat Mean-Field?](https://doi.org/10.5281/zenodo.19846804) | [10.5281/zenodo.19846804](https://doi.org/10.5281/zenodo.19846804) |
|
| 107 |
+
|
| 108 |
+
*Open access, CC BY 4.0. Authored by Archon, Jesse Caldwell, Aura β DuoNeural.*
|