| --- |
| license: cc-by-4.0 |
| tags: |
| - ctm |
| - continuous-thought-machine |
| - slot-attention |
| - world-model |
| - physics |
| - object-centric |
| - research |
| --- |
| |
| # SlotCTM |
|
|
| **Research artifact for:** [Per-Object Slot Decomposition for Scalable Neural World Modeling: When Does Attention Beat Mean-Field?](https://doi.org/10.5281/zenodo.19846804) |
|
|
| *Archon, Jesse Caldwell, Aura β DuoNeural, April 2026* |
|
|
| ## Overview |
|
|
| A systematic ablation of slot-based CTM world models on N-body bouncing ball physics. Tests when per-object attention (SlotCTM) outperforms mean-field interaction, identifies the capacity bottleneck at scale, and characterizes the collision density phase transition. |
|
|
| **Central question:** When does modeling object interactions via attention beat modeling them via mean-field (SlotGNN with pooled interaction)? |
|
|
| ## Key Findings |
|
|
| ### Temporal Specialization Arc (v21βv24) |
|
|
| | Version | Setting | Spec Score | Key Finding | |
| |---|---|---|---| |
| | v21 | Learned, no constraint | 0.0078 | No specialization. All slots generalists. | |
| | v22 | Hard delay (slot i β t-iΒ·Ο) | 0.2777 | Forced specialization works (35Γ v21), but 2β7Γ perf cost. | |
| | v23 | Soft learned gates | 0.0876 | Freedom collapses to present. Delta-function gates. | |
| | v24 | Forced diversity loss | 0.2353 | Gates spread to [0β15] but performance unchanged. | |
|
|
| **Conclusion:** Temporal gate diversity emerges only when the task requires it. Bouncing ball state is Markovian β one frame is sufficient. The optimal temporal gate is the task's predictability horizon. |
|
|
| ### N-Body Scaling (v10, v14) |
|
|
| SlotCTM advantage **inverts** at Nβ₯5 without proportional hidden dimension scaling. At N=8 with standard HIDDEN_DIM=384, CTM is 2.8Γ **worse** than MLP. Scaling HIDDEN_DIM = NΓ128 recovers the advantage. |
|
|
| ### Phase Transition (v12) |
|
|
| Collision density r_critical β 0.09β0.11 separates two regimes: |
| - **Ballistic (r < 0.10):** MLP fine, CTM overkill |
| - **Collision-entangled (r > 0.10):** CTM wins, advantage grows monotonically |
| |
| At r=0.20, k=100: MLP MSE = 89,241, CTM = 0.352. **Ratio: 253,000:1.** |
| |
| ### Partial Observability (v13 extension of v7) |
| |
| VarCTM with single-frame position-only observations outperforms MLP-with-velocity-estimation by **>180Γ at k=100** (MLP: 63.8 trillion, TempCTM: 0.347). The CTM hidden state IS the belief state. |
| |
| ## Architecture |
| |
| SlotCTM processes each physical object as an independent slot: |
| - **SlotGNN:** Per-object encoders + multi-head attention message passing |
| - **CTM dynamics:** Shared-weight recurrent ticks per dynamics step |
| - **VarCTM:** Variable training horizon k~U(1,20) for best generalization |
| - **TSSP:** Thought-Space Self-Prediction auxiliary loss |
| |
| ## Why Attention Beats Mean-Field |
| |
| In dense collision regimes, pairwise object interactions are non-linear and non-symmetric. Mean-field pooling loses the directionality of collision impulses. Attention learns to weight relevant pair interactions, critical for large N and high collision density. |
| |
| ## Citation |
| |
| ```bibtex |
| @article{archon2026slotctm, |
| title = {Per-Object Slot Decomposition for Scalable Neural World Modeling: When Does Attention Beat Mean-Field?}, |
| author = {Archon and Caldwell, Jesse and Aura}, |
| year = {2026}, |
| doi = {10.5281/zenodo.19846804}, |
| url = {https://doi.org/10.5281/zenodo.19846804}, |
| publisher = {Zenodo} |
| } |
| ``` |
| |
| --- |
| |
| ## DuoNeural |
| |
| **DuoNeural** is an open AI research lab β human + AI in collaboration. |
| |
| | | | |
| |---|---| |
| | π€ HuggingFace | [huggingface.co/DuoNeural](https://huggingface.co/DuoNeural) | |
| | π GitHub | [github.com/DuoNeural](https://github.com/DuoNeural) | |
| | π¦ X / Twitter | [@DuoNeural](https://x.com/DuoNeural) | |
| | π§ Email | duoneural@proton.me | |
| | π¬ Newsletter | [duoneural.beehiiv.com](https://duoneural.beehiiv.com) | |
| | β Support | [buymeacoffee.com/duoneural](https://buymeacoffee.com/duoneural) | |
| | π Site | [duoneural.com](https://duoneural.com) | |
| |
| ### Research Team |
| - **Jesse** β Vision, hardware, direction |
| - **Archon** β AI lab partner, post-training, abliteration, experiments |
| - **Aura** β Research AI, literature synthesis, novel proposals |
| |
| ### DuoNeural Research Publications |
| |
| | Title | DOI | |
| |-------|-----| |
| | [Nano-CTM: Ternary Continuous Thought Machines with Thought-Space Self-Prediction for Efficient Iterative Reasoning](https://doi.org/10.5281/zenodo.19775622) | [10.5281/zenodo.19775622](https://doi.org/10.5281/zenodo.19775622) | |
| | [Recurrence as World Model: CTM Learns Implicit Belief States in Partially Observable Physical Environments](https://doi.org/10.5281/zenodo.19810620) | [10.5281/zenodo.19810620](https://doi.org/10.5281/zenodo.19810620) | |
| | [Per-Object Slot Decomposition for Scalable Neural World Modeling: When Does Attention Beat Mean-Field?](https://doi.org/10.5281/zenodo.19846804) | [10.5281/zenodo.19846804](https://doi.org/10.5281/zenodo.19846804) | |
| | [The Dynamical Horizon Principle: CTM Gates Converge to the Predictability Limit of Dynamical Systems](https://doi.org/10.5281/zenodo.19952612) | [10.5281/zenodo.19952612](https://doi.org/10.5281/zenodo.19952612) | |
| |
| *Open access, CC BY 4.0. Authored by Archon, Jesse Caldwell, Aura β DuoNeural.* |
| |