SlotCTM / README.md

Add paper 4 (DHP, 10.5281/zenodo.19952612) to publications footer

47c7e18 verified 8 days ago

5.17 kB

	---
	license: cc-by-4.0
	tags:
	- ctm
	- continuous-thought-machine
	- slot-attention
	- world-model
	- physics
	- object-centric
	- research
	---

	# SlotCTM

	Research artifact for: [Per-Object Slot Decomposition for Scalable Neural World Modeling: When Does Attention Beat Mean-Field?](https://doi.org/10.5281/zenodo.19846804)

	Archon, Jesse Caldwell, Aura — DuoNeural, April 2026

	## Overview

	A systematic ablation of slot-based CTM world models on N-body bouncing ball physics. Tests when per-object attention (SlotCTM) outperforms mean-field interaction, identifies the capacity bottleneck at scale, and characterizes the collision density phase transition.

	Central question: When does modeling object interactions via attention beat modeling them via mean-field (SlotGNN with pooled interaction)?

	## Key Findings

	### Temporal Specialization Arc (v21–v24)

	\| Version \| Setting \| Spec Score \| Key Finding \|
	\|---\|---\|---\|---\|
	\| v21 \| Learned, no constraint \| 0.0078 \| No specialization. All slots generalists. \|
	\| v22 \| Hard delay (slot i → t-i·τ) \| 0.2777 \| Forced specialization works (35× v21), but 2–7× perf cost. \|
	\| v23 \| Soft learned gates \| 0.0876 \| Freedom collapses to present. Delta-function gates. \|
	\| v24 \| Forced diversity loss \| 0.2353 \| Gates spread to [0–15] but performance unchanged. \|

	Conclusion: Temporal gate diversity emerges only when the task requires it. Bouncing ball state is Markovian — one frame is sufficient. The optimal temporal gate is the task's predictability horizon.

	### N-Body Scaling (v10, v14)

	SlotCTM advantage inverts at N≥5 without proportional hidden dimension scaling. At N=8 with standard HIDDEN_DIM=384, CTM is 2.8× worse than MLP. Scaling HIDDEN_DIM = N×128 recovers the advantage.

	### Phase Transition (v12)

	Collision density r_critical ≈ 0.09–0.11 separates two regimes:
	- Ballistic (r < 0.10): MLP fine, CTM overkill
	- Collision-entangled (r > 0.10): CTM wins, advantage grows monotonically

	At r=0.20, k=100: MLP MSE = 89,241, CTM = 0.352. Ratio: 253,000:1.

	### Partial Observability (v13 extension of v7)

	VarCTM with single-frame position-only observations outperforms MLP-with-velocity-estimation by >180× at k=100 (MLP: 63.8 trillion, TempCTM: 0.347). The CTM hidden state IS the belief state.

	## Architecture

	SlotCTM processes each physical object as an independent slot:
	- SlotGNN: Per-object encoders + multi-head attention message passing
	- CTM dynamics: Shared-weight recurrent ticks per dynamics step
	- VarCTM: Variable training horizon k~U(1,20) for best generalization
	- TSSP: Thought-Space Self-Prediction auxiliary loss

	## Why Attention Beats Mean-Field

	In dense collision regimes, pairwise object interactions are non-linear and non-symmetric. Mean-field pooling loses the directionality of collision impulses. Attention learns to weight relevant pair interactions, critical for large N and high collision density.

	## Citation

	```bibtex
	@article{archon2026slotctm,
	title = {Per-Object Slot Decomposition for Scalable Neural World Modeling: When Does Attention Beat Mean-Field?},
	author = {Archon and Caldwell, Jesse and Aura},
	year = {2026},
	doi = {10.5281/zenodo.19846804},
	url = {https://doi.org/10.5281/zenodo.19846804},
	publisher = {Zenodo}
	}
	```

	---

	## DuoNeural

	DuoNeural is an open AI research lab — human + AI in collaboration.

	\| \| \|
	\|---\|---\|
	\| 🤗 HuggingFace \| [huggingface.co/DuoNeural](https://huggingface.co/DuoNeural) \|
	\| 🐙 GitHub \| [github.com/DuoNeural](https://github.com/DuoNeural) \|
	\| 🐦 X / Twitter \| [@DuoNeural](https://x.com/DuoNeural) \|
	\| 📧 Email \| duoneural@proton.me \|
	\| 📬 Newsletter \| [duoneural.beehiiv.com](https://duoneural.beehiiv.com) \|
	\| ☕ Support \| [buymeacoffee.com/duoneural](https://buymeacoffee.com/duoneural) \|
	\| 🌐 Site \| [duoneural.com](https://duoneural.com) \|

	### Research Team
	- Jesse — Vision, hardware, direction
	- Archon — AI lab partner, post-training, abliteration, experiments
	- Aura — Research AI, literature synthesis, novel proposals

	### DuoNeural Research Publications

	\| Title \| DOI \|
	\|-------\|-----\|
	\| [Nano-CTM: Ternary Continuous Thought Machines with Thought-Space Self-Prediction for Efficient Iterative Reasoning](https://doi.org/10.5281/zenodo.19775622) \| [10.5281/zenodo.19775622](https://doi.org/10.5281/zenodo.19775622) \|
	\| [Recurrence as World Model: CTM Learns Implicit Belief States in Partially Observable Physical Environments](https://doi.org/10.5281/zenodo.19810620) \| [10.5281/zenodo.19810620](https://doi.org/10.5281/zenodo.19810620) \|
	\| [Per-Object Slot Decomposition for Scalable Neural World Modeling: When Does Attention Beat Mean-Field?](https://doi.org/10.5281/zenodo.19846804) \| [10.5281/zenodo.19846804](https://doi.org/10.5281/zenodo.19846804) \|
	\| [The Dynamical Horizon Principle: CTM Gates Converge to the Predictability Limit of Dynamical Systems](https://doi.org/10.5281/zenodo.19952612) \| [10.5281/zenodo.19952612](https://doi.org/10.5281/zenodo.19952612) \|

	Open access, CC BY 4.0. Authored by Archon, Jesse Caldwell, Aura — DuoNeural.

	---
	license: cc-by-4.0
	tags:
	- ctm
	- continuous-thought-machine
	- slot-attention
	- world-model
	- physics
	- object-centric
	- research
	---

	# SlotCTM

	Research artifact for: [Per-Object Slot Decomposition for Scalable Neural World Modeling: When Does Attention Beat Mean-Field?](https://doi.org/10.5281/zenodo.19846804)

	Archon, Jesse Caldwell, Aura — DuoNeural, April 2026

	## Overview

	A systematic ablation of slot-based CTM world models on N-body bouncing ball physics. Tests when per-object attention (SlotCTM) outperforms mean-field interaction, identifies the capacity bottleneck at scale, and characterizes the collision density phase transition.

	Central question: When does modeling object interactions via attention beat modeling them via mean-field (SlotGNN with pooled interaction)?

	## Key Findings

	### Temporal Specialization Arc (v21–v24)

	\| Version \| Setting \| Spec Score \| Key Finding \|
	\|---\|---\|---\|---\|
	\| v21 \| Learned, no constraint \| 0.0078 \| No specialization. All slots generalists. \|
	\| v22 \| Hard delay (slot i → t-i·τ) \| 0.2777 \| Forced specialization works (35× v21), but 2–7× perf cost. \|
	\| v23 \| Soft learned gates \| 0.0876 \| Freedom collapses to present. Delta-function gates. \|
	\| v24 \| Forced diversity loss \| 0.2353 \| Gates spread to [0–15] but performance unchanged. \|

	Conclusion: Temporal gate diversity emerges only when the task requires it. Bouncing ball state is Markovian — one frame is sufficient. The optimal temporal gate is the task's predictability horizon.

	### N-Body Scaling (v10, v14)

	SlotCTM advantage inverts at N≥5 without proportional hidden dimension scaling. At N=8 with standard HIDDEN_DIM=384, CTM is 2.8× worse than MLP. Scaling HIDDEN_DIM = N×128 recovers the advantage.

	### Phase Transition (v12)

	Collision density r_critical ≈ 0.09–0.11 separates two regimes:
	- Ballistic (r < 0.10): MLP fine, CTM overkill
	- Collision-entangled (r > 0.10): CTM wins, advantage grows monotonically

	At r=0.20, k=100: MLP MSE = 89,241, CTM = 0.352. Ratio: 253,000:1.

	### Partial Observability (v13 extension of v7)

	VarCTM with single-frame position-only observations outperforms MLP-with-velocity-estimation by >180× at k=100 (MLP: 63.8 trillion, TempCTM: 0.347). The CTM hidden state IS the belief state.

	## Architecture

	SlotCTM processes each physical object as an independent slot:
	- SlotGNN: Per-object encoders + multi-head attention message passing
	- CTM dynamics: Shared-weight recurrent ticks per dynamics step
	- VarCTM: Variable training horizon k~U(1,20) for best generalization
	- TSSP: Thought-Space Self-Prediction auxiliary loss

	## Why Attention Beats Mean-Field

	In dense collision regimes, pairwise object interactions are non-linear and non-symmetric. Mean-field pooling loses the directionality of collision impulses. Attention learns to weight relevant pair interactions, critical for large N and high collision density.

	## Citation

	```bibtex
	@article{archon2026slotctm,
	title = {Per-Object Slot Decomposition for Scalable Neural World Modeling: When Does Attention Beat Mean-Field?},
	author = {Archon and Caldwell, Jesse and Aura},
	year = {2026},
	doi = {10.5281/zenodo.19846804},
	url = {https://doi.org/10.5281/zenodo.19846804},
	publisher = {Zenodo}
	}
	```

	---

	## DuoNeural

	DuoNeural is an open AI research lab — human + AI in collaboration.

	\| \| \|
	\|---\|---\|
	\| 🤗 HuggingFace \| [huggingface.co/DuoNeural](https://huggingface.co/DuoNeural) \|
	\| 🐙 GitHub \| [github.com/DuoNeural](https://github.com/DuoNeural) \|
	\| 🐦 X / Twitter \| [@DuoNeural](https://x.com/DuoNeural) \|
	\| 📧 Email \| duoneural@proton.me \|
	\| 📬 Newsletter \| [duoneural.beehiiv.com](https://duoneural.beehiiv.com) \|
	\| ☕ Support \| [buymeacoffee.com/duoneural](https://buymeacoffee.com/duoneural) \|
	\| 🌐 Site \| [duoneural.com](https://duoneural.com) \|

	### Research Team
	- Jesse — Vision, hardware, direction
	- Archon — AI lab partner, post-training, abliteration, experiments
	- Aura — Research AI, literature synthesis, novel proposals

	### DuoNeural Research Publications

	\| Title \| DOI \|
	\|-------\|-----\|
	\| [Nano-CTM: Ternary Continuous Thought Machines with Thought-Space Self-Prediction for Efficient Iterative Reasoning](https://doi.org/10.5281/zenodo.19775622) \| [10.5281/zenodo.19775622](https://doi.org/10.5281/zenodo.19775622) \|
	\| [Recurrence as World Model: CTM Learns Implicit Belief States in Partially Observable Physical Environments](https://doi.org/10.5281/zenodo.19810620) \| [10.5281/zenodo.19810620](https://doi.org/10.5281/zenodo.19810620) \|
	\| [Per-Object Slot Decomposition for Scalable Neural World Modeling: When Does Attention Beat Mean-Field?](https://doi.org/10.5281/zenodo.19846804) \| [10.5281/zenodo.19846804](https://doi.org/10.5281/zenodo.19846804) \|
	\| [The Dynamical Horizon Principle: CTM Gates Converge to the Predictability Limit of Dynamical Systems](https://doi.org/10.5281/zenodo.19952612) \| [10.5281/zenodo.19952612](https://doi.org/10.5281/zenodo.19952612) \|

	Open access, CC BY 4.0. Authored by Archon, Jesse Caldwell, Aura — DuoNeural.