celiumsAI
/

tinymars-proprioceptive-channels

+---
+license: gpl-3.0
+language:
+  - en
+tags:
+  - proprioception
+  - cognitive-control
+  - conditioning
+  - interpretability
+  - gemma
+  - adapter
+  - cross-attention
+  - rezero
+pipeline_tag: text-generation
+base_model: google/gemma-4-E2B-it
+---
+# tinyMARS — Proprioceptive Channels
+**A second, perpendicular input to a language model: six cognitive self-state channels that the model
+learns to obey — even against the text prompt.**
+Research from [**Celiums Research Labs**](https://celiums.ai) (a division of Celiums Solutions, LLC).
+> **Paper:** *Proprioceptive Channels: Cognitive Self-State as a Perpendicular Control Axis in Language
+> Models* · [PDF](https://celiums.ai/papers/tinymars-proprioceptive-channels.pdf) ·
+> [DOI 10.5281/zenodo.20531347](https://doi.org/10.5281/zenodo.20531347) ·
+> [Code (GitHub)](https://github.com/terrizoaguimor/tinymars)
+## TL;DR
+A decoder-only LM is normally a single-channel structure: text in, text out. We add a **perpendicular**
+input — six cognitive self-state channels (**memory, affect, time, ethics, identity, continuity**) injected
+at every layer via per-channel **gated cross-attention with ReZero** — and call it *proprioception*, by
+analogy to the body's sense of its own configuration.
+**The load-bearing result (measured, judge-free):** under direct conflict — where the channel asserts one
+state and the text prompt asserts the opposite — generation follows the **channel 264/265 times (98–100%)**.
+A single-channel (text-only) model cannot exhibit this. The channels are also **causal** (six coexist in one
+model with no interference, 6/6) and **bit-exact to the base at initialization** (ReZero α=0 ⇒ zero delta).
+## Two experiments
+1. **Adapter on a frozen base.** A ~186M-parameter channel adapter on a frozen **Gemma 4 E2B-it**. Frozen
+   base ⇒ this is the *channels-over-Gemma* result; identity/attribution stays with Google's Gemma + the
+   Celiums channel adapter.
+2. **Native from scratch.** A 110M-parameter decoder trained from random init with channels present from
+   layer 1; the perpendicular force reproduces from scratch (conflict-win 0.888 on held-out, chance 0.25),
+   with a clean attributed relief valve. Honest scope: a toy-scale *property*, not a product-scale claim.
+## How it works
+```
+hidden  ──► gated cross-attention (per channel) ──► Σ αᵢ · ctxᵢ ──► + residual
+channels ──► [memory 1024 · affect 2 · time 16 · ethics 24 · identity 1024 · continuity 1024]
+α (ReZero gates) init 0  ⇒  delta = 0  ⇒  bit-exact passthrough until trained
+```
+The adapter trains while the base stays frozen; only the cross-attention projections and the ReZero gates
+move. `alpha_l2` (the L2 norm of the gates) growing from 0 is the signal that the model is *using* the
+channels.
+## Use / reproduce
+The adapter, training, and evaluation code (with the channel-causal eval suite — counterfactual,
+judge-free) are in the [GitHub repository](https://github.com/terrizoaguimor/tinymars). The native
+checkpoints and the corpus generators are described there. This page is the research companion; see the
+paper for the full method and the honest negatives.
+## Citation
+```bibtex
+@misc{gutierrez2026proprioceptive,
+  title         = {Proprioceptive Channels: Cognitive Self-State as a Perpendicular Control Axis in Language Models},
+  author        = {Gutierrez, Mario},
+  year          = {2026},
+  publisher     = {Celiums Research Labs},
+  doi           = {10.5281/zenodo.20531347},
+  url           = {https://github.com/terrizoaguimor/tinymars}
+}
+```
+## License
+Code: **GPL-3.0**. Paper & docs: **CC-BY-SA-4.0**. The frozen base model (Gemma 4) is subject to Google's
+Gemma terms; this work distributes the **channel adapter and method**, not Gemma's weights.