File size: 11,948 Bytes
353d189 4a1097d 27a8b64 6eec070 353d189 e183a48 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 | ---
license: apache-2.0
datasets:
- karpathy/fineweb-edu-100B-gpt2-token-shards
language:
- en
- ja
- es
metrics:
- accuracy
base_model:
- Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled
new_version: Drjkedwards/Recursive-Transformer-Model
library_name: transformers
---
**# Model Card for Recursive Transformer Model (RTM) / ERS PyTorch Implementation**
<!-- Provide a quick summary of what the model is/does. -->
This is the official PyTorch implementation of the **Recursive Transformer Model (RTM)**, a novel architecture that augments standard Transformer-based systems with **recursive memory reconsideration**, **temporal decay mechanisms**, and **Persistent Memory Logic Loops (PMLL)**. It addresses "nostalgic incorrectness" (the tendency of stateless AI to retain outdated or contradictory beliefs) by maintaining coherent, self-correcting state across inference sessions. The production-grade reference implementation is the **Enhanced Reconsideration System (ERS)** library, which includes PyTorch components for embeddings, lattice-based tensor routing, multi-petal attention, and knowledge-graph integration.<grok:render card_id="0ccb01" card_type="citation_card" type="render_inline_citation"><argument name="citation_id">1</argument></grok:render><grok:render card_id="f06887" card_type="citation_card" type="render_inline_citation"><argument name="citation_id">20</argument></grok:render>
The Kaggle-hosted PyTorch model provides the core RTM/ERS runtime (including `PMLLLattice`, `MemoryBlock`, temporal decay, consensus, and contradiction detection) for integration with any LLM/transformer stack. It is **not** a standalone pretrained language model but a stateful memory layer/framework.
---
## Model Details
### Model Description
The Recursive Transformer Model (RTM) extends the classic Transformer architecture with:
- **Adaptive temporal decay** on memory confidence.
- **Multi-dimensional consensus** via embedding-space geometry and knowledge graphs.
- **Vector-based contradiction detection** with integrated rewrite capabilities.
- **Persistent Memory Logic Loops (PMLL)**: a lattice-based DAG for compressed, low-rank tensor routing and recursive passes over memory slots.
Key innovations solve the stateless limitation of standard transformers by enabling iterative, multi-pass reconsideration of beliefs during inference. The Enhanced Reconsideration System (ERS) is the complete, production-ready Python/PyTorch reference implementation.<grok:render card_id="c29ff8" card_type="citation_card" type="render_inline_citation"><argument name="citation_id">17</argument></grok:render><grok:render card_id="7edf74" card_type="citation_card" type="render_inline_citation"><argument name="citation_id">43</argument></grok:render>
- **Developed by:** Dr. Josef “Q.” Edwards (Josef Kurk Edwards / josefedwards / drQedwards), University of Colorado Boulder
- **Funded by [optional]:** U.S. Department of Defense (funder identifier 100000005)
- **Shared by [optional]:** Josef Edwards (via Kaggle and GitHub)
- **Model type:** Recursive Transformer extension / stateful memory framework (PMLL + ERS)
- **Language(s) (NLP):** Language-agnostic (works with any text/embedding-based input; primarily demonstrated on English factual/knowledge-base tasks)
- **License:** MIT (see ERS repository)
- **Finetuned from model [optional]:** Not finetuned; augments any base Transformer (integrates with sentence-transformers, LangChain, etc.)
### Model Sources [optional]
- **Repository:** [Kaggle Model](https://www.kaggle.com/models/josefedwards/recursive-transformer-model/pyTorch) • [GitHub ERS (primary implementation)](https://github.com/drqedwards/ERS) • [GitHub PMLL_archive](https://github.com/drqedwards/PMLL_archive)
- **Paper:** Edwards, J. K. (2025). *The Recursive Transformer Model: Architecture, Theory, and Implementation with Persistent Memory Logic Loops*. TechRxiv. DOI: 10.36227/techrxiv.176118936.69886233/v1 (October 23, 2025)<grok:render card_id="a04172" card_type="citation_card" type="render_inline_citation"><argument name="citation_id">20</argument></grok:render>
- **Demo [optional]:** See ERS README quick-start example (async memory reconsideration loop)
## Uses
### Direct Use
Use as a drop-in memory layer for any Transformer/LLM pipeline:
- Add factual or conversational memories.
- Run recursive reconsideration loops (temporal decay → consensus → contradiction detection → optional rewrite).
- Persist state across sessions via JSON + safetensors.
Ideal for agents, chatbots, or knowledge-intensive applications that require long-term coherence.
### Downstream Use [optional]
- Integrate with LangChain agents or any LLM stack via Graphiti/Mem0 knowledge graphs.
- Extend base models (e.g., Llama, Mistral) with stateful recursive passes.
- Use in production AI systems needing self-correction and belief updating.
### Out-of-Scope Use
- Not intended as a standalone generative LLM.
- Not suitable for real-time low-latency inference without hardware acceleration (multiple recursive passes add compute).
- Avoid use in safety-critical systems without additional ethical/guardrail layers (rewrites can be LLM-guided).
## Bias, Risks, and Limitations
- **Technical limitations:** Recursive loops increase inference-time compute; performance depends on embedding quality and KG backend (Neo4j recommended for Graphiti).
- **Sociotechnical risks:** Automated memory rewrites could propagate or amplify biases present in the underlying LLM or knowledge graph. Contradiction detection relies on embedding geometry and may miss subtle nuances.
- **Nostalgic incorrectness mitigation:** The core goal is to *reduce* outdated beliefs, but incorrect source data or poor consensus thresholds can still lead to erroneous updates.
### Recommendations
Users should:
- Monitor rewrite logs and confidence deltas.
- Use high-quality, verified knowledge graphs.
- Apply domain-specific safety policies before committing rewrites.
- Test with synthetic contradictory memory scenarios to validate behavior.
## How to Get Started with the Model
```python
# Via Kaggle (PyTorch model) or direct from ERS GitHub
# Install dependencies (from ERS README)
# pip install torch sentence-transformers safetensors mem0-ai graphiti-core langchain langchain-community
import asyncio
from ERS import EnhancedReconsiderationSystem, MemoryBlock, ERSPromise # or load from Kaggle PyTorch weights
async def main():
ers = EnhancedReconsiderationSystem() # loads saved state if present
await ers.add_memory("Paris is the capital of France")
await ers.add_memory("Paris is the largest city in France") # contradictory example
await ers.reconsider_deferred()
await ers.recursive_loop_check() # performs RTM-style multi-pass reconsideration
await ers.close()
asyncio.run(main())
```
Full usage and configuration in the [ERS GitHub README](https://github.com/drqedwards/ERS). The Kaggle PyTorch model loads the core `PMLLLattice` and related tensors.
## Training Details
### Training Data
None (this is an architectural extension/framework, not a pretrained LLM). It operates on top of any Transformer embeddings (e.g., via `sentence-transformers`). Memory content is user-provided or agent-generated.
### Training Procedure
#### Preprocessing [optional]
Memory blocks are created with embeddings (via sentence-transformers), timestamps, confidence scores, and SHA-256 hashes. Optional KG indexing via Graphiti/Mem0.
#### Training Hyperparameters
- **Training regime:** Not applicable (no end-to-end training). Runtime inference uses PyTorch (fp32/bf16 supported via torch).
- Configuration options (RTM integration): `passes: 2`, `early_stop_cosine_delta: 0.002`, `max_rewrites_per_slot: 1`, `decay_alpha: 0.95`, adaptive λ decay rates, similarity threshold τ_sim, etc. (fully configurable in ERS).<grok:render card_id="355dea" card_type="citation_card" type="render_inline_citation"><argument name="citation_id">43</argument></grok:render>
#### Speeds, Sizes, Times [optional]
Real-time performance demonstrated in ERS (production-grade). Exact throughput depends on hardware, number of recursive passes, and KG backend. Lattice uses low-rank compression for scalability.
## Evaluation
### Testing Data, Factors & Metrics
No public benchmark datasets or quantitative results published in the preprint. Evaluation is qualitative/conceptual via synthetic contradictory memory scenarios (e.g., Paris facts example) and convergence metrics (confidence delta, rewrite count, cosine similarity shifts).
#### Factors
- Memory age, source quality, domain volatility, embedding similarity.
#### Metrics
- Nostalgic Incorrectness (NI) metric defined in paper.
- Consensus score, contradiction score, confidence update delta.
### Results
[More Information Needed] — Paper focuses on theoretical framework and architectural feasibility rather than large-scale empirical benchmarks. ERS demonstrates real-time recursive reconsideration.
#### Summary
The model successfully maintains coherent state and resolves contradictions in controlled memory scenarios.
## Model Examination [optional]
Interpretability is built-in: per-pass logs of embedding shifts, confidence changes, rewrite proposals, and KG updates. Visualize memory graph evolution (planned roadmap feature).
## Environmental Impact
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
- **Hardware Type:** [More Information Needed] (tested on standard CPU/GPU with PyTorch)
- **Hours used:** [More Information Needed]
- **Cloud Provider:** [More Information Needed]
- **Compute Region:** [More Information Needed]
- **Carbon Emitted:** [More Information Needed]
## Technical Specifications [optional]
### Model Architecture and Objective
- Base: Transformer stack with augmented embedding layer and reconsideration head.
- Key equations: temporal decay \( \text{conf}_i(t) = \text{conf}_i(0) \cdot e^{-\lambda_i (t - t_i)} \cdot \dots \), consensus scoring, integrated confidence update, PMLL lattice (DAG with quantization and low-rank compression).
- Objective: Stateful, self-correcting memory across sessions.
### Compute Infrastructure
#### Hardware
Standard PyTorch-compatible (CPU/GPU).
#### Software
Python 3.8+, PyTorch, sentence-transformers, safetensors, mem0-ai, graphiti-core, LangChain.
## Citation [optional]
**BibTeX:**
```bibtex
@article{edwards2025recursive,
author = {Edwards, Josef Kurk},
title = {The Recursive Transformer Model: Architecture, Theory, and Implementation with Persistent Memory Logic Loops},
journal = {TechRxiv},
year = {2025},
month = {October},
doi = {10.36227/techrxiv.176118936.69886233/v1},
url = {https://www.techrxiv.org/users/856117/articles/1345789}
}
```
**APA:**
Edwards, J. K. (2025). The Recursive Transformer Model: Architecture, Theory, and Implementation with Persistent Memory Logic Loops. TechRxiv. https://doi.org/10.36227/techrxiv.176118936.69886233/v1
## Glossary [optional]
- **PMLL**: Persistent Memory Logic Loop — lattice-based memory compression and routing.
- **ERS**: Enhanced Reconsideration System — production Python/PyTorch library.
- **Nostalgic Incorrectness**: Retention of outdated/conflicting beliefs in stateless models.
## More Information [optional]
- Full paper and math: TechRxiv preprint.
- Live implementation: [ERS GitHub](https://github.com/drqedwards/ERS).
- Related work: Hybrid TRM-RTM model, PMLL P=NP proof paper (separate preprint).
## Model Card Authors [optional]
Compiled by Dr Q based on public sources from Josef Edwards / Dr. Q.
## Model Card Contact
Josef Edwards (Kaggle: josefedwards, GitHub: drqedwards, Email: joed6834@colorado.edu) or open an issue on the ERS repository. |