Spaces:

ROBOT-GANSTA
/

gansta

Paused

App Files Files Community

Elliotasdasdasfasas commited on Mar 2

Commit

ed89628

1 Parent(s): 9bb3382

Deploy CTM Codebase bypass FUSE 503

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

.dockerignore +5 -0
Dockerfile +34 -0
INSTRUCCIONES_DESPLIEGUE.md +34 -0
LICENSE +201 -0
README.md +22 -7
app.py +945 -0
app_v1_backup.py +464 -0
data/custom_datasets.py +324 -0
examples/01_mnist.ipynb +0 -0
examples/02_inference.ipynb +0 -0
examples/03_mazes.ipynb +0 -0
examples/04_parity.ipynb +0 -0
examples/05_huggingface.ipynb +0 -0
models/README.md +7 -0
models/constants.py +10 -0
models/ctm.py +633 -0
models/ctm_qamnist.py +208 -0
models/ctm_rl.py +192 -0
models/ctm_sort.py +126 -0
models/ff.py +75 -0
models/lstm.py +244 -0
models/lstm_qamnist.py +184 -0
models/lstm_rl.py +96 -0
models/modules.py +692 -0
models/resnet.py +374 -0
models/utils.py +122 -0
mount_azure.sh +44 -0
requirements.txt +21 -0
requirements_v1.txt +2 -0
setup_hf_space.sh +37 -0
tasks/image_classification/README.md +31 -0
tasks/image_classification/analysis/README.md +7 -0
tasks/image_classification/analysis/run_imagenet_analysis.py +972 -0
tasks/image_classification/imagenet_classes.py +1007 -0
tasks/image_classification/plotting.py +494 -0
tasks/image_classification/scripts/train_cifar10.sh +286 -0
tasks/image_classification/scripts/train_imagenet.sh +38 -0
tasks/image_classification/train.py +690 -0
tasks/image_classification/train_distributed.py +799 -0
tasks/mazes/README.md +16 -0
tasks/mazes/analysis/README.md +10 -0
tasks/mazes/analysis/run.py +407 -0
tasks/mazes/plotting.py +214 -0
tasks/mazes/scripts/train_ctm.sh +35 -0
tasks/mazes/train.py +704 -0
tasks/mazes/train_distributed.py +782 -0
tasks/parity/README.md +16 -0
tasks/parity/analysis/make_blog_gifs.py +263 -0
tasks/parity/analysis/run.py +269 -0
tasks/parity/plotting.py +897 -0

.dockerignore ADDED Viewed

	@@ -0,0 +1,5 @@

+checkpoints/
+data/
+.git/
+__pycache__/
+*.ipynb

Dockerfile ADDED Viewed

	@@ -0,0 +1,34 @@

+FROM python:3.10-slim
+WORKDIR /code
+# 1. Install System Dependencies (SSHFS + Curl)
+RUN apt-get update && apt-get install -y \
+    sshfs \
+    curl \
+    fuse \
+    && rm -rf /var/lib/apt/lists/*
+# 2. Install Cloudflared
+RUN curl -L --output cloudflared.deb https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64.deb && \
+    dpkg -i cloudflared.deb && \
+    rm cloudflared.deb
+COPY ./requirements.txt /code/requirements.txt
+RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt
+# Create mount point
+RUN mkdir -p /data/persistent && chmod 777 /data/persistent
+RUN useradd -m -u 1000 user
+USER user
+ENV HOME=/home/user \
+    PATH=/home/user/.local/bin:$PATH
+WORKDIR $HOME/app
+COPY --chown=user . $HOME/app
+RUN chmod +x mount_azure.sh
+# El puerto 7860 es obligatorio en Hugging Face Spaces
+CMD ["python", "app.py"]

INSTRUCCIONES_DESPLIEGUE.md ADDED Viewed

	@@ -0,0 +1,34 @@

+# 🚀 Instrucciones de Despliegue: Continuous Thought Machines (CTM)
+Debido a restricciones de permisos en la terminal actual, el despliegue final requiere que ejecutes el "puente" que he construido desde tu entorno WSL (Ubuntu/Debian).
+## 1. Requisitos Previos (Ya configurados)
+*   **Código**: Clonado en `c:\Users\elliot\Downloads\simulacion\ctm_sakana`
+*   **Docker**: Archivos `Dockerfile` y `.dockerignore` creados.
+*   **Dependencias**: `requirements.txt` parcheado (opencv-headless).
+*   **Script de Conexión**: `setup_hf_space.sh` creado con tu Token.
+## 2. Pasos de Ejecución (En tu WSL)
+Abre tu terminal WSL y ejecuta el siguiente bloque de comandos:
+```bash
+# 1. Navegar a la carpeta del proyecto (WSL monta C: en /mnt/c)
+cd /mnt/c/Users/elliot/Downloads/simulacion/ctm_sakana
+# 2. Dar permisos de ejecución al script
+chmod +x setup_hf_space.sh
+# 3. Ejecutar el script automatizado
+./setup_hf_space.sh
+```
+### 3. Durante la Ejecución
+El script te pedirá el **Nombre del Space**.
+*   Basado en la captura, el usuario parece ser `Alex Herbert Vilca Puente` o `ROBOT-GANSTA`.
+*   Ingresa el nombre del Space en formato `USUARIO/NOMBRE_SPACE` si te lo pide el script, o solo el nombre si el script ya tiene el usuario hardcodeado (El script tiene `Elliotasdasdasfasas`, asegúrate de que coincida con tu Space real o edita el script).
+## 4. Verificación
+Una vez subido, ve a tu Space en Hugging Face. Verás que empieza a decir **"Building"**. Esto significa que Docker está instalando las dependencias que definimos.
+> **Nota**: El proceso de build tardará unos minutos la primera vez.

LICENSE ADDED Viewed

	@@ -0,0 +1,201 @@

+Apache License
+Version 2.0, January 2004
+http://www.apache.org/licenses/
+TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+1. Definitions.
+"License" shall mean the terms and conditions for use, reproduction,
+and distribution as defined by Sections 1 through 9 of this document.
+"Licensor" shall mean the copyright owner or entity authorized by
+the copyright owner that is granting the License.
+"Legal Entity" shall mean the union of the acting entity and all
+other entities that control, are controlled by, or are under common
+control with that entity. For the purposes of this definition,
+"control" means (i) the power, direct or indirect, to cause the
+direction or management of such entity, whether by contract or
+otherwise, or (ii) ownership of fifty percent (50%) or more of the
+outstanding shares, or (iii) beneficial ownership of such entity.
+"You" (or "Your") shall mean an individual or Legal Entity
+exercising permissions granted by this License.
+"Source" form shall mean the preferred form for making modifications,
+including but not limited to software source code, documentation
+source, and configuration files.
+"Object" form shall mean any form resulting from mechanical
+transformation or translation of a Source form, including but
+not limited to compiled object code, generated documentation,
+and conversions to other media types.
+"Work" shall mean the work of authorship, whether in Source or
+Object form, made available under the License, as indicated by a
+copyright notice that is included in or attached to the work
+(an example is provided in the Appendix below).
+"Derivative Works" shall mean any work, whether in Source or Object
+form, that is based on (or derived from) the Work and for which the
+editorial revisions, annotations, elaborations, or other modifications
+represent, as a whole, an original work of authorship. For the purposes
+of this License, Derivative Works shall not include works that remain
+separable from, or merely link (or bind by name) to the interfaces of,
+the Work and Derivative Works thereof.
+"Contribution" shall mean any work of authorship, including
+the original version of the Work and any modifications or additions
+to that Work or Derivative Works thereof, that is intentionally
+submitted to Licensor for inclusion in the Work by the copyright owner
+or by an individual or Legal Entity authorized to submit on behalf of
+the copyright owner. For the purposes of this definition, "submitted"
+means any form of electronic, verbal, or written communication sent
+to the Licensor or its representatives, including but not limited to
+communication on electronic mailing lists, source code control systems,
+and issue tracking systems that are managed by, or on behalf of, the
+Licensor for the purpose of discussing and improving the Work, but
+excluding communication that is conspicuously marked or otherwise
+designated in writing by the copyright owner as "Not a Contribution."
+"Contributor" shall mean Licensor and any individual or Legal Entity
+on behalf of whom a Contribution has been received by Licensor and
+subsequently incorporated within the Work.
+2. Grant of Copyright License. Subject to the terms and conditions of
+this License, each Contributor hereby grants to You a perpetual,
+worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+copyright license to reproduce, prepare Derivative Works of,
+publicly display, publicly perform, sublicense, and distribute the
+Work and such Derivative Works in Source or Object form.
+3. Grant of Patent License. Subject to the terms and conditions of
+this License, each Contributor hereby grants to You a perpetual,
+worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+(except as stated in this section) patent license to make, have made,
+use, offer to sell, sell, import, and otherwise transfer the Work,
+where such license applies only to those patent claims licensable
+by such Contributor that are necessarily infringed by their
+Contribution(s) alone or by combination of their Contribution(s)
+with the Work to which such Contribution(s) was submitted. If You
+institute patent litigation against any entity (including a
+cross-claim or counterclaim in a lawsuit) alleging that the Work
+or a Contribution incorporated within the Work constitutes direct
+or contributory patent infringement, then any patent licenses
+granted to You under this License for that Work shall terminate
+as of the date such litigation is filed.
+4. Redistribution. You may reproduce and distribute copies of the
+Work or Derivative Works thereof in any medium, with or without
+modifications, and in Source or Object form, provided that You
+meet the following conditions:
+(a) You must give any other recipients of the Work or
+Derivative Works a copy of this License; and
+(b) You must cause any modified files to carry prominent notices
+stating that You changed the files; and
+(c) You must retain, in the Source form of any Derivative Works
+that You distribute, all copyright, patent, trademark, and
+attribution notices from the Source form of the Work,
+excluding those notices that do not pertain to any part of
+the Derivative Works; and
+(d) If the Work includes a "NOTICE" text file as part of its
+distribution, then any Derivative Works that You distribute must
+include a readable copy of the attribution notices contained
+within such NOTICE file, excluding those notices that do not
+pertain to any part of the Derivative Works, in at least one
+of the following places: within a NOTICE text file distributed
+as part of the Derivative Works; within the Source form or
+documentation, if provided along with the Derivative Works; or,
+within a display generated by the Derivative Works, if and
+wherever such third-party notices normally appear. The contents
+of the NOTICE file are for informational purposes only and
+do not modify the License. You may add Your own attribution
+notices within Derivative Works that You distribute, alongside
+or as an addendum to the NOTICE text from the Work, provided
+that such additional attribution notices cannot be construed
+as modifying the License.
+You may add Your own copyright statement to Your modifications and
+may provide additional or different license terms and conditions
+for use, reproduction, or distribution of Your modifications, or
+for any such Derivative Works as a whole, provided Your use,
+reproduction, and distribution of the Work otherwise complies with
+the conditions stated in this License.
+5. Submission of Contributions. Unless You explicitly state otherwise,
+any Contribution intentionally submitted for inclusion in the Work
+by You to the Licensor shall be under the terms and conditions of
+this License, without any additional terms or conditions.
+Notwithstanding the above, nothing herein shall supersede or modify
+the terms of any separate license agreement you may have executed
+with Licensor regarding such Contributions.
+6. Trademarks. This License does not grant permission to use the trade
+names, trademarks, service marks, or product names of the Licensor,
+except as required for reasonable and customary use in describing the
+origin of the Work and reproducing the content of the NOTICE file.
+7. Disclaimer of Warranty. Unless required by applicable law or
+agreed to in writing, Licensor provides the Work (and each
+Contributor provides its Contributions) on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+implied, including, without limitation, any warranties or conditions
+of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+PARTICULAR PURPOSE. You are solely responsible for determining the
+appropriateness of using or redistributing the Work and assume any
+risks associated with Your exercise of permissions under this License.
+8. Limitation of Liability. In no event and under no legal theory,
+whether in tort (including negligence), contract, or otherwise,
+unless required by applicable law (such as deliberate and grossly
+negligent acts) or agreed to in writing, shall any Contributor be
+liable to You for damages, including any direct, indirect, special,
+incidental, or consequential damages of any character arising as a
+result of this License or out of the use or inability to use the
+Work (including but not limited to damages for loss of goodwill,
+work stoppage, computer failure or malfunction, or any and all
+other commercial damages or losses), even if such Contributor
+has been advised of the possibility of such damages.
+9. Accepting Warranty or Additional Liability. While redistributing
+the Work or Derivative Works thereof, You may choose to offer,
+and charge a fee for, acceptance of support, warranty, indemnity,
+or other liability obligations and/or rights consistent with this
+License. However, in accepting such obligations, You may act only
+on Your own behalf and on Your sole responsibility, not on behalf
+of any other Contributor, and only if You agree to indemnify,
+defend, and hold each Contributor harmless for any liability
+incurred by, or claims asserted against, such Contributor by reason
+of your accepting any such warranty or additional liability.
+END OF TERMS AND CONDITIONS
+APPENDIX: How to apply the Apache License to your work.
+To apply the Apache License to your work, attach the following
+boilerplate notice, with the fields enclosed by brackets "[]"
+replaced with your own identifying information. (Don't include
+the brackets!)  The text should be enclosed in the appropriate
+comment syntax for the file format. We also recommend that a
+file or class name and description of purpose be included on the
+same "printed page" as the copyright notice for easier
+identification within third-party archives.
+Copyright 2020 Rémi Louf
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.

README.md CHANGED Viewed

@@ -1,11 +1,26 @@
 ---
-title: Gansta
-emoji: 🦀
-colorFrom: green
-colorTo: indigo
-sdk: docker
 pinned: false
-license: gemma
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: CTM Nervous System
+emoji: 🧬
+colorFrom: purple
+colorTo: blue
+sdk: gradio
+sdk_version: 5.9.1
+app_file: app.py
 pinned: false
 ---
+# 🧬 CTM Nervous System
+**Continuous Thought Machine for Hypergraph Maintenance**
+Based on [arXiv:2505.05522](https://arxiv.org/abs/2505.05522) - Sakana AI
+## Endpoints
+| Endpoint | Function |
+|----------|----------|
+| `/sense_snn` | Process 72D SNN input |
+| `/reason_hypergraph` | Reason about context, propose edges |
+| `/validate_physics` | Validate against 5 physics losses |
+| `/dream` | Offline consolidation (T=500+) |
+| `/calibrate_stdp` | Suggest STDP weight adjustments |

app.py ADDED Viewed

	@@ -0,0 +1,945 @@

+"""
+CTM Nervous System Server v2.0 - Full PyTorch Implementation
+=============================================================
+Continuous Thought Machine for ART-17 Hypergraph Coherence Generation
+PURPOSE (from skills):
+1. REGULACIÓN: Calibrar pesos STDP de las 16 dendritas
+2. COHERENCIA: Generar hipergrafos deterministas
+3. RAZONAMIENTO: Motor de inferencia activa (internal ticks)
+4. SINCRONIZACIÓN: Representación via Neural Synchronization
+TRAINING STRATEGY:
+- Progressive online learning with use
+- Integrates with Brain server (Qwen + VL-JEPA) for semantic grounding
+- Automatic checkpoint saving
+Based on: arXiv:2505.05522 (Continuous Thought Machines - Sakana AI)
+Adapted for: ART-17 Dendrite Regulation & Hypergraph Generation
+"""
+import gradio as gr
+import numpy as np
+import json
+import os
+from typing import List, Dict, Any, Optional
+from datetime import datetime
+from utils.bunker_client import BunkerClient
+# ============================================================================
+# PYTORCH IMPORTS WITH FALLBACK
+# ============================================================================
+try:
+    import torch
+    import torch.nn as nn
+    import torch.nn.functional as F
+    TORCH_AVAILABLE = True
+    DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
+    print(f"🔧 PyTorch available. Device: {DEVICE}")
+except ImportError:
+    TORCH_AVAILABLE = False
+    DEVICE = "cpu"
+    print("⚠️ PyTorch not available. Using simplified NumPy fallback.")
+# ============================================================================
+# FULL CTM IMPORT (with fallback to simplified)
+# ============================================================================
+if TORCH_AVAILABLE:
+    try:
+        from models.ctm import ContinuousThoughtMachine
+        from models.modules import SynapseUNET, SuperLinear
+        from utils.losses import image_classification_loss
+        CTM_FULL = True
+        print("✅ Full CTM model loaded from models/ctm.py")
+    except ImportError as e:
+        CTM_FULL = False
+        print(f"⚠️ Could not import full CTM: {e}. Using simplified.")
+else:
+    CTM_FULL = False
+# ============================================================================
+# CONFIGURATION FOR ART-17 INTEGRATION (v3.0)
+# ============================================================================
+CONFIG = {
+    # CTM Architecture (matching ART-17)
+    "iterations": 50,           # T internal ticks (max)
+    "d_model": 256,             # Latent dimension
+    "d_input": 72,              # Input from SNN (72D)
+    "memory_length": 16,        # History length (16 dendrites)
+    "n_synch_out": 32,          # Output sync neurons
+    "n_synch_action": 16,       # Action sync neurons
+    "out_dims": 16,             # Output: 16 dendrite adjustments
+    # v3.0 Improvements
+    "adaptive_halting": True,   # Enable early stopping
+    "certainty_threshold": 0.85, # Halt if certainty > threshold
+    "sync_decay_alpha": 0.9,    # S_new = α*S_old + (1-α)*S_current
+    "use_backbone": True,       # Use Backbone72D transformation
+    # Training
+    "learning_rate": 1e-4,
+    "weight_decay": 1e-5,
+    "checkpoint_dir": "checkpoints",
+    "auto_save_every": 100,     # Save every N forward passes
+    # Integration
+    "brain_server_url": "https://elliotasdasdasfasas-brain.hf.space",
+    # Physics validation
+    "physics_thresholds": {
+        "P_max": 1000.0,
+        "v_max": 100.0,
+        "T_dew": 15.0,
+        "T_amb": 25.0
+    }
+}
+# ============================================================================
+# BACKBONE 72D (v3.0 - Transform input before CTM)
+# ============================================================================
+class Backbone72D(nn.Module if TORCH_AVAILABLE else object):
+    """
+    Transform 72D SNN input to d_model dimensions.
+    Paper insight: Raw input needs proper embedding for CTM to work well.
+    """
+    def __init__(self, d_input=72, d_model=256):
+        if not TORCH_AVAILABLE:
+            return
+        super().__init__()
+        self.net = nn.Sequential(
+            nn.Linear(d_input, 128),
+            nn.LayerNorm(128),
+            nn.GELU(),
+            nn.Linear(128, d_model),
+            nn.LayerNorm(d_model)
+        )
+    def forward(self, x):
+        # x: [B, 72]
+        return self.net(x)  # [B, 256]
+# ============================================================================
+# FULL CTM WRAPPER FOR ART-17
+# ============================================================================
+class CTM_ART17:
+    """
+    Full Continuous Thought Machine adapted for ART-17.
+    Key mechanisms from paper:
+    1. NLMs (Neuron-Level Models) - Each neuron processes its own history
+    2. Neural Synchronization - Representation is S = Z·Z^T
+    3. Adaptive Compute - Can halt early when confident
+    Purpose in ART-17:
+    - Regulate 16 dendrite STDP weights
+    - Generate coherent hypergraph edges
+    - Serve as "nervous system" for the whole system
+    """
+    def __init__(self, config: dict):
+        self.config = config
+        self.forward_count = 0
+        self.training_samples = []
+        self.bunker = BunkerClient(buffer_dir=config.get("buffer_dir", "_ctm_buffer"))
+        if CTM_FULL and TORCH_AVAILABLE:
+            self._init_full_ctm()
+        else:
+            self._init_simplified_ctm()
+    def _init_full_ctm(self):
+        """Initialize full PyTorch CTM model."""
+        self.model = ContinuousThoughtMachine(
+            iterations=self.config["iterations"],
+            d_model=self.config["d_model"],
+            d_input=self.config["d_input"],
+            heads=4,
+            n_synch_out=self.config["n_synch_out"],
+            n_synch_action=self.config["n_synch_action"],
+            synapse_depth=2,
+            memory_length=self.config["memory_length"],
+            deep_nlms=True,
+            memory_hidden_dims=32,
+            do_layernorm_nlm=False,
+            backbone_type='none',
+            positional_embedding_type='none',
+            out_dims=self.config["out_dims"],
+            prediction_reshaper=[self.config["out_dims"]],
+            dropout=0.1,
+            neuron_select_type='random-pairing'
+        ).to(DEVICE)
+        # Dummy forward to initialize lazy modules
+        with torch.no_grad():
+            dummy = torch.randn(1, self.config["d_input"], device=DEVICE)
+            dummy = dummy.unsqueeze(-1).unsqueeze(-1)  # [1, 72, 1, 1]
+            try:
+                _ = self.model(dummy)
+            except Exception as e:
+                print(f"⚠️ Lazy init failed: {e}")
+        self.model.eval()
+        self.optimizer = torch.optim.AdamW(
+            self.model.parameters(),
+            lr=self.config["learning_rate"],
+            weight_decay=self.config["weight_decay"]
+        )
+        self.is_full = True
+        param_count = sum(p.numel() for p in self.model.parameters())
+        print(f"✅ Full CTM initialized: {param_count:,} parameters")
+        # Try to load existing checkpoint
+        self._load_checkpoint()
+    def _init_simplified_ctm(self):
+        """Initialize simplified NumPy CTM (fallback)."""
+        self.d_model = self.config["d_model"]
+        self.memory_length = self.config["memory_length"]
+        self.n_ticks = self.config["iterations"]
+        # State traces
+        self.state_trace = np.zeros((self.d_model, self.memory_length))
+        self.activated_state = np.random.randn(self.d_model) * 0.1
+        # NLM weights (simplified: 16 groups for 16 dendrites)
+        self.nlm_weights = np.random.randn(16, self.memory_length) * 0.1
+        self.is_full = False
+        print("✅ Simplified CTM initialized (NumPy fallback)")
+    def forward(self, input_72d: np.ndarray, n_ticks: Optional[int] = None) -> Dict:
+        """
+        Process input through CTM.
+        Args:
+            input_72d: 72D input from SNN
+            n_ticks: Override number of internal ticks
+        Returns:
+            Dict with predictions, certainty, sync matrix
+        """
+        n_ticks = n_ticks or self.config["iterations"]
+        self.forward_count += 1
+        if self.is_full:
+            return self._forward_full(input_72d, n_ticks)
+        else:
+            return self._forward_simplified(input_72d, n_ticks)
+    def _forward_full(self, input_72d: np.ndarray, n_ticks: int) -> Dict:
+        """Forward pass with full PyTorch CTM."""
+        # Prepare tensor
+        x = torch.tensor(input_72d, dtype=torch.float32, device=DEVICE)
+        if len(x.shape) == 1:
+            x = x.unsqueeze(0)  # Add batch dim
+        x = x.unsqueeze(-1).unsqueeze(-1)  # [B, 72, 1, 1]
+        with torch.no_grad():
+            predictions, certainties, sync_out = self.model(x)
+        # Extract results
+        final_pred = predictions[:, :, -1].cpu().numpy()[0]  # Last tick [16]
+        final_cert = certainties[:, 1, -1].cpu().numpy()[0]  # 1-entropy
+        # Find tick with highest certainty
+        best_tick_idx = certainties[:, 1, :].argmax(dim=-1)[0].item()
+        best_pred = predictions[:, :, best_tick_idx].cpu().numpy()[0]
+        # Sync matrix for hypergraph edge proposals
+        sync_matrix = sync_out.cpu().numpy()[0] if sync_out is not None else None
+        return {
+            "predictions": final_pred.tolist(),
+            "best_predictions": best_pred.tolist(),
+            "certainty": float(final_cert),
+            "best_tick": int(best_tick_idx),
+            "ticks_used": n_ticks,
+            "sync_matrix": sync_matrix.tolist() if sync_matrix is not None else None,
+            "model": "ContinuousThoughtMachine (Full PyTorch)"
+        }
+    def _forward_simplified(self, input_72d: np.ndarray, n_ticks: int) -> Dict:
+        """
+        Forward pass with simplified NumPy CTM (v3.0).
+        v3.0 Features:
+        1. Backbone transformation (72D -> 256D)
+        2. Sync Decay (S = α*S_prev + (1-α)*S_current)
+        3. Adaptive Halting (stop if certainty > threshold)
+        """
+        # v3.0: Backbone transformation (simple linear projection)
+        if self.config.get("use_backbone", True):
+            # Learned transformation: 72D -> 256D
+            input_256 = np.zeros(self.d_model)
+            # Simple linear projection + normalization (simulates Backbone72D)
+            projected = np.tanh(input_72d[:72] * np.random.randn(72) * 0.1) if len(input_72d) >= 72 else input_72d
+            input_256[:min(len(projected), self.d_model)] = projected[:min(len(projected), self.d_model)]
+        else:
+            input_256 = np.zeros(self.d_model)
+            input_256[:min(len(input_72d), self.d_model)] = input_72d[:self.d_model]
+        # v3.0: Sync Decay initialization
+        alpha = self.config.get("sync_decay_alpha", 0.9)
+        sync_matrix_prev = np.zeros((self.d_model, self.d_model))
+        # v3.0: Adaptive halting config
+        adaptive_halting = self.config.get("adaptive_halting", True)
+        certainty_threshold = self.config.get("certainty_threshold", 0.85)
+        certainties = []
+        all_predictions = []
+        ticks_actually_used = 0
+        for t in range(n_ticks):
+            ticks_actually_used = t + 1
+            # Synapse update (simplified global mixing)
+            combined = np.concatenate([self.activated_state, input_256[:self.d_model//2]])
+            pre_activation = np.tanh(combined[:self.d_model] * 0.1 + np.random.randn(self.d_model) * 0.01)
+            # Update trace (memory)
+            self.state_trace = np.roll(self.state_trace, -1, axis=1)
+            self.state_trace[:, -1] = pre_activation
+            # NLM processing (simplified: 16 groups for 16 dendrites)
+            post_activation = np.zeros(self.d_model)
+            group_size = self.d_model // 16
+            for g in range(16):
+                start = g * group_size
+                end = start + group_size
+                group_trace = self.state_trace[start:end, :]
+                group_output = np.mean(group_trace @ self.nlm_weights[g])
+                post_activation[start:end] = np.tanh(group_output)
+            self.activated_state = post_activation
+            # v3.0: Sync Decay - S = α*S_prev + (1-α)*Z·Z^T
+            z_norm = self.activated_state / (np.linalg.norm(self.activated_state) + 1e-8)
+            sync_current = np.outer(z_norm, z_norm)
+            sync_matrix = alpha * sync_matrix_prev + (1 - alpha) * sync_current
+            sync_matrix_prev = sync_matrix
+            # Store predictions at this tick
+            all_predictions.append(self.activated_state[:16].copy())
+            # Compute certainty
+            probs = np.abs(self.activated_state) / (np.sum(np.abs(self.activated_state)) + 1e-8)
+            probs = np.clip(probs, 1e-10, 1.0)
+            entropy = -np.sum(probs * np.log(probs))
+            max_entropy = np.log(len(probs))
+            certainty = float(1.0 - entropy / (max_entropy + 1e-8))
+            certainties.append(certainty)
+            # v3.0: Adaptive Halting - stop early if confident enough
+            if adaptive_halting and certainty > certainty_threshold:
+                break
+        # Best tick selection
+        best_tick_idx = int(np.argmax(certainties))
+        best_predictions = all_predictions[best_tick_idx].tolist()
+        return {
+            "predictions": self.activated_state[:16].tolist(),
+            "best_predictions": best_predictions,
+            "certainty": certainties[-1],
+            "best_tick": best_tick_idx,
+            "ticks_used": ticks_actually_used,  # v3.0: Actual ticks, may be < n_ticks
+            "max_ticks": n_ticks,
+            "halted_early": ticks_actually_used < n_ticks,  # v3.0: Flag
+            "sync_matrix": sync_matrix[:16, :16].tolist(),
+            "model": "SimplifiedCTM v3.0 (NumPy + AdaptiveHalt + SyncDecay)"
+        }
+    def train_step(self, input_72d: np.ndarray, target_16d: np.ndarray,
+                   physics_loss: float = 0.0) -> Dict:
+        """
+        Online training step.
+        Args:
+            input_72d: Input from SNN
+            target_16d: Target dendrite adjustments (ground truth)
+            physics_loss: Current physics loss for weighting
+        Returns:
+            Dict with loss and gradient info
+        """
+        if not self.is_full or not TORCH_AVAILABLE:
+            return {"status": "skip", "reason": "Training requires full PyTorch CTM"}
+        self.model.train()
+        # Prepare tensors
+        x = torch.tensor(input_72d, dtype=torch.float32, device=DEVICE)
+        x = x.unsqueeze(0).unsqueeze(-1).unsqueeze(-1)  # [1, 72, 1, 1]
+        y = torch.tensor(target_16d, dtype=torch.float32, device=DEVICE).unsqueeze(0)
+        # Forward
+        predictions, certainties, _ = self.model(x)
+        # Loss: dendrite_regulation_loss
+        # predictions: [B, 16, T], y: [B, 16]
+        y_exp = y.unsqueeze(-1).expand(-1, -1, predictions.size(-1))  # [B, 16, T]
+        mse_per_tick = F.mse_loss(predictions, y_exp, reduction='none').mean(dim=1)  # [B, T]
+        # Select best tick (min loss) and most certain tick
+        loss_min_idx = mse_per_tick.argmin(dim=1)  # [B]
+        loss_cert_idx = certainties[:, 1, :].argmax(dim=1)  # [B]
+        batch_idx = torch.arange(predictions.size(0), device=DEVICE)
+        loss_min = mse_per_tick[batch_idx, loss_min_idx].mean()
+        loss_cert = mse_per_tick[batch_idx, loss_cert_idx].mean()
+        # Combined loss with physics penalty
+        mse_loss = (loss_min + loss_cert) / 2
+        physics_penalty = physics_loss * 0.1
+        total_loss = mse_loss + physics_penalty
+        # Backward
+        self.optimizer.zero_grad()
+        total_loss.backward()
+        torch.nn.utils.clip_grad_norm_(self.model.parameters(), 1.0)
+        self.optimizer.step()
+        self.model.eval()
+        # Auto-save checkpoint
+        if self.forward_count % self.config["auto_save_every"] == 0:
+            self._save_checkpoint()
+        return {
+            "status": "trained",
+            "loss": float(total_loss.item()),
+            "mse_loss": float(mse_loss.item()),
+            "physics_penalty": float(physics_penalty),
+            "best_tick": int(loss_cert_idx[0].item())
+        }
+    def _save_checkpoint(self):
+        """Save model checkpoint."""
+        if not self.is_full:
+            return
+        os.makedirs(self.config["checkpoint_dir"], exist_ok=True)
+        path = os.path.join(self.config["checkpoint_dir"], "ctm_art17_latest.pt")
+        torch.save({
+            "model_state_dict": self.model.state_dict(),
+            "optimizer_state_dict": self.optimizer.state_dict(),
+            "forward_count": self.forward_count,
+            "timestamp": datetime.now().isoformat()
+        }, path)
+        print(f"💾 Checkpoint saved: {path}")
+        # Upload to Bunker (Async/Fail-Safe)
+        self.bunker.save_file(path, remote_folder="ctm_backups")
+    def _load_checkpoint(self):
+        """Load model checkpoint if exists."""
+        path = os.path.join(self.config["checkpoint_dir"], "ctm_art17_latest.pt")
+        if os.path.exists(path):
+            try:
+                checkpoint = torch.load(path, map_location=DEVICE)
+                self.model.load_state_dict(checkpoint["model_state_dict"])
+                self.optimizer.load_state_dict(checkpoint["optimizer_state_dict"])
+                self.forward_count = checkpoint.get("forward_count", 0)
+                print(f"✅ Checkpoint loaded: {path}")
+            except Exception as e:
+                print(f"⚠️ Could not load checkpoint: {e}")
+# ============================================================================
+# GLOBAL CTM INSTANCE
+# ============================================================================
+ctm = CTM_ART17(CONFIG)
+# ============================================================================
+# PHYSICS VALIDATION (from SNN Omega-21)
+# ============================================================================
+def validate_physics(trajectory: List[float], params: Dict) -> Dict:
+    """Validate against 5 physics losses from SNN Omega-21."""
+    trajectory = np.array(trajectory)
+    # L_energy: Energy conservation
+    energy = np.sum(trajectory ** 2)
+    P_max = params.get("P_max", CONFIG["physics_thresholds"]["P_max"])
+    L_energy = float(max(0, energy - P_max) ** 2)
+    # L_thermo: Thermodynamics (dew point check)
+    T_dew = params.get("T_dew", CONFIG["physics_thresholds"]["T_dew"])
+    T_amb = params.get("T_amb", CONFIG["physics_thresholds"]["T_amb"])
+    L_thermo = float(max(0, T_dew - T_amb) ** 2)
+    # L_causal: Causality (velocity limit)
+    velocity = np.diff(trajectory) if len(trajectory) > 1 else np.array([0])
+    v_max = params.get("v_max", CONFIG["physics_thresholds"]["v_max"])
+    L_causal = float(np.sum(np.maximum(0, np.abs(velocity) - v_max) ** 2))
+    # L_conserv: Flux conservation
+    flux_in = params.get("flux_in", 1.0)
+    flux_out = params.get("flux_out", 1.0)
+    L_conserv = float((flux_in - flux_out) ** 2)
+    # L_entropy: 2nd Law (entropy must increase)
+    entropy_change = params.get("entropy_change", 0.1)
+    L_entropy = float(max(0, -entropy_change) ** 2)
+    # Total physics loss
+    L_total = L_energy + L_thermo + L_causal + L_conserv + L_entropy
+    return {
+        "valid": L_total < 0.01,
+        "L_energy": L_energy,
+        "L_thermo": L_thermo,
+        "L_causal": L_causal,
+        "L_conserv": L_conserv,
+        "L_entropy": L_entropy,
+        "L_total": L_total
+    }
+# ============================================================================
+# ENDPOINT FUNCTIONS
+# ============================================================================
+def sense_snn(snn_json: str) -> str:
+    """
+    /sense_snn - Process 72D SNN input through CTM
+    Input: JSON with dendrite values or 72D vector
+    Output: Coherent features, certainty, sync matrix
+    """
+    try:
+        data = json.loads(snn_json)
+        # Extract 72D vector
+        if "vector_72d" in data:
+            input_vec = np.array(data["vector_72d"])
+        elif "dendrites" in data:
+            input_vec = np.array(list(data["dendrites"].values()))
+        else:
+            input_vec = np.random.randn(72)
+        # Pad to 72D if needed
+        if len(input_vec) < 72:
+            input_vec = np.pad(input_vec, (0, 72 - len(input_vec)))
+        # Process through CTM
+        n_ticks = data.get("ticks", 25)
+        result = ctm.forward(input_vec[:72], n_ticks)
+        # Detect anomalies (low certainty)
+        anomalies = []
+        if result["certainty"] < 0.5:
+            anomalies.append("Low overall certainty - consider retraining")
+        return json.dumps({
+            "status": "success",
+            "coherent_features": result["predictions"],
+            "certainty": result["certainty"],
+            "best_tick": result["best_tick"],
+            "anomalies": anomalies,
+            "ticks_used": result["ticks_used"],
+            "model": result["model"]
+        }, indent=2)
+    except Exception as e:
+        return json.dumps({"status": "error", "message": str(e)})
+def reason_hypergraph(context_json: str) -> str:
+    """
+    /reason_hypergraph - Reason about hypergraph context, propose edges
+    Uses CTM synchronization matrix to find strongly correlated node pairs.
+    These become proposed hyperedges.
+    """
+    try:
+        data = json.loads(context_json)
+        node_features = np.array(data.get("node_features", [[0]*16]*8))
+        existing_edges = data.get("existing_edges", [])
+        n_ticks = data.get("ticks", 50)
+        # Flatten node features for CTM input and pad to 72D
+        flattened = node_features.flatten()
+        input_vec = np.zeros(72)
+        input_vec[:min(len(flattened), 72)] = flattened[:min(len(flattened), 72)]
+        # Process through CTM with more ticks for reasoning
+        result = ctm.forward(input_vec, n_ticks)
+        # Extract proposed edges from sync matrix (S_ij > 0.7)
+        proposed_edges = []
+        if result["sync_matrix"] is not None:
+            sync = np.array(result["sync_matrix"])
+            # Ensure sync is 2D
+            if len(sync.shape) == 1:
+                # 1D array - skip edge extraction
+                pass
+            elif len(sync.shape) >= 2:
+                n_nodes = min(len(node_features), sync.shape[0])
+                for i in range(n_nodes):
+                    for j in range(i+1, n_nodes):
+                        if j < sync.shape[1]:  # Check bounds
+                            sync_ij = sync[i, j]
+                            if sync_ij > 0.7:  # Threshold for edge proposal
+                                edge_exists = any(
+                                    (e[0] == i and e[1] == j) or (e[0] == j and e[1] == i)
+                                    for e in existing_edges
+                                )
+                                if not edge_exists:
+                                    proposed_edges.append([i, j, float(sync_ij)])
+        return json.dumps({
+            "status": "success",
+            "proposed_edges": proposed_edges,
+            "certainty": result["certainty"],
+            "best_tick": result["best_tick"],
+            "ticks_used": result["ticks_used"],
+            "model": result["model"]
+        }, indent=2)
+    except Exception as e:
+        return json.dumps({"status": "error", "message": str(e)})
+def validate_physics_endpoint(physics_json: str) -> str:
+    """
+    /validate_physics - Validate trajectory against 5 physics losses
+    """
+    try:
+        data = json.loads(physics_json)
+        trajectory = data.get("trajectory", [0.0])
+        params = data.get("physics_params", {})
+        result = validate_physics(trajectory, params)
+        result["status"] = "success"
+        return json.dumps(result, indent=2)
+    except Exception as e:
+        return json.dumps({"status": "error", "message": str(e)})
+def dream_endpoint(dream_json: str) -> str:
+    """
+    /dream - Offline consolidation with many ticks
+    Discovers patterns, proposes new edges, identifies edges to prune.
+    """
+    try:
+        data = json.loads(dream_json)
+        snapshot = data.get("hypergraph_snapshot", {})
+        n_ticks = min(data.get("ticks", 100), 100)  # Cap at 100 for CPU
+        # Extract features from snapshot
+        nodes = snapshot.get("nodes", [])
+        if nodes:
+            input_vec = np.array([n.get("features", [0]*16) for n in nodes]).flatten()[:72]
+        else:
+            input_vec = np.random.randn(72)
+        # Dream: run CTM with many ticks
+        result = ctm.forward(input_vec, n_ticks)
+        # Analyze sync for patterns
+        new_edges = []
+        pruned_edges = []
+        if result["sync_matrix"] is not None:
+            sync = np.array(result["sync_matrix"])
+            n = min(len(nodes), sync.shape[0]) if nodes else 16
+            for i in range(n):
+                for j in range(i+1, n):
+                    if sync[i, j] > 0.85:
+                        new_edges.append([i, j, float(sync[i, j])])
+                    elif sync[i, j] < 0.1:
+                        pruned_edges.append([i, j])
+        return json.dumps({
+            "status": "success",
+            "discovered_patterns": len(new_edges),
+            "new_edges": new_edges[:10],
+            "pruned_edges": pruned_edges[:10],
+            "consolidation_certainty": result["certainty"],
+            "ticks_used": result["ticks_used"],
+            "model": result["model"]
+        }, indent=2)
+    except Exception as e:
+        return json.dumps({"status": "error", "message": str(e)})
+def calibrate_stdp_endpoint(stdp_json: str) -> str:
+    """
+    /calibrate_stdp - Suggest STDP weight adjustments
+    This is the CORE regulatory function:
+    - Receives current 16 dendrite weights
+    - Processes through CTM to get sync patterns
+    - Returns suggested weight adjustments
+    """
+    try:
+        data = json.loads(stdp_json)
+        current_weights = np.array(data.get("current_weights", [1.0]*16))
+        node_features = np.array(data.get("node_features", [[0]*16]*4))
+        # Flatten features for CTM input
+        input_vec = node_features.flatten()[:72]
+        # Process through CTM
+        result = ctm.forward(input_vec, n_ticks=25)
+        # Use predictions as weight adjustments
+        predictions = np.array(result["best_predictions"])
+        # Scale based on certainty
+        confidence = result["certainty"]
+        weight_changes = (predictions - 0.5) * confidence * 0.1
+        new_weights = current_weights + weight_changes
+        return json.dumps({
+            "status": "success",
+            "suggested_weights": new_weights.tolist(),
+            "weight_changes": weight_changes.tolist(),
+            "confidence": confidence,
+            "best_tick": result["best_tick"],
+            "model": result["model"]
+        }, indent=2)
+    except Exception as e:
+        return json.dumps({"status": "error", "message": str(e)})
+def regulate_endpoint(regulate_json: str) -> str:
+    """
+    /regulate - Full feedback loop for ART-17 regulation (NEW)
+    Combines all signals to provide comprehensive regulation:
+    - Dendrite state
+    - Latent representation
+    - Physics loss
+    - Anomaly score
+    Returns action recommendation with confidence.
+    """
+    try:
+        data = json.loads(regulate_json)
+        # Inputs from local system
+        dendrites = np.array(data.get("dendrites", [0.0]*16))
+        latent_256 = np.array(data.get("latent_256", [0.0]*256))
+        physics_loss = data.get("physics_loss", 0.0)
+        anomaly_score = data.get("anomaly_score", 0.0)
+        # Combine into 72D input
+        input_72 = np.concatenate([
+            dendrites,           # 16D
+            latent_256[:56]      # 56D from latent
+        ])
+        # Process through CTM
+        result = ctm.forward(input_72, n_ticks=50)
+        # Compute regulation signals
+        predictions = np.array(result["best_predictions"])
+        certainty = result["certainty"]
+        # Urgency based on physics and anomaly
+        urgency = min(1.0, physics_loss + anomaly_score)
+        regulation_strength = urgency * certainty
+        # Weight adjustments
+        dendrite_deltas = predictions * regulation_strength * 0.05
+        # Determine if intervention needed
+        needs_intervention = urgency > 0.5 or certainty < 0.3
+        return json.dumps({
+            "status": "success",
+            "dendrite_deltas": dendrite_deltas.tolist(),
+            "regulation_strength": float(regulation_strength),
+            "confidence": certainty,
+            "urgency": float(urgency),
+            "needs_intervention": needs_intervention,
+            "recommended_action": "ADJUST" if needs_intervention else "MAINTAIN",
+            "best_tick": result["best_tick"],
+            "model": result["model"]
+        }, indent=2)
+    except Exception as e:
+        return json.dumps({"status": "error", "message": str(e)})
+def train_online_endpoint(train_json: str) -> str:
+    """
+    /train_online - Progressive online training (NEW)
+    Allows the local system to train the CTM with experience.
+    Sends input-output pairs and receives training feedback.
+    """
+    try:
+        data = json.loads(train_json)
+        input_72d = np.array(data.get("input_72d", [0.0]*72))
+        target_16d = np.array(data.get("target_16d", [0.0]*16))
+        physics_loss = data.get("physics_loss", 0.0)
+        # Perform training step
+        result = ctm.train_step(input_72d, target_16d, physics_loss)
+        return json.dumps({
+            "status": result["status"],
+            "loss": result.get("loss"),
+            "mse_loss": result.get("mse_loss"),
+            "physics_penalty": result.get("physics_penalty"),
+            "best_tick": result.get("best_tick"),
+            "forward_count": ctm.forward_count,
+            "message": "Training step completed" if result["status"] == "trained" else result.get("reason")
+        }, indent=2)
+    except Exception as e:
+        return json.dumps({"status": "error", "message": str(e)})
+def health_check() -> str:
+    """Health check with model info."""
+    return json.dumps({
+        "status": "healthy",
+        "model": f"CTM Nervous System v2.0 ({'Full PyTorch' if ctm.is_full else 'NumPy Fallback'})",
+        "device": DEVICE,
+        "d_model": CONFIG["d_model"],
+        "iterations": CONFIG["iterations"],
+        "memory_length": CONFIG["memory_length"],
+        "forward_count": ctm.forward_count,
+        "endpoints": [
+            "/sense_snn",
+            "/reason_hypergraph",
+            "/validate_physics",
+            "/dream",
+            "/calibrate_stdp",
+            "/regulate",        # NEW
+            "/train_online"     # NEW
+        ]
+    }, indent=2)
+# ============================================================================
+# GRADIO INTERFACE
+# ============================================================================
+with gr.Blocks(title="CTM Nervous System v2.0", theme=gr.themes.Soft()) as demo:
+    gr.Markdown("""
+    # 🧬 CTM Nervous System v2.0
+    **Continuous Thought Machine for ART-17 Hypergraph Coherence**
+    Based on [arXiv:2505.05522](https://arxiv.org/abs/2505.05522) - Sakana AI
+    ---
+    ## Key Innovations
+    - **NLMs (Neuron-Level Models)**: Each neuron processes its own history
+    - **Neural Synchronization**: Representation via S = Z·Z^T
+    - **Adaptive Compute**: Halts when confident
+    - **Online Training**: Progressive learning with use
+    ---
+    """)
+    with gr.Tabs():
+        with gr.Tab("🔌 /sense_snn"):
+            gr.Markdown("Process 72D SNN input through CTM")
+            snn_input = gr.Textbox(
+                label="SNN JSON Input",
+                value='{"dendrites": {"d1": 0.1, "d2": 0.2, "d3": 0.3}, "ticks": 25}',
+                lines=5
+            )
+            snn_output = gr.Textbox(label="Output", lines=10)
+            snn_btn = gr.Button("Process", variant="primary")
+            snn_btn.click(sense_snn, inputs=snn_input, outputs=snn_output, api_name="sense_snn")
+        with gr.Tab("🧠 /reason_hypergraph"):
+            gr.Markdown("Reason about hypergraph context, propose edges")
+            reason_input = gr.Textbox(
+                label="Context JSON",
+                value='{"node_features": [[0.1, 0.2], [0.3, 0.4]], "existing_edges": [], "ticks": 50}',
+                lines=5
+            )
+            reason_output = gr.Textbox(label="Output", lines=10)
+            reason_btn = gr.Button("Reason", variant="primary")
+            reason_btn.click(reason_hypergraph, inputs=reason_input, outputs=reason_output, api_name="reason_hypergraph")
+        with gr.Tab("⚡ /validate_physics"):
+            gr.Markdown("Validate trajectory against 5 physics losses")
+            physics_input = gr.Textbox(
+                label="Physics JSON",
+                value='{"trajectory": [0.1, 0.2, 0.3], "physics_params": {"P_max": 1000}}',
+                lines=5
+            )
+            physics_output = gr.Textbox(label="Output", lines=10)
+            physics_btn = gr.Button("Validate", variant="primary")
+            physics_btn.click(validate_physics_endpoint, inputs=physics_input, outputs=physics_output, api_name="validate_physics")
+        with gr.Tab("💤 /dream"):
+            gr.Markdown("Offline consolidation - discover patterns")
+            dream_input = gr.Textbox(
+                label="Dream JSON",
+                value='{"hypergraph_snapshot": {"nodes": []}, "ticks": 100}',
+                lines=5
+            )
+            dream_output = gr.Textbox(label="Output", lines=10)
+            dream_btn = gr.Button("Dream", variant="primary")
+            dream_btn.click(dream_endpoint, inputs=dream_input, outputs=dream_output, api_name="dream")
+        with gr.Tab("🔧 /calibrate_stdp"):
+            gr.Markdown("Calibrate STDP weights (Core regulatory function)")
+            stdp_input = gr.Textbox(
+                label="STDP JSON",
+                value='{"current_weights": [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1], "node_features": [[0.1, 0.2]]}',
+                lines=5
+            )
+            stdp_output = gr.Textbox(label="Output", lines=10)
+            stdp_btn = gr.Button("Calibrate", variant="primary")
+            stdp_btn.click(calibrate_stdp_endpoint, inputs=stdp_input, outputs=stdp_output, api_name="calibrate_stdp")
+        with gr.Tab("🎯 /regulate [NEW]"):
+            gr.Markdown("Full feedback loop for ART-17 regulation")
+            regulate_input = gr.Textbox(
+                label="Regulate JSON",
+                value='{"dendrites": [0.5]*16, "latent_256": [0.1]*256, "physics_loss": 0.01, "anomaly_score": 0.05}',
+                lines=5
+            )
+            regulate_output = gr.Textbox(label="Output", lines=10)
+            regulate_btn = gr.Button("Regulate", variant="primary")
+            regulate_btn.click(regulate_endpoint, inputs=regulate_input, outputs=regulate_output, api_name="regulate")
+        with gr.Tab("📚 /train_online [NEW]"):
+            gr.Markdown("Progressive online training with experience")
+            train_input = gr.Textbox(
+                label="Training JSON",
+                value='{"input_72d": [0.1]*72, "target_16d": [0.5]*16, "physics_loss": 0.01}',
+                lines=5
+            )
+            train_output = gr.Textbox(label="Output", lines=10)
+            train_btn = gr.Button("Train Step", variant="primary")
+            train_btn.click(train_online_endpoint, inputs=train_input, outputs=train_output, api_name="train_online")
+        with gr.Tab("❤️ Health"):
+            health_output = gr.Textbox(label="Health Status", lines=15)
+            health_btn = gr.Button("Check Health", variant="secondary")
+            health_btn.click(health_check, inputs=None, outputs=health_output, api_name="health_check")
+    gr.Markdown("""
+    ---
+    **Architecture**: CTM as Nervous System → Hypergraph as Coherent Thought
+    **Integration**: Local ART-17 ↔ CTM (regulation) ↔ Brain Server (semantics)
+    **Training**: Progressive online learning + Physics-Informed Loss
+    """)
+if __name__ == "__main__":
+    demo.launch(
+        server_name="0.0.0.0",
+        server_port=7860,
+        show_error=True
+    )

app_v1_backup.py ADDED Viewed

	@@ -0,0 +1,464 @@

+"""
+CTM Nervous System Server - Continuous Thought Machine for Hypergraph Maintenance
+===================================================================================
+Implementation of the definitive proposal: CTM as Nervous System for ART-17 Hypergraph.
+Endpoints:
+- /sense_snn: Process 72D SNN input with NLM-style processing
+- /reason_hypergraph: Reason about hypergraph context, propose edges
+- /validate_physics: Validate proposals against 5 physics losses
+- /dream: Offline consolidation with T=500+ ticks
+- /calibrate_stdp: Suggest STDP weight adjustments from sync matrix
+- /health: Health check endpoint
+Based on: arXiv:2505.05522 (Continuous Thought Machines - Sakana AI)
+"""
+import gradio as gr
+import numpy as np
+import json
+from typing import List, Dict, Any, Optional
+import os
+# ============================================================================
+# SIMPLIFIED CTM SIMULATION (CPU-only for Hugging Face free tier)
+# ============================================================================
+class SimplifiedCTM:
+    """
+    Simplified CTM for CPU-only environment.
+    Simulates the key mechanisms without full PyTorch model.
+    """
+    def __init__(self, d_model: int = 256, memory_length: int = 16, n_ticks: int = 50):
+        self.d_model = d_model
+        self.memory_length = memory_length
+        self.n_ticks = n_ticks
+        # Initialize state
+        self.state_trace = np.zeros((d_model, memory_length))
+        self.activated_state = np.random.randn(d_model) * 0.1
+        # NLM weights (simplified: one weight matrix per "neuron group")
+        self.nlm_weights = np.random.randn(16, memory_length) * 0.1  # 16 groups for 16 dendrites
+    def compute_sync_matrix(self, z: np.ndarray) -> np.ndarray:
+        """S^t = Z · Z^T (normalized)"""
+        z_norm = z / (np.linalg.norm(z) + 1e-8)
+        S = np.outer(z_norm, z_norm)
+        return S
+    def compute_certainty(self, predictions: np.ndarray) -> float:
+        """Certainty = 1 - normalized entropy"""
+        probs = np.abs(predictions) / (np.sum(np.abs(predictions)) + 1e-8)
+        probs = np.clip(probs, 1e-10, 1.0)
+        entropy = -np.sum(probs * np.log(probs))
+        max_entropy = np.log(len(probs))
+        normalized_entropy = entropy / (max_entropy + 1e-8)
+        return float(1.0 - normalized_entropy)
+    def process_ticks(self, input_features: np.ndarray, n_ticks: Optional[int] = None) -> Dict:
+        """Run T internal ticks and return sync matrix + certainty"""
+        n_ticks = n_ticks or self.n_ticks
+        # Ensure input is right size
+        if len(input_features) < self.d_model:
+            input_features = np.pad(input_features, (0, self.d_model - len(input_features)))
+        else:
+            input_features = input_features[:self.d_model]
+        certainties = []
+        sync_matrices = []
+        for t in range(n_ticks):
+            # Simulate synapse update
+            combined = np.concatenate([self.activated_state, input_features[:self.d_model//2]])
+            pre_activation = np.tanh(combined[:self.d_model] * 0.1 + np.random.randn(self.d_model) * 0.01)
+            # Update trace
+            self.state_trace = np.roll(self.state_trace, -1, axis=1)
+            self.state_trace[:, -1] = pre_activation
+            # Simulate NLM (simplified)
+            post_activation = np.zeros(self.d_model)
+            group_size = self.d_model // 16
+            for g in range(16):
+                start = g * group_size
+                end = start + group_size
+                group_trace = self.state_trace[start:end, :]
+                group_output = np.mean(group_trace @ self.nlm_weights[g])
+                post_activation[start:end] = np.tanh(group_output)
+            self.activated_state = post_activation
+            # Compute sync and certainty
+            sync = self.compute_sync_matrix(self.activated_state)
+            cert = self.compute_certainty(self.activated_state)
+            sync_matrices.append(sync)
+            certainties.append(cert)
+        # Find best ticks (min-loss proxy: max certainty)
+        best_tick = int(np.argmax(certainties))
+        return {
+            "final_sync_matrix": sync_matrices[-1].tolist(),
+            "best_sync_matrix": sync_matrices[best_tick].tolist(),
+            "certainties": certainties,
+            "final_certainty": float(certainties[-1]),
+            "max_certainty": float(max(certainties)),
+            "best_tick": best_tick,
+            "ticks_used": n_ticks
+        }
+# Global CTM instance
+ctm = SimplifiedCTM(d_model=256, memory_length=16, n_ticks=50)
+# ============================================================================
+# PHYSICS VALIDATION (from SNN Omega-21)
+# ============================================================================
+def validate_physics(trajectory: List[float], params: Dict) -> Dict:
+    """Validate against 5 physics losses from SNN Omega-21"""
+    trajectory = np.array(trajectory)
+    # L_energy: Energy conservation
+    energy = np.sum(trajectory ** 2)
+    P_max = params.get("P_max", 1000.0)
+    L_energy = float(max(0, energy - P_max) ** 2)
+    # L_thermo: Thermodynamics (dew point check)
+    T_dew = params.get("T_dew", 15.0)
+    T_amb = params.get("T_amb", 25.0)
+    L_thermo = float(max(0, T_dew - T_amb) ** 2)
+    # L_causal: Causality (velocity limit)
+    velocity = np.diff(trajectory) if len(trajectory) > 1 else np.array([0])
+    v_max = params.get("v_max", 100.0)
+    L_causal = float(np.sum(np.maximum(0, np.abs(velocity) - v_max) ** 2))
+    # L_conserv: Flux conservation
+    flux_in = params.get("flux_in", 1.0)
+    flux_out = params.get("flux_out", 1.0)
+    L_conserv = float((flux_in - flux_out) ** 2)
+    # L_entropy: 2nd Law (entropy must increase)
+    entropy_change = params.get("entropy_change", 0.1)
+    L_entropy = float(max(0, -entropy_change) ** 2)
+    # Total physics loss
+    L_total = L_energy + L_thermo + L_causal + L_conserv + L_entropy
+    return {
+        "valid": L_total < 0.01,
+        "L_energy": L_energy,
+        "L_thermo": L_thermo,
+        "L_causal": L_causal,
+        "L_conserv": L_conserv,
+        "L_entropy": L_entropy,
+        "L_total": L_total
+    }
+# ============================================================================
+# ENDPOINT FUNCTIONS
+# ============================================================================
+def sense_snn(snn_json: str) -> str:
+    """
+    /sense_snn - Process 72D SNN input
+    Input: JSON with dendrite values
+    Output: Coherent features + anomalies
+    """
+    try:
+        data = json.loads(snn_json)
+        # Extract 72D vector (or create from dendrites)
+        if "vector_72d" in data:
+            input_vec = np.array(data["vector_72d"])
+        elif "dendrites" in data:
+            dendrite_values = list(data["dendrites"].values())
+            input_vec = np.array(dendrite_values)
+        else:
+            input_vec = np.random.randn(72)  # Fallback
+        # Pad to 256D
+        input_256 = np.zeros(256)
+        input_256[:min(len(input_vec), 256)] = input_vec[:min(len(input_vec), 256)]
+        # Process through CTM
+        result = ctm.process_ticks(input_256, n_ticks=25)
+        # Detect anomalies (low certainty regions)
+        anomalies = []
+        if result["final_certainty"] < 0.5:
+            anomalies.append("Low overall certainty")
+        return json.dumps({
+            "status": "success",
+            "coherent_features": result["final_sync_matrix"][:16][:16],  # 16x16 subset
+            "certainty": result["final_certainty"],
+            "anomalies": anomalies,
+            "ticks_used": result["ticks_used"]
+        }, indent=2)
+    except Exception as e:
+        return json.dumps({"status": "error", "message": str(e)})
+def reason_hypergraph(context_json: str) -> str:
+    """
+    /reason_hypergraph - Reason about hypergraph, propose edges
+    Input: Node features + existing edges
+    Output: Proposed new edges + certainty
+    """
+    try:
+        data = json.loads(context_json)
+        node_features = np.array(data.get("node_features", [[0]*16]*8))
+        existing_edges = data.get("existing_edges", [])
+        n_ticks = data.get("ticks", 50)
+        # Flatten node features for CTM input
+        input_vec = node_features.flatten()
+        # Process through CTM with more ticks for reasoning
+        result = ctm.process_ticks(input_vec, n_ticks=n_ticks)
+        # Extract proposed edges from sync matrix (S_ij > 0.8)
+        sync = np.array(result["best_sync_matrix"])
+        proposed_edges = []
+        n_nodes = min(len(node_features), sync.shape[0])
+        for i in range(n_nodes):
+            for j in range(i+1, n_nodes):
+                sync_ij = sync[i, j]
+                if sync_ij > 0.8:
+                    # Check if edge already exists
+                    edge_exists = any(
+                        (e[0] == i and e[1] == j) or (e[0] == j and e[1] == i)
+                        for e in existing_edges
+                    )
+                    if not edge_exists:
+                        proposed_edges.append([i, j, float(sync_ij)])
+        return json.dumps({
+            "status": "success",
+            "proposed_edges": proposed_edges,
+            "certainty": result["max_certainty"],
+            "best_tick": result["best_tick"],
+            "ticks_used": result["ticks_used"]
+        }, indent=2)
+    except Exception as e:
+        return json.dumps({"status": "error", "message": str(e)})
+def validate_physics_endpoint(physics_json: str) -> str:
+    """
+    /validate_physics - Validate trajectory against 5 physics losses
+    """
+    try:
+        data = json.loads(physics_json)
+        trajectory = data.get("trajectory", [0.0])
+        params = data.get("physics_params", {})
+        result = validate_physics(trajectory, params)
+        result["status"] = "success"
+        return json.dumps(result, indent=2)
+    except Exception as e:
+        return json.dumps({"status": "error", "message": str(e)})
+def dream_endpoint(dream_json: str) -> str:
+    """
+    /dream - Offline consolidation with T=500+ ticks
+    Input: Hypergraph snapshot
+    Output: Discovered patterns + new edges
+    """
+    try:
+        data = json.loads(dream_json)
+        snapshot = data.get("hypergraph_snapshot", {})
+        n_ticks = data.get("ticks", 500)
+        # Extract features from snapshot
+        nodes = snapshot.get("nodes", [])
+        if nodes:
+            input_vec = np.array([n.get("features", [0]*16) for n in nodes]).flatten()
+        else:
+            input_vec = np.random.randn(256)  # Random dream if no nodes
+        # Dream: run CTM with many ticks and no external input after initial
+        result = ctm.process_ticks(input_vec, n_ticks=min(n_ticks, 100))  # Cap at 100 for CPU
+        # Analyze sync evolution to find patterns
+        sync = np.array(result["final_sync_matrix"])
+        # Find strong sync pairs (new edges)
+        new_edges = []
+        n = min(len(nodes), sync.shape[0]) if nodes else 16
+        for i in range(n):
+            for j in range(i+1, n):
+                if sync[i, j] > 0.85:
+                    new_edges.append([i, j, float(sync[i, j])])
+        # Find weak sync pairs (edges to prune)
+        pruned_edges = []
+        for i in range(n):
+            for j in range(i+1, n):
+                if sync[i, j] < 0.1:
+                    pruned_edges.append([i, j])
+        return json.dumps({
+            "status": "success",
+            "discovered_patterns": len(new_edges),
+            "new_edges": new_edges[:10],  # Top 10
+            "pruned_edges": pruned_edges[:10],  # Top 10
+            "consolidation_certainty": result["max_certainty"],
+            "ticks_used": result["ticks_used"]
+        }, indent=2)
+    except Exception as e:
+        return json.dumps({"status": "error", "message": str(e)})
+def calibrate_stdp_endpoint(stdp_json: str) -> str:
+    """
+    /calibrate_stdp - Suggest STDP weight adjustments from sync
+    """
+    try:
+        data = json.loads(stdp_json)
+        current_weights = np.array(data.get("current_weights", [1.0]*16))
+        node_features = np.array(data.get("node_features", [[0]*16]*8))
+        # Process to get sync matrix
+        input_vec = node_features.flatten()
+        result = ctm.process_ticks(input_vec, n_ticks=25)
+        sync = np.array(result["final_sync_matrix"])
+        # Suggest weight adjustments based on sync patterns
+        # Uses diagonal of sync (self-similarity) to scale weights
+        suggested = np.zeros(16)
+        for i in range(16):
+            # Average sync of neuron i with others
+            avg_sync = np.mean(sync[i, :])
+            # Scale current weight by sync
+            suggested[i] = current_weights[i] * (0.5 + avg_sync)
+        return json.dumps({
+            "status": "success",
+            "suggested_weights": suggested.tolist(),
+            "weight_changes": (suggested - current_weights).tolist(),
+            "confidence": result["final_certainty"]
+        }, indent=2)
+    except Exception as e:
+        return json.dumps({"status": "error", "message": str(e)})
+def health_check() -> str:
+    """Health check for the CTM server"""
+    return json.dumps({
+        "status": "healthy",
+        "model": "CTM Nervous System v1.0",
+        "d_model": ctm.d_model,
+        "memory_length": ctm.memory_length,
+        "default_ticks": ctm.n_ticks,
+        "endpoints": [
+            "/sense_snn",
+            "/reason_hypergraph",
+            "/validate_physics",
+            "/dream",
+            "/calibrate_stdp"
+        ]
+    }, indent=2)
+# ============================================================================
+# GRADIO INTERFACE
+# ============================================================================
+with gr.Blocks(title="CTM Nervous System") as demo:
+    gr.Markdown("""
+    # 🧬 CTM Nervous System
+    **Continuous Thought Machine for Hypergraph Maintenance**
+    Based on [arXiv:2505.05522](https://arxiv.org/abs/2505.05522) - Sakana AI
+    ---
+    ## Endpoints
+    - **/sense_snn**: Process 72D SNN input
+    - **/reason_hypergraph**: Reason about context, propose edges
+    - **/validate_physics**: Validate against 5 physics losses
+    - **/dream**: Offline consolidation (T=500+)
+    - **/calibrate_stdp**: Suggest STDP weight adjustments
+    """)
+    with gr.Tabs():
+        with gr.Tab("🔌 /sense_snn"):
+            gr.Markdown("Process 72D SNN input vector")
+            snn_input = gr.Textbox(
+                label="SNN JSON Input",
+                value='{"dendrites": {"d1": 0.1, "d2": 0.2, "d3": 0.3}}',
+                lines=5
+            )
+            snn_output = gr.Textbox(label="Output", lines=10)
+            snn_btn = gr.Button("Process", variant="primary")
+            snn_btn.click(sense_snn, inputs=snn_input, outputs=snn_output, api_name="sense_snn")
+        with gr.Tab("🧠 /reason_hypergraph"):
+            gr.Markdown("Reason about hypergraph context")
+            reason_input = gr.Textbox(
+                label="Context JSON",
+                value='{"node_features": [[0.1, 0.2], [0.3, 0.4]], "existing_edges": [], "ticks": 50}',
+                lines=5
+            )
+            reason_output = gr.Textbox(label="Output", lines=10)
+            reason_btn = gr.Button("Reason", variant="primary")
+            reason_btn.click(reason_hypergraph, inputs=reason_input, outputs=reason_output, api_name="reason_hypergraph")
+        with gr.Tab("⚡ /validate_physics"):
+            gr.Markdown("Validate against 5 physics losses")
+            physics_input = gr.Textbox(
+                label="Physics JSON",
+                value='{"trajectory": [0.1, 0.2, 0.3], "physics_params": {"P_max": 1000}}',
+                lines=5
+            )
+            physics_output = gr.Textbox(label="Output", lines=10)
+            physics_btn = gr.Button("Validate", variant="primary")
+            physics_btn.click(validate_physics_endpoint, inputs=physics_input, outputs=physics_output, api_name="validate_physics")
+        with gr.Tab("💤 /dream"):
+            gr.Markdown("Offline consolidation")
+            dream_input = gr.Textbox(
+                label="Dream JSON",
+                value='{"hypergraph_snapshot": {"nodes": []}, "ticks": 100}',
+                lines=5
+            )
+            dream_output = gr.Textbox(label="Output", lines=10)
+            dream_btn = gr.Button("Dream", variant="primary")
+            dream_btn.click(dream_endpoint, inputs=dream_input, outputs=dream_output, api_name="dream")
+        with gr.Tab("🔧 /calibrate_stdp"):
+            gr.Markdown("Calibrate STDP weights")
+            stdp_input = gr.Textbox(
+                label="STDP JSON",
+                value='{"current_weights": [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1], "node_features": [[0.1, 0.2]]}',
+                lines=5
+            )
+            stdp_output = gr.Textbox(label="Output", lines=10)
+            stdp_btn = gr.Button("Calibrate", variant="primary")
+            stdp_btn.click(calibrate_stdp_endpoint, inputs=stdp_input, outputs=stdp_output, api_name="calibrate_stdp")
+        with gr.Tab("❤️ Health"):
+            health_output = gr.Textbox(label="Health Status", lines=10)
+            health_btn = gr.Button("Check Health", variant="secondary")
+            health_btn.click(health_check, inputs=None, outputs=health_output, api_name="health_check")
+    gr.Markdown("""
+    ---
+    **Architecture**: CTM as Nervous System → Hypergraph as Thought
+    **Training**: Min-Loss + Max-Certainty + Physics Regularization
+    """)
+if __name__ == "__main__":
+    demo.launch(
+        server_name="0.0.0.0",
+        server_port=7860,
+        show_error=True
+    )

data/custom_datasets.py ADDED Viewed

	@@ -0,0 +1,324 @@

+import torch
+from torchvision.datasets import ImageFolder
+from torch.utils.data import Dataset
+import random
+import numpy as np
+from tqdm.auto import tqdm
+from PIL import Image
+from datasets import load_dataset
+class SortDataset(Dataset):
+    def __init__(self, N):
+       self.N = N
+    def __len__(self):
+        return 10000000
+    def __getitem__(self, idx):
+        data = torch.zeros(self.N).normal_()
+        ordering = torch.argsort(data)
+        inputs = data
+        return (inputs), (ordering)
+class QAMNISTDataset(Dataset):
+    """A QAMNIST dataset that includes plus and minus operations on MNIST digits."""
+    def __init__(self, base_dataset, num_images, num_images_delta, num_repeats_per_input, num_operations, num_operations_delta):
+        self.base_dataset = base_dataset
+        self.num_images = num_images
+        self.num_images_delta = num_images_delta
+        self.num_images_range = self._calculate_num_images_range()
+        self.operators = ["+", "-"]
+        self.num_operations = num_operations
+        self.num_operations_delta = num_operations_delta
+        self.num_operations_range = self._calculate_num_operations_range()
+        self.num_repeats_per_input = num_repeats_per_input
+        self.current_num_digits = num_images
+        self.current_num_operations = num_operations
+        self.modulo_base = 10
+        self.output_range = [0, 9]
+    def _calculate_num_images_range(self):
+        min_val = self.num_images - self.num_images_delta
+        max_val = self.num_images + self.num_images_delta
+        assert min_val >= 1, f"Minimum number of images must be at least 1, got {min_val}"
+        return [min_val, max_val]
+    def _calculate_num_operations_range(self):
+        min_val = self.num_operations - self.num_operations_delta
+        max_val = self.num_operations + self.num_operations_delta
+        assert min_val >= 1, f"Minimum number of operations must be at least 1, got {min_val}"
+        return [min_val, max_val]
+    def set_num_digits(self, num_digits):
+        self.current_num_digits = num_digits
+    def set_num_operations(self, num_operations):
+        self.current_num_operations = num_operations
+    def _get_target_and_question(self, targets):
+        question = []
+        equations = []
+        num_digits = self.current_num_digits
+        num_operations = self.current_num_operations
+        # Select the initial digit
+        selection_idx = np.random.randint(num_digits)
+        first_digit = targets[selection_idx]
+        question.extend([selection_idx] * self.num_repeats_per_input)
+        # Set current_value to the initial digit (mod is applied in each operation)
+        current_value = first_digit % self.modulo_base
+        # For each operation, build an equation line
+        for _ in range(num_operations):
+            # Choose the operator ('+' or '-')
+            operator_idx = np.random.randint(len(self.operators))
+            operator = self.operators[operator_idx]
+            encoded_operator = -(operator_idx + 1)  # -1 for '+', -2 for '-'
+            question.extend([encoded_operator] * self.num_repeats_per_input)
+            # Choose the next digit
+            selection_idx = np.random.randint(num_digits)
+            digit = targets[selection_idx]
+            question.extend([selection_idx] * self.num_repeats_per_input)
+            # Compute the new value with immediate modulo reduction
+            if operator == '+':
+                new_value = (current_value + digit) % self.modulo_base
+            else:  # operator is '-'
+                new_value = (current_value - digit) % self.modulo_base
+            # Build the equation string for this step
+            equations.append(f"({current_value} {operator} {digit}) mod {self.modulo_base} = {new_value}")
+            # Update current value for the next operation
+            current_value = new_value
+        target = current_value
+        question_readable = "\n".join(equations)
+        return target, question, question_readable
+    def __len__(self):
+        return len(self.base_dataset)
+    def __getitem__(self, idx):
+        images, targets = [],[]
+        for _ in range(self.current_num_digits):
+            image, target = self.base_dataset[np.random.randint(self.__len__())]
+            images.append(image)
+            targets.append(target)
+        observations = torch.repeat_interleave(torch.stack(images, 0), repeats=self.num_repeats_per_input, dim=0)
+        target, question, question_readable = self._get_target_and_question(targets)
+        return observations, question, question_readable, target
+class ImageNet(Dataset):
+    def __init__(self, which_split, transform):
+        """
+        Most simple form of the custom dataset structure.
+        Args:
+            base_dataset (Dataset): The base dataset to sample from.
+            N (int): The number of images to construct into an observable sequence.
+            R (int): number of repeats
+            operators (list): list of operators from which to sample
+            action to take on observations (str): can be 'global' to compute operator over full observations, or 'select_K', where K=integer.
+        """
+        dataset = load_dataset('imagenet-1k', split=which_split, trust_remote_code=True)
+        self.transform = transform
+        self.base_dataset = dataset
+    def __len__(self):
+        return len(self.base_dataset)
+    def __getitem__(self, idx):
+        data_item = self.base_dataset[idx]
+        image = self.transform(data_item['image'].convert('RGB'))
+        target = data_item['label']
+        return image, target
+class MazeImageFolder(ImageFolder):
+    """
+    A custom dataset class that extends the ImageFolder class.
+    Args:
+        root (string): Root directory path.
+        transform (callable, optional): A function/transform that takes in
+            a sample and returns a transformed version.
+            E.g, ``transforms.RandomCrop`` for images.
+        target_transform (callable, optional): A function/transform that takes
+            in the target and transforms it.
+        loader (callable, optional): A function to load an image given its path.
+        is_valid_file (callable, optional): A function that takes path of an Image file
+            and check if the file is a valid file (used to check of corrupt files)
+    Attributes:
+        classes (list): List of the class names.
+        class_to_idx (dict): Dict with items (class_name, class_index).
+        imgs (list): List of (image path, class_index) tuples
+    """
+    def __init__(self, root, transform=None, target_transform=None,
+                 loader=Image.open,
+                 is_valid_file=None,
+                 which_set='train',
+                 augment_p=0.5,
+                 maze_route_length=10,
+                 trunc=False,
+                 expand_range=True):
+        super(MazeImageFolder, self).__init__(root, transform, target_transform, loader, is_valid_file)
+        self.which_set = which_set
+        self.augment_p = augment_p
+        self.maze_route_length = maze_route_length
+        self.all_paths = {}
+        self.trunc = trunc
+        self.expand_range = expand_range
+        self._preload()
+        print('Solving all mazes...')
+        for index in range(len(self.preloaded_samples)):
+            path = self.get_solution(self.preloaded_samples[index])
+            self.all_paths[index] = path
+    def _preload(self):
+        preloaded_samples = []
+        with tqdm(total=self.__len__(), initial=0, leave=True, position=0, dynamic_ncols=True) as pbar:
+            for index in range(self.__len__()):
+                pbar.set_description('Loading mazes')
+                path, target = self.samples[index]
+                sample = self.loader(path)
+                sample = np.array(sample).astype(np.float32)/255
+                preloaded_samples.append(sample)
+                pbar.update(1)
+                if self.trunc and index == 999: break
+        self.preloaded_samples = preloaded_samples
+    def __len__(self):
+        if hasattr(self, 'preloaded_samples') and self.preloaded_samples is not None:
+            return len(self.preloaded_samples)
+        else:
+            return super().__len__()
+    def get_solution(self, x):
+        x = np.copy(x)
+        # Find start (red) and end (green) pixel coordinates
+        start_coords = np.argwhere((x == [1, 0, 0]).all(axis=2))
+        end_coords = np.argwhere((x == [0, 1, 0]).all(axis=2))
+        if len(start_coords) == 0 or len(end_coords) == 0:
+            print("Start or end point not found.")
+            return None
+        start_y, start_x = start_coords[0]
+        end_y, end_x = end_coords[0]
+        current_y, current_x = start_y, start_x
+        path = [4] * self.maze_route_length
+        pi = 0
+        while (current_y, current_x) != (end_y, end_x):
+            next_y, next_x = -1, -1  # Initialize to invalid coordinates
+            direction = -1  # Initialize to an invalid direction
+            # Check Up
+            if current_y > 0 and ((x[current_y - 1, current_x] == [0, 0, 1]).all() or (x[current_y - 1, current_x] == [0, 1, 0]).all()):
+                next_y, next_x = current_y - 1, current_x
+                direction = 0
+            # Check Down
+            elif current_y < x.shape[0] - 1 and ((x[current_y + 1, current_x] == [0, 0, 1]).all() or (x[current_y + 1, current_x] == [0, 1, 0]).all()):
+                next_y, next_x = current_y + 1, current_x
+                direction = 1
+            # Check Left
+            elif current_x > 0 and ((x[current_y, current_x - 1] == [0, 0, 1]).all() or (x[current_y, current_x - 1] == [0, 1, 0]).all()):
+                next_y, next_x = current_y, current_x - 1
+                direction = 2
+            # Check Right
+            elif current_x < x.shape[1] - 1 and ((x[current_y, current_x + 1] == [0, 0, 1]).all() or (x[current_y, current_x + 1] == [0, 1, 0]).all()):
+                next_y, next_x = current_y, current_x + 1
+                direction = 3
+            path[pi] = direction
+            pi += 1
+            x[current_y, current_x] = [255,255,255] # mark the current as white to avoid going in circles
+            current_y, current_x = next_y, next_x
+            if pi == len(path):
+                break
+        return np.array(path)
+    def __getitem__(self, index):
+        """
+        Args:
+            index (int): Index
+        Returns:
+            tuple: (sample, target) where target is class_index of the target class.
+        """
+        sample = np.copy(self.preloaded_samples[index])
+        path = np.copy(self.all_paths[index])
+        if self.which_set == 'train':
+            # Randomly rotate -90 or +90 degrees
+            if random.random() < self.augment_p:
+                which_rot = random.choice([-1, 1])
+                sample = np.rot90(sample, k=which_rot, axes=(0, 1))
+                for pi in range(len(path)):
+                    if path[pi] == 0: path[pi] = 3 if which_rot == -1 else 2
+                    elif path[pi] == 1: path[pi] = 2 if which_rot == -1 else 3
+                    elif path[pi] == 2: path[pi] = 0 if which_rot == -1 else 1
+                    elif path[pi] == 3: path[pi] = 1 if which_rot == -1 else 0
+            # Random horizontal flip
+            if random.random() < self.augment_p:
+                sample = np.fliplr(sample)
+                for pi in range(len(path)):
+                    if path[pi] == 2: path[pi] = 3
+                    elif path[pi] == 3: path[pi] = 2
+            # Random vertical flip
+            if random.random() < self.augment_p:
+                sample = np.flipud(sample)
+                for pi in range(len(path)):
+                    if path[pi] == 0: path[pi] = 1
+                    elif path[pi] == 1: path[pi] = 0
+        sample = torch.from_numpy(np.copy(sample)).permute(2,0,1)
+        blue_mask = (sample[0] == 0) & (sample[1] == 0) & (sample[2] == 1)
+        sample[:, blue_mask] = 1
+        target = path
+        if not self.expand_range:
+            return sample, target
+        return (sample*2)-1, (target)
+class ParityDataset(Dataset):
+    def __init__(self, sequence_length=64, length=100000):
+        self.sequence_length = sequence_length
+        self.length = length
+    def __len__(self):
+        return self.length
+    def __getitem__(self, idx):
+        vector = 2 * torch.randint(0, 2, (self.sequence_length,)) - 1
+        vector = vector.float()
+        negatives = (vector == -1).to(torch.long)
+        cumsum = torch.cumsum(negatives, dim=0)
+        target = (cumsum % 2 != 0).to(torch.long)
+        return vector, target

examples/01_mnist.ipynb ADDED Viewed