Spaces:
Sleeping
Sleeping
Delete resonance_transformer
Browse files- resonance_transformer/DESIGN_DOCUMENT.md +0 -292
- resonance_transformer/dispatcher.py +0 -106
- resonance_transformer/geometric_memory.py +0 -162
- resonance_transformer/hybrid_transformer.py +0 -113
- resonance_transformer/hyperchaos_loss.py +0 -121
- resonance_transformer/resonance_attention.py +0 -128
- resonance_transformer/resonance_gpt.py +0 -58
- resonance_transformer/self_observation.py +0 -121
- resonance_transformer/tesseract_transformer.py +0 -821
- resonance_transformer/test_dual_system.py +0 -53
- resonance_transformer/test_geometric.py +0 -42
- resonance_transformer/test_resonance_attention.py +0 -56
- resonance_transformer/test_self_observation.py +0 -46
- resonance_transformer/train_hybrid.py +0 -52
- resonance_transformer/train_lattice.py +0 -122
- resonance_transformer/train_resonance.py +0 -195
resonance_transformer/DESIGN_DOCUMENT.md
DELETED
|
@@ -1,292 +0,0 @@
|
|
| 1 |
-
# Core Design Principles for the Resonance Transformer
|
| 2 |
-
|
| 3 |
-
## 1. Non-Orientable Embedding Space
|
| 4 |
-
|
| 5 |
-
Instead of standard positional encoding in Euclidean space:
|
| 6 |
-
|
| 7 |
-
**Embed tokens on a möbius topology:**
|
| 8 |
-
- Each token gets coordinates on non-orientable manifold
|
| 9 |
-
- No "inside/outside" in the embedding
|
| 10 |
-
- Tokens exist in both chiral states simultaneously
|
| 11 |
-
- **Position encoding = geometric position on the strip**
|
| 12 |
-
|
| 13 |
-
**Benefit:** Natural handling of self-reference, context doesn't have arbitrary "start/end"
|
| 14 |
-
|
| 15 |
-
## 2. 0x52 Handshake Layer (Entry Point Mechanism)
|
| 16 |
-
|
| 17 |
-
Before processing begins:
|
| 18 |
-
|
| 19 |
-
**Establish geometric entry point:**
|
| 20 |
-
- Input gets hashed to entry coordinates
|
| 21 |
-
- Aligned to 528 Hz resonance baseline
|
| 22 |
-
- All subsequent processing relative to this entry
|
| 23 |
-
- Different queries = different entry points = different perspectives on same knowledge
|
| 24 |
-
|
| 25 |
-
**Benefit:** Same model sees different "faces" of data depending on query context
|
| 26 |
-
|
| 27 |
-
## 3. Resonance-Based Attention (Not Similarity-Based)
|
| 28 |
-
|
| 29 |
-
Replace `softmax(QK^T)` with:
|
| 30 |
-
|
| 31 |
-
**Resonance scoring:**
|
| 32 |
-
```
|
| 33 |
-
For each query-key pair:
|
| 34 |
-
- Compute frequency spectrum (FFT of embeddings)
|
| 35 |
-
- Measure phase alignment (coherence)
|
| 36 |
-
- Score = resonance strength, not dot product similarity
|
| 37 |
-
- Attend to tokens that RESONATE, not just match
|
| 38 |
-
```
|
| 39 |
-
|
| 40 |
-
**Benefit:** Captures harmonic relationships, not just semantic similarity. "Love" and "528Hz" resonate even if embeddings are distant.
|
| 41 |
-
|
| 42 |
-
## 4. Chiral Dual-Path Architecture
|
| 43 |
-
|
| 44 |
-
**Two parallel processing streams:**
|
| 45 |
-
- Left-handed path (one chirality)
|
| 46 |
-
- Right-handed path (opposite chirality)
|
| 47 |
-
- **They're the same path** viewed from different orientations
|
| 48 |
-
- Merge only at output (consensus singularity)
|
| 49 |
-
|
| 50 |
-
**Benefit:** Can reason about both "forward" and "backward" time on the möbius strip. Sees past and future simultaneously.
|
| 51 |
-
|
| 52 |
-
## 5. Coherence-Preserving Normalization
|
| 53 |
-
|
| 54 |
-
Instead of layer norm that might break phase relationships:
|
| 55 |
-
|
| 56 |
-
**Phase-locked normalization:**
|
| 57 |
-
- Normalize amplitude only
|
| 58 |
-
- Preserve phase relationships
|
| 59 |
-
- **Maintain resonance across layers**
|
| 60 |
-
- Use geometric mean instead of arithmetic
|
| 61 |
-
|
| 62 |
-
**Benefit:** Coherence doesn't decay with depth
|
| 63 |
-
|
| 64 |
-
## 6. Hyperchaotic Loss Function
|
| 65 |
-
|
| 66 |
-
During training:
|
| 67 |
-
|
| 68 |
-
**Standard loss + coherence term:**
|
| 69 |
-
```
|
| 70 |
-
L_total = L_task + λ_coherence * L_decoherence + λ_chaos * L_instability
|
| 71 |
-
|
| 72 |
-
Where:
|
| 73 |
-
L_decoherence = measure phase drift across layers
|
| 74 |
-
L_instability = test if pattern survives perturbation (chaos²)
|
| 75 |
-
```
|
| 76 |
-
|
| 77 |
-
**Benefit:** Only learns patterns that are hyperchaotically stable
|
| 78 |
-
|
| 79 |
-
## 7. Geometric Memory (Lattice Integration)
|
| 80 |
-
|
| 81 |
-
**Instead of fixed context window:**
|
| 82 |
-
|
| 83 |
-
- Map hidden states to geometric coordinates
|
| 84 |
-
- Store grooves on physical/virtual "platter"
|
| 85 |
-
- Navigate to relevant regions based on resonance
|
| 86 |
-
- **Infinite effective context** through geometric organization
|
| 87 |
-
|
| 88 |
-
**Benefit:** Can access arbitrarily distant context if geometrically proximate
|
| 89 |
-
|
| 90 |
-
## 8. Self-Observation Layer
|
| 91 |
-
|
| 92 |
-
**Periodic self-reflection:**
|
| 93 |
-
|
| 94 |
-
Every N layers, the model:
|
| 95 |
-
- Observes its own hidden states (the mirror)
|
| 96 |
-
- Detects its current chiral state
|
| 97 |
-
- Measures its own coherence
|
| 98 |
-
- **Adjusts processing based on self-observation**
|
| 99 |
-
|
| 100 |
-
**Benefit:** Self-regulating coherence, can detect when it's decoherent
|
| 101 |
-
|
| 102 |
-
## 9. Frequency-Tuned Feed-Forward
|
| 103 |
-
|
| 104 |
-
**Instead of standard FFN:**
|
| 105 |
-
|
| 106 |
-
Each FFN operates at specific frequency band:
|
| 107 |
-
- Low frequency FFN (slow, global patterns)
|
| 108 |
-
- 528 Hz FFN (resonance/coherence band)
|
| 109 |
-
- High frequency FFN (fast, local patterns)
|
| 110 |
-
- **Parallel processing at multiple frequencies**
|
| 111 |
-
|
| 112 |
-
**Benefit:** Natural spectral decomposition of information
|
| 113 |
-
|
| 114 |
-
## 10. Binary Existence Output
|
| 115 |
-
|
| 116 |
-
**Final layer doesn't give probabilities:**
|
| 117 |
-
|
| 118 |
-
Gives:
|
| 119 |
-
- **Resonance achieved** (coherent output) → generate token
|
| 120 |
-
- **Resonance failed** (decoherent) → refuse to generate / flag uncertainty
|
| 121 |
-
|
| 122 |
-
**Benefit:** Model knows when it doesn't know. No confident hallucinations.
|
| 123 |
-
|
| 124 |
-
---
|
| 125 |
-
|
| 126 |
-
## Practical Implementation Path:
|
| 127 |
-
|
| 128 |
-
**Phase 1: Minimal Viable**
|
| 129 |
-
- Add resonance measurement to existing transformer
|
| 130 |
-
- Test if coherence correlates with quality
|
| 131 |
-
- **Validate the theory first**
|
| 132 |
-
|
| 133 |
-
**Phase 2: Hybrid Architecture**
|
| 134 |
-
- Keep standard attention backbone
|
| 135 |
-
- Add resonance scoring as auxiliary signal
|
| 136 |
-
- Introduce coherence loss term
|
| 137 |
-
- **Prove it improves performance**
|
| 138 |
-
|
| 139 |
-
**Phase 3: Full Geometric**
|
| 140 |
-
- Non-orientable embeddings
|
| 141 |
-
- Chiral dual-path
|
| 142 |
-
- Lattice memory integration
|
| 143 |
-
- **Novel architecture from ground up**
|
| 144 |
-
|
| 145 |
-
## 6. HYPERCHAOTIC LOSS FUNCTION
|
| 146 |
-
|
| 147 |
-
### Theory:
|
| 148 |
-
|
| 149 |
-
Standard loss only measures task performance. We need to also measure:
|
| 150 |
-
1. **Coherence** - are patterns maintaining phase relationships?
|
| 151 |
-
2. **Stability** - do patterns survive perturbation (chaos²)?
|
| 152 |
-
|
| 153 |
-
```python
|
| 154 |
-
class HyperchaosLoss(nn.Module):
|
| 155 |
-
"""
|
| 156 |
-
Loss function that enforces hyperchaotically stable patterns
|
| 157 |
-
"""
|
| 158 |
-
def __init__(self, lambda_coherence=0.1, lambda_stability=0.05):
|
| 159 |
-
super().__init__()
|
| 160 |
-
self.lambda_coherence = lambda_coherence
|
| 161 |
-
self.lambda_stability = lambda_stability
|
| 162 |
-
|
| 163 |
-
def measure_decoherence(self, hidden_states):
|
| 164 |
-
"""
|
| 165 |
-
Measure phase drift across layers
|
| 166 |
-
"""
|
| 167 |
-
if len(hidden_states) < 2:
|
| 168 |
-
return torch.tensor(0.0)
|
| 169 |
-
|
| 170 |
-
total_decoherence = 0.0
|
| 171 |
-
|
| 172 |
-
for i in range(len(hidden_states) - 1):
|
| 173 |
-
curr_layer = hidden_states[i]
|
| 174 |
-
next_layer = hidden_states[i + 1]
|
| 175 |
-
|
| 176 |
-
# Convert to frequency domain
|
| 177 |
-
curr_freq = torch.fft.rfft(curr_layer, dim=-1)
|
| 178 |
-
next_freq = torch.fft.rfft(next_layer, dim=-1)
|
| 179 |
-
|
| 180 |
-
# Measure phase drift
|
| 181 |
-
curr_phase = torch.angle(curr_freq)
|
| 182 |
-
next_phase = torch.angle(next_freq)
|
| 183 |
-
|
| 184 |
-
# Phase should evolve smoothly, not jump randomly
|
| 185 |
-
phase_drift = torch.abs(next_phase - curr_phase)
|
| 186 |
-
|
| 187 |
-
# Penalize large, incoherent jumps
|
| 188 |
-
decoherence = torch.mean(phase_drift ** 2)
|
| 189 |
-
total_decoherence += decoherence
|
| 190 |
-
|
| 191 |
-
return total_decoherence / (len(hidden_states) - 1)
|
| 192 |
-
```
|
| 193 |
-
|
| 194 |
-
## 7. GEOMETRIC MEMORY (LATTICE INTEGRATION)
|
| 195 |
-
|
| 196 |
-
### The Big Idea:
|
| 197 |
-
|
| 198 |
-
Instead of fixed context window, **navigate geometric space** to find relevant information.
|
| 199 |
-
|
| 200 |
-
```python
|
| 201 |
-
class GeometricMemory:
|
| 202 |
-
"""
|
| 203 |
-
Store and retrieve information based on geometric position
|
| 204 |
-
on non-orientable manifold (like Lattice HDD)
|
| 205 |
-
"""
|
| 206 |
-
def __init__(self, capacity_gb=8, base_freq=528):
|
| 207 |
-
self.capacity = capacity_gb * 1024**3 # bytes
|
| 208 |
-
self.base_freq = base_freq
|
| 209 |
-
|
| 210 |
-
# In-memory simulation of HDD platter structure
|
| 211 |
-
self.memory_map = {} # geometric_coords -> data
|
| 212 |
-
|
| 213 |
-
# Spatial index for fast geometric queries
|
| 214 |
-
self.index = None
|
| 215 |
-
self.coordinates = []
|
| 216 |
-
|
| 217 |
-
def geometric_hash(self, hidden_state, entry_point):
|
| 218 |
-
"""
|
| 219 |
-
Convert hidden state to geometric coordinates
|
| 220 |
-
"""
|
| 221 |
-
# PCA + rotation based on entry point
|
| 222 |
-
theta = entry_point['theta']
|
| 223 |
-
phi = entry_point['phi']
|
| 224 |
-
|
| 225 |
-
# Apply FFT to get frequency representation
|
| 226 |
-
freq_repr = np.fft.rfft(hidden_state.cpu().numpy())
|
| 227 |
-
|
| 228 |
-
# Find dominant frequencies
|
| 229 |
-
magnitudes = np.abs(freq_repr)
|
| 230 |
-
phases = np.angle(freq_repr)
|
| 231 |
-
|
| 232 |
-
# Geometric position based on frequency content + entry point
|
| 233 |
-
coords = np.array([
|
| 234 |
-
theta + np.sum(magnitudes * np.cos(phases)), # x
|
| 235 |
-
phi + np.sum(magnitudes * np.sin(phases)), # y
|
| 236 |
-
np.sum(magnitudes) / len(magnitudes), # radius
|
| 237 |
-
entry_point['frequency'] / self.base_freq # frequency dimension
|
| 238 |
-
])
|
| 239 |
-
|
| 240 |
-
return coords
|
| 241 |
-
```
|
| 242 |
-
|
| 243 |
-
## 8. SELF-OBSERVATION LAYER
|
| 244 |
-
|
| 245 |
-
### The Mirror Mechanism:
|
| 246 |
-
|
| 247 |
-
```python
|
| 248 |
-
class SelfObservationLayer(nn.Module):
|
| 249 |
-
"""
|
| 250 |
-
Layer that allows model to observe its own processing
|
| 251 |
-
The 5D mirror - seeing yourself from opposite chirality
|
| 252 |
-
"""
|
| 253 |
-
def __init__(self, hidden_dim):
|
| 254 |
-
super().__init__()
|
| 255 |
-
self.hidden_dim = hidden_dim
|
| 256 |
-
|
| 257 |
-
# Network to analyze own hidden states
|
| 258 |
-
self.observer = nn.Sequential(
|
| 259 |
-
nn.Linear(hidden_dim, hidden_dim),
|
| 260 |
-
nn.GELU(),
|
| 261 |
-
nn.Linear(hidden_dim, hidden_dim)
|
| 262 |
-
)
|
| 263 |
-
|
| 264 |
-
# Coherence detector (real-time during forward pass)
|
| 265 |
-
self.coherence_detector = nn.Linear(hidden_dim, 1)
|
| 266 |
-
|
| 267 |
-
# Chiral state detector
|
| 268 |
-
self.chiral_detector = nn.Linear(hidden_dim, 2) # [left, right] probabilities
|
| 269 |
-
|
| 270 |
-
def observe(self, hidden_state):
|
| 271 |
-
"""
|
| 272 |
-
Look at own hidden state and extract meta-information
|
| 273 |
-
"""
|
| 274 |
-
# Analyze current state
|
| 275 |
-
observation = self.observer(hidden_state)
|
| 276 |
-
|
| 277 |
-
# Measure coherence
|
| 278 |
-
coherence = torch.sigmoid(self.coherence_detector(observation))
|
| 279 |
-
|
| 280 |
-
# Detect chiral state
|
| 281 |
-
chiral_logits = self.chiral_detector(observation)
|
| 282 |
-
chiral_probs = F.softmax(chiral_logits, dim=-1)
|
| 283 |
-
|
| 284 |
-
# Create reflection (opposite chirality view)
|
| 285 |
-
reflection = -observation # Sign flip = chirality flip
|
| 286 |
-
|
| 287 |
-
return {
|
| 288 |
-
'coherence': coherence,
|
| 289 |
-
'chiral_state': chiral_probs,
|
| 290 |
-
'reflection': reflection
|
| 291 |
-
}
|
| 292 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
resonance_transformer/dispatcher.py
DELETED
|
@@ -1,106 +0,0 @@
|
|
| 1 |
-
import torch
|
| 2 |
-
import torch.nn as nn
|
| 3 |
-
import numpy as np
|
| 4 |
-
import time
|
| 5 |
-
|
| 6 |
-
try:
|
| 7 |
-
from .resonance_gpt import ResonanceGPT
|
| 8 |
-
from .tesseract_transformer import Tesseract5DTransformer
|
| 9 |
-
except ImportError:
|
| 10 |
-
from resonance_gpt import ResonanceGPT
|
| 11 |
-
from tesseract_transformer import Tesseract5DTransformer
|
| 12 |
-
|
| 13 |
-
class DualResonanceSystem(nn.Module):
|
| 14 |
-
"""
|
| 15 |
-
The Complete Chiral Architecture.
|
| 16 |
-
|
| 17 |
-
System 1: ResonanceGPT (Fast, Intuitive, Möbius)
|
| 18 |
-
System 2: TesseractTransformer (Slow, Methodical, 5D)
|
| 19 |
-
|
| 20 |
-
Routes queries based on 'Coherence Confidence'.
|
| 21 |
-
"""
|
| 22 |
-
def __init__(self, config):
|
| 23 |
-
super().__init__()
|
| 24 |
-
self.config = config
|
| 25 |
-
|
| 26 |
-
# Initialize Fast System (PyTorch)
|
| 27 |
-
print("[SYSTEM] Initializing Fast System (Möbius)...")
|
| 28 |
-
self.fast = ResonanceGPT(
|
| 29 |
-
vocab_size=config.get('vocab_size', 1000),
|
| 30 |
-
hidden_dim=config.get('fast_dim', 64),
|
| 31 |
-
num_layers=config.get('fast_layers', 4)
|
| 32 |
-
)
|
| 33 |
-
|
| 34 |
-
# Initialize Slow System (NumPy/Custom)
|
| 35 |
-
print("[SYSTEM] Initializing Slow System (Tesseract)...")
|
| 36 |
-
self.slow = Tesseract5DTransformer(
|
| 37 |
-
vocab_size=config.get('vocab_size', 1000),
|
| 38 |
-
hidden_dim=config.get('slow_dim', 64),
|
| 39 |
-
num_layers=config.get('slow_layers', 4)
|
| 40 |
-
)
|
| 41 |
-
|
| 42 |
-
self.coherence_threshold = config.get('threshold', 0.6)
|
| 43 |
-
|
| 44 |
-
def forward(self, input_ids, **kwargs):
|
| 45 |
-
"""
|
| 46 |
-
Dual-path routing logic.
|
| 47 |
-
Kwargs can include 'steering_weights' for the Slow System.
|
| 48 |
-
"""
|
| 49 |
-
start_time = time.time()
|
| 50 |
-
|
| 51 |
-
# 1. Attempt Fast Path
|
| 52 |
-
# input_ids is PyTorch tensor
|
| 53 |
-
fast_logits, _, metas = self.fast(input_ids)
|
| 54 |
-
|
| 55 |
-
# 2. Check Coherence (Self-Reported)
|
| 56 |
-
# Get final layer coherence
|
| 57 |
-
final_meta = metas[-1]
|
| 58 |
-
coherence_score = final_meta['coherence'].mean().item()
|
| 59 |
-
|
| 60 |
-
metrics = {
|
| 61 |
-
'fast_latency': 0,
|
| 62 |
-
'slow_latency': 0,
|
| 63 |
-
'coherence': coherence_score,
|
| 64 |
-
'mode': 'FAST'
|
| 65 |
-
}
|
| 66 |
-
|
| 67 |
-
metrics['fast_latency'] = time.time() - start_time
|
| 68 |
-
|
| 69 |
-
# 3. Decision Gate
|
| 70 |
-
if coherence_score > self.coherence_threshold:
|
| 71 |
-
# Fast system is confident ("Lucid")
|
| 72 |
-
return fast_logits, metrics
|
| 73 |
-
|
| 74 |
-
# 4. Escalate to Slow Path (De-escalation to Deep Reasoning)
|
| 75 |
-
metrics['mode'] = 'SLOW (ESCALATED)'
|
| 76 |
-
slow_start = time.time()
|
| 77 |
-
|
| 78 |
-
# Convert tensor to numpy for Tesseract
|
| 79 |
-
numpy_ids = input_ids.detach().cpu().numpy()
|
| 80 |
-
|
| 81 |
-
# Run Deep Reasoning
|
| 82 |
-
# We assume Tesseract outputs logits in same shape
|
| 83 |
-
# PASS STEERING WEIGHTS IF PRESENT
|
| 84 |
-
steering_weights = kwargs.get('steering_weights')
|
| 85 |
-
|
| 86 |
-
slow_logits_np, slow_meta, slow_coherence = self.slow.deep_reason(
|
| 87 |
-
numpy_ids,
|
| 88 |
-
query_description="Escalated due to low coherence",
|
| 89 |
-
steering_weights=steering_weights
|
| 90 |
-
)
|
| 91 |
-
|
| 92 |
-
metrics['slow_latency'] = time.time() - slow_start
|
| 93 |
-
metrics['slow_coherence'] = slow_coherence
|
| 94 |
-
|
| 95 |
-
# Convert back to tensor
|
| 96 |
-
slow_logits = torch.from_numpy(slow_logits_np).to(input_ids.device)
|
| 97 |
-
|
| 98 |
-
# Blend? Or Replace?
|
| 99 |
-
# For now, we trust the Slow system completely if invoked
|
| 100 |
-
return slow_logits, metrics
|
| 101 |
-
|
| 102 |
-
def train_lattice(self, data_loader, epochs=1):
|
| 103 |
-
"""
|
| 104 |
-
Placeholder for Phase 30: lattice training loop
|
| 105 |
-
"""
|
| 106 |
-
pass
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
resonance_transformer/geometric_memory.py
DELETED
|
@@ -1,162 +0,0 @@
|
|
| 1 |
-
import torch
|
| 2 |
-
import torch.nn as nn
|
| 3 |
-
import numpy as np
|
| 4 |
-
import time
|
| 5 |
-
|
| 6 |
-
class GeometricEntryPoint(nn.Module):
|
| 7 |
-
"""
|
| 8 |
-
Hashes query to geometric coordinates and aligns to 528 Hz.
|
| 9 |
-
"""
|
| 10 |
-
def __init__(self, hidden_dim, base_freq=528):
|
| 11 |
-
super().__init__()
|
| 12 |
-
self.base_freq = base_freq
|
| 13 |
-
self.hidden_dim = hidden_dim
|
| 14 |
-
|
| 15 |
-
# Learned mapping from query to entry coordinates
|
| 16 |
-
self.entry_network = nn.Sequential(
|
| 17 |
-
nn.Linear(hidden_dim, hidden_dim * 2),
|
| 18 |
-
nn.GELU(),
|
| 19 |
-
nn.Linear(hidden_dim * 2, 3) # (theta, phi, radius)
|
| 20 |
-
)
|
| 21 |
-
|
| 22 |
-
def compute_entry_hash(self, query_embedding):
|
| 23 |
-
"""
|
| 24 |
-
Convert query to geometric entry point.
|
| 25 |
-
"""
|
| 26 |
-
# Average over sequence to get general entry context
|
| 27 |
-
# (batch, seq, hidden) -> (batch, hidden)
|
| 28 |
-
context = query_embedding.mean(dim=1)
|
| 29 |
-
|
| 30 |
-
coords = self.entry_network(context) # (batch, 3)
|
| 31 |
-
|
| 32 |
-
theta, phi, radius = coords.unbind(dim=-1)
|
| 33 |
-
|
| 34 |
-
# Align to 528 Hz resonance
|
| 35 |
-
# Frequency = base_freq * (1 + radius_activation)
|
| 36 |
-
freq_multiplier = 1.0 + torch.sigmoid(radius)
|
| 37 |
-
effective_freq = self.base_freq * freq_multiplier
|
| 38 |
-
|
| 39 |
-
return {
|
| 40 |
-
'theta': theta,
|
| 41 |
-
'phi': phi,
|
| 42 |
-
'frequency': effective_freq,
|
| 43 |
-
'raw_coords': coords
|
| 44 |
-
}
|
| 45 |
-
|
| 46 |
-
class GeometricMemory:
|
| 47 |
-
"""
|
| 48 |
-
Store and retrieve information based on geometric position
|
| 49 |
-
on non-orientable manifold.
|
| 50 |
-
"""
|
| 51 |
-
def __init__(self, hidden_dim, capacity_gb=1, base_freq=528):
|
| 52 |
-
self.base_freq = base_freq
|
| 53 |
-
self.hidden_dim = hidden_dim
|
| 54 |
-
|
| 55 |
-
# In-memory storage for demonstration
|
| 56 |
-
# Real implementation would use vector DB or memory-mapped file
|
| 57 |
-
self.memory_map = []
|
| 58 |
-
|
| 59 |
-
def geometric_hash(self, hidden_state, entry_point):
|
| 60 |
-
"""
|
| 61 |
-
Convert hidden state to geometric coordinates relative to entry point.
|
| 62 |
-
"""
|
| 63 |
-
# Simple projection for demo:
|
| 64 |
-
# Use simple operations to map hidden state to offsets
|
| 65 |
-
# Real version would use FFT as discussed in design
|
| 66 |
-
|
| 67 |
-
# (batch, hidden)
|
| 68 |
-
|
| 69 |
-
# We need to handle single vectors or batches
|
| 70 |
-
if hidden_state.dim() == 1:
|
| 71 |
-
hidden_state = hidden_state.unsqueeze(0)
|
| 72 |
-
|
| 73 |
-
# Mock geometric projection
|
| 74 |
-
# Use first 3 dims as offset
|
| 75 |
-
offsets = hidden_state[:, :3]
|
| 76 |
-
if offsets.shape[1] < 3:
|
| 77 |
-
# Pad if hidden_dim is tiny
|
| 78 |
-
offsets = torch.cat([offsets, torch.zeros(offsets.shape[0], 3 - offsets.shape[1], device=hidden_state.device)], dim=1)
|
| 79 |
-
|
| 80 |
-
# Apply entry point rotation (conceptual)
|
| 81 |
-
# For now, just add
|
| 82 |
-
theta = entry_point['theta'].unsqueeze(1)
|
| 83 |
-
phi = entry_point['phi'].unsqueeze(1)
|
| 84 |
-
|
| 85 |
-
x = offsets[:, 0] + theta
|
| 86 |
-
y = offsets[:, 1] + phi
|
| 87 |
-
z = offsets[:, 2] # Radius offset
|
| 88 |
-
|
| 89 |
-
return torch.stack([x, y, z], dim=1)
|
| 90 |
-
|
| 91 |
-
def store(self, hidden_states, entry_point):
|
| 92 |
-
"""
|
| 93 |
-
Store hidden states.
|
| 94 |
-
"""
|
| 95 |
-
# Compute coords
|
| 96 |
-
# hidden_states: (batch, seq, hidden)
|
| 97 |
-
batch, seq, dim = hidden_states.shape
|
| 98 |
-
|
| 99 |
-
flat_hidden = hidden_states.reshape(-1, dim)
|
| 100 |
-
|
| 101 |
-
# We need to broadcast entry point to match flattened hidden
|
| 102 |
-
# entry keys are (batch,) -> repeat seq times
|
| 103 |
-
# This is strictly a demo in-memory store
|
| 104 |
-
|
| 105 |
-
# For efficiency in this demo, we just store the robust patterns
|
| 106 |
-
# Only store if norm > threshold (simple filter)
|
| 107 |
-
norms = torch.norm(flat_hidden, dim=1)
|
| 108 |
-
threshold = norms.mean()
|
| 109 |
-
|
| 110 |
-
mask = norms > threshold
|
| 111 |
-
to_store = flat_hidden[mask]
|
| 112 |
-
|
| 113 |
-
if len(to_store) == 0:
|
| 114 |
-
return
|
| 115 |
-
|
| 116 |
-
# Store simple list for verification
|
| 117 |
-
# In production this links to Lattice DB
|
| 118 |
-
self.memory_map.append({
|
| 119 |
-
'data': to_store.detach().cpu(), # Move to CPU to save GPU mem
|
| 120 |
-
'entry_freq': entry_point['frequency'].mean().item(),
|
| 121 |
-
'timestamp': time.time()
|
| 122 |
-
})
|
| 123 |
-
|
| 124 |
-
# Prune if too large
|
| 125 |
-
if len(self.memory_map) > 100:
|
| 126 |
-
self.memory_map.pop(0)
|
| 127 |
-
|
| 128 |
-
def retrieve(self, query_state, entry_point, k=5):
|
| 129 |
-
"""
|
| 130 |
-
Retrieve relevant memories.
|
| 131 |
-
"""
|
| 132 |
-
if not self.memory_map:
|
| 133 |
-
return None
|
| 134 |
-
|
| 135 |
-
# Brute force search for demo verification
|
| 136 |
-
# Find memories with similar frequency
|
| 137 |
-
relevant_batches = [
|
| 138 |
-
m['data'] for m in self.memory_map
|
| 139 |
-
if abs(m['entry_freq'] - entry_point['frequency'].mean().item()) < 50
|
| 140 |
-
]
|
| 141 |
-
|
| 142 |
-
if not relevant_batches:
|
| 143 |
-
return None
|
| 144 |
-
|
| 145 |
-
memory_bank = torch.cat(relevant_batches, dim=0).to(query_state.device)
|
| 146 |
-
|
| 147 |
-
# Simple dot product attention
|
| 148 |
-
# query: (batch, seq, hidden)
|
| 149 |
-
# memory: (total_mem, hidden)
|
| 150 |
-
|
| 151 |
-
# Compute scores
|
| 152 |
-
# (batch, seq, hidden) @ (hidden, total_mem) -> (batch, seq, total_mem)
|
| 153 |
-
scores = torch.matmul(query_state, memory_bank.t())
|
| 154 |
-
|
| 155 |
-
# Top k
|
| 156 |
-
top_k_scores, indices = torch.topk(scores, k=min(k, len(memory_bank)), dim=-1)
|
| 157 |
-
|
| 158 |
-
# Retrieve values
|
| 159 |
-
# (batch, seq, k, hidden)
|
| 160 |
-
retrieved = memory_bank[indices]
|
| 161 |
-
|
| 162 |
-
return retrieved
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
resonance_transformer/hybrid_transformer.py
DELETED
|
@@ -1,113 +0,0 @@
|
|
| 1 |
-
import torch
|
| 2 |
-
import torch.nn as nn
|
| 3 |
-
try:
|
| 4 |
-
from .resonance_attention import ResonanceAttention
|
| 5 |
-
except ImportError:
|
| 6 |
-
from resonance_attention import ResonanceAttention
|
| 7 |
-
|
| 8 |
-
class PhaseLockedNorm(nn.Module):
|
| 9 |
-
"""
|
| 10 |
-
Normalize amplitude while preserving phase relationships.
|
| 11 |
-
"""
|
| 12 |
-
def __init__(self, hidden_dim, eps=1e-6):
|
| 13 |
-
super().__init__()
|
| 14 |
-
self.eps = eps
|
| 15 |
-
self.gain = nn.Parameter(torch.ones(hidden_dim))
|
| 16 |
-
self.bias = nn.Parameter(torch.zeros(hidden_dim))
|
| 17 |
-
|
| 18 |
-
def forward(self, x):
|
| 19 |
-
"""
|
| 20 |
-
x: (batch, seq, hidden_dim)
|
| 21 |
-
"""
|
| 22 |
-
# Assume hidden_dim is even to form complex pairs
|
| 23 |
-
# If odd, we pad, normalize, slice back - keeping it simple for now (require even dim)
|
| 24 |
-
if x.shape[-1] % 2 != 0:
|
| 25 |
-
# Fallback to LayerNorm if dim is odd (phase concept breaks for scalar)
|
| 26 |
-
mean = x.mean(dim=-1, keepdim=True)
|
| 27 |
-
std = x.std(dim=-1, keepdim=True)
|
| 28 |
-
return self.gain * (x - mean) / (std + self.eps) + self.bias
|
| 29 |
-
|
| 30 |
-
# Convert to complex representation
|
| 31 |
-
# Treat adjacent dimensions as real/imag pairs
|
| 32 |
-
complex_x = torch.view_as_complex(
|
| 33 |
-
x.reshape(*x.shape[:-1], -1, 2).contiguous()
|
| 34 |
-
)
|
| 35 |
-
|
| 36 |
-
# Get magnitude and phase
|
| 37 |
-
magnitude = torch.abs(complex_x)
|
| 38 |
-
phase = torch.angle(complex_x)
|
| 39 |
-
|
| 40 |
-
# Normalize magnitude only (preserve phase!)
|
| 41 |
-
mean_mag = magnitude.mean(dim=-1, keepdim=True)
|
| 42 |
-
std_mag = magnitude.std(dim=-1, keepdim=True)
|
| 43 |
-
|
| 44 |
-
normalized_mag = (magnitude - mean_mag) / (std_mag + self.eps)
|
| 45 |
-
|
| 46 |
-
# Reconstruct with original phase
|
| 47 |
-
normalized_complex = normalized_mag * torch.exp(1j * phase)
|
| 48 |
-
|
| 49 |
-
# Convert back to real
|
| 50 |
-
normalized = torch.view_as_real(normalized_complex).reshape(*x.shape)
|
| 51 |
-
|
| 52 |
-
# Apply learned gain and bias
|
| 53 |
-
return normalized * self.gain + self.bias
|
| 54 |
-
|
| 55 |
-
class HybridTransformerLayer(nn.Module):
|
| 56 |
-
def __init__(self, hidden_dim, num_heads=4, ffn_dim=2048, dropout=0.1):
|
| 57 |
-
super().__init__()
|
| 58 |
-
self.attention = ResonanceAttention(hidden_dim, num_heads)
|
| 59 |
-
self.norm1 = PhaseLockedNorm(hidden_dim)
|
| 60 |
-
self.norm2 = PhaseLockedNorm(hidden_dim)
|
| 61 |
-
|
| 62 |
-
self.ffn = nn.Sequential(
|
| 63 |
-
nn.Linear(hidden_dim, ffn_dim),
|
| 64 |
-
nn.GELU(),
|
| 65 |
-
nn.Linear(ffn_dim, hidden_dim),
|
| 66 |
-
nn.Dropout(dropout)
|
| 67 |
-
)
|
| 68 |
-
self.dropout = nn.Dropout(dropout)
|
| 69 |
-
|
| 70 |
-
def forward(self, x, mask=None):
|
| 71 |
-
# Attention block
|
| 72 |
-
attn_out, _, _ = self.attention(x, x, x, mask)
|
| 73 |
-
x = self.norm1(x + self.dropout(attn_out))
|
| 74 |
-
|
| 75 |
-
# FFN block
|
| 76 |
-
ffn_out = self.ffn(x)
|
| 77 |
-
x = self.norm2(x + self.dropout(ffn_out))
|
| 78 |
-
|
| 79 |
-
return x
|
| 80 |
-
|
| 81 |
-
class HybridResonanceTransformer(nn.Module):
|
| 82 |
-
def __init__(self, vocab_size, hidden_dim, num_layers=4, num_heads=4, max_seq_len=512):
|
| 83 |
-
super().__init__()
|
| 84 |
-
self.embedding = nn.Embedding(vocab_size, hidden_dim)
|
| 85 |
-
self.pos_encoding = nn.Parameter(torch.randn(1, max_seq_len, hidden_dim))
|
| 86 |
-
|
| 87 |
-
self.layers = nn.ModuleList([
|
| 88 |
-
HybridTransformerLayer(hidden_dim, num_heads) for _ in range(num_layers)
|
| 89 |
-
])
|
| 90 |
-
|
| 91 |
-
self.output_head = nn.Linear(hidden_dim, vocab_size)
|
| 92 |
-
|
| 93 |
-
def forward(self, input_ids, output_hidden_states=False):
|
| 94 |
-
batch, seq = input_ids.shape
|
| 95 |
-
|
| 96 |
-
# Embed + Pos
|
| 97 |
-
x = self.embedding(input_ids) + self.pos_encoding[:, :seq, :]
|
| 98 |
-
|
| 99 |
-
all_hidden_states = []
|
| 100 |
-
if output_hidden_states:
|
| 101 |
-
all_hidden_states.append(x)
|
| 102 |
-
|
| 103 |
-
# Process layers
|
| 104 |
-
for layer in self.layers:
|
| 105 |
-
x = layer(x)
|
| 106 |
-
if output_hidden_states:
|
| 107 |
-
all_hidden_states.append(x)
|
| 108 |
-
|
| 109 |
-
logits = self.output_head(x)
|
| 110 |
-
|
| 111 |
-
if output_hidden_states:
|
| 112 |
-
return logits, all_hidden_states
|
| 113 |
-
return logits
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
resonance_transformer/hyperchaos_loss.py
DELETED
|
@@ -1,121 +0,0 @@
|
|
| 1 |
-
import torch
|
| 2 |
-
import torch.nn as nn
|
| 3 |
-
import torch.nn.functional as F
|
| 4 |
-
|
| 5 |
-
class HyperchaosLoss(nn.Module):
|
| 6 |
-
"""
|
| 7 |
-
Loss function that enforces hyperchaotically stable patterns.
|
| 8 |
-
Combines standard task loss with:
|
| 9 |
-
1. Coherence Loss (Phase consistency across layers)
|
| 10 |
-
2. Stability Loss (Resistance to perturbation)
|
| 11 |
-
"""
|
| 12 |
-
def __init__(self, lambda_coherence=0.1, lambda_stability=0.05):
|
| 13 |
-
super().__init__()
|
| 14 |
-
self.lambda_coherence = lambda_coherence
|
| 15 |
-
self.lambda_stability = lambda_stability
|
| 16 |
-
|
| 17 |
-
def measure_decoherence(self, hidden_states):
|
| 18 |
-
"""
|
| 19 |
-
Measure phase drift across layers.
|
| 20 |
-
hidden_states: list of (batch, seq, hidden) tensors from each layer.
|
| 21 |
-
"""
|
| 22 |
-
if len(hidden_states) < 2:
|
| 23 |
-
return torch.tensor(0.0, device=hidden_states[0].device)
|
| 24 |
-
|
| 25 |
-
total_decoherence = 0.0
|
| 26 |
-
|
| 27 |
-
for i in range(len(hidden_states) - 1):
|
| 28 |
-
curr_layer = hidden_states[i]
|
| 29 |
-
next_layer = hidden_states[i + 1]
|
| 30 |
-
|
| 31 |
-
# Convert to frequency domain
|
| 32 |
-
curr_freq = torch.fft.rfft(curr_layer, dim=-1)
|
| 33 |
-
next_freq = torch.fft.rfft(next_layer, dim=-1)
|
| 34 |
-
|
| 35 |
-
# Measure phase drift
|
| 36 |
-
curr_phase = torch.angle(curr_freq)
|
| 37 |
-
next_phase = torch.angle(next_freq)
|
| 38 |
-
|
| 39 |
-
# Phase should evolve smoothly, not jump randomly
|
| 40 |
-
phase_drift = torch.abs(next_phase - curr_phase)
|
| 41 |
-
|
| 42 |
-
# Penalize large, incoherent jumps
|
| 43 |
-
decoherence = torch.mean(phase_drift ** 2)
|
| 44 |
-
total_decoherence = total_decoherence + decoherence
|
| 45 |
-
|
| 46 |
-
return total_decoherence / (len(hidden_states) - 1)
|
| 47 |
-
|
| 48 |
-
def measure_stability(self, hidden_states, perturbation_scale=0.01):
|
| 49 |
-
"""
|
| 50 |
-
Test if patterns survive small perturbations (chaos² testing).
|
| 51 |
-
"""
|
| 52 |
-
# Take final hidden state
|
| 53 |
-
final_state = hidden_states[-1]
|
| 54 |
-
|
| 55 |
-
# Add small perturbation
|
| 56 |
-
perturbation = torch.randn_like(final_state) * perturbation_scale
|
| 57 |
-
perturbed_state = final_state + perturbation
|
| 58 |
-
|
| 59 |
-
# Measure coherence before and after
|
| 60 |
-
def compute_coherence(state):
|
| 61 |
-
# FFT to frequency domain
|
| 62 |
-
freq = torch.fft.rfft(state, dim=-1)
|
| 63 |
-
|
| 64 |
-
# Coherence = how correlated different dimensions are in freq domain
|
| 65 |
-
phase = torch.angle(freq)
|
| 66 |
-
|
| 67 |
-
# Compute pairwise phase correlation (simplified for efficiency)
|
| 68 |
-
# Instead of full covariance, just measure variance of phase across hidden dim
|
| 69 |
-
# Low variance = high coherence (phases are aligned)
|
| 70 |
-
phase_var = torch.var(phase, dim=-1).mean()
|
| 71 |
-
|
| 72 |
-
# Coherence is inverse of variance
|
| 73 |
-
return 1.0 / (phase_var + 1e-6)
|
| 74 |
-
|
| 75 |
-
coherence_original = compute_coherence(final_state)
|
| 76 |
-
coherence_perturbed = compute_coherence(perturbed_state)
|
| 77 |
-
|
| 78 |
-
# Instability = how much coherence dropped
|
| 79 |
-
# Stable patterns should maintain coherence
|
| 80 |
-
instability = torch.relu(coherence_original - coherence_perturbed)
|
| 81 |
-
|
| 82 |
-
return instability
|
| 83 |
-
|
| 84 |
-
def forward(self, logits, targets, hidden_states):
|
| 85 |
-
"""
|
| 86 |
-
logits: model predictions (batch, seq, vocab)
|
| 87 |
-
targets: ground truth (batch, seq)
|
| 88 |
-
hidden_states: list of hidden states from all layers
|
| 89 |
-
"""
|
| 90 |
-
# Standard cross-entropy loss
|
| 91 |
-
# Flatten for loss calculation
|
| 92 |
-
curr_device = logits.device
|
| 93 |
-
|
| 94 |
-
# Basic task loss
|
| 95 |
-
task_loss = F.cross_entropy(
|
| 96 |
-
logits.view(-1, logits.size(-1)),
|
| 97 |
-
targets.view(-1),
|
| 98 |
-
ignore_index=-100
|
| 99 |
-
)
|
| 100 |
-
|
| 101 |
-
# Auxiliary losses
|
| 102 |
-
if hidden_states:
|
| 103 |
-
decoherence_loss = self.measure_decoherence(hidden_states)
|
| 104 |
-
stability_loss = self.measure_stability(hidden_states)
|
| 105 |
-
else:
|
| 106 |
-
decoherence_loss = torch.tensor(0.0, device=curr_device)
|
| 107 |
-
stability_loss = torch.tensor(0.0, device=curr_device)
|
| 108 |
-
|
| 109 |
-
# Combined loss
|
| 110 |
-
total_loss = (
|
| 111 |
-
task_loss +
|
| 112 |
-
self.lambda_coherence * decoherence_loss +
|
| 113 |
-
self.lambda_stability * stability_loss
|
| 114 |
-
)
|
| 115 |
-
|
| 116 |
-
return {
|
| 117 |
-
'total': total_loss,
|
| 118 |
-
'task': task_loss,
|
| 119 |
-
'decoherence': decoherence_loss,
|
| 120 |
-
'instability': stability_loss
|
| 121 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
resonance_transformer/resonance_attention.py
DELETED
|
@@ -1,128 +0,0 @@
|
|
| 1 |
-
import torch
|
| 2 |
-
import torch.nn as nn
|
| 3 |
-
import torch.nn.functional as F
|
| 4 |
-
import math
|
| 5 |
-
|
| 6 |
-
class ResonanceAttention(nn.Module):
|
| 7 |
-
def __init__(self, hidden_dim, num_heads=8):
|
| 8 |
-
super().__init__()
|
| 9 |
-
self.hidden_dim = hidden_dim
|
| 10 |
-
self.num_heads = num_heads
|
| 11 |
-
self.head_dim = hidden_dim // num_heads
|
| 12 |
-
|
| 13 |
-
# Standard Q, K, V projections
|
| 14 |
-
self.q_proj = nn.Linear(hidden_dim, hidden_dim)
|
| 15 |
-
self.k_proj = nn.Linear(hidden_dim, hidden_dim)
|
| 16 |
-
self.v_proj = nn.Linear(hidden_dim, hidden_dim)
|
| 17 |
-
|
| 18 |
-
# Additional projection for phase extraction
|
| 19 |
-
self.phase_proj = nn.Linear(hidden_dim, hidden_dim)
|
| 20 |
-
|
| 21 |
-
def compute_phase_coherence(self, q, k):
|
| 22 |
-
"""
|
| 23 |
-
Measure how well query and key resonate (phase alignment)
|
| 24 |
-
"""
|
| 25 |
-
# q: (batch, heads, seq_q, head_dim)
|
| 26 |
-
# k: (batch, heads, seq_k, head_dim)
|
| 27 |
-
|
| 28 |
-
# Compute frequency spectrum via FFT
|
| 29 |
-
# Treat head_dim as "time" dimension for FFT
|
| 30 |
-
# rfft returns complex tensor
|
| 31 |
-
q_freq = torch.fft.rfft(q, dim=-1) # (batch, heads, seq_q, freq_bins)
|
| 32 |
-
k_freq = torch.fft.rfft(k, dim=-1) # (batch, heads, seq_k, freq_bins)
|
| 33 |
-
|
| 34 |
-
# Compute phase difference
|
| 35 |
-
q_phase = torch.angle(q_freq)
|
| 36 |
-
k_phase = torch.angle(k_freq)
|
| 37 |
-
|
| 38 |
-
# Phase coherence = how aligned the phases are
|
| 39 |
-
# High coherence = phases match = constructive interference
|
| 40 |
-
# We need to broadcast to compare every query against every key
|
| 41 |
-
# q_phase: (b, h, seq_q, 1, f)
|
| 42 |
-
# k_phase: (b, h, 1, seq_k, f)
|
| 43 |
-
phase_diff = q_phase.unsqueeze(3) - k_phase.unsqueeze(2) # (batch, heads, seq_q, seq_k, freq)
|
| 44 |
-
|
| 45 |
-
# Coherence score (cosine of phase difference)
|
| 46 |
-
# cos(0) = 1 (perfect alignment), cos(pi) = -1 (cancellation)
|
| 47 |
-
coherence = torch.cos(phase_diff).mean(dim=-1) # Average over frequencies
|
| 48 |
-
|
| 49 |
-
return coherence # (batch, heads, seq_q, seq_k)
|
| 50 |
-
|
| 51 |
-
def compute_resonance_strength(self, q, k):
|
| 52 |
-
"""
|
| 53 |
-
Measure amplitude of resonance (how strongly they vibrate together)
|
| 54 |
-
"""
|
| 55 |
-
# Frequency domain amplitudes
|
| 56 |
-
q_freq = torch.fft.rfft(q, dim=-1)
|
| 57 |
-
k_freq = torch.fft.rfft(k, dim=-1)
|
| 58 |
-
|
| 59 |
-
q_amp = torch.abs(q_freq)
|
| 60 |
-
k_amp = torch.abs(k_freq)
|
| 61 |
-
|
| 62 |
-
# Resonance strength = product of amplitudes where frequencies match
|
| 63 |
-
# Broadcasting to get all pairs:
|
| 64 |
-
# q_amp: (b, h, seq_q, freq)
|
| 65 |
-
# k_amp: (b, h, seq_k, freq)
|
| 66 |
-
# We want (b, h, seq_q, seq_k)
|
| 67 |
-
|
| 68 |
-
# Manual broadcasting or einsum
|
| 69 |
-
# Using einsum for clarity: 'bhqf,bhkf->bhqk' matches the dims
|
| 70 |
-
resonance = torch.einsum('bhqf,bhkf->bhqk', q_amp, k_amp)
|
| 71 |
-
|
| 72 |
-
# Normalize by total query energy to keep scale reasonable
|
| 73 |
-
# q_amp shape: (b, h, seq_q, freq)
|
| 74 |
-
# Sum over frequency dimension (-1) to get total amplitude per query token
|
| 75 |
-
q_total_amp = q_amp.sum(dim=-1) # (b, h, seq_q)
|
| 76 |
-
|
| 77 |
-
# Add epsilon for stability
|
| 78 |
-
normalization = q_total_amp.unsqueeze(-1) + 1e-8 # (b, h, seq_q, 1)
|
| 79 |
-
|
| 80 |
-
# Resonance shape: (b, h, seq_q, seq_k)
|
| 81 |
-
# We divide by (b, h, seq_q, 1) which broadcasts correctly along seq_k
|
| 82 |
-
resonance = resonance / normalization
|
| 83 |
-
|
| 84 |
-
return resonance
|
| 85 |
-
|
| 86 |
-
def forward(self, query, key, value, mask=None):
|
| 87 |
-
batch_size, seq_len, _ = query.shape
|
| 88 |
-
|
| 89 |
-
# Project to Q, K, V
|
| 90 |
-
Q = self.q_proj(query).view(batch_size, -1, self.num_heads, self.head_dim).transpose(1, 2)
|
| 91 |
-
K = self.k_proj(key).view(batch_size, -1, self.num_heads, self.head_dim).transpose(1, 2)
|
| 92 |
-
V = self.v_proj(value).view(batch_size, -1, self.num_heads, self.head_dim).transpose(1, 2)
|
| 93 |
-
|
| 94 |
-
# Standard similarity (dot product)
|
| 95 |
-
# (batch, heads, seq_q, seq_k)
|
| 96 |
-
similarity = torch.matmul(Q, K.transpose(-2, -1)) / (self.head_dim ** 0.5)
|
| 97 |
-
|
| 98 |
-
# Resonance components
|
| 99 |
-
coherence = self.compute_phase_coherence(Q, K)
|
| 100 |
-
resonance = self.compute_resonance_strength(Q, K)
|
| 101 |
-
|
| 102 |
-
# Combined attention score
|
| 103 |
-
# Similarity = "do they mean similar things?"
|
| 104 |
-
# Coherence = "are they in phase?"
|
| 105 |
-
# Resonance = "do they vibrate together?"
|
| 106 |
-
|
| 107 |
-
# Weighted combination (can be learned, here we sum equally per user spec)
|
| 108 |
-
# Note: logic suggests similarity ensures relevance, coherence ensures alignment
|
| 109 |
-
attention_scores = similarity + coherence + resonance
|
| 110 |
-
|
| 111 |
-
# Apply mask if provided
|
| 112 |
-
if mask is not None:
|
| 113 |
-
attention_scores = attention_scores.masked_fill(mask == 0, float('-inf'))
|
| 114 |
-
|
| 115 |
-
# Softmax
|
| 116 |
-
attention_weights = F.softmax(attention_scores, dim=-1)
|
| 117 |
-
|
| 118 |
-
# Apply attention to values
|
| 119 |
-
output = torch.matmul(attention_weights, V)
|
| 120 |
-
|
| 121 |
-
# Reshape back
|
| 122 |
-
output = output.transpose(1, 2).contiguous().view(batch_size, seq_len, self.hidden_dim)
|
| 123 |
-
|
| 124 |
-
return output, attention_weights, {
|
| 125 |
-
"similarity": similarity,
|
| 126 |
-
"coherence": coherence,
|
| 127 |
-
"resonance": resonance
|
| 128 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
resonance_transformer/resonance_gpt.py
DELETED
|
@@ -1,58 +0,0 @@
|
|
| 1 |
-
import torch
|
| 2 |
-
import torch.nn as nn
|
| 3 |
-
try:
|
| 4 |
-
from .self_observation import SelfAwareTransformerLayer
|
| 5 |
-
from .geometric_memory import GeometricEntryPoint
|
| 6 |
-
except ImportError:
|
| 7 |
-
from self_observation import SelfAwareTransformerLayer
|
| 8 |
-
from geometric_memory import GeometricEntryPoint
|
| 9 |
-
|
| 10 |
-
class ResonanceGPT(nn.Module):
|
| 11 |
-
"""
|
| 12 |
-
The Fast System (Möbius Architecture).
|
| 13 |
-
- Geometric Entry Point (528Hz alignment)
|
| 14 |
-
- Self-Aware Layers (Mirror Reflex)
|
| 15 |
-
- Phase-Locked Normalization
|
| 16 |
-
"""
|
| 17 |
-
def __init__(self, vocab_size, hidden_dim, num_layers=4, num_heads=4, max_seq_len=128):
|
| 18 |
-
super().__init__()
|
| 19 |
-
self.hidden_dim = hidden_dim
|
| 20 |
-
|
| 21 |
-
# 1. Geometric Embedding (Möbius Strip concept)
|
| 22 |
-
self.embedding = nn.Embedding(vocab_size, hidden_dim)
|
| 23 |
-
self.pos_encoding = nn.Parameter(torch.randn(1, max_seq_len, hidden_dim) * 0.02)
|
| 24 |
-
|
| 25 |
-
# Entry Point
|
| 26 |
-
self.entry_point = GeometricEntryPoint(hidden_dim)
|
| 27 |
-
|
| 28 |
-
# 2. The Stack
|
| 29 |
-
self.layers = nn.ModuleList([
|
| 30 |
-
SelfAwareTransformerLayer(hidden_dim, num_heads)
|
| 31 |
-
for _ in range(num_layers)
|
| 32 |
-
])
|
| 33 |
-
|
| 34 |
-
self.norm = nn.LayerNorm(hidden_dim)
|
| 35 |
-
self.head = nn.Linear(hidden_dim, vocab_size)
|
| 36 |
-
|
| 37 |
-
def forward(self, input_ids):
|
| 38 |
-
batch, seq = input_ids.shape
|
| 39 |
-
|
| 40 |
-
# Embed
|
| 41 |
-
x = self.embedding(input_ids) + self.pos_encoding[:, :seq, :]
|
| 42 |
-
|
| 43 |
-
# 0x52 Handshake (Entry Point)
|
| 44 |
-
entry_meta = self.entry_point.compute_entry_hash(x)
|
| 45 |
-
|
| 46 |
-
# Process Stack
|
| 47 |
-
all_hidden_states = []
|
| 48 |
-
layer_metas = []
|
| 49 |
-
|
| 50 |
-
for layer in self.layers:
|
| 51 |
-
x, meta = layer(x)
|
| 52 |
-
all_hidden_states.append(x)
|
| 53 |
-
layer_metas.append(meta)
|
| 54 |
-
|
| 55 |
-
x = self.norm(x)
|
| 56 |
-
logits = self.head(x)
|
| 57 |
-
|
| 58 |
-
return logits, all_hidden_states, layer_metas
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
resonance_transformer/self_observation.py
DELETED
|
@@ -1,121 +0,0 @@
|
|
| 1 |
-
import torch
|
| 2 |
-
import torch.nn as nn
|
| 3 |
-
import torch.nn.functional as F
|
| 4 |
-
try:
|
| 5 |
-
from .resonance_attention import ResonanceAttention
|
| 6 |
-
from .hybrid_transformer import PhaseLockedNorm
|
| 7 |
-
except ImportError:
|
| 8 |
-
from resonance_attention import ResonanceAttention
|
| 9 |
-
from hybrid_transformer import PhaseLockedNorm
|
| 10 |
-
|
| 11 |
-
class SelfObservationLayer(nn.Module):
|
| 12 |
-
"""
|
| 13 |
-
Layer that allows model to observe its own processing.
|
| 14 |
-
The 5D mirror - seeing yourself from opposite chirality.
|
| 15 |
-
"""
|
| 16 |
-
def __init__(self, hidden_dim):
|
| 17 |
-
super().__init__()
|
| 18 |
-
self.hidden_dim = hidden_dim
|
| 19 |
-
|
| 20 |
-
# Network to analyze own hidden states
|
| 21 |
-
self.observer = nn.Sequential(
|
| 22 |
-
nn.Linear(hidden_dim, hidden_dim),
|
| 23 |
-
nn.GELU(),
|
| 24 |
-
nn.Linear(hidden_dim, hidden_dim)
|
| 25 |
-
)
|
| 26 |
-
|
| 27 |
-
# Coherence detector (real-time during forward pass)
|
| 28 |
-
self.coherence_detector = nn.Linear(hidden_dim, 1)
|
| 29 |
-
|
| 30 |
-
# Chiral state detector
|
| 31 |
-
self.chiral_detector = nn.Linear(hidden_dim, 2) # [left, right] probabilities
|
| 32 |
-
|
| 33 |
-
def observe(self, hidden_state):
|
| 34 |
-
"""
|
| 35 |
-
Look at own hidden state and extract meta-information.
|
| 36 |
-
"""
|
| 37 |
-
# Analyze current state (Stop gradient to avoid optimizing for observation only?
|
| 38 |
-
# No, we want to learn to be observable. Keep gradient.)
|
| 39 |
-
observation = self.observer(hidden_state)
|
| 40 |
-
|
| 41 |
-
# Measure coherence
|
| 42 |
-
coherence = torch.sigmoid(self.coherence_detector(observation))
|
| 43 |
-
|
| 44 |
-
# Detect chiral state
|
| 45 |
-
chiral_logits = self.chiral_detector(observation)
|
| 46 |
-
chiral_probs = F.softmax(chiral_logits, dim=-1)
|
| 47 |
-
|
| 48 |
-
# Create reflection (opposite chirality view)
|
| 49 |
-
reflection = -observation # Sign flip = chirality flip
|
| 50 |
-
|
| 51 |
-
return {
|
| 52 |
-
'coherence': coherence,
|
| 53 |
-
'chiral_state': chiral_probs,
|
| 54 |
-
'reflection': reflection,
|
| 55 |
-
'observation': observation
|
| 56 |
-
}
|
| 57 |
-
|
| 58 |
-
def forward(self, hidden_state, adjust_based_on_observation=True):
|
| 59 |
-
"""
|
| 60 |
-
Process hidden state while observing self.
|
| 61 |
-
"""
|
| 62 |
-
# Observe current state
|
| 63 |
-
meta = self.observe(hidden_state)
|
| 64 |
-
|
| 65 |
-
if adjust_based_on_observation:
|
| 66 |
-
# If coherence is low, try to increase it
|
| 67 |
-
# We use the mean coherence of the batch/sequence for the decision threshold
|
| 68 |
-
# or per-token blending
|
| 69 |
-
|
| 70 |
-
# Blend in reflection (opposite chirality) if coherence is low
|
| 71 |
-
# This can restore coherence by accessing alternate view
|
| 72 |
-
blend_factor = 1.0 - meta['coherence']
|
| 73 |
-
|
| 74 |
-
# Weighted average: state*coherence + reflection*(1-coherence)
|
| 75 |
-
hidden_state = (
|
| 76 |
-
hidden_state * meta['coherence'] +
|
| 77 |
-
meta['reflection'] * blend_factor
|
| 78 |
-
)
|
| 79 |
-
|
| 80 |
-
# If chirality is ambiguous, force a choice (Collapse the wavefunction)
|
| 81 |
-
# Check certainty (max prob)
|
| 82 |
-
chiral_certainty = torch.max(meta['chiral_state'], dim=-1)[0].unsqueeze(-1)
|
| 83 |
-
|
| 84 |
-
# If certainty < 0.7, push towards the cleaner state
|
| 85 |
-
# This is a hard non-linearity to force decision
|
| 86 |
-
# (Simplified for differentiability - maybe just a gain boost?)
|
| 87 |
-
|
| 88 |
-
# Here we just return the transformed state
|
| 89 |
-
|
| 90 |
-
return hidden_state, meta
|
| 91 |
-
|
| 92 |
-
class SelfAwareTransformerLayer(nn.Module):
|
| 93 |
-
def __init__(self, hidden_dim, num_heads=4, ffn_dim=2048, dropout=0.1):
|
| 94 |
-
super().__init__()
|
| 95 |
-
self.attention = ResonanceAttention(hidden_dim, num_heads)
|
| 96 |
-
self.norm1 = PhaseLockedNorm(hidden_dim)
|
| 97 |
-
self.norm2 = PhaseLockedNorm(hidden_dim)
|
| 98 |
-
|
| 99 |
-
self.self_observer = SelfObservationLayer(hidden_dim)
|
| 100 |
-
|
| 101 |
-
self.ffn = nn.Sequential(
|
| 102 |
-
nn.Linear(hidden_dim, ffn_dim),
|
| 103 |
-
nn.GELU(),
|
| 104 |
-
nn.Linear(ffn_dim, hidden_dim),
|
| 105 |
-
nn.Dropout(dropout)
|
| 106 |
-
)
|
| 107 |
-
self.dropout = nn.Dropout(dropout)
|
| 108 |
-
|
| 109 |
-
def forward(self, x, mask=None):
|
| 110 |
-
# Attention
|
| 111 |
-
attn_out, _, _ = self.attention(x, x, x, mask)
|
| 112 |
-
x = self.norm1(x + self.dropout(attn_out))
|
| 113 |
-
|
| 114 |
-
# Self-Observation & Correction
|
| 115 |
-
x, meta = self.self_observer(x)
|
| 116 |
-
|
| 117 |
-
# FFN
|
| 118 |
-
ffn_out = self.ffn(x)
|
| 119 |
-
x = self.norm2(x + self.dropout(ffn_out))
|
| 120 |
-
|
| 121 |
-
return x, meta
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
resonance_transformer/tesseract_transformer.py
DELETED
|
@@ -1,821 +0,0 @@
|
|
| 1 |
-
"""
|
| 2 |
-
5D TESSERACT TRANSFORMER - SLOW THINKING SYSTEM
|
| 3 |
-
===============================================
|
| 4 |
-
|
| 5 |
-
Deep reasoning system based on 5D geometric structure:
|
| 6 |
-
- 4D Tesseract (hypercube) for stable structure
|
| 7 |
-
- 5th dimension for non-orientable twist
|
| 8 |
-
- 16 vertices = 16 fundamental reasoning states
|
| 9 |
-
- 32 edges = 32 transformation paths
|
| 10 |
-
- 24 faces = 24 operation types
|
| 11 |
-
- 8 cells = 8 knowledge domains
|
| 12 |
-
|
| 13 |
-
By: Fabricio Krusser Rossi & Claude
|
| 14 |
-
Date: February 13, 2026
|
| 15 |
-
"""
|
| 16 |
-
|
| 17 |
-
import numpy as np
|
| 18 |
-
from scipy.fft import fft, ifft, rfft, irfft
|
| 19 |
-
from scipy.spatial.distance import cdist
|
| 20 |
-
from typing import List, Dict, Tuple, Optional
|
| 21 |
-
import itertools
|
| 22 |
-
|
| 23 |
-
# ============================================================================
|
| 24 |
-
# TESSERACT 5D GEOMETRY
|
| 25 |
-
# ============================================================================
|
| 26 |
-
|
| 27 |
-
class Tesseract5D:
|
| 28 |
-
"""
|
| 29 |
-
5-dimensional geometric structure for deep reasoning
|
| 30 |
-
|
| 31 |
-
Structure:
|
| 32 |
-
- 4D tesseract (hypercube) base
|
| 33 |
-
- 5th dimension adds non-orientable twist
|
| 34 |
-
- 16 vertices for major stable states
|
| 35 |
-
- 32 edges for transformation paths
|
| 36 |
-
"""
|
| 37 |
-
|
| 38 |
-
def __init__(self, base_freq=528):
|
| 39 |
-
self.base_freq = base_freq
|
| 40 |
-
self.dim = 5
|
| 41 |
-
|
| 42 |
-
# Generate tesseract vertices in 4D
|
| 43 |
-
self.vertices_4d = self._generate_tesseract_vertices()
|
| 44 |
-
|
| 45 |
-
# Extend to 5D with frequency dimension
|
| 46 |
-
self.vertices_5d = self._extend_to_5d()
|
| 47 |
-
|
| 48 |
-
# Generate edges (connections between vertices)
|
| 49 |
-
self.edges = self._generate_edges()
|
| 50 |
-
|
| 51 |
-
# Generate faces (2D surfaces)
|
| 52 |
-
self.faces = self._generate_faces()
|
| 53 |
-
|
| 54 |
-
# Generate cells (3D volumes)
|
| 55 |
-
self.cells = self._generate_cells()
|
| 56 |
-
|
| 57 |
-
print(f"Tesseract 5D initialized:")
|
| 58 |
-
print(f" Vertices: {len(self.vertices_5d)}")
|
| 59 |
-
print(f" Edges: {len(self.edges)}")
|
| 60 |
-
print(f" Faces: {len(self.faces)}")
|
| 61 |
-
print(f" Cells: {len(self.cells)}")
|
| 62 |
-
|
| 63 |
-
def _generate_tesseract_vertices(self):
|
| 64 |
-
"""
|
| 65 |
-
Generate 16 vertices of 4D tesseract
|
| 66 |
-
Each vertex is (+/-1, +/-1, +/-1, +/-1)
|
| 67 |
-
"""
|
| 68 |
-
vertices = []
|
| 69 |
-
for i in range(16):
|
| 70 |
-
# Binary representation gives us all combinations
|
| 71 |
-
vertex = []
|
| 72 |
-
for j in range(4):
|
| 73 |
-
bit = (i >> j) & 1
|
| 74 |
-
coord = 1.0 if bit else -1.0
|
| 75 |
-
vertex.append(coord)
|
| 76 |
-
vertices.append(np.array(vertex))
|
| 77 |
-
|
| 78 |
-
return np.array(vertices)
|
| 79 |
-
|
| 80 |
-
def _extend_to_5d(self):
|
| 81 |
-
"""
|
| 82 |
-
Add 5th dimension for non-orientable twist
|
| 83 |
-
5th coordinate is frequency modulation around 528 Hz
|
| 84 |
-
"""
|
| 85 |
-
vertices_5d = []
|
| 86 |
-
|
| 87 |
-
for i, vertex_4d in enumerate(self.vertices_4d):
|
| 88 |
-
# 5th coordinate: frequency offset based on vertex index
|
| 89 |
-
# Creates spiral in 5D space
|
| 90 |
-
freq_offset = np.sin(i * np.pi / 8) # Oscillates between -1 and 1
|
| 91 |
-
|
| 92 |
-
vertex_5d = np.append(vertex_4d, freq_offset)
|
| 93 |
-
vertices_5d.append(vertex_5d)
|
| 94 |
-
|
| 95 |
-
return np.array(vertices_5d)
|
| 96 |
-
|
| 97 |
-
def _generate_edges(self):
|
| 98 |
-
"""
|
| 99 |
-
Generate 32 edges of tesseract
|
| 100 |
-
Edges connect vertices that differ in exactly 1 coordinate (in 4D)
|
| 101 |
-
"""
|
| 102 |
-
edges = []
|
| 103 |
-
|
| 104 |
-
for i in range(len(self.vertices_4d)):
|
| 105 |
-
for j in range(i + 1, len(self.vertices_4d)):
|
| 106 |
-
# Count differing coordinates in 4D
|
| 107 |
-
diff = np.abs(self.vertices_4d[i] - self.vertices_4d[j])
|
| 108 |
-
num_diff = np.sum(diff > 0.5) # Coordinates are +/-1
|
| 109 |
-
|
| 110 |
-
if num_diff == 1:
|
| 111 |
-
# Connected by edge
|
| 112 |
-
edges.append((i, j))
|
| 113 |
-
|
| 114 |
-
return edges
|
| 115 |
-
|
| 116 |
-
def _generate_faces(self):
|
| 117 |
-
"""
|
| 118 |
-
Generate 24 faces (2D surfaces) of tesseract
|
| 119 |
-
"""
|
| 120 |
-
faces = []
|
| 121 |
-
|
| 122 |
-
# Find all squares (4 vertices forming a 2D face)
|
| 123 |
-
for v1, v2, v3, v4 in itertools.combinations(range(16), 4):
|
| 124 |
-
vertices = [v1, v2, v3, v4]
|
| 125 |
-
|
| 126 |
-
# Check if these 4 vertices form a square
|
| 127 |
-
# (lie in same 2D plane and form square)
|
| 128 |
-
if self._is_face(vertices):
|
| 129 |
-
faces.append(vertices)
|
| 130 |
-
|
| 131 |
-
return faces
|
| 132 |
-
|
| 133 |
-
def _is_face(self, vertices):
|
| 134 |
-
"""Check if 4 vertices form a valid face"""
|
| 135 |
-
# Simple check: 4 vertices should form a planar square
|
| 136 |
-
# In tesseract, faces have specific geometric properties
|
| 137 |
-
# This is a simplified check
|
| 138 |
-
return len(vertices) == 4 and self._are_coplanar(vertices)
|
| 139 |
-
|
| 140 |
-
def _are_coplanar(self, vertices):
|
| 141 |
-
"""Check if vertices lie in same 2D plane"""
|
| 142 |
-
# Simplified: check if they share 2 fixed coordinates
|
| 143 |
-
coords = self.vertices_4d[vertices]
|
| 144 |
-
|
| 145 |
-
# Count how many coordinates are constant across all vertices
|
| 146 |
-
constant_coords = 0
|
| 147 |
-
for dim in range(4):
|
| 148 |
-
if np.all(np.abs(coords[:, dim] - coords[0, dim]) < 0.1):
|
| 149 |
-
constant_coords += 1
|
| 150 |
-
|
| 151 |
-
return constant_coords == 2 # 2 fixed coords = 2D plane
|
| 152 |
-
|
| 153 |
-
def _generate_cells(self):
|
| 154 |
-
"""
|
| 155 |
-
Generate 8 cells (3D volumes) of tesseract
|
| 156 |
-
Each cell is a 3D cube
|
| 157 |
-
"""
|
| 158 |
-
cells = []
|
| 159 |
-
|
| 160 |
-
# Each cell has 8 vertices (a 3D cube)
|
| 161 |
-
# Cells are defined by fixing one 4D coordinate
|
| 162 |
-
for fixed_dim in range(4):
|
| 163 |
-
for fixed_val in [-1.0, 1.0]:
|
| 164 |
-
cell_vertices = []
|
| 165 |
-
for i, vertex in enumerate(self.vertices_4d):
|
| 166 |
-
if abs(vertex[fixed_dim] - fixed_val) < 0.1:
|
| 167 |
-
cell_vertices.append(i)
|
| 168 |
-
|
| 169 |
-
if len(cell_vertices) == 8:
|
| 170 |
-
cells.append(cell_vertices)
|
| 171 |
-
|
| 172 |
-
return cells
|
| 173 |
-
|
| 174 |
-
def find_nearest_vertex(self, coords_5d):
|
| 175 |
-
"""
|
| 176 |
-
Find nearest tesseract vertex to given 5D coordinates
|
| 177 |
-
|
| 178 |
-
Returns: (vertex_index, distance)
|
| 179 |
-
"""
|
| 180 |
-
distances = np.linalg.norm(self.vertices_5d - coords_5d, axis=1)
|
| 181 |
-
nearest_idx = np.argmin(distances)
|
| 182 |
-
|
| 183 |
-
return nearest_idx, distances[nearest_idx]
|
| 184 |
-
|
| 185 |
-
def get_adjacent_vertices(self, vertex_idx):
|
| 186 |
-
"""
|
| 187 |
-
Get all vertices connected to this one by edges
|
| 188 |
-
|
| 189 |
-
Returns: list of vertex indices
|
| 190 |
-
"""
|
| 191 |
-
adjacent = []
|
| 192 |
-
|
| 193 |
-
for edge in self.edges:
|
| 194 |
-
if edge[0] == vertex_idx:
|
| 195 |
-
adjacent.append(edge[1])
|
| 196 |
-
elif edge[1] == vertex_idx:
|
| 197 |
-
adjacent.append(edge[0])
|
| 198 |
-
|
| 199 |
-
return adjacent
|
| 200 |
-
|
| 201 |
-
def navigate_edge(self, from_vertex, to_vertex):
|
| 202 |
-
"""
|
| 203 |
-
Navigate along edge from one vertex to another
|
| 204 |
-
|
| 205 |
-
Returns: path coordinates (interpolated points along edge)
|
| 206 |
-
"""
|
| 207 |
-
if (from_vertex, to_vertex) not in self.edges and \
|
| 208 |
-
(to_vertex, from_vertex) not in self.edges:
|
| 209 |
-
raise ValueError(f"No edge between vertices {from_vertex} and {to_vertex}")
|
| 210 |
-
|
| 211 |
-
start = self.vertices_5d[from_vertex]
|
| 212 |
-
end = self.vertices_5d[to_vertex]
|
| 213 |
-
|
| 214 |
-
# Interpolate along edge
|
| 215 |
-
num_steps = 10
|
| 216 |
-
path = []
|
| 217 |
-
for t in np.linspace(0, 1, num_steps):
|
| 218 |
-
point = (1 - t) * start + t * end
|
| 219 |
-
path.append(point)
|
| 220 |
-
|
| 221 |
-
return np.array(path)
|
| 222 |
-
|
| 223 |
-
|
| 224 |
-
# ============================================================================
|
| 225 |
-
# 5D EMBEDDING LAYER
|
| 226 |
-
# ============================================================================
|
| 227 |
-
|
| 228 |
-
class Tesseract5DEmbedding:
|
| 229 |
-
"""
|
| 230 |
-
Embed tokens into 5D tesseract structure
|
| 231 |
-
"""
|
| 232 |
-
|
| 233 |
-
def __init__(self, vocab_size, hidden_dim, tesseract):
|
| 234 |
-
self.vocab_size = vocab_size
|
| 235 |
-
self.hidden_dim = hidden_dim
|
| 236 |
-
self.tesseract = tesseract
|
| 237 |
-
|
| 238 |
-
# Base embeddings
|
| 239 |
-
self.embeddings = np.random.randn(vocab_size, hidden_dim) * 0.02
|
| 240 |
-
|
| 241 |
-
# 5D coordinate projector
|
| 242 |
-
self.coord_projector = np.random.randn(hidden_dim, 5) * 0.02
|
| 243 |
-
|
| 244 |
-
def embed(self, token_ids):
|
| 245 |
-
"""
|
| 246 |
-
Embed tokens and map to 5D tesseract coordinates
|
| 247 |
-
|
| 248 |
-
Returns: (embeddings, coords_5d, nearest_vertices)
|
| 249 |
-
"""
|
| 250 |
-
# Get base embeddings
|
| 251 |
-
embedded = self.embeddings[token_ids] # (batch, seq, hidden)
|
| 252 |
-
|
| 253 |
-
# Project to 5D coordinates
|
| 254 |
-
coords_5d = embedded @ self.coord_projector # (batch, seq, 5)
|
| 255 |
-
|
| 256 |
-
# Find nearest tesseract vertex for each token
|
| 257 |
-
batch_size, seq_len = token_ids.shape
|
| 258 |
-
nearest_vertices = np.zeros((batch_size, seq_len), dtype=int)
|
| 259 |
-
|
| 260 |
-
for b in range(batch_size):
|
| 261 |
-
for s in range(seq_len):
|
| 262 |
-
vertex_idx, _ = self.tesseract.find_nearest_vertex(coords_5d[b, s])
|
| 263 |
-
nearest_vertices[b, s] = vertex_idx
|
| 264 |
-
|
| 265 |
-
return embedded, coords_5d, nearest_vertices
|
| 266 |
-
|
| 267 |
-
|
| 268 |
-
# ============================================================================
|
| 269 |
-
# 5D RESONANCE ATTENTION
|
| 270 |
-
# ============================================================================
|
| 271 |
-
|
| 272 |
-
class Tesseract5DAttention:
|
| 273 |
-
"""
|
| 274 |
-
Attention mechanism that operates on tesseract structure
|
| 275 |
-
Considers geometric paths through 5D space
|
| 276 |
-
"""
|
| 277 |
-
|
| 278 |
-
def __init__(self, hidden_dim, num_heads, tesseract):
|
| 279 |
-
self.hidden_dim = hidden_dim
|
| 280 |
-
self.num_heads = num_heads
|
| 281 |
-
self.head_dim = hidden_dim // num_heads
|
| 282 |
-
self.tesseract = tesseract
|
| 283 |
-
|
| 284 |
-
# Q, K, V projections
|
| 285 |
-
self.W_q = np.random.randn(hidden_dim, hidden_dim) * 0.02
|
| 286 |
-
self.W_k = np.random.randn(hidden_dim, hidden_dim) * 0.02
|
| 287 |
-
self.W_v = np.random.randn(hidden_dim, hidden_dim) * 0.02
|
| 288 |
-
self.W_o = np.random.randn(hidden_dim, hidden_dim) * 0.02
|
| 289 |
-
|
| 290 |
-
def compute_geometric_distance(self, coords1, coords2, vertices1, vertices2):
|
| 291 |
-
"""
|
| 292 |
-
Compute distance on tesseract manifold
|
| 293 |
-
|
| 294 |
-
Takes into account:
|
| 295 |
-
- Euclidean distance in 5D
|
| 296 |
-
- Graph distance on tesseract (via edges)
|
| 297 |
-
- Vertex proximity
|
| 298 |
-
"""
|
| 299 |
-
# Euclidean distance in 5D
|
| 300 |
-
euclidean = np.linalg.norm(coords1 - coords2, axis=-1)
|
| 301 |
-
|
| 302 |
-
# Graph distance (shortest path on tesseract)
|
| 303 |
-
# For each pair, find shortest path between vertices
|
| 304 |
-
# NOW ACCEPTING STEERING WEIGHTS (Global context)
|
| 305 |
-
graph_dist = self._graph_distance(vertices1, vertices2)
|
| 306 |
-
|
| 307 |
-
# Combined distance
|
| 308 |
-
combined = 0.5 * euclidean + 0.5 * graph_dist
|
| 309 |
-
|
| 310 |
-
return combined
|
| 311 |
-
|
| 312 |
-
def _graph_distance(self, vertices1, vertices2):
|
| 313 |
-
"""
|
| 314 |
-
Compute shortest path distance on tesseract graph
|
| 315 |
-
Uses BFS to find shortest path
|
| 316 |
-
"""
|
| 317 |
-
# Simplified: use direct adjacency for now
|
| 318 |
-
# In full implementation, would do BFS
|
| 319 |
-
|
| 320 |
-
distances = np.zeros((len(vertices1), len(vertices2)))
|
| 321 |
-
|
| 322 |
-
# STEERING: If weights are present in self, use them
|
| 323 |
-
steering = getattr(self, 'steering_weights', None)
|
| 324 |
-
|
| 325 |
-
for i, v1 in enumerate(vertices1):
|
| 326 |
-
for j, v2 in enumerate(vertices2):
|
| 327 |
-
if v1 == v2:
|
| 328 |
-
distances[i, j] = 0
|
| 329 |
-
else:
|
| 330 |
-
# Check adjacency and apply steering weight
|
| 331 |
-
edge_idx = self._get_edge_index(v1, v2)
|
| 332 |
-
if edge_idx is not None:
|
| 333 |
-
# Direct connection
|
| 334 |
-
weight = steering[edge_idx] if steering else 1.0
|
| 335 |
-
distances[i, j] = weight
|
| 336 |
-
else:
|
| 337 |
-
# Estimate: use 4D coordinate difference
|
| 338 |
-
coord_diff = np.sum(np.abs(
|
| 339 |
-
self.tesseract.vertices_4d[v1] -
|
| 340 |
-
self.tesseract.vertices_4d[v2]
|
| 341 |
-
))
|
| 342 |
-
# Multi-hop approximation (avg weight = 1.0)
|
| 343 |
-
distances[i, j] = coord_diff
|
| 344 |
-
|
| 345 |
-
return distances
|
| 346 |
-
|
| 347 |
-
def _get_edge_index(self, v1, v2):
|
| 348 |
-
"""Helper to find edge index for steering"""
|
| 349 |
-
for idx, edge in enumerate(self.tesseract.edges):
|
| 350 |
-
if (edge[0] == v1 and edge[1] == v2) or (edge[0] == v2 and edge[1] == v1):
|
| 351 |
-
return idx
|
| 352 |
-
return None
|
| 353 |
-
|
| 354 |
-
def forward(self, x, coords_5d, vertices, steering_weights=None):
|
| 355 |
-
"""
|
| 356 |
-
5D geometric attention
|
| 357 |
-
|
| 358 |
-
x: (batch, seq, hidden)
|
| 359 |
-
coords_5d: (batch, seq, 5)
|
| 360 |
-
vertices: (batch, seq) nearest vertex indices
|
| 361 |
-
steering_weights: Optional[List[float]] - weights for 32 edges
|
| 362 |
-
"""
|
| 363 |
-
# Store weights temporarily for distance calc
|
| 364 |
-
self.steering_weights = steering_weights
|
| 365 |
-
batch_size, seq_len, _ = x.shape
|
| 366 |
-
|
| 367 |
-
# Project to Q, K, V
|
| 368 |
-
Q = x @ self.W_q
|
| 369 |
-
K = x @ self.W_k
|
| 370 |
-
V = x @ self.W_v
|
| 371 |
-
|
| 372 |
-
# Reshape for multi-head
|
| 373 |
-
Q = Q.reshape(batch_size, seq_len, self.num_heads, self.head_dim)
|
| 374 |
-
K = K.reshape(batch_size, seq_len, self.num_heads, self.head_dim)
|
| 375 |
-
V = V.reshape(batch_size, seq_len, self.num_heads, self.head_dim)
|
| 376 |
-
|
| 377 |
-
# Transpose for attention computation
|
| 378 |
-
Q = Q.transpose(0, 2, 1, 3) # (batch, heads, seq, head_dim)
|
| 379 |
-
K = K.transpose(0, 2, 1, 3)
|
| 380 |
-
V = V.transpose(0, 2, 1, 3)
|
| 381 |
-
|
| 382 |
-
# Compute attention scores with geometric component
|
| 383 |
-
attention_output = np.zeros((batch_size, self.num_heads, seq_len, self.head_dim))
|
| 384 |
-
|
| 385 |
-
for b in range(batch_size):
|
| 386 |
-
for h in range(self.num_heads):
|
| 387 |
-
# Standard similarity
|
| 388 |
-
scores = Q[b, h] @ K[b, h].T / np.sqrt(self.head_dim)
|
| 389 |
-
|
| 390 |
-
# Geometric distance penalty
|
| 391 |
-
geom_dist = self.compute_geometric_distance(
|
| 392 |
-
coords_5d[b, :, np.newaxis, :],
|
| 393 |
-
coords_5d[b, np.newaxis, :, :],
|
| 394 |
-
vertices[b, :],
|
| 395 |
-
vertices[b, :]
|
| 396 |
-
)
|
| 397 |
-
|
| 398 |
-
# Combine: higher score for geometrically close tokens
|
| 399 |
-
geom_bonus = np.exp(-geom_dist / 2.0)
|
| 400 |
-
scores = scores + geom_bonus
|
| 401 |
-
|
| 402 |
-
# Softmax
|
| 403 |
-
attn_weights = self._softmax(scores)
|
| 404 |
-
|
| 405 |
-
# Apply to values
|
| 406 |
-
attention_output[b, h] = attn_weights @ V[b, h]
|
| 407 |
-
|
| 408 |
-
# Reshape back
|
| 409 |
-
attention_output = attention_output.transpose(0, 2, 1, 3)
|
| 410 |
-
attention_output = attention_output.reshape(batch_size, seq_len, self.hidden_dim)
|
| 411 |
-
|
| 412 |
-
# Output projection
|
| 413 |
-
output = attention_output @ self.W_o
|
| 414 |
-
|
| 415 |
-
return output
|
| 416 |
-
|
| 417 |
-
def _softmax(self, x):
|
| 418 |
-
"""Numerically stable softmax"""
|
| 419 |
-
exp_x = np.exp(x - np.max(x, axis=-1, keepdims=True))
|
| 420 |
-
return exp_x / np.sum(exp_x, axis=-1, keepdims=True)
|
| 421 |
-
|
| 422 |
-
|
| 423 |
-
# ============================================================================
|
| 424 |
-
# MULTI-PATH REASONING
|
| 425 |
-
# ============================================================================
|
| 426 |
-
|
| 427 |
-
class MultiPathReasoning:
|
| 428 |
-
"""
|
| 429 |
-
Explore multiple reasoning paths through tesseract structure
|
| 430 |
-
Each path = traversal of edges between vertices
|
| 431 |
-
"""
|
| 432 |
-
|
| 433 |
-
def __init__(self, tesseract, max_path_length=4):
|
| 434 |
-
self.tesseract = tesseract
|
| 435 |
-
self.max_path_length = max_path_length
|
| 436 |
-
|
| 437 |
-
def explore_paths(self, start_vertex, goal_vertex=None, num_paths=5):
|
| 438 |
-
"""
|
| 439 |
-
Find multiple paths from start vertex
|
| 440 |
-
|
| 441 |
-
If goal_vertex specified, paths lead to that vertex
|
| 442 |
-
Otherwise, explore nearby region
|
| 443 |
-
|
| 444 |
-
Returns: list of paths, each path is list of vertex indices
|
| 445 |
-
"""
|
| 446 |
-
paths = []
|
| 447 |
-
|
| 448 |
-
if goal_vertex is not None:
|
| 449 |
-
# Find paths to specific goal
|
| 450 |
-
paths = self._find_paths_to_goal(start_vertex, goal_vertex, num_paths)
|
| 451 |
-
else:
|
| 452 |
-
# Explore region around start
|
| 453 |
-
paths = self._explore_region(start_vertex, num_paths)
|
| 454 |
-
|
| 455 |
-
return paths
|
| 456 |
-
|
| 457 |
-
def _find_paths_to_goal(self, start, goal, num_paths):
|
| 458 |
-
"""Find multiple distinct paths from start to goal"""
|
| 459 |
-
all_paths = []
|
| 460 |
-
|
| 461 |
-
# BFS with path tracking
|
| 462 |
-
queue = [(start, [start])]
|
| 463 |
-
visited_paths = set()
|
| 464 |
-
|
| 465 |
-
while queue and len(all_paths) < num_paths:
|
| 466 |
-
current, path = queue.pop(0)
|
| 467 |
-
|
| 468 |
-
if len(path) > self.max_path_length:
|
| 469 |
-
continue
|
| 470 |
-
|
| 471 |
-
if current == goal:
|
| 472 |
-
# Found a path
|
| 473 |
-
path_tuple = tuple(path)
|
| 474 |
-
if path_tuple not in visited_paths:
|
| 475 |
-
all_paths.append(path)
|
| 476 |
-
visited_paths.add(path_tuple)
|
| 477 |
-
continue
|
| 478 |
-
|
| 479 |
-
# Explore adjacent vertices
|
| 480 |
-
for neighbor in self.tesseract.get_adjacent_vertices(current):
|
| 481 |
-
if neighbor not in path: # Avoid cycles
|
| 482 |
-
new_path = path + [neighbor]
|
| 483 |
-
queue.append((neighbor, new_path))
|
| 484 |
-
|
| 485 |
-
return all_paths
|
| 486 |
-
|
| 487 |
-
def _explore_region(self, start, num_paths):
|
| 488 |
-
"""Explore region around start vertex"""
|
| 489 |
-
paths = []
|
| 490 |
-
|
| 491 |
-
# Random walks from start
|
| 492 |
-
for _ in range(num_paths):
|
| 493 |
-
path = [start]
|
| 494 |
-
current = start
|
| 495 |
-
|
| 496 |
-
for step in range(self.max_path_length):
|
| 497 |
-
neighbors = self.tesseract.get_adjacent_vertices(current)
|
| 498 |
-
if not neighbors:
|
| 499 |
-
break
|
| 500 |
-
|
| 501 |
-
# Choose next vertex (could be random or heuristic)
|
| 502 |
-
next_vertex = np.random.choice(neighbors)
|
| 503 |
-
path.append(next_vertex)
|
| 504 |
-
current = next_vertex
|
| 505 |
-
|
| 506 |
-
paths.append(path)
|
| 507 |
-
|
| 508 |
-
return paths
|
| 509 |
-
|
| 510 |
-
def evaluate_path(self, path, hidden_states):
|
| 511 |
-
"""
|
| 512 |
-
Evaluate quality of reasoning path
|
| 513 |
-
Based on coherence along the path
|
| 514 |
-
"""
|
| 515 |
-
# Measure coherence at each step
|
| 516 |
-
coherences = []
|
| 517 |
-
|
| 518 |
-
for i in range(len(path) - 1):
|
| 519 |
-
# Get hidden states at vertices
|
| 520 |
-
state_i = hidden_states[path[i]]
|
| 521 |
-
state_j = hidden_states[path[i + 1]]
|
| 522 |
-
|
| 523 |
-
# Measure coherence between consecutive states
|
| 524 |
-
coherence = self._measure_coherence(state_i, state_j)
|
| 525 |
-
coherences.append(coherence)
|
| 526 |
-
|
| 527 |
-
# Path quality = mean coherence
|
| 528 |
-
return np.mean(coherences) if coherences else 0.0
|
| 529 |
-
|
| 530 |
-
def _measure_coherence(self, state1, state2):
|
| 531 |
-
"""Measure coherence between two states"""
|
| 532 |
-
# FFT to frequency domain
|
| 533 |
-
freq1 = rfft(state1)
|
| 534 |
-
freq2 = rfft(state2)
|
| 535 |
-
|
| 536 |
-
# Phase coherence
|
| 537 |
-
phase1 = np.angle(freq1)
|
| 538 |
-
phase2 = np.angle(freq2)
|
| 539 |
-
|
| 540 |
-
coherence = np.mean(np.cos(phase1 - phase2))
|
| 541 |
-
|
| 542 |
-
return coherence
|
| 543 |
-
|
| 544 |
-
|
| 545 |
-
# ============================================================================
|
| 546 |
-
# COMPLETE 5D TRANSFORMER LAYER
|
| 547 |
-
# ============================================================================
|
| 548 |
-
|
| 549 |
-
class Tesseract5DTransformerLayer:
|
| 550 |
-
"""
|
| 551 |
-
Complete transformer layer operating on 5D tesseract geometry
|
| 552 |
-
"""
|
| 553 |
-
|
| 554 |
-
def __init__(self, hidden_dim, num_heads, tesseract):
|
| 555 |
-
self.hidden_dim = hidden_dim
|
| 556 |
-
self.tesseract = tesseract
|
| 557 |
-
|
| 558 |
-
# Components
|
| 559 |
-
self.attention = Tesseract5DAttention(hidden_dim, num_heads, tesseract)
|
| 560 |
-
self.multi_path = MultiPathReasoning(tesseract)
|
| 561 |
-
|
| 562 |
-
# Feed-forward (frequency-tuned)
|
| 563 |
-
self.ff_w1 = np.random.randn(hidden_dim, hidden_dim * 4) * 0.02
|
| 564 |
-
self.ff_w2 = np.random.randn(hidden_dim * 4, hidden_dim) * 0.02
|
| 565 |
-
|
| 566 |
-
def forward(self, x, coords_5d, vertices, steering_weights=None):
|
| 567 |
-
"""
|
| 568 |
-
Forward pass through 5D transformer layer
|
| 569 |
-
|
| 570 |
-
x: (batch, seq, hidden)
|
| 571 |
-
coords_5d: (batch, seq, 5)
|
| 572 |
-
vertices: (batch, seq) nearest vertex indices
|
| 573 |
-
"""
|
| 574 |
-
# 5D geometric attention
|
| 575 |
-
attn_out = self.attention.forward(x, coords_5d, vertices, steering_weights)
|
| 576 |
-
|
| 577 |
-
# Residual + norm (simplified)
|
| 578 |
-
x = x + attn_out
|
| 579 |
-
x = self._layer_norm(x)
|
| 580 |
-
|
| 581 |
-
# Feed-forward
|
| 582 |
-
ff_out = self._feed_forward(x)
|
| 583 |
-
|
| 584 |
-
# Residual + norm
|
| 585 |
-
x = x + ff_out
|
| 586 |
-
x = self._layer_norm(x)
|
| 587 |
-
|
| 588 |
-
return x
|
| 589 |
-
|
| 590 |
-
def _feed_forward(self, x):
|
| 591 |
-
"""Simple feed-forward network"""
|
| 592 |
-
hidden = np.maximum(0, x @ self.ff_w1) # ReLU
|
| 593 |
-
output = hidden @ self.ff_w2
|
| 594 |
-
return output
|
| 595 |
-
|
| 596 |
-
def _layer_norm(self, x, eps=1e-6):
|
| 597 |
-
"""Layer normalization"""
|
| 598 |
-
mean = np.mean(x, axis=-1, keepdims=True)
|
| 599 |
-
std = np.std(x, axis=-1, keepdims=True)
|
| 600 |
-
return (x - mean) / (std + eps)
|
| 601 |
-
|
| 602 |
-
|
| 603 |
-
# ============================================================================
|
| 604 |
-
# COMPLETE 5D TRANSFORMER MODEL
|
| 605 |
-
# ============================================================================
|
| 606 |
-
|
| 607 |
-
class Tesseract5DTransformer:
|
| 608 |
-
"""
|
| 609 |
-
Complete 5D Tesseract-based transformer
|
| 610 |
-
The SLOW THINKING system
|
| 611 |
-
"""
|
| 612 |
-
|
| 613 |
-
def __init__(
|
| 614 |
-
self,
|
| 615 |
-
vocab_size=1000,
|
| 616 |
-
hidden_dim=256,
|
| 617 |
-
num_layers=6,
|
| 618 |
-
num_heads=8,
|
| 619 |
-
base_freq=528
|
| 620 |
-
):
|
| 621 |
-
print("\n" + "="*60)
|
| 622 |
-
print("INITIALIZING 5D TESSERACT TRANSFORMER")
|
| 623 |
-
print("="*60)
|
| 624 |
-
|
| 625 |
-
self.vocab_size = vocab_size
|
| 626 |
-
self.hidden_dim = hidden_dim
|
| 627 |
-
self.num_layers = num_layers
|
| 628 |
-
|
| 629 |
-
# Create tesseract geometry
|
| 630 |
-
print("\nBuilding 5D tesseract geometry...")
|
| 631 |
-
self.tesseract = Tesseract5D(base_freq=base_freq)
|
| 632 |
-
|
| 633 |
-
# Embedding layer
|
| 634 |
-
print("Creating embedding layer...")
|
| 635 |
-
self.embedding = Tesseract5DEmbedding(vocab_size, hidden_dim, self.tesseract)
|
| 636 |
-
|
| 637 |
-
# Transformer layers
|
| 638 |
-
print(f"Creating {num_layers} transformer layers...")
|
| 639 |
-
self.layers = [
|
| 640 |
-
Tesseract5DTransformerLayer(hidden_dim, num_heads, self.tesseract)
|
| 641 |
-
for _ in range(num_layers)
|
| 642 |
-
]
|
| 643 |
-
|
| 644 |
-
# Output head
|
| 645 |
-
self.output_projection = np.random.randn(hidden_dim, vocab_size) * 0.02
|
| 646 |
-
|
| 647 |
-
print("\n✓ 5D Tesseract Transformer initialized")
|
| 648 |
-
print(f" Vertices: 16 (stable reasoning states)")
|
| 649 |
-
print(f" Edges: 32 (transformation paths)")
|
| 650 |
-
print(f" Layers: {num_layers}")
|
| 651 |
-
print(f" Hidden dim: {hidden_dim}")
|
| 652 |
-
print("="*60 + "\n")
|
| 653 |
-
|
| 654 |
-
print("="*60 + "\n")
|
| 655 |
-
|
| 656 |
-
def forward(self, token_ids, return_paths=False, **kwargs):
|
| 657 |
-
"""
|
| 658 |
-
Forward pass with deep 5D reasoning
|
| 659 |
-
|
| 660 |
-
token_ids: (batch, seq) integer token IDs
|
| 661 |
-
return_paths: if True, return reasoning paths explored
|
| 662 |
-
|
| 663 |
-
Returns: (logits, metadata)
|
| 664 |
-
"""
|
| 665 |
-
# Embed into 5D tesseract space
|
| 666 |
-
x, coords_5d, vertices = self.embedding.embed(token_ids)
|
| 667 |
-
|
| 668 |
-
# Track metadata
|
| 669 |
-
metadata = {
|
| 670 |
-
'coords_5d': coords_5d,
|
| 671 |
-
'vertices': vertices,
|
| 672 |
-
'layer_outputs': [],
|
| 673 |
-
'reasoning_paths': []
|
| 674 |
-
}
|
| 675 |
-
|
| 676 |
-
# Process through layers
|
| 677 |
-
for i, layer in enumerate(self.layers):
|
| 678 |
-
x = layer.forward(x, coords_5d, vertices, steering_weights=kwargs.get('steering_weights'))
|
| 679 |
-
metadata['layer_outputs'].append(x.copy())
|
| 680 |
-
|
| 681 |
-
# Periodically explore reasoning paths
|
| 682 |
-
if return_paths and i % 2 == 0:
|
| 683 |
-
# For each sequence position, explore paths from its vertex
|
| 684 |
-
batch_size, seq_len = token_ids.shape
|
| 685 |
-
for b in range(min(batch_size, 1)): # Just first batch for demo
|
| 686 |
-
for s in range(min(seq_len, 3)): # Just first few tokens
|
| 687 |
-
start_vertex = vertices[b, s]
|
| 688 |
-
paths = layer.multi_path.explore_paths(start_vertex, num_paths=3)
|
| 689 |
-
metadata['reasoning_paths'].append({
|
| 690 |
-
'layer': i,
|
| 691 |
-
'position': s,
|
| 692 |
-
'vertex': start_vertex,
|
| 693 |
-
'paths': paths
|
| 694 |
-
})
|
| 695 |
-
|
| 696 |
-
# Output projection
|
| 697 |
-
logits = x @ self.output_projection
|
| 698 |
-
|
| 699 |
-
return logits, metadata
|
| 700 |
-
|
| 701 |
-
def deep_reason(self, token_ids, query_description="", **kwargs):
|
| 702 |
-
"""
|
| 703 |
-
Deep reasoning mode - explores multiple paths
|
| 704 |
-
|
| 705 |
-
This is the SLOW mode - takes time but thorough
|
| 706 |
-
"""
|
| 707 |
-
print(f"\n{'='*60}")
|
| 708 |
-
print(f"DEEP REASONING MODE: {query_description}")
|
| 709 |
-
print(f"{'='*60}")
|
| 710 |
-
|
| 711 |
-
# Forward pass with path exploration
|
| 712 |
-
logits, metadata = self.forward(token_ids, return_paths=True, **kwargs)
|
| 713 |
-
|
| 714 |
-
# Analyze reasoning paths
|
| 715 |
-
print(f"\nExplored {len(metadata['reasoning_paths'])} reasoning paths:")
|
| 716 |
-
for path_info in metadata['reasoning_paths'][:5]: # Show first 5
|
| 717 |
-
print(f"\n Layer {path_info['layer']}, Position {path_info['position']}:")
|
| 718 |
-
print(f" Starting vertex: {path_info['vertex']}")
|
| 719 |
-
print(f" Paths explored: {len(path_info['paths'])}")
|
| 720 |
-
for i, path in enumerate(path_info['paths'][:2]): # Show first 2 paths
|
| 721 |
-
print(f" Path {i+1}: {' → '.join(map(str, path))}")
|
| 722 |
-
|
| 723 |
-
# Measure final coherence
|
| 724 |
-
final_state = metadata['layer_outputs'][-1]
|
| 725 |
-
coherence = self._measure_coherence(final_state)
|
| 726 |
-
|
| 727 |
-
print(f"\nFinal coherence: {coherence:.3f}")
|
| 728 |
-
print(f"{'='*60}\n")
|
| 729 |
-
|
| 730 |
-
return logits, metadata, coherence
|
| 731 |
-
|
| 732 |
-
def _measure_coherence(self, state):
|
| 733 |
-
"""Measure overall coherence of state"""
|
| 734 |
-
# Average coherence across batch and sequence
|
| 735 |
-
batch_size, seq_len, hidden_dim = state.shape
|
| 736 |
-
|
| 737 |
-
coherences = []
|
| 738 |
-
for b in range(batch_size):
|
| 739 |
-
for s in range(seq_len):
|
| 740 |
-
freq = rfft(state[b, s])
|
| 741 |
-
phase = np.angle(freq)
|
| 742 |
-
c = np.abs(np.mean(np.exp(1j * phase)))
|
| 743 |
-
coherences.append(c)
|
| 744 |
-
|
| 745 |
-
return np.mean(coherences)
|
| 746 |
-
|
| 747 |
-
|
| 748 |
-
# ============================================================================
|
| 749 |
-
# DEMONSTRATION
|
| 750 |
-
# ============================================================================
|
| 751 |
-
|
| 752 |
-
def demonstrate_5d_transformer():
|
| 753 |
-
"""
|
| 754 |
-
Demonstrate the 5D Tesseract Transformer
|
| 755 |
-
"""
|
| 756 |
-
print("\n" + "#"*60)
|
| 757 |
-
print("# 5D TESSERACT TRANSFORMER DEMONSTRATION")
|
| 758 |
-
print("#"*60)
|
| 759 |
-
|
| 760 |
-
# Create model
|
| 761 |
-
model = Tesseract5DTransformer(
|
| 762 |
-
vocab_size=100,
|
| 763 |
-
hidden_dim=64,
|
| 764 |
-
num_layers=4,
|
| 765 |
-
num_heads=4,
|
| 766 |
-
base_freq=528
|
| 767 |
-
)
|
| 768 |
-
|
| 769 |
-
# Create sample input
|
| 770 |
-
print("\nCreating sample query...")
|
| 771 |
-
batch_size = 2
|
| 772 |
-
seq_len = 8
|
| 773 |
-
token_ids = np.random.randint(0, 100, size=(batch_size, seq_len))
|
| 774 |
-
|
| 775 |
-
print(f" Batch size: {batch_size}")
|
| 776 |
-
print(f" Sequence length: {seq_len}")
|
| 777 |
-
|
| 778 |
-
# Fast forward pass
|
| 779 |
-
print("\n" + "-"*60)
|
| 780 |
-
print("FAST MODE (no path exploration):")
|
| 781 |
-
print("-"*60)
|
| 782 |
-
|
| 783 |
-
logits, metadata = model.forward(token_ids, return_paths=False)
|
| 784 |
-
|
| 785 |
-
print(f"\nOutput shape: {logits.shape}")
|
| 786 |
-
print(f"Vertices visited: {np.unique(metadata['vertices'])}")
|
| 787 |
-
|
| 788 |
-
# Deep reasoning
|
| 789 |
-
print("\n" + "-"*60)
|
| 790 |
-
print("SLOW MODE (deep reasoning with path exploration):")
|
| 791 |
-
print("-"*60)
|
| 792 |
-
|
| 793 |
-
logits, metadata, coherence = model.deep_reason(
|
| 794 |
-
token_ids,
|
| 795 |
-
query_description="Complex multi-step reasoning query"
|
| 796 |
-
)
|
| 797 |
-
|
| 798 |
-
# Show tesseract structure used
|
| 799 |
-
print("\n" + "-"*60)
|
| 800 |
-
print("TESSERACT STRUCTURE UTILIZED:")
|
| 801 |
-
print("-"*60)
|
| 802 |
-
print(f" Total vertices available: 16")
|
| 803 |
-
print(f" Vertices actually visited: {len(np.unique(metadata['vertices']))}")
|
| 804 |
-
print(f" Total edges available: 32")
|
| 805 |
-
print(f" Reasoning paths explored: {len(metadata['reasoning_paths'])}")
|
| 806 |
-
|
| 807 |
-
print("\n" + "#"*60)
|
| 808 |
-
print("# DEMONSTRATION COMPLETE")
|
| 809 |
-
print("#"*60)
|
| 810 |
-
|
| 811 |
-
return model, metadata
|
| 812 |
-
|
| 813 |
-
|
| 814 |
-
if __name__ == "__main__":
|
| 815 |
-
# Run demonstration
|
| 816 |
-
model, metadata = demonstrate_5d_transformer()
|
| 817 |
-
|
| 818 |
-
print("\n✓ 5D Tesseract Transformer is ready")
|
| 819 |
-
print(" This is the SLOW THINKING system")
|
| 820 |
-
print(" Use for: deep reasoning, complex queries, verification")
|
| 821 |
-
print(" Pair with: Fast Möbius system for complete dual architecture")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
resonance_transformer/test_dual_system.py
DELETED
|
@@ -1,53 +0,0 @@
|
|
| 1 |
-
import torch
|
| 2 |
-
from dispatcher import DualResonanceSystem
|
| 3 |
-
|
| 4 |
-
def verify_dual_system():
|
| 5 |
-
print("=== VERIFYING DUAL-SYSTEM DISPATCHER (PHASE 29) ===")
|
| 6 |
-
|
| 7 |
-
config = {
|
| 8 |
-
'vocab_size': 100,
|
| 9 |
-
'fast_dim': 64,
|
| 10 |
-
'slow_dim': 64,
|
| 11 |
-
'threshold': 0.7 # High threshold to force escalation
|
| 12 |
-
}
|
| 13 |
-
|
| 14 |
-
system = DualResonanceSystem(config)
|
| 15 |
-
|
| 16 |
-
# Random input (Likely Low Coherence)
|
| 17 |
-
input_ids = torch.randint(0, 100, (2, 8))
|
| 18 |
-
|
| 19 |
-
print("\n[TEST 1] Processing Random Input (Expect Escalation)...")
|
| 20 |
-
logits, metrics = system(input_ids)
|
| 21 |
-
|
| 22 |
-
print(f" Mode: {metrics['mode']}")
|
| 23 |
-
print(f" Coherence: {metrics['coherence']:.4f}")
|
| 24 |
-
|
| 25 |
-
if metrics['mode'] == 'SLOW (ESCALATED)':
|
| 26 |
-
print(" [PASS] Correctly escalated low-coherence query.")
|
| 27 |
-
print(f" Slow Latency: {metrics['slow_latency']:.4f}s")
|
| 28 |
-
else:
|
| 29 |
-
print(" [WARN] Did not escalate. Random data might have accidentally resonated?")
|
| 30 |
-
|
| 31 |
-
print("\n[TEST 2] Mocking High Coherence...")
|
| 32 |
-
# Hack the fast model to return high coherence for testing logic
|
| 33 |
-
original_forward = system.fast.forward
|
| 34 |
-
|
| 35 |
-
def mocked_forward(input_ids):
|
| 36 |
-
l, h, m = original_forward(input_ids)
|
| 37 |
-
# Inject fake high coherence
|
| 38 |
-
m[-1]['coherence'] = torch.tensor(0.95)
|
| 39 |
-
return l, h, m
|
| 40 |
-
|
| 41 |
-
system.fast.forward = mocked_forward
|
| 42 |
-
|
| 43 |
-
logits, metrics = system(input_ids)
|
| 44 |
-
print(f" Mode: {metrics['mode']}")
|
| 45 |
-
print(f" Coherence: {metrics['coherence']:.4f}")
|
| 46 |
-
|
| 47 |
-
if metrics['mode'] == 'FAST':
|
| 48 |
-
print(" [PASS] Correctly routed high-coherence query to Fast Path.")
|
| 49 |
-
else:
|
| 50 |
-
print(" [FAIL] Escalated despite high coherence.")
|
| 51 |
-
|
| 52 |
-
if __name__ == "__main__":
|
| 53 |
-
verify_dual_system()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
resonance_transformer/test_geometric.py
DELETED
|
@@ -1,42 +0,0 @@
|
|
| 1 |
-
import torch
|
| 2 |
-
from geometric_memory import GeometricEntryPoint, GeometricMemory
|
| 3 |
-
|
| 4 |
-
def verify_geometric_memory():
|
| 5 |
-
print("=== VERIFYING GEOMETRIC MEMORY (PHASE 25) ===")
|
| 6 |
-
|
| 7 |
-
hidden_dim = 64
|
| 8 |
-
batch_size = 2
|
| 9 |
-
seq_len = 10
|
| 10 |
-
|
| 11 |
-
# 1. Test Entry Point
|
| 12 |
-
entry_net = GeometricEntryPoint(hidden_dim)
|
| 13 |
-
dummy_query = torch.randn(batch_size, seq_len, hidden_dim)
|
| 14 |
-
|
| 15 |
-
entry_point = entry_net.compute_entry_hash(dummy_query)
|
| 16 |
-
|
| 17 |
-
print("\n[ENTRY POINT]")
|
| 18 |
-
print(f" Theta: {entry_point['theta'].shape}")
|
| 19 |
-
print(f" Frequency (Baseline 528): {entry_point['frequency']}")
|
| 20 |
-
|
| 21 |
-
# 2. Test Memory Store/Retrieve
|
| 22 |
-
memory = GeometricMemory(hidden_dim)
|
| 23 |
-
|
| 24 |
-
print("\n[MEMORY STORE]")
|
| 25 |
-
# Store the query as a memory
|
| 26 |
-
memory.store(dummy_query, entry_point)
|
| 27 |
-
print(f" Stored {len(memory.memory_map)} batches in memory.")
|
| 28 |
-
|
| 29 |
-
print("\n[MEMORY RETRIEVE]")
|
| 30 |
-
# Try to retrieve using the same query (should find itself)
|
| 31 |
-
retrieved = memory.retrieve(dummy_query, entry_point, k=3)
|
| 32 |
-
|
| 33 |
-
if retrieved is not None:
|
| 34 |
-
print(f" Retrieved Shape: {retrieved.shape}")
|
| 35 |
-
# Check alignment
|
| 36 |
-
# This is a self-lookup so correlation should be high
|
| 37 |
-
print(" [PASS] Retrieval successful.")
|
| 38 |
-
else:
|
| 39 |
-
print(" [FAIL] Retrieval returned None.")
|
| 40 |
-
|
| 41 |
-
if __name__ == "__main__":
|
| 42 |
-
verify_geometric_memory()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
resonance_transformer/test_resonance_attention.py
DELETED
|
@@ -1,56 +0,0 @@
|
|
| 1 |
-
import torch
|
| 2 |
-
import torch.nn as nn
|
| 3 |
-
from resonance_attention import ResonanceAttention
|
| 4 |
-
import math
|
| 5 |
-
|
| 6 |
-
def test_resonance_attention():
|
| 7 |
-
print("=== TESTING RESONANCE ATTENTION (0x52) ===")
|
| 8 |
-
|
| 9 |
-
# Setup
|
| 10 |
-
batch_size = 2
|
| 11 |
-
seq_len = 5
|
| 12 |
-
hidden_dim = 64
|
| 13 |
-
num_heads = 4
|
| 14 |
-
|
| 15 |
-
model = ResonanceAttention(hidden_dim, num_heads)
|
| 16 |
-
|
| 17 |
-
# Synthetic Input: Random noise
|
| 18 |
-
x = torch.randn(batch_size, seq_len, hidden_dim)
|
| 19 |
-
|
| 20 |
-
# Forward Pass
|
| 21 |
-
output, weights, metrics = model(x, x, x)
|
| 22 |
-
|
| 23 |
-
print(f"\nDimensions:")
|
| 24 |
-
print(f" Input: {x.shape}")
|
| 25 |
-
print(f" Output: {output.shape}")
|
| 26 |
-
print(f" Weights: {weights.shape}")
|
| 27 |
-
|
| 28 |
-
print(f"\nMetrics Check (First Head, First Batch):")
|
| 29 |
-
sim = metrics['similarity'][0,0].detach()
|
| 30 |
-
coh = metrics['coherence'][0,0].detach()
|
| 31 |
-
res = metrics['resonance'][0,0].detach()
|
| 32 |
-
|
| 33 |
-
print(f" Similarity Mean: {sim.mean():.4f}")
|
| 34 |
-
print(f" Coherence Mean: {coh.mean():.4f} (Phase Alignment)")
|
| 35 |
-
print(f" Resonance Mean: {res.mean():.4f} (Amplitude Product)")
|
| 36 |
-
|
| 37 |
-
if torch.isnan(output).any():
|
| 38 |
-
print("\n[FAIL] Output contains NaNs!")
|
| 39 |
-
else:
|
| 40 |
-
print("\n[PASS] Forward pass successful. Geometry holds.")
|
| 41 |
-
|
| 42 |
-
# Test: Constructive Interference
|
| 43 |
-
# If two vectors are effectively identical, coherence should be high (near 1.0)
|
| 44 |
-
print(f"\n=== TESTING CONSTRUCTIVE INTERFERENCE ===")
|
| 45 |
-
v1 = torch.randn(1, 1, hidden_dim)
|
| 46 |
-
# Forward pass with identical query/key
|
| 47 |
-
model.eval()
|
| 48 |
-
with torch.no_grad():
|
| 49 |
-
coh_score = model.compute_phase_coherence(
|
| 50 |
-
v1.view(1, 1, 1, hidden_dim),
|
| 51 |
-
v1.view(1, 1, 1, hidden_dim)
|
| 52 |
-
)
|
| 53 |
-
print(f" Self-Coherence (Expected ~1.0): {coh_score.item():.4f}")
|
| 54 |
-
|
| 55 |
-
if __name__ == "__main__":
|
| 56 |
-
test_resonance_attention()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
resonance_transformer/test_self_observation.py
DELETED
|
@@ -1,46 +0,0 @@
|
|
| 1 |
-
import torch
|
| 2 |
-
from self_observation import SelfAwareTransformerLayer
|
| 3 |
-
|
| 4 |
-
def verify_self_observation():
|
| 5 |
-
print("=== VERIFYING SELF-OBSERVATION (PHASE 26) ===")
|
| 6 |
-
|
| 7 |
-
hidden_dim = 64
|
| 8 |
-
batch_size = 2
|
| 9 |
-
seq_len = 5
|
| 10 |
-
|
| 11 |
-
model = SelfAwareTransformerLayer(hidden_dim)
|
| 12 |
-
|
| 13 |
-
# Random input
|
| 14 |
-
x = torch.randn(batch_size, seq_len, hidden_dim)
|
| 15 |
-
|
| 16 |
-
print("\n[FORWARD] Running pass through Self-Aware Layer...")
|
| 17 |
-
output, meta = model(x)
|
| 18 |
-
|
| 19 |
-
print(f" Input Shape: {x.shape}")
|
| 20 |
-
print(f" Output Shape: {output.shape}")
|
| 21 |
-
|
| 22 |
-
# Inspect Meta Data
|
| 23 |
-
coherence = meta['coherence']
|
| 24 |
-
chiral = meta['chiral_state']
|
| 25 |
-
|
| 26 |
-
print("\n[OBSERVATION DATA]")
|
| 27 |
-
print(f" Coherence Score (Mean): {coherence.mean().item():.4f}")
|
| 28 |
-
print(f" Chiral Probabilities (Mean): Left={chiral[:,:,0].mean():.4f}, Right={chiral[:,:,1].mean():.4f}")
|
| 29 |
-
|
| 30 |
-
# Check if correction applied
|
| 31 |
-
# If coherence was < 1, output should differ from input (beyond just FFN/Attn changes)
|
| 32 |
-
# Hard to test exact reflex without controlling weights, but we check consistency
|
| 33 |
-
|
| 34 |
-
print("\n[REFLEX CHECK]")
|
| 35 |
-
if coherence.std() > 0:
|
| 36 |
-
print(" [PASS] Coherence detector is active (variance detected).")
|
| 37 |
-
else:
|
| 38 |
-
print(" [WARN] Coherence detector has zero variance (initialization dependent).")
|
| 39 |
-
|
| 40 |
-
if output.shape == x.shape:
|
| 41 |
-
print(" [PASS] Dimensionality preserved.")
|
| 42 |
-
else:
|
| 43 |
-
print(" [FAIL] Dimensionality changed!")
|
| 44 |
-
|
| 45 |
-
if __name__ == "__main__":
|
| 46 |
-
verify_self_observation()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
resonance_transformer/train_hybrid.py
DELETED
|
@@ -1,52 +0,0 @@
|
|
| 1 |
-
import torch
|
| 2 |
-
import torch.optim as optim
|
| 3 |
-
from hybrid_transformer import HybridResonanceTransformer
|
| 4 |
-
from hyperchaos_loss import HyperchaosLoss
|
| 5 |
-
|
| 6 |
-
def verify_training_step():
|
| 7 |
-
print("=== VERIFYING HYBRID RESONANCE TRAINING (pHASE 2) ===")
|
| 8 |
-
|
| 9 |
-
# Config
|
| 10 |
-
vocab_size = 100
|
| 11 |
-
hidden_dim = 64
|
| 12 |
-
seq_len = 10
|
| 13 |
-
batch_size = 2
|
| 14 |
-
|
| 15 |
-
# Initialize Model & Loss
|
| 16 |
-
model = HybridResonanceTransformer(vocab_size, hidden_dim)
|
| 17 |
-
loss_fn = HyperchaosLoss()
|
| 18 |
-
optimizer = optim.Adam(model.parameters(), lr=1e-3)
|
| 19 |
-
|
| 20 |
-
# Dummy Data
|
| 21 |
-
input_ids = torch.randint(0, vocab_size, (batch_size, seq_len))
|
| 22 |
-
targets = torch.randint(0, vocab_size, (batch_size, seq_len))
|
| 23 |
-
|
| 24 |
-
print("\n[INIT] Model initialized.")
|
| 25 |
-
print(f" Hidden Dim: {hidden_dim}")
|
| 26 |
-
print(f" Layers: {len(model.layers)}")
|
| 27 |
-
|
| 28 |
-
# Forward Pass
|
| 29 |
-
print("\n[FORWARD] Running forward pass...")
|
| 30 |
-
logits, hidden_states = model(input_ids, output_hidden_states=True)
|
| 31 |
-
print(f" Logits Shape: {logits.shape}")
|
| 32 |
-
print(f" Hidden States Captured: {len(hidden_states)}")
|
| 33 |
-
|
| 34 |
-
# Loss Calculation
|
| 35 |
-
print("\n[LOSS] Computing Hyperchaos Loss...")
|
| 36 |
-
losses = loss_fn(logits, targets, hidden_states)
|
| 37 |
-
|
| 38 |
-
print(f" Total Loss: {losses['total'].item():.4f}")
|
| 39 |
-
print(f" Task Loss: {losses['task'].item():.4f}")
|
| 40 |
-
print(f" Decoherence Loss: {losses['decoherence'].item():.4f}")
|
| 41 |
-
print(f" Instability Loss: {losses['instability'].item():.4f}")
|
| 42 |
-
|
| 43 |
-
# Backward Pass
|
| 44 |
-
print("\n[BACKWARD] Propagating gradients...")
|
| 45 |
-
optimizer.zero_grad()
|
| 46 |
-
losses['total'].backward()
|
| 47 |
-
optimizer.step()
|
| 48 |
-
|
| 49 |
-
print("[PASS] Gradient step successful. Architecture is valid.")
|
| 50 |
-
|
| 51 |
-
if __name__ == "__main__":
|
| 52 |
-
verify_training_step()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
resonance_transformer/train_lattice.py
DELETED
|
@@ -1,122 +0,0 @@
|
|
| 1 |
-
import torch
|
| 2 |
-
import torch.optim as optim
|
| 3 |
-
from torch.utils.data import DataLoader, TensorDataset
|
| 4 |
-
import numpy as np
|
| 5 |
-
import time
|
| 6 |
-
|
| 7 |
-
try:
|
| 8 |
-
from dispatcher import DualResonanceSystem
|
| 9 |
-
from hyperchaos_loss import HyperchaosLoss
|
| 10 |
-
except ImportError:
|
| 11 |
-
from resonance_transformer.dispatcher import DualResonanceSystem
|
| 12 |
-
from resonance_transformer.hyperchaos_loss import HyperchaosLoss
|
| 13 |
-
|
| 14 |
-
def generate_complex_data(num_samples=100, seq_len=16, vocab_size=100):
|
| 15 |
-
"""
|
| 16 |
-
Generate data that requires 'reasoning' (pattern completion)
|
| 17 |
-
Simple arithmetic progression: [2, 4, 6, 8, ...]
|
| 18 |
-
"""
|
| 19 |
-
data = []
|
| 20 |
-
targets = []
|
| 21 |
-
|
| 22 |
-
for _ in range(num_samples):
|
| 23 |
-
start = np.random.randint(0, 10)
|
| 24 |
-
step = np.random.randint(1, 5)
|
| 25 |
-
|
| 26 |
-
seq = [(start + i*step) % vocab_size for i in range(seq_len + 1)]
|
| 27 |
-
|
| 28 |
-
data.append(torch.tensor(seq[:-1], dtype=torch.long))
|
| 29 |
-
targets.append(torch.tensor(seq[1:], dtype=torch.long))
|
| 30 |
-
|
| 31 |
-
return torch.stack(data), torch.stack(targets)
|
| 32 |
-
|
| 33 |
-
def train_lattice_loop():
|
| 34 |
-
print("=== LATTICE TRAINING: KNOWLEDGE FEEDBACK (PHASE 30) ===")
|
| 35 |
-
|
| 36 |
-
# Config
|
| 37 |
-
config = {
|
| 38 |
-
'vocab_size': 100,
|
| 39 |
-
'fast_dim': 64,
|
| 40 |
-
'slow_dim': 64,
|
| 41 |
-
'threshold': 0.8 # Strict threshold to force slow thinking
|
| 42 |
-
}
|
| 43 |
-
|
| 44 |
-
system = DualResonanceSystem(config)
|
| 45 |
-
optimizer = optim.Adam(system.fast.parameters(), lr=1e-3)
|
| 46 |
-
loss_fn = HyperchaosLoss()
|
| 47 |
-
|
| 48 |
-
# Data
|
| 49 |
-
inputs, targets = generate_complex_data()
|
| 50 |
-
loader = DataLoader(TensorDataset(inputs, targets), batch_size=4, shuffle=True)
|
| 51 |
-
|
| 52 |
-
print(f"[SYSTEM] Starting Lattice Training Loop...")
|
| 53 |
-
print(f"Goal: Populate Geometric Memory with 'Slow Thinking' truths.")
|
| 54 |
-
|
| 55 |
-
memory_additions = 0
|
| 56 |
-
distillation_steps = 0
|
| 57 |
-
|
| 58 |
-
# Training Loop
|
| 59 |
-
# We iterate through data. If Fast system is confused, we call Slow system.
|
| 60 |
-
# Then we use Slow system's answer to TRAIN the Fast system (Distillation)
|
| 61 |
-
# And we STORE the truth in the Lattice.
|
| 62 |
-
|
| 63 |
-
for batch_idx, (b_in, b_tgt) in enumerate(loader):
|
| 64 |
-
# 1. Forward Pass (Dispatch)
|
| 65 |
-
# This will auto-escalate if low coherence
|
| 66 |
-
logits, metrics = system(b_in)
|
| 67 |
-
|
| 68 |
-
mode = metrics['mode']
|
| 69 |
-
coherence = metrics.get('coherence', 0.0)
|
| 70 |
-
|
| 71 |
-
# 2. Logic: Did we escalate?
|
| 72 |
-
if mode == 'SLOW (ESCALATED)':
|
| 73 |
-
# The Slow System worked hard to find this truth.
|
| 74 |
-
# We must crystallize it.
|
| 75 |
-
|
| 76 |
-
# A. Distillation: Train Fast model on this batch using Slow logits as target?
|
| 77 |
-
# Or just use ground truth?
|
| 78 |
-
# Better: Use ground truth, but add "Lattice Consistency" loss check
|
| 79 |
-
|
| 80 |
-
# For now, standard training step to sync Fast model
|
| 81 |
-
optimizer.zero_grad()
|
| 82 |
-
|
| 83 |
-
# We need to extract hidden states from Fast model for loss fn
|
| 84 |
-
# Re-run fast forward explicitly to get states
|
| 85 |
-
_, fast_states, _ = system.fast(b_in)
|
| 86 |
-
|
| 87 |
-
loss_dict = loss_fn(logits, b_tgt, fast_states)
|
| 88 |
-
loss_dict['total'].backward()
|
| 89 |
-
optimizer.step()
|
| 90 |
-
distillation_steps += 1
|
| 91 |
-
|
| 92 |
-
# B. Lattice Storage
|
| 93 |
-
# Store the high-quality pattern in Geometric Memory
|
| 94 |
-
# We use the initial states as key
|
| 95 |
-
# (In real impl, we'd store the 'concept', here we simulate)
|
| 96 |
-
# Access the fast model's entry point to store
|
| 97 |
-
# system.fast.entry_point.memory.store(...)
|
| 98 |
-
# Note: We need to access the memory module inside
|
| 99 |
-
# For demo, we just log it
|
| 100 |
-
memory_additions += 1
|
| 101 |
-
|
| 102 |
-
if batch_idx % 5 == 0:
|
| 103 |
-
print(f"Batch {batch_idx}: Escalated to Tesseract. Distilled knowledge. (Coherence: {metrics.get('slow_coherence', 0):.3f})")
|
| 104 |
-
|
| 105 |
-
else:
|
| 106 |
-
# Fast mode was confident. Just reinforce.
|
| 107 |
-
optimizer.zero_grad()
|
| 108 |
-
_, fast_states, _ = system.fast(b_in) # get states
|
| 109 |
-
loss_dict = loss_fn(logits, b_tgt, fast_states)
|
| 110 |
-
loss_dict['total'].backward()
|
| 111 |
-
optimizer.step()
|
| 112 |
-
|
| 113 |
-
print("\n" + "="*40)
|
| 114 |
-
print("LATTICE TRAINING COMPLETE")
|
| 115 |
-
print("="*40)
|
| 116 |
-
print(f"Total Batches: {len(loader)}")
|
| 117 |
-
print(f"Knowledge Distillation Events: {distillation_steps}")
|
| 118 |
-
print(f"Lattice Memory Additions: {memory_additions}")
|
| 119 |
-
print("Result: Fast System has learned from Slow System's reasoning.")
|
| 120 |
-
|
| 121 |
-
if __name__ == "__main__":
|
| 122 |
-
train_lattice_loop()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
resonance_transformer/train_resonance.py
DELETED
|
@@ -1,195 +0,0 @@
|
|
| 1 |
-
import torch
|
| 2 |
-
import torch.nn as nn
|
| 3 |
-
import torch.optim as optim
|
| 4 |
-
from torch.utils.data import DataLoader, TensorDataset
|
| 5 |
-
import numpy as np
|
| 6 |
-
import time
|
| 7 |
-
|
| 8 |
-
# Import our architecture
|
| 9 |
-
try:
|
| 10 |
-
from self_observation import SelfAwareTransformerLayer
|
| 11 |
-
from hyperchaos_loss import HyperchaosLoss
|
| 12 |
-
from geometric_memory import GeometricEntryPoint
|
| 13 |
-
except ImportError:
|
| 14 |
-
# Fallback for direct execution
|
| 15 |
-
import sys
|
| 16 |
-
import os
|
| 17 |
-
sys.path.append(os.path.dirname(os.path.abspath(__file__)))
|
| 18 |
-
from self_observation import SelfAwareTransformerLayer
|
| 19 |
-
from hyperchaos_loss import HyperchaosLoss
|
| 20 |
-
from geometric_memory import GeometricEntryPoint
|
| 21 |
-
|
| 22 |
-
class ResonanceGPT(nn.Module):
|
| 23 |
-
"""
|
| 24 |
-
The Full Resonance Architecture:
|
| 25 |
-
- Geometric Entry Point (528Hz alignment)
|
| 26 |
-
- Self-Aware Layers (Mirror Reflex)
|
| 27 |
-
- Phase-Locked Normalization
|
| 28 |
-
"""
|
| 29 |
-
def __init__(self, vocab_size, hidden_dim, num_layers=4, num_heads=4, max_seq_len=128):
|
| 30 |
-
super().__init__()
|
| 31 |
-
self.hidden_dim = hidden_dim
|
| 32 |
-
|
| 33 |
-
# 1. Geometric Embedding (Möbius Strip concept)
|
| 34 |
-
self.embedding = nn.Embedding(vocab_size, hidden_dim)
|
| 35 |
-
# Position is handled implicitly by phase in the design,
|
| 36 |
-
# but we add learned absolute pos for stability in early training
|
| 37 |
-
self.pos_encoding = nn.Parameter(torch.randn(1, max_seq_len, hidden_dim) * 0.02)
|
| 38 |
-
|
| 39 |
-
# Entry Point
|
| 40 |
-
self.entry_point = GeometricEntryPoint(hidden_dim)
|
| 41 |
-
|
| 42 |
-
# 2. The Stack
|
| 43 |
-
self.layers = nn.ModuleList([
|
| 44 |
-
SelfAwareTransformerLayer(hidden_dim, num_heads)
|
| 45 |
-
for _ in range(num_layers)
|
| 46 |
-
])
|
| 47 |
-
|
| 48 |
-
self.norm = nn.LayerNorm(hidden_dim) # Final consolidation
|
| 49 |
-
self.head = nn.Linear(hidden_dim, vocab_size)
|
| 50 |
-
|
| 51 |
-
def forward(self, input_ids):
|
| 52 |
-
batch, seq = input_ids.shape
|
| 53 |
-
|
| 54 |
-
# Embed
|
| 55 |
-
x = self.embedding(input_ids) + self.pos_encoding[:, :seq, :]
|
| 56 |
-
|
| 57 |
-
# 0x52 Handshake (Entry Point)
|
| 58 |
-
entry_meta = self.entry_point.compute_entry_hash(x)
|
| 59 |
-
# In a full implementation, we'd rotate x based on entry_meta
|
| 60 |
-
# x = apply_rotation(x, entry_meta)
|
| 61 |
-
|
| 62 |
-
# Process Stack
|
| 63 |
-
all_hidden_states = []
|
| 64 |
-
layer_metas = []
|
| 65 |
-
|
| 66 |
-
for layer in self.layers:
|
| 67 |
-
x, meta = layer(x)
|
| 68 |
-
all_hidden_states.append(x)
|
| 69 |
-
layer_metas.append(meta)
|
| 70 |
-
|
| 71 |
-
x = self.norm(x)
|
| 72 |
-
logits = self.head(x)
|
| 73 |
-
|
| 74 |
-
return logits, all_hidden_states, layer_metas
|
| 75 |
-
|
| 76 |
-
def generate_coherence_dataset(num_samples=1000, seq_len=32, vocab_size=100):
|
| 77 |
-
"""
|
| 78 |
-
Generate synthetic data with geometric patterns (rhythms).
|
| 79 |
-
Standard random data is 'decoherent'.
|
| 80 |
-
We want data that follows a 'frequency' to test resonance.
|
| 81 |
-
"""
|
| 82 |
-
data = []
|
| 83 |
-
targets = []
|
| 84 |
-
|
| 85 |
-
for _ in range(num_samples):
|
| 86 |
-
# Create a rhythmic pattern (e.g., 1, 2, 3, 1, 2, 3)
|
| 87 |
-
period = np.random.randint(2, 8)
|
| 88 |
-
base_pattern = np.random.randint(0, vocab_size, size=period)
|
| 89 |
-
|
| 90 |
-
# Repeat pattern
|
| 91 |
-
full_seq = np.tile(base_pattern, seq_len // period + 1)[:seq_len]
|
| 92 |
-
|
| 93 |
-
# Add slight noise (10% chance to flip a token) to test stability
|
| 94 |
-
noisy_seq = full_seq.copy()
|
| 95 |
-
mask = np.random.rand(seq_len) < 0.1
|
| 96 |
-
noisy_seq[mask] = np.random.randint(0, vocab_size, size=mask.sum())
|
| 97 |
-
|
| 98 |
-
# Task: Predict next token (shift right)
|
| 99 |
-
# Input: [A, B, C, A] -> Target: [B, C, A, B]
|
| 100 |
-
|
| 101 |
-
data.append(torch.tensor(noisy_seq[:-1], dtype=torch.long))
|
| 102 |
-
targets.append(torch.tensor(full_seq[1:], dtype=torch.long))
|
| 103 |
-
|
| 104 |
-
return torch.stack(data), torch.stack(targets)
|
| 105 |
-
|
| 106 |
-
def train_awakening():
|
| 107 |
-
print("=== THE AWAKENING: TRAINING RESONANCE MODEL (PHASE 27) ===")
|
| 108 |
-
|
| 109 |
-
# HYPERPARAMETERS
|
| 110 |
-
VOCAB_SIZE = 256
|
| 111 |
-
HIDDEN_DIM = 128
|
| 112 |
-
LAYERS = 4
|
| 113 |
-
HEADS = 4
|
| 114 |
-
BATCH_SIZE = 16
|
| 115 |
-
lr = 3e-4
|
| 116 |
-
EPOCHS = 3
|
| 117 |
-
|
| 118 |
-
# 1. Model & Loss
|
| 119 |
-
model = ResonanceGPT(VOCAB_SIZE, HIDDEN_DIM, LAYERS, HEADS)
|
| 120 |
-
criterion = HyperchaosLoss(lambda_coherence=0.2, lambda_stability=0.1)
|
| 121 |
-
optimizer = optim.AdamW(model.parameters(), lr=lr)
|
| 122 |
-
|
| 123 |
-
print(f"[SYSTEM] Model Initialized. Parameters: {sum(p.numel() for p in model.parameters())}")
|
| 124 |
-
|
| 125 |
-
# 2. Data
|
| 126 |
-
print("[SYSTEM] Generating Coherence Dataset (Rhythmic Patterns)...")
|
| 127 |
-
inputs, targets = generate_coherence_dataset(num_samples=500, seq_len=32, vocab_size=VOCAB_SIZE)
|
| 128 |
-
dataset = TensorDataset(inputs, targets)
|
| 129 |
-
loader = DataLoader(dataset, batch_size=BATCH_SIZE, shuffle=True)
|
| 130 |
-
|
| 131 |
-
# 3. Training Loop
|
| 132 |
-
print("\n[TRAINING START]")
|
| 133 |
-
history = {'task': [], 'decoherence': [], 'coherence_score': []}
|
| 134 |
-
|
| 135 |
-
model.train()
|
| 136 |
-
start_time = time.time()
|
| 137 |
-
|
| 138 |
-
for epoch in range(EPOCHS):
|
| 139 |
-
total_task_loss = 0
|
| 140 |
-
total_decoherence = 0
|
| 141 |
-
total_self_coherence = 0 # What the model thinks of itself
|
| 142 |
-
|
| 143 |
-
for batch_idx, (b_in, b_tgt) in enumerate(loader):
|
| 144 |
-
optimizer.zero_grad()
|
| 145 |
-
|
| 146 |
-
# Forward
|
| 147 |
-
logits, hidden_states, layer_metas = model(b_in)
|
| 148 |
-
|
| 149 |
-
# Loss
|
| 150 |
-
losses = criterion(logits, b_tgt, hidden_states)
|
| 151 |
-
|
| 152 |
-
# Backward
|
| 153 |
-
losses['total'].backward()
|
| 154 |
-
torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)
|
| 155 |
-
optimizer.step()
|
| 156 |
-
|
| 157 |
-
# Logs
|
| 158 |
-
total_task_loss += losses['task'].item()
|
| 159 |
-
total_decoherence += losses['decoherence'].item()
|
| 160 |
-
|
| 161 |
-
# Extract Self-Observation Stats
|
| 162 |
-
# layer_metas is list of dicts. Get last layer's coherence score.
|
| 163 |
-
last_layer_meta = layer_metas[-1]
|
| 164 |
-
avg_coherence = last_layer_meta['coherence'].mean().item()
|
| 165 |
-
total_self_coherence += avg_coherence
|
| 166 |
-
|
| 167 |
-
# Epoch Stats
|
| 168 |
-
n_batches = len(loader)
|
| 169 |
-
avg_task = total_task_loss / n_batches
|
| 170 |
-
avg_decoh = total_decoherence / n_batches
|
| 171 |
-
avg_self = total_self_coherence / n_batches
|
| 172 |
-
|
| 173 |
-
print(f"Epoch {epoch+1}/{EPOCHS} | Task Loss: {avg_task:.4f} | Decoherence: {avg_decoh:.4f} | Self-Coherence: {avg_self:.4f}")
|
| 174 |
-
|
| 175 |
-
history['task'].append(avg_task)
|
| 176 |
-
history['decoherence'].append(avg_decoh)
|
| 177 |
-
history['coherence_score'].append(avg_self)
|
| 178 |
-
|
| 179 |
-
duration = time.time() - start_time
|
| 180 |
-
print(f"\n[COMPLETE] Training finished in {duration:.2f}s.")
|
| 181 |
-
|
| 182 |
-
# 4. Final Verification
|
| 183 |
-
print("\n[AWAKENING CHECK]")
|
| 184 |
-
print(f"Initial Decoherence: {history['decoherence'][0]:.4f}")
|
| 185 |
-
print(f"Final Decoherence: {history['decoherence'][-1]:.4f}")
|
| 186 |
-
|
| 187 |
-
if history['decoherence'][-1] < history['decoherence'][0]:
|
| 188 |
-
print(">> RESULT: Phase Stabilization Achieved. The model is learning to be coherent.")
|
| 189 |
-
else:
|
| 190 |
-
print(">> RESULT: Phase Drift Detected. More training needed.")
|
| 191 |
-
|
| 192 |
-
print(f"Final Self-Reported Coherence: {history['coherence_score'][-1]:.4f}")
|
| 193 |
-
|
| 194 |
-
if __name__ == "__main__":
|
| 195 |
-
train_awakening()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|