Cascade / docs /ACCESSIBLE_GUIDE.md
tostido's picture
Add research documentation: Kleene fixed-point framework paper and accessible guide
a641010
# CASCADE-LATTICE: An Accessible Guide
## From Math Theory to Working AI System
### What Is This?
CASCADE-LATTICE is a system that makes AI transparent and controllable. Think of it like a "flight recorder" for AI decisions—every choice an AI makes is recorded in a way that can't be faked, and humans can pause the AI at any time to override its decisions.
---
## The Core Idea (For Everyone)
Imagine you're teaching a student to solve math problems step-by-step. Each step builds on the last:
```
Step 1: 2 + 3 = 5
Step 2: 5 × 4 = 20
Step 3: 20 - 7 = 13
```
CASCADE-LATTICE watches AI "thinking" the same way:
```
Input: "What's in this image?"
Layer 1: Detect edges
Layer 2: Recognize shapes
Layer 3: Identify objects
Output: "It's a cat"
```
**Two key innovations:**
1. **Provenance**: Every step is cryptographically hashed (think: fingerprinted) and linked to the previous step. This creates an unbreakable chain of evidence.
2. **HOLD**: At critical decision points, the AI pauses and shows you what it's about to do. You can accept it or override with your own choice.
---
## The Core Idea (For Data Scientists)
CASCADE-LATTICE maps neural network computation to **Kleene fixed-point iteration**. Here's the mathematical elegance:
### Neural Networks ARE Fixed-Point Computations
A forward pass through a neural network:
```python
output = layer_n(layer_{n-1}(...(layer_1(input))))
```
Is equivalent to iterating a function `f` from bottom element `⊥`:
```
⊥ → f(⊥) → f²(⊥) → f³(⊥) → ... → fix(f)
```
Where:
- **Domain**: Activation space (ℝⁿ with pointwise ordering)
- **Function f**: Layer transformation
- **Fixed point**: Final prediction
### Why This Matters
1. **Monotonicity**: ReLU layers are monotonic functions → guaranteed convergence
2. **Least Fixed Point**: Kleene theorem guarantees we reach the "smallest" valid solution
3. **Provenance = Iteration Trace**: Each step in the chain is a provenance record
### The Provenance Chain
```python
# Each layer creates a record
record = ProvenanceRecord(
layer_name="transformer.layer.5",
state_hash=hash(activation), # H(fⁱ(⊥))
parent_hashes=[previous_hash], # H(fⁱ⁻¹(⊥))
execution_order=i # Iteration index
)
```
These records form a **Merkle tree**—the root uniquely identifies the entire computation:
```
Merkle Root = M(fix(f))
```
**Cryptographic guarantee**: Different computation → Different root (with probability 1 - 2⁻²⁵⁶)
---
## The Architecture (Everyone)
Think of CASCADE-LATTICE as having three layers:
### Layer 1: OBSERVE
**What it does**: Records everything an AI does
**Analogy**: Like a security camera for AI decisions
**Example**:
```python
# AI makes a decision
result = ai_model.predict(data)
# CASCADE automatically records it
observe("my_ai", {"input": data, "output": result})
```
### Layer 2: HOLD
**What it does**: Pauses AI at decision points
**Analogy**: Like having a "pause button" during a video game where you can see the AI's plan and change it
**Example**:
```python
# AI is about to choose an action
action_probabilities = [0.1, 0.7, 0.2] # 70% sure about action #1
# Pause and show human
resolution = hold.yield_point(
action_probs=action_probabilities,
observation=current_state
)
# Human sees: "AI wants action #1 (70% confidence)"
# Human can: Accept, or override with action #0 or #2
```
### Layer 3: LATTICE
**What it does**: Connects multiple AIs into a knowledge network
**Analogy**: Like Wikipedia but for AI experiences—one AI's learnings become available to all others
**Example**:
```python
# Robot A explores a maze
observe("robot_a", {"location": (5, 10), "obstacle": True})
# Robot B later queries and learns from A's experience
past_experiences = query("robot_a")
```
---
## The Architecture (Data Scientists)
### Component Breakdown
```
┌───────────────────────────────────────────────────┐
│ CASCADE-LATTICE Stack │
├───────────────────────────────────────────────────┤
│ │
│ Application Layer │
│ ├─ OBSERVE: Provenance tracking API │
│ ├─ HOLD: Intervention protocol │
│ └─ QUERY: Lattice data retrieval │
│ │
├───────────────────────────────────────────────────┤
│ │
│ Core Engine │
│ ├─ ProvenanceTracker: Hooks into forward pass │
│ ├─ ProvenanceChain: Stores iteration sequence │
│ ├─ MerkleTree: Computes cryptographic root │
│ └─ HoldSession: Manages decision checkpoints │
│ │
├───────────────────────────────────────────────────┤
│ │
│ Lattice Network │
│ ├─ Storage: JSONL + CBOR persistence │
│ ├─ Genesis: Network bootstrap (root hash) │
│ ├─ Identity: Model registry │
│ └─ IPLD/IPFS: Content-addressed distribution │
│ │
└───────────────────────────────────────────────────┘
```
### Data Flow
1. **Capture Phase**:
```python
tracker = ProvenanceTracker(model, model_id="gpt2")
tracker.start_session(input_text)
output = model(**inputs) # Hooks fire on each layer
chain = tracker.finalize_session()
```
2. **Hash Computation** (per layer):
```python
# Sample tensor for efficiency
state_hash = SHA256(tensor[:1000].tobytes())
# Link to parent
record = ProvenanceRecord(
state_hash=state_hash,
parent_hashes=[previous_hash]
)
```
3. **Merkle Tree Construction**:
```python
def compute_merkle_root(hashes):
if len(hashes) == 1:
return hashes[0]
# Pairwise hashing
next_level = [
SHA256(h1 + h2)
for h1, h2 in zip(hashes[::2], hashes[1::2])
]
return compute_merkle_root(next_level)
```
4. **Lattice Integration**:
```python
# Link to external systems
chain.link_external(other_system.merkle_root)
# Recompute root (includes external dependencies)
chain.finalize()
```
### Key Algorithms
**Algorithm: Forward Pass Provenance Tracking**
```
INPUT: Neural network N, input x
OUTPUT: Provenance chain C with Merkle root M
1. Initialize chain C with input_hash = H(x)
2. Set last_hash ← input_hash
3. For each layer fᵢ in N:
a. Compute activation: aᵢ ← fᵢ(aᵢ₋₁)
b. Hash activation: hᵢ ← H(aᵢ)
c. Create record: rᵢ ← (layer=i, hash=hᵢ, parent=last_hash)
d. Add to chain: C.add(rᵢ)
e. Update: last_hash ← hᵢ
4. Compute Merkle root: M ← MerkleRoot([h₁, h₂, ..., hₙ])
5. Finalize: C.merkle_root ← M
6. Return C
```
**Complexity**: O(n) for n layers
**Algorithm: Lattice Convergence**
```
INPUT: Set of agents A = {a₁, a₂, ..., aₙ}
OUTPUT: Global fixed point (no new merkle roots)
1. For each agent aᵢ: initialize chain Cᵢ
2. Repeat until convergence:
a. For each agent aᵢ:
i. Get neighbor chains: N = {Cⱼ | j ∈ neighbors(i)}
ii. Extract roots: R = {C.merkle_root | C ∈ N}
iii. Link external: Cᵢ.external_roots.extend(R)
iv. Recompute: Cᵢ.finalize()
b. Check: if no new roots added, break
3. Return lattice state L = {C₁, C₂, ..., Cₙ}
```
**Complexity**: O(n²) worst case (full graph)
---
## Real-World Examples
### Example 1: Medical AI Oversight
**Scenario**: AI diagnoses medical images
**Everyone version**:
```
1. Doctor uploads X-ray
2. AI analyzes → "90% sure it's pneumonia"
3. HOLD pauses: shows doctor the AI's reasoning
4. Doctor reviews: "Actually, I think it's normal"
5. Doctor overrides → "No pneumonia"
6. Both choices are recorded with proof
```
**Data scientist version**:
```python
# AI processes medical image
image_tensor = preprocess(xray_image)
diagnosis_probs = medical_ai(image_tensor)
# Provenance captures internal reasoning
chain = tracker.finalize_session()
print(f"Diagnosis chain: {chain.merkle_root}")
# HOLD for doctor review
resolution = hold.yield_point(
action_probs=diagnosis_probs,
observation={"image_id": xray_id},
action_labels=["Normal", "Pneumonia", "Other"],
# Pass AI's "reasoning"
attention=model.attention_weights[-1].tolist(),
features={"lung_opacity": 0.8, "consolidation": 0.6}
)
# Doctor overrides
final_diagnosis = resolution.action # May differ from AI
# Both paths recorded
assert chain.records["final_layer"].state_hash in chain.merkle_root
```
### Example 2: Autonomous Drone Fleet
**Everyone version**:
```
1. Drone A explores area, finds obstacle
2. Records: "obstacle at (100, 200)"
3. Drone B needs to navigate same area
4. Queries lattice: "Any obstacles near (100, 200)?"
5. Gets Drone A's discovery
6. Avoids obstacle without re-exploring
```
**Data scientist version**:
```python
# Drone A observes
obstacle_detection = drone_a.camera.detect_obstacles()
observe("drone_a", {
"position": (100, 200),
"obstacles": obstacle_detection,
"timestamp": time.time()
})
# Provenance chain created
chain_a = get_latest_chain("drone_a")
print(f"Drone A chain: {chain_a.merkle_root}")
# Drone B queries
past_observations = query("drone_a", filters={
"position": nearby((100, 200), radius=50)
})
# Drone B integrates knowledge
for obs in past_observations:
drone_b.add_to_map(obs.data["obstacles"])
# Link chains (creates lattice)
chain_b = drone_b.current_chain
chain_b.link_external(chain_a.merkle_root)
# Now chain_b provably depends on chain_a's data
chain_b.finalize()
```
### Example 3: Financial Trading Algorithm
**Everyone version**:
```
1. Trading AI: "Buy 1000 shares (85% confidence)"
2. Compliance officer sees HOLD notification
3. Reviews: AI reasoning + market context
4. Decision: "No, market too volatile today"
5. Override: Block the trade
6. Audit trail: Both AI suggestion and human override recorded
```
**Data scientist version**:
```python
# Trading model predicts
market_state = get_market_snapshot()
action_probs = trading_model.predict(market_state)
# [0.05, 0.85, 0.10] → BUY has 85%
# Capture provenance
tracker = ProvenanceTracker(trading_model, model_id="quant_v2.3")
tracker.start_session(market_state)
chain = tracker.finalize_session()
# HOLD for compliance
resolution = hold.yield_point(
action_probs=action_probs,
value=expected_profit,
observation=market_state,
action_labels=["SELL", "BUY", "HOLD"],
# Rich context for human
features={
"volatility": market_state.volatility,
"liquidity": market_state.liquidity,
"risk_score": 0.7
},
reasoning=[
"Strong momentum signal",
"Historical pattern match",
"But: elevated VIX"
]
)
# Compliance overrides
final_action = resolution.action # May be HOLD instead of BUY
# Regulatory export
export_chain_for_audit(chain, f"trade_{timestamp}.json")
# Regulator can verify:
valid, error = verify_chain(chain)
assert valid, "Provenance integrity violated!"
```
---
## Why Kleene Fixed Points Matter
### For Everyone
**The Problem**: How do you know an AI is telling the truth about what it did?
**The Solution**: Math guarantees.
When you compute `2 + 2`, the answer is always `4`. It's not a matter of opinion—it's mathematically guaranteed.
CASCADE-LATTICE uses the same kind of mathematical guarantee (called a "fixed point") for AI computations. The AI's decision process must converge to a stable, reproducible result, and that result is cryptographically fingerprinted.
**Translation**: You can verify an AI's work the way you'd verify a math proof.
### For Data Scientists
**The Deep Connection**:
Kleene's fixed-point theorem from 1952 states:
```
For continuous f: D → D over CPO D with bottom ⊥:
fix(f) = ⊔ᵢ₌₀^∞ fⁱ(⊥)
```
Neural networks implement this:
```python
# Bottom element: zero initialization
x₀ = zeros(input_shape)
# Kleene iteration: apply layers
x₁ = layer_1(x₀)
x₂ = layer_2(x₁)
...
xₙ = layer_n(xₙ₋₁)
# Fixed point: final output
output = xₙ = fix(compose(layer_n, ..., layer_1))
```
**Why This Is Profound**:
1. **Provenance = Iteration Trace**: Each provenance record is one step in the Kleene chain
2. **Merkle Root = Fixed Point Hash**: The final hash uniquely identifies `fix(f)`
3. **Convergence Guaranteed**: Monotonic layers → guaranteed convergence (no infinite loops)
**Practical Benefit**:
```python
# Two runs with same input
chain_1 = track_provenance(model, input_data)
chain_2 = track_provenance(model, input_data)
# Must produce same Merkle root
assert chain_1.merkle_root == chain_2.merkle_root
# This is NOT just reproducibility—it's mathematical necessity
# Different root → Different computation (provably)
```
**Lattice Network = Distributed Fixed Point**:
Each agent computes local fixed point, then exchanges Merkle roots. The lattice itself converges to a global fixed point:
```
Global_State(t+1) = Merge(Global_State(t), New_Observations)
```
This is Kleene iteration on the **space of knowledge graphs**.
---
## Installation & Quick Start
### Everyone Version
1. **Install**:
```bash
pip install cascade-lattice
```
2. **Try the demo**:
```bash
cascade-demo
```
Fly a lunar lander! Press `H` to pause the AI and take control.
3. **Use in your code**:
```python
import cascade
cascade.init()
# Now all AI calls are automatically tracked
```
### Data Scientist Version
1. **Install**:
```bash
pip install cascade-lattice
# With optional dependencies
pip install cascade-lattice[all] # Includes IPFS, demos
```
2. **Manual Provenance Tracking**:
```python
from cascade.core.provenance import ProvenanceTracker
import torch
model = YourPyTorchModel()
tracker = ProvenanceTracker(model, model_id="my_model")
# Start session
session_id = tracker.start_session(input_data)
# Run inference (hooks capture everything)
output = model(input_data)
# Finalize and get chain
chain = tracker.finalize_session(output)
print(f"Merkle Root: {chain.merkle_root}")
print(f"Records: {len(chain.records)}")
print(f"Verified: {chain.verify()[0]}")
```
3. **HOLD Integration**:
```python
from cascade.hold import Hold
import numpy as np
hold = Hold.get()
# In your RL loop
for episode in range(1000):
state = env.reset()
done = False
while not done:
# Get action probabilities
action_probs = agent.predict(state)
# Yield to HOLD
resolution = hold.yield_point(
action_probs=action_probs,
value=agent.value_estimate(state),
observation={"state": state.tolist()},
brain_id="rl_agent",
action_labels=env.action_names
)
# Execute (AI or human choice)
state, reward, done, info = env.step(resolution.action)
```
4. **Query Lattice**:
```python
from cascade.store import observe, query
# Write observations
observe("my_agent", {
"state": [1, 2, 3],
"action": 0,
"reward": 1.5
})
# Query later
history = query("my_agent", limit=100)
for receipt in history:
print(f"CID: {receipt.cid}")
print(f"Data: {receipt.data}")
print(f"Merkle: {receipt.merkle_root}")
```
---
## Performance Considerations
### Everyone Version
**Q: Does CASCADE slow down my AI?**
A: Slightly (5-10% overhead), like how a dashcam uses a tiny bit of your car's power.
**Q: How much storage does it use?**
A: Depends on how much your AI runs. Each decision is a few kilobytes.
### Data Scientist Version
**Overhead Analysis**:
| Operation | Complexity | Typical Latency |
|-----------|-----------|-----------------|
| Hash tensor | O(k) | ~0.1-1ms (k=1000) |
| Merkle tree | O(n log n) | ~1-5ms (n=50 layers) |
| HOLD pause | O(1) | User-dependent (1-30s) |
| Lattice merge | O(N) | ~10-100ms (N=neighbors) |
**Total Inference Overhead**: ~5-10% latency increase
**Optimization Strategies**:
1. **Tensor Sampling**:
```python
# Don't hash entire tensor
hash_tensor(tensor, sample_size=1000) # First 1000 elements
```
2. **Async Merkle Computation**:
```python
# Finalize chain in background thread
chain.finalize_async()
```
3. **Batch Observations**:
```python
# Group writes to lattice
with observation_batch():
for step in episode:
observe("agent", step)
```
4. **Sparse HOLD**:
```python
# Only pause on uncertainty
if max(action_probs) < confidence_threshold:
resolution = hold.yield_point(...)
```
**Storage Scaling**:
```python
# Per-record size
record_size = (
32 bytes (hash) +
8 bytes (timestamp) +
N bytes (metadata)
) ≈ 100-500 bytes
# For 1M inference steps
total_storage = 1M * 500 bytes ≈ 500 MB
```
**Pruning Strategy**:
```python
# Archive old chains
if chain.created_at < (now - 30_days):
archive_to_ipfs(chain)
remove_from_local_lattice(chain)
```
---
## FAQ
### Everyone
**Q: Can CASCADE work with any AI?**
A: Yes! It works with ChatGPT, autonomous robots, game AIs, anything.
**Q: Is my data private?**
A: Yes. Everything stays on your computer unless you explicitly choose to share it.
**Q: What happens if I override the AI?**
A: Both choices (AI's and yours) are recorded. You can later see why you disagreed.
### Data Scientists
**Q: Does CASCADE require modifying model code?**
A: No. It uses PyTorch hooks / framework interceptors. Zero code changes required.
**Q: What about non-PyTorch frameworks?**
A: Supported:
- PyTorch: ✅ (native hooks)
- TensorFlow: ✅ (via tf.Module hooks)
- JAX: ✅ (via jax.jit wrapping)
- HuggingFace: ✅ (transformers integration)
- OpenAI/Anthropic: ✅ (API wrappers)
**Q: How does HOLD integrate with existing RL frameworks?**
A: Drop-in replacement for action sampling:
```python
# Before
action = np.argmax(action_probs)
# After
resolution = hold.yield_point(action_probs=action_probs, ...)
action = resolution.action
```
**Q: Can I use CASCADE with distributed training?**
A: Yes. Each rank tracks its own provenance:
```python
tracker = ProvenanceTracker(
model,
model_id=f"ddp_rank_{dist.get_rank()}"
)
```
**Q: What about privacy in the lattice?**
A: Three modes:
1. **Local**: Lattice stays on disk (default)
2. **Private Network**: Share only with trusted nodes
3. **Public**: Publish to IPFS (opt-in)
---
## The Big Picture
### Everyone
CASCADE-LATTICE makes AI systems:
- **Transparent**: See what AI sees
- **Controllable**: Override AI decisions
- **Collaborative**: AIs share knowledge
- **Trustworthy**: Cryptographic proof of actions
**The Vision**: AI systems that humans can audit, control, and trust.
### Data Scientists
CASCADE-LATTICE provides:
- **Formal Semantics**: Kleene fixed points give rigorous meaning to "AI computation"
- **Cryptographic Proofs**: Merkle roots create tamper-evident audit trails
- **Human Agency**: HOLD protocol enables intervention without breaking provenance
- **Collective Intelligence**: Lattice network creates decentralized AI knowledge base
**The Vision**: A future where:
1. Every AI decision is mathematically verifiable
2. Humans can intervene at any decision boundary
3. AI systems form a global knowledge lattice (the "neural internetwork")
4. Governance emerges from cryptographic consensus, not centralized control
---
## Next Steps
### Everyone
1. Try the demo: `cascade-demo`
2. Read the README: `cascade-lattice/README.md`
3. Join the community: [GitHub Issues](https://github.com/Yufok1/cascade-lattice)
### Data Scientists
1. Read the research paper: `docs/RESEARCH_PAPER.md`
2. Explore the codebase:
- `cascade/core/provenance.py` — Kleene iteration engine
- `cascade/hold/session.py` — Intervention protocol
- `cascade/store.py` — Lattice storage
3. Integrate with your models:
```python
from cascade import init
init() # That's it!
```
4. Contribute:
- Optimize Merkle tree construction
- Add new framework integrations
- Build visualization tools
- Extend HOLD protocol
---
## Conclusion
Whether you're a concerned citizen wondering about AI transparency, or a researcher building the next generation of AI systems, CASCADE-LATTICE offers a path forward:
**From Kleene's fixed points in 1952...**
**To cryptographic AI provenance in 2026...**
**To a future where AI and humanity converge on shared truth.**
*"The fixed point is not just computation—it is consensus."*
---
*Guide Version: 1.0*
*Date: 2026-01-12*
*For: CASCADE-LATTICE System*