sal-learning / docs /architecture.md

Whiteroom

Initial SAL core for HF (no plots/pdf)

2c914eb 2 months ago

preview code

raw

history blame contribute delete

10.7 kB

SAL Architecture

Technical Deep-Dive

Overview

SAL consists of four interconnected components:

┌─────────────────────────────────────────────────────────────┐
│                      Training Loop                          │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   Input → Model → Loss → Gradients                         │
│                              ↓                              │
│                 ┌────────────────────────┐                  │
│                 │  Communication Layer   │                  │
│                 │  ┌──────────────────┐  │                  │
│                 │  │ Stability        │  │                  │
│                 │  │ Analyzer         │  │                  │
│                 │  └────────┬─────────┘  │                  │
│                 │           ↓            │                  │
│                 │  ┌──────────────────┐  │                  │
│                 │  │ Emergence        │  │                  │
│                 │  │ Field            │  │                  │
│                 │  └────────┬─────────┘  │                  │
│                 │           ↓            │                  │
│                 │  ┌──────────────────┐  │                  │
│                 │  │ Protection       │  │                  │
│                 │  │ Masks            │  │                  │
│                 │  └──────────────────┘  │                  │
│                 └────────────┬───────────┘                  │
│                              ↓                              │
│                    Protected Gradients                      │
│                              ↓                              │
│                       Optimizer.step()                      │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Component 1: Communication Layer

The Communication Layer is the core of SAL. It sits between gradient computation and optimizer application.

Class: `CommunicationLayer`

from sal import CommunicationLayer

comm = CommunicationLayer(
    model=model,
    threshold=0.5,           # Base stability threshold
    threshold_adaptation=0.1, # How much threshold adapts
    soft_protection=True,     # Soft vs hard protection
    history_length=100,       # Steps to track
)

Methods

`analyze() -> Dict[str, float]`

Analyzes all parameters and computes stability scores.

stability_scores = comm.analyze()
# {'layer1.weight': 0.73, 'layer1.bias': 0.45, ...}

Stability Score Formula:

s(p) = 1 / (1 + Δw × g_norm)

Where:

Δw = weight change since last step
g_norm = gradient magnitude

High stability = low change × low gradient = parameter has settled.

`protect() -> Dict[str, float]`

Applies protection to gradients based on stability analysis.

protection_rates = comm.protect()
# {'layer1.weight': 0.42, 'layer1.bias': 0.0, ...}

Protection Formula (Soft):

protected_gradient = gradient × (1 - stability_score)

Stable parameters get reduced gradients. Volatile parameters get full gradients.

Adaptive Threshold

The threshold adapts to training dynamics:

τ = τ₀ + α × (σ_grad / μ_grad)

When gradients are noisy (high variance), protection increases. When gradients are stable, protection decreases.

Component 2: Stability Analyzer

Classifies parameters into the Stability Spectrum.

Class: `StabilityAnalyzer`

from sal import StabilityAnalyzer

analyzer = StabilityAnalyzer(
    model=model,
    protected_threshold=0.7,  # Score above this → protected
    volatile_threshold=0.3,   # Score below this → volatile
    history_length=50,        # Steps to track
)

Methods

`analyze() -> Dict[str, float]`

Computes stability scores using multiple signals:

Weight variance — Low variance over time = stable
Gradient consistency — Consistent direction = stable
Change magnitude — Small changes = stable

scores = analyzer.analyze()

`classify() -> StabilitySpectrum`

Returns the distribution across stability states:

spectrum = analyzer.classify()
# StabilitySpectrum(protected=12.3, neutral=70.5, volatile=17.2)

Stability States

State	Score Range	Behavior
Protected	> 0.7	Minimal updates
Neutral	0.3 - 0.7	Careful updates
Volatile	< 0.3	Full updates

Component 3: Emergence Field

Measures coherence, novelty, and resonance in semantic space.

Class: `EmergenceField`

from sal import EmergenceField

field = EmergenceField(
    dimensions=768,           # Semantic space dimensions
    history_length=100,       # Patterns to remember
    coherence_threshold=0.6,  # Minimum for emergence
    novelty_threshold=0.4,    # Minimum for emergence
)

Methods

`observe(pattern) -> EmergenceState`

Observes a pattern and measures its emergence characteristics:

state = field.observe(embedding)
# EmergenceState(coherence=0.72, novelty=0.45, resonance=0.63, intensity=0.41)

`detect_emergence(coherence, novelty) -> bool`

Simple check for emergence:

is_emergent = field.detect_emergence(0.72, 0.45)
# True

Emergence Metrics

Coherence: How internally consistent is the pattern?

Measures variance between chunks
Measures local smoothness
High coherence = structured, meaningful

Novelty: How different from known patterns?

Compares to historical patterns via cosine similarity
High novelty = genuinely new

Resonance: How well does it fit the field?

Distance from field centroid
High resonance = harmonious with existing patterns

Emergence = Coherent Novelty that Resonates

Component 4: Pulse-Split-Cascade (PSC)

Semantic Game of Life for pattern evolution.

Class: `PulseCascade`

from sal import PulseCascade

cascade = PulseCascade(
    max_pulses=32,          # Maximum concurrent pulses
    max_generations=10,     # Maximum depth
    split_threshold=0.6,    # Coherence needed to split
    merge_threshold=0.8,    # Similarity needed to merge
    expire_threshold=0.3,   # Minimum coherence to survive
)

Flow

1. INITIATE
   Prompt embedding creates root pulse
   
2. EVOLVE
   Each pulse evolves via evolve_fn
   Coherence, novelty, resonance are measured
   
3. SPLIT
   High-coherence pulses split into children
   Children have slight variations
   
4. MERGE
   Similar pulses merge (high cosine similarity)
   Merging combines embeddings and preserves best traits
   
5. EXPIRE
   Low-coherence pulses expire
   Their patterns are lost
   
6. EMERGE
   Best viable pulse is the emergent result
   No scoring — just natural selection

Methods

`initiate(embedding) -> Pulse`

Start cascade from prompt:

root = cascade.initiate(prompt_embedding)

`step(evolve_fn, measure_fn) -> List[Pulse]`

Advance cascade by one step:

active = cascade.step(
    evolve_fn=lambda x: model(x),
    measure_fn=lambda x: (coherence(x), novelty(x), resonance(x)),
)

`emerge() -> Pulse`

Get the emergent result:

result = cascade.emerge()

Integration

Minimal Integration (2 lines)

# Standard training loop
output = model(input)
loss = criterion(output, target)
loss.backward()

# SAL integration
comm.analyze()   # ← Line 1
comm.protect()   # ← Line 2

optimizer.step()
optimizer.zero_grad()

Full Integration

from sal import CommunicationLayer, StabilityAnalyzer, EmergenceField

# Initialize
comm = CommunicationLayer(model)
stability = StabilityAnalyzer(model)
field = EmergenceField()

# Training loop
for epoch in range(epochs):
    for batch in dataloader:
        # Forward
        output = model(batch)
        loss = criterion(output, target)
        
        # Backward
        loss.backward()
        
        # SAL: Analyze
        comm.analyze()
        stability.update()
        
        # SAL: Observe emergence
        with torch.no_grad():
            state = field.observe(model.get_embedding())
        
        # SAL: Protect
        comm.protect()
        
        # Update
        optimizer.step()
        optimizer.zero_grad()
        
    # Log spectrum
    spectrum = stability.classify()
    print(f"Epoch {epoch}: {spectrum}")

Configuration

Recommended Defaults

Parameter	Default	Description
`threshold`	0.5	Base stability threshold
`threshold_adaptation`	0.1	Adaptation rate
`soft_protection`	True	Soft vs hard protection
`protected_threshold`	0.7	Score for protected state
`volatile_threshold`	0.3	Score for volatile state
`history_length`	100	Steps to track

Tuning Guidelines

More Protection: Increase threshold, decrease threshold_adaptation Less Protection: Decrease threshold, increase threshold_adaptation Faster Adaptation: Increase history_length More Stability: Increase protected_threshold

Performance

SAL adds approximately 10% computational overhead:

Stability analysis: O(n) where n = number of parameters
Protection application: O(n)
Memory: O(n × history_length) for tracking

This overhead is negligible compared to the benefits of reduced catastrophic forgetting and improved continual learning.

For the philosophy behind these technical choices, see Principles.

SAL Architecture

Technical Deep-Dive

Overview

Component 1: Communication Layer

Class: CommunicationLayer

Methods

analyze() -> Dict[str, float]

protect() -> Dict[str, float]

Adaptive Threshold

Component 2: Stability Analyzer

Class: StabilityAnalyzer

Methods

analyze() -> Dict[str, float]

classify() -> StabilitySpectrum

Stability States

Component 3: Emergence Field

Class: EmergenceField

Methods

observe(pattern) -> EmergenceState

detect_emergence(coherence, novelty) -> bool

Emergence Metrics

Component 4: Pulse-Split-Cascade (PSC)

Class: PulseCascade

Flow

Methods

initiate(embedding) -> Pulse

step(evolve_fn, measure_fn) -> List[Pulse]

emerge() -> Pulse

Integration

Minimal Integration (2 lines)

Full Integration

Configuration

Recommended Defaults

Tuning Guidelines

Performance

Class: `CommunicationLayer`

`analyze() -> Dict[str, float]`

`protect() -> Dict[str, float]`

Class: `StabilityAnalyzer`

`analyze() -> Dict[str, float]`

`classify() -> StabilitySpectrum`

Class: `EmergenceField`

`observe(pattern) -> EmergenceState`

`detect_emergence(coherence, novelty) -> bool`

Class: `PulseCascade`

`initiate(embedding) -> Pulse`

`step(evolve_fn, measure_fn) -> List[Pulse]`

`emerge() -> Pulse`