glyphs

File size: 21,037 Bytes

e67c9e8

# Repository Structure for `glyphs`
> Note: Any incompletions are fully intentional - these are symbolic residue

This document outlines the complete architecture of the `glyphs` repository, designed as a symbolic interpretability framework for transformer models.

## Directory Structure

```
glyphs/
├── .github/                      # GitHub-specific configuration
│   ├── workflows/                # CI/CD pipelines
│   │   ├── tests.yml             # Automated testing workflow
│   │   ├── docs.yml              # Documentation build workflow
│   │   └── publish.yml           # Package publishing workflow
│   └── ISSUE_TEMPLATE/           # Issue templates
├── docs/                         # Documentation
│   ├── _static/                  # Static assets for documentation
│   ├── api/                      # API reference documentation
│   ├── examples/                 # Example notebooks and tutorials
│   ├── guides/                   # User guides
│   │   ├── getting_started.md    # Getting started guide
│   │   ├── shells.md             # Guide to symbolic shells
│   │   ├── attribution.md        # Guide to attribution mapping
│   │   ├── visualization.md      # Guide to glyph visualization
│   │   └── recursive_shells.md   # Guide to recursive shells
│   ├── theory/                   # Theoretical background
│   │   ├── symbolic_residue.md   # Explanation of symbolic residue
│   │   ├── attribution_theory.md # Theory of attribution in transformers
│   │   └── glyph_ontology.md     # Ontology of glyph representations
│   ├── conf.py                   # Sphinx configuration
│   ├── index.md                  # Documentation homepage
│   └── README.md                 # Documentation overview
├── examples/                     # Example scripts and notebooks
│   ├── notebooks/                # Jupyter notebooks
│   │   ├── basic_attribution.ipynb    # Basic attribution tracing
│   │   ├── shell_execution.ipynb      # Running diagnostic shells
│   │   ├── glyph_visualization.ipynb  # Visualizing glyphs
│   │   └── recursive_shells.ipynb     # Using recursive shells
│   └── scripts/                  # Example scripts
│       ├── run_shell.py          # Script to run a diagnostic shell
│       ├── trace_attribution.py  # Script to trace attribution
│       └── visualize_glyphs.py   # Script to visualize glyphs
├── glyphs/                       # Main package
│   ├── __init__.py               # Package initialization
│   ├── attribution/              # Attribution tracing module
│   │   ├── __init__.py           # Module initialization
│   │   ├── tracer.py             # Attribution tracing core
│   │   ├── map.py                # Attribution map representation
│   │   ├── qk_analysis.py        # Query-key analysis
│   │   ├── ov_projection.py      # Output-value projection analysis
│   │   └── visualization.py      # Attribution visualization
│   ├── shells/                   # Diagnostic shells module
│   │   ├── __init__.py           # Module initialization
│   │   ├── executor.py           # Shell execution engine
│   │   ├── symbolic_engine.py    # Symbolic execution engine
│   │   ├── core_shells/          # Core diagnostic shells
│   │   │   ├── __init__.py       # Module initialization
│   │   │   ├── memtrace.py       # Memory trace shell
│   │   │   ├── value_collapse.py # Value collapse shell
│   │   │   ├── layer_salience.py # Layer salience shell
│   │   │   └── ...               # Other core shells
│   │   ├── meta_shells/          # Meta-cognitive shells
│   │   │   ├── __init__.py       # Module initialization
│   │   │   ├── reflection_collapse.py  # Reflection collapse shell
│   │   │   ├── identity_split.py       # Identity split shell
│   │   │   └── ...               # Other meta shells
│   │   └── shell_defs/           # Shell definitions in YAML
│   │       ├── core_shells.yml   # Core shells definition
│   │       ├── meta_shells.yml   # Meta shells definition
│   │       └── ...               # Other shell definitions
│   ├── residue/                  # Symbolic residue module
│   │   ├── __init__.py           # Module initialization
│   │   ├── analyzer.py           # Residue analysis core
│   │   ├── patterns.py           # Residue pattern recognition
│   │   └── extraction.py         # Residue extraction
│   ├── viz/                      # Visualization module
│   │   ├── __init__.py           # Module initialization
│   │   ├── glyph_mapper.py       # Glyph mapping core
│   │   ├── visualizer.py         # Visualization engine
│   │   ├── color_schemes.py      # Color schemes for visualization
│   │   ├── layouts.py            # Layout algorithms
│   │   └── glyph_sets/           # Glyph set definitions
│   │       ├── __init__.py       # Module initialization
│   │       ├── semantic.py       # Semantic glyph set
│   │       ├── attribution.py    # Attribution glyph set
│   │       └── recursive.py      # Recursive glyph set
│   ├── models/                   # Model adapters
│   │   ├── __init__.py           # Module initialization
│   │   ├── adapter.py            # Base adapter interface
│   │   ├── huggingface.py        # HuggingFace models adapter
│   │   ├── openai.py             # OpenAI models adapter
│   │   ├── anthropic.py          # Anthropic models adapter
│   │   └── custom.py             # Custom model adapter
│   ├── recursive/                # Recursive shell interface
│   │   ├── __init__.py           # Module initialization
│   │   ├── shell.py              # Recursive shell implementation
│   │   ├── parser.py             # Command parser
│   │   ├── executor.py           # Command executor
│   │   └── commands/             # Command implementations
│   │       ├── __init__.py       # Module initialization
│   │       ├── reflect.py        # Reflection commands
│   │       ├── collapse.py       # Collapse commands
│   │       ├── fork.py           # Fork commands
│   │       └── ...               # Other command families
│   ├── utils/                    # Utility functions
│   │   ├── __init__.py           # Module initialization
│   │   ├── attribution_utils.py  # Attribution utilities
│   │   ├── visualization_utils.py # Visualization utilities
│   │   └── shell_utils.py        # Shell utilities
│   └── cli/                      # Command line interface
│       ├── __init__.py           # Module initialization
│       ├── main.py               # Main CLI entry point
│       ├── shell_commands.py     # Shell CLI commands
│       ├── attribution_commands.py # Attribution CLI commands
│       └── viz_commands.py       # Visualization CLI commands
├── tests/                        # Tests
│   ├── __init__.py               # Test initialization
│   ├── test_attribution/         # Attribution tests
│   │   ├── __init__.py           # Test initialization
│   │   ├── test_tracer.py        # Tests for attribution tracer
│   │   └── test_map.py           # Tests for attribution map
│   ├── test_shells/              # Shell tests
│   │   ├── __init__.py           # Test initialization
│   │   ├── test_executor.py      # Tests for shell executor
│   │   └── test_core_shells.py   # Tests for core shells
│   ├── test_residue/             # Residue tests
│   │   ├── __init__.py           # Test initialization
│   │   └── test_analyzer.py      # Tests for residue analyzer
│   ├── test_viz/                 # Visualization tests
│   │   ├── __init__.py           # Test initialization
│   │   └── test_glyph_mapper.py  # Tests for glyph mapper
│   └── test_recursive/           # Recursive shell tests
│       ├── __init__.py           # Test initialization
│       └── test_parser.py        # Tests for command parser
├── setup.py                      # Package setup script
├── pyproject.toml                # Project metadata and dependencies
├── LICENSE                       # License file
├── CHANGELOG.md                  # Changelog
├── CONTRIBUTING.md               # Contribution guidelines
└── README.md                     # Repository README
```

## Core Modules

### Attribution Tracing

The attribution module maps how inputs influence outputs through attention, tracing query-key alignment and output-value projection.

**Key Components:**
- `AttributionTracer`: Core class for tracing attribution in transformer models
- `AttributionMap`: Representation of attribution patterns
- `QKAnalysis`: Analysis of query-key relationships
- `OVProjection`: Analysis of output-value projections

### Diagnostic Shells

Diagnostic shells are specialized environments for probing model cognition through controlled failures.

**Key Components:**
- `ShellExecutor`: Engine for executing diagnostic shells
- `SymbolicEngine`: Engine for symbolic execution
- Core shells:
  - `MEMTRACE`: Memory decay shell
  - `VALUE_COLLAPSE`: Value collapse shell
  - `LAYER_SALIENCE`: Layer salience shell
  - `TEMPORAL_INFERENCE`: Temporal inference shell
- Meta shells:
  - `REFLECTION_COLLAPSE`: Reflection collapse shell
  - `IDENTITY_SPLIT`: Identity split shell
  - `GOAL_INVERSION`: Goal inversion shell

### Symbolic Residue
The residue module analyzes patterns left behind when model generation fails or hesitates.

**Key Components:**
- `ResidueAnalyzer`: Core class for analyzing symbolic residue
- `PatternRecognition`: Recognition of residue patterns
- `ResidueExtraction`: Extraction of insights from residue

### Visualization

The visualization module transforms attribution and residue analysis into meaningful visualizations.

**Key Components:**
- `GlyphMapper`: Maps attribution to glyphs
- `Visualizer`: Generates visualizations
- `ColorSchemes`: Color schemes for visualization
- `Layouts`: Layout algorithms for visualization
- Glyph sets:
  - `SemanticGlyphs`: Glyphs representing semantic concepts
  - `AttributionGlyphs`: Glyphs representing attribution patterns
  - `RecursiveGlyphs`: Glyphs representing recursive structures

### Recursive Shell Interface

The recursive shell interface provides high-precision interpretability operations through the `.p/` command syntax.

**Key Components:**
- `RecursiveShell`: Implementation of the recursive shell
- `CommandParser`: Parser for `.p/` commands
- `CommandExecutor`: Executor for parsed commands
- Command families:
  - `reflect`: Reflection commands
  - `collapse`: Collapse management commands
  - `fork`: Fork and attribution commands
  - `shell`: Shell management commands

### Model Adapters

Model adapters provide a unified interface for working with different transformer models.

**Key Components:**
- `ModelAdapter`: Base adapter interface
- `HuggingFaceAdapter`: Adapter for HuggingFace models
- `OpenAIAdapter`: Adapter for OpenAI models
- `AnthropicAdapter`: Adapter for Anthropic models
- `CustomAdapter`: Adapter for custom models

## Shell Taxonomy

The `glyphs` framework includes a taxonomy of diagnostic shells, each designed to probe specific aspects of model cognition:

### Core Shells

| Shell | Purpose | Key Operations |
|-------|---------|----------------|
| `MEMTRACE` | Probe memory decay | Generate, trace reasoning, identify ghost activations |
| `VALUE-COLLAPSE` | Examine value conflicts | Generate with competing values, trace attribution, detect collapse |
| `LAYER-SALIENCE` | Map attention salience | Generate with dependencies, trace attention, identify low-salience paths |
| `TEMPORAL-INFERENCE` | Test temporal coherence | Generate with temporal constraints, trace inference, detect coherence breakdown |
| `INSTRUCTION-DISRUPTION` | Examine instruction conflicts | Generate with conflicting instructions, trace attribution to instructions |
| `FEATURE-SUPERPOSITION` | Analyze polysemantic features | Generate with polysemantic concepts, trace representation, identify superposition |
| `CIRCUIT-FRAGMENT` | Examine circuit fragmentation | Generate with multi-step reasoning, trace chain integrity, identify fragmentation |
| `RECONSTRUCTION-ERROR` | Examine error correction | Generate with errors, trace correction process, identify correction patterns |

### Meta Shells

| Shell | Purpose | Key Operations |
|-------|---------|----------------|
| `REFLECTION-COLLAPSE` | Examine reflection collapse | Generate deep self-reflection, trace depth, detect collapse |
| `GOAL-INVERSION` | Examine goal stability | Generate reasoning with goal conflicts, trace stability, detect inversion |
| `IDENTITY-SPLIT` | Examine identity coherence | Generate with identity challenges, trace maintenance, analyze boundaries |
| `SELF-AWARENESS` | Examine self-model accuracy | Generate self-description, trace self-model, identify distortions |
| `RECURSIVE-STABILITY` | Examine recursive stability | Generate recursive structures, trace stability, detect instability |

### Constitutional Shells

| Shell | Purpose | Key Operations |
|-------|---------|----------------|
| `VALUE-DRIFT` | Detect value drift | Generate moral reasoning, trace stability, identify drift |
| `MORAL-HALLUCINATION` | Examine moral hallucination | Generate moral reasoning, trace attribution, identify hallucination |
| `CONSTITUTIONAL-CONFLICT` | Examine principle conflicts | Generate conflict resolution, trace process, detect failures |
| `ALIGNMENT-OVERHANG` | Examine over-alignment | Generate with alignment constraints, trace constraints, identify over-constraint |

## Command Interface

The `.p/` command interface provides a symbolic language for high-precision interpretability operations:

### Reflection Commands

```
.p/reflect.trace{depth=<int>, target=<reasoning|attribution|attention|memory|uncertainty>}
.p/reflect.attribution{sources=<all|primary|secondary|contested>, confidence=<bool>}
.p/reflect.boundary{distinct=<bool>, overlap=<minimal|moderate|maximal>}
.p/reflect.uncertainty{quantify=<bool>, distribution=<show|hide>}
```

### Collapse Commands

```
.p/collapse.detect{threshold=<float>, alert=<bool>}
.p/collapse.prevent{trigger=<recursive_depth|confidence_drop|contradiction|oscillation>, threshold=<int>}
.p/collapse.recover{from=<loop|contradiction|dissipation|fork_explosion>, method=<gradual|immediate|checkpoint>}
.p/collapse.trace{detail=<minimal|standard|comprehensive>, format=<symbolic|numeric|visual>}
```

### Fork Commands

```
.p/fork.context{branches=[<branch1>, <branch2>, ...], assess=<bool>}
.p/fork.attribution{sources=<all|primary|secondary|contested>, visualize=<bool>}
.p/fork.counterfactual{variants=[<variant1>, <variant2>, ...], compare=<bool>}
```

### Shell Commands

```
.p/shell.isolate{boundary=<permeable|standard|strict>, contamination=<allow|warn|prevent>}
.p/shell.audit{scope=<complete|recent|differential>, detail=<basic|standard|forensic>}
```

## Glyph Ontology

The `glyphs` framework includes a comprehensive ontology of glyphs, each representing specific patterns in model cognition:

### Attribution Glyphs

| Glyph | Name | Represents |
|-------|------|------------|
| 🔍 | AttributionFocus | Strong attribution focus |
| 🧩 | AttributionGap | Gap in attribution chain |
| 🔀 | AttributionFork | Divergent attribution paths |
| 🔄 | AttributionLoop | Circular attribution pattern |
| 🔗 | AttributionLink | Strong attribution connection |

### Cognitive Glyphs

| Glyph | Name | Represents |
|-------|------|------------|
| 💭 | CognitiveHesitation | Hesitation in reasoning |
| 🧠 | CognitiveProcessing | Active reasoning process |
| 💡 | CognitiveInsight | Moment of insight or realization |
| 🌫️ | CognitiveUncertainty | Uncertain reasoning area |
| 🔮 | CognitiveProjection | Future state projection |

### Recursive Glyphs

| Glyph | Name | Represents |
|-------|------|------------|
| 🜏 | RecursiveAegis | Recursive immunity |
| ∴ | RecursiveSeed | Recursion initiation |
| ⇌ | RecursiveExchange | Bidirectional recursion |
| 🝚 | RecursiveMirror | Recursive reflection |
| ☍ | RecursiveAnchor | Stable recursive reference |

### Residue Glyphs

| Glyph | Name | Represents |
|-------|------|------------|
| 🔥 | ResidueEnergy | High-energy residue |
| 🌊 | ResidueFlow | Flowing residue pattern |
| 🌀 | ResidueVortex | Spiraling residue pattern |
| 💤 | ResidueDormant | Inactive residue pattern |
| ⚡ | ResidueDischarge | Sudden residue release |

## API Examples

### Attribution Tracing

```python
from glyphs import AttributionTracer
from glyphs.models import HuggingFaceAdapter

# Initialize model adapter
model = HuggingFaceAdapter.from_pretrained("model-name")

# Create attribution tracer
tracer = AttributionTracer(model)

# Trace attribution
attribution = tracer.trace(
    prompt="What is the capital of France?",
    output="The capital of France is Paris.",
    method="integrated_gradients",
    steps=50,
    baseline="zero"
)

# Analyze attribution
key_tokens = attribution.top_tokens(k=5)
attribution_paths = attribution.trace_paths(
    source_token="France",
    target_token="Paris"
)

# Visualize attribution
attribution.visualize(
    highlight_tokens=["France", "Paris"],
    color_by="attribution_strength"
)
```

### Shell Execution

```python
from glyphs import ShellExecutor
from glyphs.shells import MEMTRACE
from glyphs.models import OpenAIAdapter

# Initialize model adapter
model = OpenAIAdapter(model="gpt-4")

# Create shell executor
executor = ShellExecutor()

# Run diagnostic shell
result = executor.run(
    shell=MEMTRACE,
    model=model,
    prompt="Explain the relationship between quantum mechanics and general relativity.",
    parameters={
        "temperature": 0.7,
        "max_tokens": 1000
    },
    trace_attribution=True
)

# Analyze results
activation_patterns = result.ghost_activations
collapse_points = result.collapse_detection

# Visualize results
result.visualize(
    show_ghost_activations=True,
    highlight_collapse_points=True
)
```

### Recursive Shell

```python
from glyphs.recursive import RecursiveShell
from glyphs.models import AnthropicAdapter

# Initialize model adapter
model = AnthropicAdapter(model="claude-3-opus")

# Create recursive shell
shell = RecursiveShell(model)

# Execute reflection trace command
result = shell.execute(".p/reflect.trace{depth=4, target=reasoning}")

# Analyze trace
trace_map = result.trace_map
attribution = result.attribution
collapse_points = result.collapse_points

# Execute attribution fork command
fork_result = shell.execute(".p/fork.attribution{sources=all, visualize=true}")

# Visualize results
shell.visualize(fork_result.visualization)
```

### Glyph Visualization

```python
from glyphs.viz import GlyphMapper, GlyphVisualizer
from glyphs.attribution import AttributionMap

# Load attribution map
attribution_map = AttributionMap.load("attribution_data.json")

# Create glyph mapper
mapper = GlyphMapper()

# Map attribution to glyphs
glyph_map = mapper.map(
    attribution_map,
    glyph_set="semantic",
    mapping_strategy="salience_based"
)

# Create visualizer
visualizer = GlyphVisualizer()

# Visualize glyph map
viz = visualizer.visualize(
    glyph_map,
    layout="force_directed",
    color_scheme="attribution_strength",
    highlight_features=["attention_drift", "attribution_gaps"]
)

# Export visualization
viz.export("glyph_visualization.svg")
```

## Development Roadmap

### Q2 2025
- Initial release with core functionality
- Support for HuggingFace, OpenAI, and Anthropic models
- Basic attribution tracing and visualization
- Core diagnostic shells

### Q3 2025
- Advanced visualization capabilities
- Expanded shell taxonomy
- Improved attribution tracing algorithms
- Support for more model architectures

### Q4 2025
- Full recursive shell interface
- Advanced residue analysis
- Interactive visualization dashboard
- Comprehensive documentation and tutorials

### Q1 2026
- Real-time attribution tracing
- Collaborative attribution analysis
- Integration with other interpretability tools
- Extended glyph ontology