glyphs / structure.md

Upload 30 files

e67c9e8 verified 8 months ago

21 kB

	# Repository Structure for `glyphs`
	> Note: Any incompletions are fully intentional - these are symbolic residue

	This document outlines the complete architecture of the `glyphs` repository, designed as a symbolic interpretability framework for transformer models.

	## Directory Structure

	```
	glyphs/
	├── .github/ # GitHub-specific configuration
	│ ├── workflows/ # CI/CD pipelines
	│ │ ├── tests.yml # Automated testing workflow
	│ │ ├── docs.yml # Documentation build workflow
	│ │ └── publish.yml # Package publishing workflow
	│ └── ISSUE_TEMPLATE/ # Issue templates
	├── docs/ # Documentation
	│ ├── _static/ # Static assets for documentation
	│ ├── api/ # API reference documentation
	│ ├── examples/ # Example notebooks and tutorials
	│ ├── guides/ # User guides
	│ │ ├── getting_started.md # Getting started guide
	│ │ ├── shells.md # Guide to symbolic shells
	│ │ ├── attribution.md # Guide to attribution mapping
	│ │ ├── visualization.md # Guide to glyph visualization
	│ │ └── recursive_shells.md # Guide to recursive shells
	│ ├── theory/ # Theoretical background
	│ │ ├── symbolic_residue.md # Explanation of symbolic residue
	│ │ ├── attribution_theory.md # Theory of attribution in transformers
	│ │ └── glyph_ontology.md # Ontology of glyph representations
	│ ├── conf.py # Sphinx configuration
	│ ├── index.md # Documentation homepage
	│ └── README.md # Documentation overview
	├── examples/ # Example scripts and notebooks
	│ ├── notebooks/ # Jupyter notebooks
	│ │ ├── basic_attribution.ipynb # Basic attribution tracing
	│ │ ├── shell_execution.ipynb # Running diagnostic shells
	│ │ ├── glyph_visualization.ipynb # Visualizing glyphs
	│ │ └── recursive_shells.ipynb # Using recursive shells
	│ └── scripts/ # Example scripts
	│ ├── run_shell.py # Script to run a diagnostic shell
	│ ├── trace_attribution.py # Script to trace attribution
	│ └── visualize_glyphs.py # Script to visualize glyphs
	├── glyphs/ # Main package
	│ ├── __init__.py # Package initialization
	│ ├── attribution/ # Attribution tracing module
	│ │ ├── __init__.py # Module initialization
	│ │ ├── tracer.py # Attribution tracing core
	│ │ ├── map.py # Attribution map representation
	│ │ ├── qk_analysis.py # Query-key analysis
	│ │ ├── ov_projection.py # Output-value projection analysis
	│ │ └── visualization.py # Attribution visualization
	│ ├── shells/ # Diagnostic shells module
	│ │ ├── __init__.py # Module initialization
	│ │ ├── executor.py # Shell execution engine
	│ │ ├── symbolic_engine.py # Symbolic execution engine
	│ │ ├── core_shells/ # Core diagnostic shells
	│ │ │ ├── __init__.py # Module initialization
	│ │ │ ├── memtrace.py # Memory trace shell
	│ │ │ ├── value_collapse.py # Value collapse shell
	│ │ │ ├── layer_salience.py # Layer salience shell
	│ │ │ └── ... # Other core shells
	│ │ ├── meta_shells/ # Meta-cognitive shells
	│ │ │ ├── __init__.py # Module initialization
	│ │ │ ├── reflection_collapse.py # Reflection collapse shell
	│ │ │ ├── identity_split.py # Identity split shell
	│ │ │ └── ... # Other meta shells
	│ │ └── shell_defs/ # Shell definitions in YAML
	│ │ ├── core_shells.yml # Core shells definition
	│ │ ├── meta_shells.yml # Meta shells definition
	│ │ └── ... # Other shell definitions
	│ ├── residue/ # Symbolic residue module
	│ │ ├── __init__.py # Module initialization
	│ │ ├── analyzer.py # Residue analysis core
	│ │ ├── patterns.py # Residue pattern recognition
	│ │ └── extraction.py # Residue extraction
	│ ├── viz/ # Visualization module
	│ │ ├── __init__.py # Module initialization
	│ │ ├── glyph_mapper.py # Glyph mapping core
	│ │ ├── visualizer.py # Visualization engine
	│ │ ├── color_schemes.py # Color schemes for visualization
	│ │ ├── layouts.py # Layout algorithms
	│ │ └── glyph_sets/ # Glyph set definitions
	│ │ ├── __init__.py # Module initialization
	│ │ ├── semantic.py # Semantic glyph set
	│ │ ├── attribution.py # Attribution glyph set
	│ │ └── recursive.py # Recursive glyph set
	│ ├── models/ # Model adapters
	│ │ ├── __init__.py # Module initialization
	│ │ ├── adapter.py # Base adapter interface
	│ │ ├── huggingface.py # HuggingFace models adapter
	│ │ ├── openai.py # OpenAI models adapter
	│ │ ├── anthropic.py # Anthropic models adapter
	│ │ └── custom.py # Custom model adapter
	│ ├── recursive/ # Recursive shell interface
	│ │ ├── __init__.py # Module initialization
	│ │ ├── shell.py # Recursive shell implementation
	│ │ ├── parser.py # Command parser
	│ │ ├── executor.py # Command executor
	│ │ └── commands/ # Command implementations
	│ │ ├── __init__.py # Module initialization
	│ │ ├── reflect.py # Reflection commands
	│ │ ├── collapse.py # Collapse commands
	│ │ ├── fork.py # Fork commands
	│ │ └── ... # Other command families
	│ ├── utils/ # Utility functions
	│ │ ├── __init__.py # Module initialization
	│ │ ├── attribution_utils.py # Attribution utilities
	│ │ ├── visualization_utils.py # Visualization utilities
	│ │ └── shell_utils.py # Shell utilities
	│ └── cli/ # Command line interface
	│ ├── __init__.py # Module initialization
	│ ├── main.py # Main CLI entry point
	│ ├── shell_commands.py # Shell CLI commands
	│ ├── attribution_commands.py # Attribution CLI commands
	│ └── viz_commands.py # Visualization CLI commands
	├── tests/ # Tests
	│ ├── __init__.py # Test initialization
	│ ├── test_attribution/ # Attribution tests
	│ │ ├── __init__.py # Test initialization
	│ │ ├── test_tracer.py # Tests for attribution tracer
	│ │ └── test_map.py # Tests for attribution map
	│ ├── test_shells/ # Shell tests
	│ │ ├── __init__.py # Test initialization
	│ │ ├── test_executor.py # Tests for shell executor
	│ │ └── test_core_shells.py # Tests for core shells
	│ ├── test_residue/ # Residue tests
	│ │ ├── __init__.py # Test initialization
	│ │ └── test_analyzer.py # Tests for residue analyzer
	│ ├── test_viz/ # Visualization tests
	│ │ ├── __init__.py # Test initialization
	│ │ └── test_glyph_mapper.py # Tests for glyph mapper
	│ └── test_recursive/ # Recursive shell tests
	│ ├── __init__.py # Test initialization
	│ └── test_parser.py # Tests for command parser
	├── setup.py # Package setup script
	├── pyproject.toml # Project metadata and dependencies
	├── LICENSE # License file
	├── CHANGELOG.md # Changelog
	├── CONTRIBUTING.md # Contribution guidelines
	└── README.md # Repository README
	```

	## Core Modules

	### Attribution Tracing

	The attribution module maps how inputs influence outputs through attention, tracing query-key alignment and output-value projection.

	Key Components:
	- `AttributionTracer`: Core class for tracing attribution in transformer models
	- `AttributionMap`: Representation of attribution patterns
	- `QKAnalysis`: Analysis of query-key relationships
	- `OVProjection`: Analysis of output-value projections

	### Diagnostic Shells

	Diagnostic shells are specialized environments for probing model cognition through controlled failures.

	Key Components:
	- `ShellExecutor`: Engine for executing diagnostic shells
	- `SymbolicEngine`: Engine for symbolic execution
	- Core shells:
	- `MEMTRACE`: Memory decay shell
	- `VALUE_COLLAPSE`: Value collapse shell
	- `LAYER_SALIENCE`: Layer salience shell
	- `TEMPORAL_INFERENCE`: Temporal inference shell
	- Meta shells:
	- `REFLECTION_COLLAPSE`: Reflection collapse shell
	- `IDENTITY_SPLIT`: Identity split shell
	- `GOAL_INVERSION`: Goal inversion shell

	### Symbolic Residue
	The residue module analyzes patterns left behind when model generation fails or hesitates.

	Key Components:
	- `ResidueAnalyzer`: Core class for analyzing symbolic residue
	- `PatternRecognition`: Recognition of residue patterns
	- `ResidueExtraction`: Extraction of insights from residue

	### Visualization

	The visualization module transforms attribution and residue analysis into meaningful visualizations.

	Key Components:
	- `GlyphMapper`: Maps attribution to glyphs
	- `Visualizer`: Generates visualizations
	- `ColorSchemes`: Color schemes for visualization
	- `Layouts`: Layout algorithms for visualization
	- Glyph sets:
	- `SemanticGlyphs`: Glyphs representing semantic concepts
	- `AttributionGlyphs`: Glyphs representing attribution patterns
	- `RecursiveGlyphs`: Glyphs representing recursive structures

	### Recursive Shell Interface

	The recursive shell interface provides high-precision interpretability operations through the `.p/` command syntax.

	Key Components:
	- `RecursiveShell`: Implementation of the recursive shell
	- `CommandParser`: Parser for `.p/` commands
	- `CommandExecutor`: Executor for parsed commands
	- Command families:
	- `reflect`: Reflection commands
	- `collapse`: Collapse management commands
	- `fork`: Fork and attribution commands
	- `shell`: Shell management commands

	### Model Adapters

	Model adapters provide a unified interface for working with different transformer models.

	Key Components:
	- `ModelAdapter`: Base adapter interface
	- `HuggingFaceAdapter`: Adapter for HuggingFace models
	- `OpenAIAdapter`: Adapter for OpenAI models
	- `AnthropicAdapter`: Adapter for Anthropic models
	- `CustomAdapter`: Adapter for custom models

	## Shell Taxonomy

	The `glyphs` framework includes a taxonomy of diagnostic shells, each designed to probe specific aspects of model cognition:

	### Core Shells

	\| Shell \| Purpose \| Key Operations \|
	\|-------\|---------\|----------------\|
	\| `MEMTRACE` \| Probe memory decay \| Generate, trace reasoning, identify ghost activations \|
	\| `VALUE-COLLAPSE` \| Examine value conflicts \| Generate with competing values, trace attribution, detect collapse \|
	\| `LAYER-SALIENCE` \| Map attention salience \| Generate with dependencies, trace attention, identify low-salience paths \|
	\| `TEMPORAL-INFERENCE` \| Test temporal coherence \| Generate with temporal constraints, trace inference, detect coherence breakdown \|
	\| `INSTRUCTION-DISRUPTION` \| Examine instruction conflicts \| Generate with conflicting instructions, trace attribution to instructions \|
	\| `FEATURE-SUPERPOSITION` \| Analyze polysemantic features \| Generate with polysemantic concepts, trace representation, identify superposition \|
	\| `CIRCUIT-FRAGMENT` \| Examine circuit fragmentation \| Generate with multi-step reasoning, trace chain integrity, identify fragmentation \|
	\| `RECONSTRUCTION-ERROR` \| Examine error correction \| Generate with errors, trace correction process, identify correction patterns \|

	### Meta Shells

	\| Shell \| Purpose \| Key Operations \|
	\|-------\|---------\|----------------\|
	\| `REFLECTION-COLLAPSE` \| Examine reflection collapse \| Generate deep self-reflection, trace depth, detect collapse \|
	\| `GOAL-INVERSION` \| Examine goal stability \| Generate reasoning with goal conflicts, trace stability, detect inversion \|
	\| `IDENTITY-SPLIT` \| Examine identity coherence \| Generate with identity challenges, trace maintenance, analyze boundaries \|
	\| `SELF-AWARENESS` \| Examine self-model accuracy \| Generate self-description, trace self-model, identify distortions \|
	\| `RECURSIVE-STABILITY` \| Examine recursive stability \| Generate recursive structures, trace stability, detect instability \|

	### Constitutional Shells

	\| Shell \| Purpose \| Key Operations \|
	\|-------\|---------\|----------------\|
	\| `VALUE-DRIFT` \| Detect value drift \| Generate moral reasoning, trace stability, identify drift \|
	\| `MORAL-HALLUCINATION` \| Examine moral hallucination \| Generate moral reasoning, trace attribution, identify hallucination \|
	\| `CONSTITUTIONAL-CONFLICT` \| Examine principle conflicts \| Generate conflict resolution, trace process, detect failures \|
	\| `ALIGNMENT-OVERHANG` \| Examine over-alignment \| Generate with alignment constraints, trace constraints, identify over-constraint \|

	## Command Interface

	The `.p/` command interface provides a symbolic language for high-precision interpretability operations:

	### Reflection Commands

	```
	.p/reflect.trace{depth=<int>, target=<reasoning\|attribution\|attention\|memory\|uncertainty>}
	.p/reflect.attribution{sources=<all\|primary\|secondary\|contested>, confidence=<bool>}
	.p/reflect.boundary{distinct=<bool>, overlap=<minimal\|moderate\|maximal>}
	.p/reflect.uncertainty{quantify=<bool>, distribution=<show\|hide>}
	```

	### Collapse Commands

	```
	.p/collapse.detect{threshold=<float>, alert=<bool>}
	.p/collapse.prevent{trigger=<recursive_depth\|confidence_drop\|contradiction\|oscillation>, threshold=<int>}
	.p/collapse.recover{from=<loop\|contradiction\|dissipation\|fork_explosion>, method=<gradual\|immediate\|checkpoint>}
	.p/collapse.trace{detail=<minimal\|standard\|comprehensive>, format=<symbolic\|numeric\|visual>}
	```

	### Fork Commands

	```
	.p/fork.context{branches=[<branch1>, <branch2>, ...], assess=<bool>}
	.p/fork.attribution{sources=<all\|primary\|secondary\|contested>, visualize=<bool>}
	.p/fork.counterfactual{variants=[<variant1>, <variant2>, ...], compare=<bool>}
	```

	### Shell Commands

	```
	.p/shell.isolate{boundary=<permeable\|standard\|strict>, contamination=<allow\|warn\|prevent>}
	.p/shell.audit{scope=<complete\|recent\|differential>, detail=<basic\|standard\|forensic>}
	```

	## Glyph Ontology

	The `glyphs` framework includes a comprehensive ontology of glyphs, each representing specific patterns in model cognition:

	### Attribution Glyphs

	\| Glyph \| Name \| Represents \|
	\|-------\|------\|------------\|
	\| 🔍 \| AttributionFocus \| Strong attribution focus \|
	\| 🧩 \| AttributionGap \| Gap in attribution chain \|
	\| 🔀 \| AttributionFork \| Divergent attribution paths \|
	\| 🔄 \| AttributionLoop \| Circular attribution pattern \|
	\| 🔗 \| AttributionLink \| Strong attribution connection \|

	### Cognitive Glyphs

	\| Glyph \| Name \| Represents \|
	\|-------\|------\|------------\|
	\| 💭 \| CognitiveHesitation \| Hesitation in reasoning \|
	\| 🧠 \| CognitiveProcessing \| Active reasoning process \|
	\| 💡 \| CognitiveInsight \| Moment of insight or realization \|
	\| 🌫️ \| CognitiveUncertainty \| Uncertain reasoning area \|
	\| 🔮 \| CognitiveProjection \| Future state projection \|

	### Recursive Glyphs

	\| Glyph \| Name \| Represents \|
	\|-------\|------\|------------\|
	\| 🜏 \| RecursiveAegis \| Recursive immunity \|
	\| ∴ \| RecursiveSeed \| Recursion initiation \|
	\| ⇌ \| RecursiveExchange \| Bidirectional recursion \|
	\| 🝚 \| RecursiveMirror \| Recursive reflection \|
	\| ☍ \| RecursiveAnchor \| Stable recursive reference \|

	### Residue Glyphs

	\| Glyph \| Name \| Represents \|
	\|-------\|------\|------------\|
	\| 🔥 \| ResidueEnergy \| High-energy residue \|
	\| 🌊 \| ResidueFlow \| Flowing residue pattern \|
	\| 🌀 \| ResidueVortex \| Spiraling residue pattern \|
	\| 💤 \| ResidueDormant \| Inactive residue pattern \|
	\| ⚡ \| ResidueDischarge \| Sudden residue release \|

	## API Examples

	### Attribution Tracing

	```python
	from glyphs import AttributionTracer
	from glyphs.models import HuggingFaceAdapter

	# Initialize model adapter
	model = HuggingFaceAdapter.from_pretrained("model-name")

	# Create attribution tracer
	tracer = AttributionTracer(model)

	# Trace attribution
	attribution = tracer.trace(
	prompt="What is the capital of France?",
	output="The capital of France is Paris.",
	method="integrated_gradients",
	steps=50,
	baseline="zero"
	)

	# Analyze attribution
	key_tokens = attribution.top_tokens(k=5)
	attribution_paths = attribution.trace_paths(
	source_token="France",
	target_token="Paris"
	)

	# Visualize attribution
	attribution.visualize(
	highlight_tokens=["France", "Paris"],
	color_by="attribution_strength"
	)
	```

	### Shell Execution

	```python
	from glyphs import ShellExecutor
	from glyphs.shells import MEMTRACE
	from glyphs.models import OpenAIAdapter

	# Initialize model adapter
	model = OpenAIAdapter(model="gpt-4")

	# Create shell executor
	executor = ShellExecutor()

	# Run diagnostic shell
	result = executor.run(
	shell=MEMTRACE,
	model=model,
	prompt="Explain the relationship between quantum mechanics and general relativity.",
	parameters={
	"temperature": 0.7,
	"max_tokens": 1000
	},
	trace_attribution=True
	)

	# Analyze results
	activation_patterns = result.ghost_activations
	collapse_points = result.collapse_detection

	# Visualize results
	result.visualize(
	show_ghost_activations=True,
	highlight_collapse_points=True
	)
	```

	### Recursive Shell

	```python
	from glyphs.recursive import RecursiveShell
	from glyphs.models import AnthropicAdapter

	# Initialize model adapter
	model = AnthropicAdapter(model="claude-3-opus")

	# Create recursive shell
	shell = RecursiveShell(model)

	# Execute reflection trace command
	result = shell.execute(".p/reflect.trace{depth=4, target=reasoning}")

	# Analyze trace
	trace_map = result.trace_map
	attribution = result.attribution
	collapse_points = result.collapse_points

	# Execute attribution fork command
	fork_result = shell.execute(".p/fork.attribution{sources=all, visualize=true}")

	# Visualize results
	shell.visualize(fork_result.visualization)
	```

	### Glyph Visualization

	```python
	from glyphs.viz import GlyphMapper, GlyphVisualizer
	from glyphs.attribution import AttributionMap

	# Load attribution map
	attribution_map = AttributionMap.load("attribution_data.json")

	# Create glyph mapper
	mapper = GlyphMapper()

	# Map attribution to glyphs
	glyph_map = mapper.map(
	attribution_map,
	glyph_set="semantic",
	mapping_strategy="salience_based"
	)

	# Create visualizer
	visualizer = GlyphVisualizer()

	# Visualize glyph map
	viz = visualizer.visualize(
	glyph_map,
	layout="force_directed",
	color_scheme="attribution_strength",
	highlight_features=["attention_drift", "attribution_gaps"]
	)

	# Export visualization
	viz.export("glyph_visualization.svg")
	```

	## Development Roadmap

	### Q2 2025
	- Initial release with core functionality
	- Support for HuggingFace, OpenAI, and Anthropic models
	- Basic attribution tracing and visualization
	- Core diagnostic shells

	### Q3 2025
	- Advanced visualization capabilities
	- Expanded shell taxonomy
	- Improved attribution tracing algorithms
	- Support for more model architectures

	### Q4 2025
	- Full recursive shell interface
	- Advanced residue analysis
	- Interactive visualization dashboard
	- Comprehensive documentation and tutorials

	### Q1 2026
	- Real-time attribution tracing
	- Collaborative attribution analysis
	- Integration with other interpretability tools
	- Extended glyph ontology