Spaces:

cdpearlman
/

LLMVis

Sleeping

App Files Files Community

cdpearlman commited on Jan 22

Commit

e13b1ba

1 Parent(s): 3850656

Add comprehensive test suite for utility modules (73 tests)

Browse files

Files changed (26) hide show

.cursor/debug.log +0 -4
.cursor/rules/minimal_changes.mdc +6 -0
.cursor/rules/planning_mode.mdc +62 -0
components/__pycache__/investigation_panel.cpython-311.pyc +0 -0
components/__pycache__/model_selector.cpython-311.pyc +0 -0
components/__pycache__/pipeline.cpython-311.pyc +0 -0
components/model_selector.py +11 -0
plans.md +26 -15
requirements.txt +3 -0
tests/__init__.py +1 -0
tests/__pycache__/__init__.cpython-311.pyc +0 -0
tests/__pycache__/conftest.cpython-311-pytest-8.4.2.pyc +0 -0
tests/__pycache__/test_ablation_metrics.cpython-311-pytest-8.4.2.pyc +0 -0
tests/__pycache__/test_head_detection.cpython-311-pytest-8.4.2.pyc +0 -0
tests/__pycache__/test_model_config.cpython-311-pytest-8.4.2.pyc +0 -0
tests/__pycache__/test_model_patterns.cpython-311-pytest-8.4.2.pyc +0 -0
tests/__pycache__/test_token_attribution.cpython-311-pytest-8.4.2.pyc +0 -0
tests/conftest.py +206 -0
tests/test_ablation_metrics.py +120 -0
tests/test_head_detection.py +313 -0
tests/test_model_config.py +161 -0
tests/test_model_patterns.py +180 -0
tests/test_token_attribution.py +182 -0
todo.md +9 -16
utils/__pycache__/__init__.cpython-311.pyc +0 -0
utils/__pycache__/token_attribution.cpython-311.pyc +0 -0

.cursor/debug.log DELETED Viewed

@@ -1,4 +0,0 @@
-{"sessionId": "debug-session", "runId": "pre-fix", "hypothesisId": "H1", "location": "app.py:render_heatmap:entry", "message": "render_heatmap entry", "data": {"has_activation_data": true, "has_activation_data2": false, "has_original_activation_data": false, "mode_data": {"comparison": "prompt1", "ablation": "original"}, "model_name": "Qwen/Qwen2.5-0.5B", "plotly_version": "6.3.0"}, "timestamp": 1768427085403}
-{"sessionId": "debug-session", "runId": "pre-fix", "hypothesisId": "H2", "location": "app.py:render_heatmap:active_data", "message": "selected active data source", "data": {"comparison_mode": "prompt1", "ablation_mode": "original", "show_comparison_toggle": false, "show_ablation_toggle": false, "active_data_source": "activation_data"}, "timestamp": 1768427090436}
-{"sessionId": "debug-session", "runId": "pre-fix", "hypothesisId": "H3", "location": "app.py:render_heatmap:matrix_data", "message": "matrix data summary", "data": {"rows": 24, "cols": 11, "tokens_len": 11, "layers_len": 24, "top_tokens_rows": 24}, "timestamp": 1768427136380}
-{"sessionId": "debug-session", "runId": "pre-fix", "hypothesisId": "H1", "location": "app.py:render_heatmap:colorbar_config", "message": "heatmap colorbar config", "data": {"colorbar_config": {"title": {"text": "Delta", "side": "right"}}}, "timestamp": 1768427136385}

.cursor/rules/minimal_changes.mdc CHANGED Viewed

@@ -11,6 +11,12 @@ alwaysApply: true
 # Minimal Change Rules
 - Plan first:
   - Update `todo.md` with the smallest next actions tied to `plans.md`.
   - Keep tasks atomic and check them off as you go.

 # Minimal Change Rules
+- Testing & verification:
+  - For substantial code changes (new files, new functionality), write tests first in `tests/` that describe expected behavior.
+  - Skip tests for UI/frontend changes, trivial additions, and documentation.
+  - After implementing changes, run `pytest` to verify all tests pass.
+  - If tests fail, iterate on debugging until fixed.
 - Plan first:
   - Update `todo.md` with the smallest next actions tied to `plans.md`.
   - Keep tasks atomic and check them off as you go.

.cursor/rules/planning_mode.mdc ADDED Viewed

	@@ -0,0 +1,62 @@

+---
+description: Rules for dividing work into parallel agent worktrees
+globs:
+  - "plans.md"
+  - "todo.md"
+  - "**/PLAN*.md"
+alwaysApply: false
+---
+# Parallel Agent Planning Rules
+When creating or updating plans (plan mode active), structure work for independent parallel execution.
+## Worktree Division Principles
+- **Isolate by file/module**: Each agent task must target a distinct set of files that no other agent will touch.
+- **No shared edits**: If two tasks could modify the same file, merge them into one task.
+- **Define boundaries explicitly**: In your plan file, list the exact files/directories each agent owns.
+## Plan Structure for Parallel Agents
+When dividing work, create a section like:
+```
+## Parallel Worktrees
+### Agent A: [Task Name]
+- **Owns**: `utils/feature_a.py`, `tests/test_feature_a.py`
+- **Does not touch**: everything else
+- **Deliverable**: [specific outcome]
+### Agent B: [Task Name]
+- **Owns**: `components/widget.py`, `tests/test_widget.py`
+- **Does not touch**: everything else
+- **Deliverable**: [specific outcome]
+### Sequential (after parallel completes)
+- Integration task that touches shared files (e.g., `app.py` imports)
+```
+## Rules for Each Agent Task
+- Each task must be **self-contained**: write, test, verify independently.
+- Each task must specify:
+  1. Files it will create or modify (exclusive ownership)
+  2. Tests it will write/run
+  3. Success criteria
+- Shared dependencies (imports, configs) should be locked before parallel work begins.
+## Conflict Prevention Checklist
+Before finalizing a parallel plan, verify:
+- [ ] No two agents modify the same file
+- [ ] No two agents add imports to the same `__init__.py`
+- [ ] Shared interfaces are defined and frozen before parallel work
+- [ ] Each agent's tests can run independently
+## Git Strategy for Parallel Agents
+- Each agent works on its own feature branch: `feature/<agent-task-name>`
+- Branches are merged sequentially after all pass tests
+- Order: merge least-dependent branches first

components/__pycache__/investigation_panel.cpython-311.pyc ADDED Viewed

Binary file (17 kB). View file

components/__pycache__/model_selector.cpython-311.pyc CHANGED Viewed

Binary files a/components/__pycache__/model_selector.cpython-311.pyc and b/components/__pycache__/model_selector.cpython-311.pyc differ

components/__pycache__/pipeline.cpython-311.pyc ADDED Viewed

Binary file (23.4 kB). View file

components/model_selector.py CHANGED Viewed

@@ -12,9 +12,20 @@ from dash import html, dcc
 AVAILABLE_MODELS = [
     # LLaMA-like models (Qwen)
     {"label": "Qwen2.5-0.5B", "value": "Qwen/Qwen2.5-0.5B"},
     # GPT-2 family
     {"label": "GPT-2 (124M)", "value": "gpt2"}
 ]
 def create_model_selector():

 AVAILABLE_MODELS = [
     # LLaMA-like models (Qwen)
     {"label": "Qwen2.5-0.5B", "value": "Qwen/Qwen2.5-0.5B"},
+    # {"label": "Qwen2.5-1.5B", "value": "Qwen/Qwen2.5-1.5B"},
     # GPT-2 family
     {"label": "GPT-2 (124M)", "value": "gpt2"}
+    # {"label": "GPT-2 Medium (355M)", "value": "gpt2-medium"},
+    # {"label": "GPT-2 Large (774M)", "value": "gpt2-large"},
+    # # OPT family
+    # {"label": "OPT-125M", "value": "facebook/opt-125m"},
+    # {"label": "OPT-350M", "value": "facebook/opt-350m"},
+    # # GPT-NeoX family (Pythia)
+    # {"label": "Pythia-70M", "value": "EleutherAI/pythia-70m"},
+    # {"label": "Pythia-160M", "value": "EleutherAI/pythia-160m"},
 ]
 def create_model_selector():

plans.md CHANGED Viewed

@@ -1,19 +1,30 @@
-## Current Plan
-### Pipeline Explanation Refactor - COMPLETED
-The dashboard has been refactored from a testing/analysis tool into an explanation-first interface:
-1. **New Pipeline Visualization**: Linear flow (Input → Tokens → Embed → Attention → MLP → Output) with click-to-expand stages
-2. **Investigation Panel**: Consolidated ablation and token attribution tools
-3. **Simplified Codebase**: Removed heatmap, comparison mode, and ~900 lines of code
-4. **Token Attribution**: New gradient-based feature importance analysis
-#### File Changes
-- `app.py`: Reduced from 1781 to ~750 lines
-- `components/pipeline.py`: NEW - Main explanation flow
-- `components/investigation_panel.py`: NEW - Ablation + Attribution
-- `utils/token_attribution.py`: NEW - Integrated Gradients
-- `model_selector.py`: Simplified (removed comparison UI)
-- `main_panel.py`: DELETED
-- `prompt_comparison.py`: DELETED

+Generation Settings:
+- change "Max New Tokens" label to "Number of New Tokens"
+- the concept of beams is non-trivial, change "Beam" referencse to something like "Number of Generation Choices"
+Generated Sequences:
+- don't show the score value, it is virtually meaningless to a user
+1. Tokenization
+- stack tokens vertically instead of listing horizontally (more visually appealing)
+2. Embedding
+- add some explanation that embedding is taken from pre-learned table (improves understanding about how ID goes to embedding)
+3. Attention
+- use BertViz head view instead of model view, less overwhelming and lets user scroll through attention heads
+- add explanation of what the user is looking at and how to navigate BertViz vizualization (answers questions "What am I looking at?", "What should this visualization show me?", "What should I look for?")
+- categorize attention heads so that the different components of the model's learning are visible
+- "Most attended tokens" section doesn't provide much value, remove this and focus on categorized attention heads and BertViz
+4. MLP
+- need more explanation about how the feed-forward networks are learned in training, allowing it to understand the current words based on its training set
+5. Output
+- include full prompt in "predicted next token", with the predicted token appended at the end (still highlighted)
+- in top 5 tokens graph, the hover-over data should just show percent and token, not the long decimal value
+Overall Conceptual Changes:
+- the analysis should be done off the initial user-given prompt, not the prompt that includes max tokens
+    - the selected beam will be used for comparison after experiments, not for analysis. For example, the user input is "The capital of the US is", the chosen beam is "The capital of the US is Washington D.C.", but the analysis is only done with the first prompt. After either experiment, the chosen beam is used to compare to the new results. If ablation made the output "The capital of the US is New York City", then that can be compared to the original chosen beam to show differences.
+- add testing and verification to the entire project so that each round of changes can be double checked and verified for correctness (try to avoid running the app, just test functions)

requirements.txt CHANGED Viewed

@@ -13,3 +13,6 @@ bertviz>=1.4.0
 # Utility dependencies
 numpy>=1.24.0

 # Utility dependencies
 numpy>=1.24.0
+# Testing dependencies
+pytest>=7.0.0

tests/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ # Test suite for Transformer Activation Capture and Visualization

tests/__pycache__/__init__.cpython-311.pyc ADDED Viewed

Binary file (172 Bytes). View file

tests/__pycache__/conftest.cpython-311-pytest-8.4.2.pyc ADDED Viewed

Binary file (7.43 kB). View file

tests/__pycache__/test_ablation_metrics.cpython-311-pytest-8.4.2.pyc ADDED Viewed

Binary file (18.2 kB). View file

tests/__pycache__/test_head_detection.cpython-311-pytest-8.4.2.pyc ADDED Viewed

Binary file (44.9 kB). View file

tests/__pycache__/test_model_config.cpython-311-pytest-8.4.2.pyc ADDED Viewed

Binary file (37 kB). View file

tests/__pycache__/test_model_patterns.cpython-311-pytest-8.4.2.pyc ADDED Viewed

Binary file (34.1 kB). View file

tests/__pycache__/test_token_attribution.cpython-311-pytest-8.4.2.pyc ADDED Viewed

Binary file (27.1 kB). View file

tests/conftest.py ADDED Viewed

	@@ -0,0 +1,206 @@

+"""
+Shared pytest fixtures for the test suite.
+Provides reusable mock data structures and synthetic tensors
+to test utility functions without loading actual ML models.
+"""
+import pytest
+import torch
+import numpy as np
+# =============================================================================
+# Synthetic Attention Matrices
+# =============================================================================
+@pytest.fixture
+def uniform_attention_matrix():
+    """4x4 uniform attention matrix (each position attends equally to all)."""
+    size = 4
+    return torch.ones(size, size) / size
+@pytest.fixture
+def previous_token_attention_matrix():
+    """
+    4x4 attention matrix where each position attends primarily to the previous token.
+    Position 0 attends to itself (no previous token).
+    """
+    size = 4
+    matrix = torch.zeros(size, size)
+    # Position 0 attends to itself
+    matrix[0, 0] = 1.0
+    # Other positions attend strongly to previous token
+    for i in range(1, size):
+        matrix[i, i-1] = 0.8
+        matrix[i, i] = 0.2  # Some self-attention
+    return matrix
+@pytest.fixture
+def first_token_attention_matrix():
+    """4x4 attention matrix where all positions attend strongly to first token."""
+    size = 4
+    matrix = torch.zeros(size, size)
+    for i in range(size):
+        matrix[i, 0] = 0.7  # Strong attention to first token
+        matrix[i, i] = 0.3  # Some self-attention
+    return matrix
+@pytest.fixture
+def peaked_attention_matrix():
+    """4x4 attention matrix with peaked (low entropy) attention at one position."""
+    size = 4
+    matrix = torch.zeros(size, size)
+    # Each position attends almost entirely to position 2
+    for i in range(size):
+        matrix[i, 2] = 0.95
+        # Distribute remaining across others
+        for j in range(size):
+            if j != 2:
+                matrix[i, j] = 0.05 / (size - 1)
+    return matrix
+# =============================================================================
+# Mock Activation Data Structures
+# =============================================================================
+@pytest.fixture
+def mock_activation_data():
+    """
+    Mock activation data structure similar to execute_forward_pass output.
+    Used for testing functions that process activation data.
+    """
+    return {
+        'model': 'mock-model',
+        'prompt': 'Hello world',
+        'input_ids': [[1, 2, 3, 4]],
+        'attention_modules': ['model.layers.0.self_attn', 'model.layers.1.self_attn'],
+        'attention_outputs': {
+            'model.layers.0.self_attn': {
+                'output': [
+                    [[0.1, 0.2, 0.3]],  # Hidden states (simplified)
+                    [[[[0.25, 0.25, 0.25, 0.25],  # Attention weights [batch, heads, seq, seq]
+                       [0.25, 0.25, 0.25, 0.25],
+                       [0.25, 0.25, 0.25, 0.25],
+                       [0.25, 0.25, 0.25, 0.25]]]]
+                ]
+            },
+            'model.layers.1.self_attn': {
+                'output': [
+                    [[0.1, 0.2, 0.3]],
+                    [[[[0.1, 0.2, 0.3, 0.4],
+                       [0.1, 0.2, 0.3, 0.4],
+                       [0.1, 0.2, 0.3, 0.4],
+                       [0.1, 0.2, 0.3, 0.4]]]]
+                ]
+            }
+        },
+        'block_modules': ['model.layers.0', 'model.layers.1'],
+        'block_outputs': {
+            'model.layers.0': {'output': [[[0.1, 0.2, 0.3, 0.4]]]},
+            'model.layers.1': {'output': [[[0.2, 0.3, 0.4, 0.5]]]}
+        },
+        'norm_parameters': ['model.norm.weight'],
+        'norm_data': [[1.0, 1.0, 1.0, 1.0]],
+        'actual_output': {'token': ' world', 'probability': 0.85},
+        'global_top5_tokens': [
+            {'token': 'world', 'probability': 0.85},
+            {'token': 'there', 'probability': 0.05},
+            {'token': 'friend', 'probability': 0.03},
+            {'token': 'everyone', 'probability': 0.02},
+            {'token': 'all', 'probability': 0.01}
+        ]
+    }
+# =============================================================================
+# Mock Module/Parameter Patterns
+# =============================================================================
+@pytest.fixture
+def mock_module_patterns():
+    """Mock module patterns as returned by extract_patterns."""
+    return {
+        'model.layers.{N}.self_attn': ['model.layers.0.self_attn', 'model.layers.1.self_attn'],
+        'model.layers.{N}.mlp': ['model.layers.0.mlp', 'model.layers.1.mlp'],
+        'model.layers.{N}': ['model.layers.0', 'model.layers.1'],
+        'model.embed_tokens': ['model.embed_tokens'],
+        'model.norm': ['model.norm']
+    }
+@pytest.fixture
+def mock_param_patterns():
+    """Mock parameter patterns as returned by extract_patterns."""
+    return {
+        'model.layers.{N}.self_attn.q_proj.weight': ['model.layers.0.self_attn.q_proj.weight'],
+        'model.layers.{N}.self_attn.k_proj.weight': ['model.layers.0.self_attn.k_proj.weight'],
+        'model.norm.weight': ['model.norm.weight'],
+        'lm_head.weight': ['lm_head.weight']
+    }
+# =============================================================================
+# Synthetic Logits for Ablation Metrics
+# =============================================================================
+@pytest.fixture
+def identical_logits():
+    """Two identical logit tensors for testing KL divergence = 0."""
+    logits = torch.tensor([[[1.0, 2.0, 3.0, 4.0],
+                            [2.0, 3.0, 4.0, 5.0]]])  # [1, 2, 4] = [batch, seq, vocab]
+    return logits, logits.clone()
+@pytest.fixture
+def different_logits():
+    """Two different logit tensors for testing KL divergence > 0."""
+    logits_p = torch.tensor([[[1.0, 2.0, 3.0, 4.0],
+                              [2.0, 3.0, 4.0, 5.0]]])
+    logits_q = torch.tensor([[[4.0, 3.0, 2.0, 1.0],
+                              [5.0, 4.0, 3.0, 2.0]]])
+    return logits_p, logits_q
+@pytest.fixture
+def prob_delta_data():
+    """Data for testing probability delta computation."""
+    # Reference favors token 3, ablated favors token 0
+    logits_ref = torch.tensor([[[1.0, 2.0, 3.0, 10.0],   # pos 0: predicts token 3
+                                [1.0, 2.0, 10.0, 3.0]]])  # pos 1: predicts token 2
+    logits_abl = torch.tensor([[[10.0, 2.0, 3.0, 1.0],   # pos 0: predicts token 0
+                                [10.0, 2.0, 1.0, 3.0]]])  # pos 1: predicts token 0
+    input_ids = torch.tensor([[0, 3, 2]])  # Actual tokens: start, 3, 2
+    return logits_ref, logits_abl, input_ids
+# =============================================================================
+# Attribution Data for Visualization Tests
+# =============================================================================
+@pytest.fixture
+def mock_attribution_result():
+    """Mock output from compute_integrated_gradients or compute_simple_gradient_attribution."""
+    return {
+        'tokens': ['Hello', ' world', '!'],
+        'token_ids': [1, 2, 3],
+        'attributions': [0.5, 1.0, 0.2],  # Raw attribution scores
+        'normalized_attributions': [0.5, 1.0, 0.2],  # Already normalized for simplicity
+        'target_token': 'next',
+        'target_token_id': 100
+    }
+# =============================================================================
+# Head Categorization Config
+# =============================================================================
+@pytest.fixture
+def default_head_config():
+    """Default head categorization configuration for testing."""
+    from utils.head_detection import HeadCategorizationConfig
+    return HeadCategorizationConfig()

tests/test_ablation_metrics.py ADDED Viewed

	@@ -0,0 +1,120 @@

+"""
+Tests for utils/ablation_metrics.py
+Tests KL divergence computation and probability delta calculations.
+Uses synthetic tensors to avoid model loading.
+"""
+import pytest
+import torch
+import torch.nn.functional as F
+from utils.ablation_metrics import compute_kl_divergence, get_token_probability_deltas
+class TestComputeKLDivergence:
+    """Tests for compute_kl_divergence function."""
+    def test_identical_distributions_zero_kl(self, identical_logits):
+        """KL divergence of identical distributions should be approximately 0."""
+        logits_p, logits_q = identical_logits
+        kl_divs = compute_kl_divergence(logits_p, logits_q)
+        assert isinstance(kl_divs, list)
+        assert len(kl_divs) == 2  # seq_len = 2
+        for kl in kl_divs:
+            assert abs(kl) < 1e-5, f"Expected ~0, got {kl}"
+    def test_different_distributions_positive_kl(self, different_logits):
+        """KL divergence of different distributions should be positive."""
+        logits_p, logits_q = different_logits
+        kl_divs = compute_kl_divergence(logits_p, logits_q)
+        assert isinstance(kl_divs, list)
+        for kl in kl_divs:
+            assert kl > 0, f"Expected positive KL, got {kl}"
+    def test_kl_divergence_asymmetry(self, different_logits):
+        """KL(P||Q) should not equal KL(Q||P) for different distributions."""
+        logits_p, logits_q = different_logits
+        kl_pq = compute_kl_divergence(logits_p, logits_q)
+        kl_qp = compute_kl_divergence(logits_q, logits_p)
+        # They should generally be different (asymmetry of KL divergence)
+        assert kl_pq != kl_qp, "KL divergence should be asymmetric"
+    def test_handles_3d_input(self):
+        """Should handle [batch, seq_len, vocab_size] input correctly."""
+        logits = torch.randn(1, 5, 100)  # batch=1, seq=5, vocab=100
+        kl_divs = compute_kl_divergence(logits, logits)
+        assert len(kl_divs) == 5
+        for kl in kl_divs:
+            assert abs(kl) < 1e-5
+class TestGetTokenProbabilityDeltas:
+    """Tests for get_token_probability_deltas function."""
+    def test_deltas_with_synthetic_data(self):
+        """Test probability delta computation with known inputs."""
+        # Logits shape: [1, seq_len, vocab_size] where seq_len matches input_ids
+        # input_ids has 3 tokens, so logits needs 3 positions
+        logits_ref = torch.tensor([[[1.0, 2.0, 3.0, 10.0],   # pos 0
+                                    [1.0, 2.0, 10.0, 3.0],   # pos 1
+                                    [1.0, 2.0, 3.0, 4.0]]])  # pos 2
+        logits_abl = torch.tensor([[[10.0, 2.0, 3.0, 1.0],
+                                    [10.0, 2.0, 1.0, 3.0],
+                                    [1.0, 2.0, 3.0, 4.0]]])
+        input_ids = torch.tensor([[0, 3, 2]])
+        deltas = get_token_probability_deltas(logits_ref, logits_abl, input_ids)
+        # Should return list of length seq_len - 1 (shifted prediction)
+        assert isinstance(deltas, list)
+        assert len(deltas) == 2  # seq_len=3, so 2 predictions (pos 0 predicts token 1, pos 1 predicts token 2)
+    def test_identical_logits_zero_delta(self):
+        """Identical logits should produce zero deltas."""
+        # Logits need seq_len=3 to match input_ids
+        logits = torch.tensor([[[1.0, 2.0, 3.0, 4.0],
+                                [2.0, 3.0, 4.0, 5.0],
+                                [3.0, 4.0, 5.0, 6.0]]])
+        input_ids = torch.tensor([[0, 3, 2]])
+        deltas = get_token_probability_deltas(logits, logits.clone(), input_ids)
+        for delta in deltas:
+            assert abs(delta) < 1e-5, f"Expected ~0 delta, got {delta}"
+    def test_delta_direction(self):
+        """When ablation increases a token's probability, delta should be positive."""
+        # 3 positions to match 3 input_ids
+        logits_ref = torch.tensor([[[1.0, 0.0, 0.0, 0.0],   # favors token 0
+                                    [1.0, 0.0, 0.0, 0.0],   # favors token 0
+                                    [1.0, 0.0, 0.0, 0.0]]])
+        logits_abl = torch.tensor([[[0.0, 10.0, 0.0, 0.0],  # favors token 1
+                                    [0.0, 10.0, 0.0, 0.0],  # favors token 1
+                                    [0.0, 10.0, 0.0, 0.0]]])
+        input_ids = torch.tensor([[0, 1, 1]])  # Target tokens: 1, 1
+        deltas = get_token_probability_deltas(logits_ref, logits_abl, input_ids)
+        # Both deltas should be positive (ablation increased target prob)
+        for delta in deltas:
+            assert delta > 0, f"Expected positive delta, got {delta}"
+    def test_delta_range(self):
+        """Deltas should be bounded by [-1, 1] since they're probability differences."""
+        # 3 positions to match input_ids
+        logits_ref = torch.tensor([[[100.0, -100.0, -100.0, -100.0],
+                                    [-100.0, 100.0, -100.0, -100.0],
+                                    [-100.0, -100.0, 100.0, -100.0]]])
+        logits_abl = torch.tensor([[[-100.0, 100.0, -100.0, -100.0],
+                                    [-100.0, -100.0, 100.0, -100.0],
+                                    [-100.0, -100.0, -100.0, 100.0]]])
+        input_ids = torch.tensor([[0, 0, 1]])  # Targets: 0, 1
+        deltas = get_token_probability_deltas(logits_ref, logits_abl, input_ids)
+        for delta in deltas:
+            assert -1.0 <= delta <= 1.0, f"Delta {delta} out of bounds"

tests/test_head_detection.py ADDED Viewed

	@@ -0,0 +1,313 @@

+"""
+Tests for utils/head_detection.py
+Tests attention head categorization heuristics using synthetic attention matrices.
+"""
+import pytest
+import torch
+import numpy as np
+from utils.head_detection import (
+    compute_attention_entropy,
+    detect_previous_token_head,
+    detect_first_token_head,
+    detect_bow_head,
+    detect_syntactic_head,
+    categorize_attention_head,
+    categorize_all_heads,
+    format_categorization_summary,
+    HeadCategorizationConfig
+)
+class TestComputeAttentionEntropy:
+    """Tests for compute_attention_entropy function."""
+    def test_uniform_distribution_high_entropy(self):
+        """Uniform attention should have high (near 1.0) normalized entropy."""
+        # 4 positions with equal attention
+        uniform = torch.tensor([0.25, 0.25, 0.25, 0.25])
+        entropy = compute_attention_entropy(uniform)
+        # Normalized entropy should be close to 1.0 for uniform
+        assert 0.95 <= entropy <= 1.0, f"Expected ~1.0, got {entropy}"
+    def test_peaked_distribution_low_entropy(self):
+        """Peaked attention should have low normalized entropy."""
+        # One position dominates
+        peaked = torch.tensor([0.97, 0.01, 0.01, 0.01])
+        entropy = compute_attention_entropy(peaked)
+        # Should be low entropy
+        assert entropy < 0.3, f"Expected low entropy, got {entropy}"
+    def test_entropy_bounds(self):
+        """Entropy should always be between 0 and 1 (normalized)."""
+        test_cases = [
+            torch.tensor([1.0, 0.0, 0.0, 0.0]),      # Extreme peaked
+            torch.tensor([0.5, 0.5, 0.0, 0.0]),      # Two positions
+            torch.tensor([0.25, 0.25, 0.25, 0.25]),  # Uniform
+        ]
+        for weights in test_cases:
+            entropy = compute_attention_entropy(weights)
+            assert 0.0 <= entropy <= 1.0, f"Entropy {entropy} out of bounds"
+class TestDetectPreviousTokenHead:
+    """Tests for detect_previous_token_head function."""
+    def test_detects_previous_token_pattern(self, previous_token_attention_matrix, default_head_config):
+        """Should detect matrix with strong previous-token attention."""
+        is_prev, score = detect_previous_token_head(
+            previous_token_attention_matrix,
+            default_head_config
+        )
+        assert is_prev == True
+        assert score > 0.5, f"Expected high score, got {score}"
+    def test_rejects_uniform_attention(self, uniform_attention_matrix, default_head_config):
+        """Should reject matrix with uniform attention."""
+        is_prev, score = detect_previous_token_head(
+            uniform_attention_matrix,
+            default_head_config
+        )
+        assert is_prev == False
+        assert score < 0.4, f"Expected low score, got {score}"
+    def test_short_sequence_returns_false(self, default_head_config):
+        """Sequence shorter than min_seq_len should return False."""
+        short_matrix = torch.ones(2, 2) / 2
+        is_prev, score = detect_previous_token_head(short_matrix, default_head_config)
+        assert is_prev == False
+        assert score == 0.0
+class TestDetectFirstTokenHead:
+    """Tests for detect_first_token_head function."""
+    def test_detects_first_token_pattern(self, first_token_attention_matrix, default_head_config):
+        """Should detect matrix with strong first-token attention."""
+        is_first, score = detect_first_token_head(
+            first_token_attention_matrix,
+            default_head_config
+        )
+        assert is_first == True
+        assert score > 0.5, f"Expected high score, got {score}"
+    def test_low_first_token_attention(self, default_head_config):
+        """Matrix with low attention to first token should not be detected."""
+        # Create matrix where first token gets very little attention
+        # Use size 5 to be above min_seq_len and avoid overlap at [0,0]
+        size = 5
+        matrix = torch.zeros(size, size)
+        for i in range(size):
+            # Distribute attention: 5% to first token, 95% to last token
+            matrix[i, 0] = 0.05
+            matrix[i, -1] = 0.95
+        is_first, score = detect_first_token_head(matrix, default_head_config)
+        assert is_first == False
+        assert score < 0.25, f"Expected low score, got {score}"
+class TestDetectBowHead:
+    """Tests for detect_bow_head (bag-of-words / diffuse attention)."""
+    def test_detects_uniform_as_bow(self, uniform_attention_matrix, default_head_config):
+        """Uniform attention should be detected as BoW head."""
+        is_bow, score = detect_bow_head(uniform_attention_matrix, default_head_config)
+        # Uniform has high entropy and low max attention - should be BoW
+        assert is_bow == True
+        assert score > 0.9, f"Expected high entropy score, got {score}"
+    def test_rejects_peaked_attention(self, peaked_attention_matrix, default_head_config):
+        """Peaked attention should not be detected as BoW."""
+        is_bow, score = detect_bow_head(peaked_attention_matrix, default_head_config)
+        # Peaked attention has low entropy - should not be BoW
+        assert is_bow == False
+class TestDetectSyntacticHead:
+    """Tests for detect_syntactic_head function."""
+    def test_consistent_distance_pattern(self, default_head_config):
+        """Matrix with consistent distance pattern should be detected as syntactic."""
+        # Create matrix where each position attends to position 2 tokens back
+        size = 6
+        matrix = torch.zeros(size, size)
+        for i in range(size):
+            target = max(0, i - 2)  # 2 tokens back
+            matrix[i, target] = 1.0
+        is_syn, score = detect_syntactic_head(matrix, default_head_config)
+        # Should have consistent distance pattern
+        assert score > 0.0, f"Expected positive score for consistent pattern"
+    def test_random_attention_returns_valid_values(self, default_head_config):
+        """Random attention should return valid boolean and score."""
+        torch.manual_seed(42)
+        random_matrix = torch.softmax(torch.randn(6, 6), dim=-1)
+        is_syn, score = detect_syntactic_head(random_matrix, default_head_config)
+        # Check it returns valid types (bool or numpy bool, and numeric score)
+        assert is_syn in [True, False] or bool(is_syn) in [True, False]
+        assert 0 <= float(score) <= 1
+class TestCategorizeAttentionHead:
+    """Tests for categorize_attention_head function."""
+    def test_categorizes_previous_token_head(self, previous_token_attention_matrix, default_head_config):
+        """Should categorize previous-token pattern correctly."""
+        result = categorize_attention_head(
+            previous_token_attention_matrix,
+            layer_idx=0,
+            head_idx=3,
+            config=default_head_config
+        )
+        assert result['category'] == 'previous_token'
+        assert result['layer'] == 0
+        assert result['head'] == 3
+        assert result['label'] == 'L0-H3'
+        assert 'scores' in result
+    def test_categorizes_first_token_head(self, first_token_attention_matrix, default_head_config):
+        """Should categorize first-token pattern correctly."""
+        result = categorize_attention_head(
+            first_token_attention_matrix,
+            layer_idx=2,
+            head_idx=5,
+            config=default_head_config
+        )
+        assert result['category'] == 'first_token'
+        assert result['label'] == 'L2-H5'
+    def test_categorizes_bow_head(self, default_head_config):
+        """Should categorize diffuse attention as BoW when it doesn't match other patterns."""
+        # Create BoW-like matrix: diffuse attention but first token gets LESS than threshold
+        # This avoids triggering first_token detection (threshold 0.25)
+        size = 5
+        matrix = torch.zeros(size, size)
+        for i in range(size):
+            # First token gets only 0.1, rest get roughly equal share
+            matrix[i, 0] = 0.1
+            remaining = 0.9 / (size - 1)
+            for j in range(1, size):
+                matrix[i, j] = remaining
+        result = categorize_attention_head(
+            matrix,
+            layer_idx=1,
+            head_idx=0,
+            config=default_head_config
+        )
+        assert result['category'] == 'bow'
+    def test_result_structure(self, uniform_attention_matrix):
+        """Result should have all required keys."""
+        result = categorize_attention_head(
+            uniform_attention_matrix,
+            layer_idx=0,
+            head_idx=0
+        )
+        required_keys = ['layer', 'head', 'category', 'scores', 'label']
+        for key in required_keys:
+            assert key in result, f"Missing key: {key}"
+class TestCategorizeAllHeads:
+    """Tests for categorize_all_heads function."""
+    def test_returns_all_categories(self, mock_activation_data, default_head_config):
+        """Should return dict with all category keys."""
+        result = categorize_all_heads(mock_activation_data, default_head_config)
+        expected_categories = ['previous_token', 'first_token', 'bow', 'syntactic', 'other']
+        for cat in expected_categories:
+            assert cat in result, f"Missing category: {cat}"
+            assert isinstance(result[cat], list)
+    def test_handles_empty_attention_data(self, default_head_config):
+        """Should handle activation data with no attention outputs."""
+        empty_data = {'attention_outputs': {}}
+        result = categorize_all_heads(empty_data, default_head_config)
+        # Should return empty lists for all categories
+        for cat, heads in result.items():
+            assert heads == []
+class TestFormatCategorizationSummary:
+    """Tests for format_categorization_summary function."""
+    def test_formats_empty_categorization(self):
+        """Should format empty categorization without error."""
+        empty = {
+            'previous_token': [],
+            'first_token': [],
+            'bow': [],
+            'syntactic': [],
+            'other': []
+        }
+        result = format_categorization_summary(empty)
+        assert isinstance(result, str)
+        assert "Total Heads: 0" in result
+    def test_formats_with_heads(self):
+        """Should format categorization with heads correctly."""
+        categorized = {
+            'previous_token': [
+                {'layer': 0, 'head': 1, 'label': 'L0-H1'},
+                {'layer': 0, 'head': 2, 'label': 'L0-H2'},
+            ],
+            'first_token': [
+                {'layer': 1, 'head': 0, 'label': 'L1-H0'},
+            ],
+            'bow': [],
+            'syntactic': [],
+            'other': []
+        }
+        result = format_categorization_summary(categorized)
+        assert "Total Heads: 3" in result
+        assert "Previous-Token Heads: 2" in result
+        assert "First/Positional-Token Heads: 1" in result
+        assert "Layer 0" in result
+        assert "Layer 1" in result
+class TestHeadCategorizationConfig:
+    """Tests for HeadCategorizationConfig defaults."""
+    def test_default_values(self):
+        """Default config should have reasonable values."""
+        config = HeadCategorizationConfig()
+        assert 0 < config.prev_token_threshold < 1
+        assert 0 < config.first_token_threshold < 1
+        assert 0 < config.bow_entropy_threshold < 1
+        assert config.min_seq_len > 0
+    def test_config_is_mutable(self):
+        """Config values should be mutable for customization."""
+        config = HeadCategorizationConfig()
+        original = config.prev_token_threshold
+        config.prev_token_threshold = 0.8
+        assert config.prev_token_threshold == 0.8
+        assert config.prev_token_threshold != original

tests/test_model_config.py ADDED Viewed

	@@ -0,0 +1,161 @@

+"""
+Tests for utils/model_config.py
+Tests model family lookups, configuration retrieval, and auto-selection logic.
+"""
+import pytest
+from utils.model_config import (
+    get_model_family,
+    get_family_config,
+    get_auto_selections,
+    _pattern_matches_template,
+    MODEL_TO_FAMILY,
+    MODEL_FAMILIES
+)
+class TestGetModelFamily:
+    """Tests for get_model_family function."""
+    def test_known_gpt2_model(self):
+        """Known GPT-2 model should return 'gpt2' family."""
+        assert get_model_family("gpt2") == "gpt2"
+        assert get_model_family("gpt2-medium") == "gpt2"
+        assert get_model_family("openai-community/gpt2") == "gpt2"
+    def test_known_llama_model(self):
+        """Known LLaMA-like models should return 'llama_like' family."""
+        assert get_model_family("Qwen/Qwen2.5-0.5B") == "llama_like"
+        assert get_model_family("meta-llama/Llama-2-7b-hf") == "llama_like"
+        assert get_model_family("mistralai/Mistral-7B-v0.1") == "llama_like"
+    def test_known_opt_model(self):
+        """Known OPT models should return 'opt' family."""
+        assert get_model_family("facebook/opt-125m") == "opt"
+        assert get_model_family("facebook/opt-1.3b") == "opt"
+    def test_unknown_model_returns_none(self):
+        """Unknown models should return None."""
+        assert get_model_family("unknown/model-name") is None
+        assert get_model_family("random-string") is None
+        assert get_model_family("") is None
+class TestGetFamilyConfig:
+    """Tests for get_family_config function."""
+    def test_valid_gpt2_config(self):
+        """GPT-2 family config should have correct structure."""
+        config = get_family_config("gpt2")
+        assert config is not None
+        assert "templates" in config
+        assert "attention_pattern" in config["templates"]
+        assert config["templates"]["attention_pattern"] == "transformer.h.{N}.attn"
+        assert config["norm_type"] == "layernorm"
+    def test_valid_llama_config(self):
+        """LLaMA-like family config should have correct structure."""
+        config = get_family_config("llama_like")
+        assert config is not None
+        assert config["templates"]["attention_pattern"] == "model.layers.{N}.self_attn"
+        assert config["norm_type"] == "rmsnorm"
+        assert config["norm_parameter"] == "model.norm.weight"
+    def test_invalid_family_returns_none(self):
+        """Invalid family name should return None."""
+        assert get_family_config("invalid_family") is None
+        assert get_family_config("") is None
+        assert get_family_config("GPT2") is None  # Case-sensitive
+class TestPatternMatchesTemplate:
+    """Tests for _pattern_matches_template function."""
+    def test_exact_match(self):
+        """Pattern that exactly matches template should return True."""
+        assert _pattern_matches_template(
+            "model.layers.{N}.self_attn",
+            "model.layers.{N}.self_attn"
+        ) is True
+    def test_matching_with_n_placeholder(self):
+        """Patterns with {N} placeholder should match correctly."""
+        assert _pattern_matches_template(
+            "transformer.h.{N}.attn",
+            "transformer.h.{N}.attn"
+        ) is True
+    def test_non_matching_pattern(self):
+        """Different patterns should not match."""
+        assert _pattern_matches_template(
+            "model.layers.{N}.self_attn",
+            "transformer.h.{N}.attn"
+        ) is False
+    def test_empty_template_returns_false(self):
+        """Empty template should return False."""
+        assert _pattern_matches_template("model.layers.{N}.self_attn", "") is False
+        assert _pattern_matches_template("", "") is False
+class TestGetAutoSelections:
+    """Tests for get_auto_selections function."""
+    def test_unknown_model_returns_empty_selections(self):
+        """Unknown model should return empty selections."""
+        result = get_auto_selections(
+            "unknown/model",
+            {"model.layers.{N}.self_attn": ["model.layers.0.self_attn"]},
+            {"model.norm.weight": ["model.norm.weight"]}
+        )
+        assert result["attention_selection"] == []
+        assert result["block_selection"] == []
+        assert result["norm_selection"] == []
+        assert result["family_name"] is None
+    def test_known_model_matches_patterns(self, mock_module_patterns, mock_param_patterns):
+        """Known model should match appropriate patterns."""
+        result = get_auto_selections(
+            "Qwen/Qwen2.5-0.5B",  # llama_like family
+            mock_module_patterns,
+            mock_param_patterns
+        )
+        assert result["family_name"] == "llama_like"
+        # Should find self_attn pattern
+        assert "model.layers.{N}.self_attn" in result["attention_selection"]
+        # Should find block pattern
+        assert "model.layers.{N}" in result["block_selection"]
+        # Should find norm pattern
+        assert result["norm_selection"] == ["model.norm.weight"]
+    def test_result_structure(self, mock_module_patterns, mock_param_patterns):
+        """Result should have all required keys."""
+        result = get_auto_selections(
+            "gpt2",
+            {},  # Empty patterns - no matches expected
+            {}
+        )
+        assert "attention_selection" in result
+        assert "block_selection" in result
+        assert "norm_selection" in result
+        assert "family_name" in result
+        assert isinstance(result["attention_selection"], list)
+        assert isinstance(result["norm_selection"], list)
+class TestModelRegistryIntegrity:
+    """Tests to verify the model registry data is consistent."""
+    def test_all_families_have_required_fields(self):
+        """All model families should have required configuration fields."""
+        required_fields = ["description", "templates", "norm_type"]
+        for family_name, config in MODEL_FAMILIES.items():
+            for field in required_fields:
+                assert field in config, f"Family {family_name} missing {field}"
+    def test_all_mapped_families_exist(self):
+        """All families referenced in MODEL_TO_FAMILY should exist in MODEL_FAMILIES."""
+        for model_name, family_name in MODEL_TO_FAMILY.items():
+            assert family_name in MODEL_FAMILIES, \
+                f"Model {model_name} references unknown family {family_name}"

tests/test_model_patterns.py ADDED Viewed

	@@ -0,0 +1,180 @@

+"""
+Tests for utils/model_patterns.py
+Tests pure logic functions that don't require model loading:
+- merge_token_probabilities
+- safe_to_serializable
+"""
+import pytest
+import torch
+import numpy as np
+from utils.model_patterns import merge_token_probabilities, safe_to_serializable
+class TestMergeTokenProbabilities:
+    """Tests for merge_token_probabilities function."""
+    def test_merges_tokens_with_leading_space(self):
+        """Tokens with and without leading space should be merged."""
+        token_probs = [
+            (" cat", 0.15),
+            ("cat", 0.05),
+            (" dog", 0.10),
+        ]
+        result = merge_token_probabilities(token_probs)
+        # Convert to dict for easier checking
+        result_dict = dict(result)
+        assert "cat" in result_dict
+        assert abs(result_dict["cat"] - 0.20) < 1e-6  # 0.15 + 0.05
+        assert "dog" in result_dict
+        assert abs(result_dict["dog"] - 0.10) < 1e-6
+    def test_sorts_by_probability_descending(self):
+        """Results should be sorted by probability (highest first)."""
+        token_probs = [
+            ("low", 0.01),
+            ("high", 0.50),
+            ("medium", 0.20),
+        ]
+        result = merge_token_probabilities(token_probs)
+        # Check order: high, medium, low
+        assert result[0][0] == "high"
+        assert result[1][0] == "medium"
+        assert result[2][0] == "low"
+    def test_handles_empty_input(self):
+        """Empty input should return empty list."""
+        result = merge_token_probabilities([])
+        assert result == []
+    def test_handles_single_token(self):
+        """Single token should be returned as-is (stripped)."""
+        result = merge_token_probabilities([(" hello", 0.5)])
+        assert len(result) == 1
+        assert result[0][0] == "hello"
+        assert result[0][1] == 0.5
+    def test_strips_multiple_spaces(self):
+        """Multiple leading spaces should all be stripped."""
+        token_probs = [
+            ("  word", 0.3),  # Two spaces
+            (" word", 0.2),   # One space
+            ("word", 0.1),    # No space
+        ]
+        result = merge_token_probabilities(token_probs)
+        result_dict = dict(result)
+        assert "word" in result_dict
+        assert abs(result_dict["word"] - 0.6) < 1e-6  # All merged
+class TestSafeToSerializable:
+    """Tests for safe_to_serializable function."""
+    def test_converts_tensor_to_list(self):
+        """PyTorch tensor should be converted to Python list."""
+        tensor = torch.tensor([1.0, 2.0, 3.0])
+        result = safe_to_serializable(tensor)
+        assert isinstance(result, list)
+        assert result == [1.0, 2.0, 3.0]
+    def test_converts_nested_tensor(self):
+        """2D tensor should become nested list."""
+        tensor = torch.tensor([[1, 2], [3, 4]])
+        result = safe_to_serializable(tensor)
+        assert isinstance(result, list)
+        assert result == [[1, 2], [3, 4]]
+    def test_converts_list_of_tensors(self):
+        """List containing tensors should have tensors converted."""
+        data = [torch.tensor([1, 2]), torch.tensor([3, 4])]
+        result = safe_to_serializable(data)
+        assert result == [[1, 2], [3, 4]]
+    def test_converts_dict_with_tensor_values(self):
+        """Dict with tensor values should have values converted."""
+        data = {
+            "a": torch.tensor([1.0, 2.0]),
+            "b": "string_value",
+            "c": 42
+        }
+        result = safe_to_serializable(data)
+        assert result["a"] == [1.0, 2.0]
+        assert result["b"] == "string_value"
+        assert result["c"] == 42
+    def test_handles_tuple_input(self):
+        """Tuple with tensors should be converted to list."""
+        data = (torch.tensor([1]), torch.tensor([2]))
+        result = safe_to_serializable(data)
+        assert isinstance(result, list)
+        assert result == [[1], [2]]
+    def test_passes_through_primitives(self):
+        """Primitive types should pass through unchanged."""
+        assert safe_to_serializable(42) == 42
+        assert safe_to_serializable(3.14) == 3.14
+        assert safe_to_serializable("hello") == "hello"
+        assert safe_to_serializable(None) is None
+        assert safe_to_serializable(True) is True
+    def test_handles_deeply_nested_structure(self):
+        """Should handle deeply nested structures with tensors."""
+        data = {
+            "level1": {
+                "level2": {
+                    "tensor": torch.tensor([1, 2, 3])
+                }
+            },
+            "list": [torch.tensor([4, 5])]
+        }
+        result = safe_to_serializable(data)
+        assert result["level1"]["level2"]["tensor"] == [1, 2, 3]
+        assert result["list"] == [[4, 5]]
+    def test_handles_empty_containers(self):
+        """Empty lists, dicts, tuples should remain empty."""
+        assert safe_to_serializable([]) == []
+        assert safe_to_serializable({}) == {}
+        assert safe_to_serializable(()) == []  # Tuple becomes list
+class TestSafeToSerializableEdgeCases:
+    """Edge case tests for safe_to_serializable."""
+    def test_handles_scalar_tensor(self):
+        """Scalar tensor should become a Python scalar."""
+        scalar = torch.tensor(42.0)
+        result = safe_to_serializable(scalar)
+        # Scalar tensor.tolist() returns a Python number
+        assert result == 42.0
+    def test_handles_integer_tensor(self):
+        """Integer tensor should be converted correctly."""
+        tensor = torch.tensor([1, 2, 3], dtype=torch.int64)
+        result = safe_to_serializable(tensor)
+        assert result == [1, 2, 3]
+        assert all(isinstance(x, int) for x in result)
+    def test_handles_mixed_list(self):
+        """List with mixed tensor and non-tensor items should work."""
+        data = [torch.tensor([1]), "string", 42, {"key": torch.tensor([2])}]
+        result = safe_to_serializable(data)
+        assert result[0] == [1]
+        assert result[1] == "string"
+        assert result[2] == 42
+        assert result[3] == {"key": [2]}

tests/test_token_attribution.py ADDED Viewed

	@@ -0,0 +1,182 @@

+"""
+Tests for utils/token_attribution.py
+Tests the visualization data formatting function (pure logic).
+The gradient computation functions require models and are not tested here.
+"""
+import pytest
+from utils.token_attribution import create_attribution_visualization_data
+class TestCreateAttributionVisualizationData:
+    """Tests for create_attribution_visualization_data function."""
+    def test_returns_correct_structure(self, mock_attribution_result):
+        """Should return list of dicts with required keys."""
+        result = create_attribution_visualization_data(mock_attribution_result)
+        assert isinstance(result, list)
+        assert len(result) == 3  # 3 tokens in mock data
+        required_keys = ['token', 'index', 'attribution', 'normalized', 'color', 'text_color']
+        for item in result:
+            for key in required_keys:
+                assert key in item, f"Missing key: {key}"
+    def test_preserves_token_order(self, mock_attribution_result):
+        """Tokens should be in same order as input."""
+        result = create_attribution_visualization_data(mock_attribution_result)
+        assert result[0]['token'] == 'Hello'
+        assert result[1]['token'] == ' world'
+        assert result[2]['token'] == '!'
+        assert result[0]['index'] == 0
+        assert result[1]['index'] == 1
+        assert result[2]['index'] == 2
+    def test_preserves_attribution_values(self, mock_attribution_result):
+        """Raw attribution values should be preserved."""
+        result = create_attribution_visualization_data(mock_attribution_result)
+        assert result[0]['attribution'] == 0.5
+        assert result[1]['attribution'] == 1.0
+        assert result[2]['attribution'] == 0.2
+    def test_color_format(self, mock_attribution_result):
+        """Colors should be valid RGB format."""
+        result = create_attribution_visualization_data(mock_attribution_result)
+        for item in result:
+            color = item['color']
+            assert color.startswith('rgb(')
+            assert color.endswith(')')
+            # Extract RGB values
+            rgb_str = color[4:-1]
+            r, g, b = [int(x) for x in rgb_str.split(',')]
+            assert 0 <= r <= 255
+            assert 0 <= g <= 255
+            assert 0 <= b <= 255
+    def test_text_color_contrast(self, mock_attribution_result):
+        """Text color should be black or white for contrast."""
+        result = create_attribution_visualization_data(mock_attribution_result)
+        for item in result:
+            assert item['text_color'] in ['#000000', '#ffffff']
+    def test_high_attribution_gets_color(self):
+        """High attribution should result in colored background."""
+        data = {
+            'tokens': ['high'],
+            'token_ids': [1],
+            'attributions': [1.0],  # Maximum positive attribution
+            'normalized_attributions': [1.0],
+            'target_token': 'x',
+            'target_token_id': 100
+        }
+        result = create_attribution_visualization_data(data)
+        # High positive attribution should have red-ish color (r=255)
+        color = result[0]['color']
+        rgb_str = color[4:-1]
+        r, g, b = [int(x) for x in rgb_str.split(',')]
+        # Red should be at max, green/blue should be reduced
+        assert r == 255
+        assert g < 255  # Reduced for visibility
+        assert b < 255
+    def test_handles_zero_attributions(self):
+        """Zero attributions should produce neutral colors."""
+        data = {
+            'tokens': ['zero'],
+            'token_ids': [1],
+            'attributions': [0.0],
+            'normalized_attributions': [0.0],
+            'target_token': 'x',
+            'target_token_id': 100
+        }
+        result = create_attribution_visualization_data(data)
+        # Zero normalized attribution should give white-ish color
+        color = result[0]['color']
+        rgb_str = color[4:-1]
+        r, g, b = [int(x) for x in rgb_str.split(',')]
+        # All components should be high (near white)
+        assert r == 255
+        assert g == 255
+        assert b == 255
+    def test_handles_negative_attributions(self):
+        """Negative attributions should get blue-ish color."""
+        data = {
+            'tokens': ['negative'],
+            'token_ids': [1],
+            'attributions': [-1.0],  # Negative attribution
+            'normalized_attributions': [1.0],  # Abs normalized
+            'target_token': 'x',
+            'target_token_id': 100
+        }
+        result = create_attribution_visualization_data(data)
+        # Negative attribution should have blue-ish color
+        color = result[0]['color']
+        rgb_str = color[4:-1]
+        r, g, b = [int(x) for x in rgb_str.split(',')]
+        # Blue should be at max, red/green should be reduced
+        assert b == 255
+        assert r < 255
+        assert g < 255
+class TestAttributionVisualizationEdgeCases:
+    """Edge case tests for create_attribution_visualization_data."""
+    def test_handles_single_token(self):
+        """Should handle single token input."""
+        data = {
+            'tokens': ['only'],
+            'token_ids': [1],
+            'attributions': [0.5],
+            'normalized_attributions': [1.0],  # Normalized to max
+            'target_token': 'x',
+            'target_token_id': 100
+        }
+        result = create_attribution_visualization_data(data)
+        assert len(result) == 1
+        assert result[0]['token'] == 'only'
+    def test_handles_empty_input(self):
+        """Should handle empty token list."""
+        data = {
+            'tokens': [],
+            'token_ids': [],
+            'attributions': [],
+            'normalized_attributions': [],
+            'target_token': 'x',
+            'target_token_id': 100
+        }
+        result = create_attribution_visualization_data(data)
+        assert result == []
+    def test_handles_special_characters_in_tokens(self):
+        """Should handle tokens with special characters."""
+        data = {
+            'tokens': ['<s>', '</s>', '\n', '  '],
+            'token_ids': [1, 2, 3, 4],
+            'attributions': [0.1, 0.2, 0.3, 0.4],
+            'normalized_attributions': [0.25, 0.5, 0.75, 1.0],
+            'target_token': 'x',
+            'target_token_id': 100
+        }
+        result = create_attribution_visualization_data(data)
+        assert len(result) == 4
+        assert result[0]['token'] == '<s>'
+        assert result[2]['token'] == '\n'

todo.md CHANGED Viewed

@@ -1,5 +1,14 @@
 # Todo
 ## Completed: Pipeline Explanation Refactor
 ### Phase 1: New Components (Done)
@@ -16,19 +25,3 @@
 - [x] Delete `prompt_comparison.py`
 - [x] Update `utils/__init__.py` exports
 - [x] Add pipeline CSS styles to `assets/style.css`
----
-## Next Steps
-### Testing
-- [ ] Run the dashboard and verify all pipeline stages render correctly
-- [ ] Test ablation experiment workflow
-- [ ] Test token attribution (both methods)
-- [ ] Verify beam search still works with multi-token generation
-### Enhancements (Optional)
-- [ ] Add loading spinners to investigation tools
-- [ ] Improve attention visualization formatting
-- [ ] Add more detailed MLP stage visualization
-- [ ] Consider adding "copy to clipboard" for token data

 # Todo
+## Completed: Test Suite Setup (Done)
+- [x] Create `tests/` folder with `__init__.py` and `conftest.py` (shared fixtures)
+- [x] Create `test_model_config.py` - 15 tests for model family lookups
+- [x] Create `test_ablation_metrics.py` - 8 tests for KL divergence and probability deltas
+- [x] Create `test_head_detection.py` - 20 tests for attention head categorization
+- [x] Create `test_model_patterns.py` - 16 tests for merge_token_probabilities, safe_to_serializable
+- [x] Create `test_token_attribution.py` - 11 tests for visualization data formatting
+- [x] Verify all 73 tests pass with `pytest tests/ -v`
 ## Completed: Pipeline Explanation Refactor
 ### Phase 1: New Components (Done)
 - [x] Delete `prompt_comparison.py`
 - [x] Update `utils/__init__.py` exports
 - [x] Add pipeline CSS styles to `assets/style.css`

utils/__pycache__/__init__.cpython-311.pyc CHANGED Viewed

Binary files a/utils/__pycache__/__init__.cpython-311.pyc and b/utils/__pycache__/__init__.cpython-311.pyc differ

utils/__pycache__/token_attribution.cpython-311.pyc ADDED Viewed

Binary file (11.8 kB). View file