cdpearlman commited on
Commit
11aaea3
·
1 Parent(s): c629c1f

ContextKit LLM memory update

Browse files
.context/data/decisions.md ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Decision Record
2
+
3
+ <!-- Append-only. Record significant decisions with reasoning. -->
4
+ <!-- Format:
5
+ ## [Decision title]
6
+ **Date**: YYYY-MM-DD
7
+ **Context**: What prompted this decision
8
+ **Options considered**: What alternatives were evaluated
9
+ **Decision**: What was chosen
10
+ **Reasoning**: Why
11
+ **Revisit if**: Conditions that would warrant reconsidering
12
+ -->
13
+
14
+ ## Educational depth: conceptual over mathematical
15
+ **Date**: 2026-03-02
16
+ **Context**: Target audience includes people without full college-level CS/math education
17
+ **Options considered**: (1) Full mathematical rigor, (2) Conceptual understanding with simplified math, (3) No math at all
18
+ **Decision**: Conceptual understanding with simplified math — skip complex derivations, focus on motivation and intuition
19
+ **Reasoning**: The goal is building correct mental models, not producing textbook-ready proofs. Accurate simplification serves the audience better than intimidating formalism.
20
+ **Revisit if**: Audience shifts to researchers or grad students who need full rigor
21
+
22
+ ## Chatbot backend: OpenRouter
23
+ **Date**: 2026-03-02
24
+ **Context**: Chatbot needed an LLM backend; previously used Gemini
25
+ **Options considered**: Gemini API, OpenRouter (multi-model), direct OpenAI
26
+ **Decision**: OpenRouter — provides access to multiple models through a single API
27
+ **Reasoning**: Flexibility to switch underlying models without code changes; single API key
28
+ **Revisit if**: OpenRouter pricing becomes prohibitive or a specific model provider offers significantly better educational responses
.context/data/lessons.md ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Lessons Learned
2
+
3
+ <!-- Append-only. Record what the team learned the hard way. -->
4
+ <!-- Format:
5
+ ## YYYY-MM-DD — [Brief title]
6
+ **What happened**: What went wrong or what was discovered
7
+ **Root cause**: Why it happened
8
+ **Fix**: What was done about it
9
+ **Rule going forward**: What to do (or avoid) in the future
10
+ -->
11
+
12
+ ## 2026-03-02 — Dead code accumulation during refactors
13
+ **What happened**: Large component changes left hundreds of lines of orphaned code from deprecated or deleted components
14
+ **Root cause**: Refactors focused only on building the new thing without cleaning up what the old thing left behind
15
+ **Fix**: Manual cleanup after discovering the bloat
16
+ **Rule going forward**: Every refactor must include a dead code sweep. This is a first-class concern, not an afterthought.
17
+
18
+ ## 2026-03-02 — Tunnel vision on implementation details
19
+ **What happened**: Going deep on implementation rabbit holes produced outputs that weren't actually useful for the educational goal
20
+ **Root cause**: Losing sight of the "is this useful for teaching?" question while focused on technical correctness
21
+ **Fix**: Stepped back and re-evaluated against the educational mission
22
+ **Rule going forward**: Sanity check every significant change: (1) Does this help someone understand transformers? (2) Is this accurate enough for correct intuition? (3) Am I in a rabbit hole?
.context/data/sessions.md ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ # Session Log
2
+
3
+ <!-- Append-only. Add a new entry after each substantive work session. -->
4
+
5
+ ## 2026-03-02 — Bootstrap
6
+ **Area**: Project setup
7
+ **Work done**: Ran ContextKit bootstrap interview, generated memory system
8
+ **Decisions made**: Established educational philosophy (conceptual understanding over math rigor), dead code cleanup as mandatory during refactors, agent behavior (push back on bad ideas, no yes-man behavior)
9
+ **Memory created**: architecture.md, conventions.md, education.md, testing.md, sessions.md, decisions.md, lessons.md
10
+ **Open threads**: None — ready for feature work
.context/modules/architecture.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Architecture
2
+
3
+ ## System Overview
4
+
5
+ A Plotly Dash single-page application that visualizes transformer LLM internals and enables interactive experimentation. Users select a model, enter a prompt, and explore a five-stage pipeline: Tokenization → Embedding → Attention → MLP → Output.
6
+
7
+ ## Component Map
8
+
9
+ ```
10
+ app.py # Entry point. Dash layout + all callbacks (~1450 lines)
11
+ ├── components/
12
+ │ ├── sidebar.py # Collapsible left panel: glossary, attention/block/norm dropdowns
13
+ │ ├── model_selector.py # Model dropdown + prompt textarea + generation settings
14
+ │ ├── pipeline.py # 5-stage expandable pipeline with flow indicators
15
+ │ ├── investigation_panel.py # Tabs: Ablation and Token Attribution
16
+ │ ├── ablation_panel.py # Head selection, run ablation, original vs ablated comparison
17
+ │ ├── chatbot.py # Floating chat icon + window + RAG-aware conversation
18
+ │ └── glossary.py # Modal with transformer terms and video links
19
+ ├── utils/
20
+ │ ├── model_patterns.py # Model loading, forward pass, head ablation, bertviz, logit lens
21
+ │ ├── model_config.py # Model family definitions, module templates, auto-selections
22
+ │ ├── head_detection.py # Categorize heads (Previous Token, Induction, etc.)
23
+ │ ├── beam_search.py # Beam search with optional ablation hooks
24
+ │ ├── token_attribution.py # Integrated Gradients and simple gradient attribution
25
+ │ ├── ablation_metrics.py # KL divergence, sequence scoring, token probability deltas
26
+ │ ├── openrouter_client.py # OpenRouter API client for chat + embeddings
27
+ │ ├── rag_utils.py # RAG: load/chunk rag_docs/, embed, retrieve
28
+ │ └── head_categories.json # Static head category definitions
29
+ ├── assets/
30
+ │ ├── style.css # Custom styling (Bootstrap-compatible)
31
+ │ └── chat_resize.js # Client-side chat window resize
32
+ ├── rag_docs/ # ~30 markdown files: chatbot knowledge base
33
+ ├── tests/ # pytest suite (~12 test files)
34
+ └── scripts/
35
+ └── analyze_heads.py # One-off analysis script
36
+ ```
37
+
38
+ ## Data Flow
39
+
40
+ 1. **User selects model** → `model_patterns.load_model()` downloads/caches HF model
41
+ 2. **User enters prompt** → Forward pass captures activations at each pipeline stage
42
+ 3. **Pipeline renders** → Each stage shows its visualization (tokens, embeddings, attention maps, MLP, logits)
43
+ 4. **Beam search** → `beam_search.perform_beam_search()` generates continuations with top-k display
44
+ 5. **Experiments** → Ablation disables selected heads and re-runs; Attribution computes token importance via gradients
45
+ 6. **Chatbot** → User question → RAG retrieval from `rag_docs/` → OpenRouter API → streamed response
46
+
47
+ ## State Management
48
+
49
+ Dash `dcc.Store` components hold session state: activations, patterns, beam results, ablation state. No server-side session persistence — everything is per-page-load.
50
+
51
+ ## Deployment
52
+
53
+ - Dockerfile targeting Hugging Face Spaces (port 7860)
54
+ - `.env` for `OPENROUTER_API_KEY` (not committed)
55
+ - Models cached locally on first load
56
+
57
+ ## Key Boundaries
58
+
59
+ - **components/** only builds Dash layout — no ML logic
60
+ - **utils/** handles all computation — no Dash imports
61
+ - **app.py** is the glue: callbacks wire components to utils
62
+ - **rag_docs/** is the chatbot's knowledge base — edit these to change what the chatbot knows
.context/modules/conventions.md ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Conventions
2
+
3
+ ## Python Style
4
+
5
+ Google Python Style Guide conventions:
6
+ - `snake_case` for functions, methods, variables, modules
7
+ - `PascalCase` for classes
8
+ - `ALL_CAPS` for constants
9
+ - 4-space indentation, 80-char line length
10
+ - Type hints on public APIs
11
+ - Docstrings on public functions/classes (`Args:`, `Returns:`, `Raises:`)
12
+ - f-strings for formatting
13
+ - Group imports: stdlib → third-party → local
14
+ - Run `pylint` to catch bugs and style issues
15
+ - No mutable default arguments (`[]`, `{}`)
16
+ - Use implicit false (`if not my_list:`) and `if foo is None:` for None checks
17
+
18
+ ## Code Hygiene
19
+
20
+ - **Dead code cleanup is mandatory during refactors.** Every refactor must include a sweep for orphaned code from deprecated or deleted components. This is a recurring problem — treat it as a first-class concern.
21
+ - Prefer small, surgical edits over broad rewrites.
22
+ - Reuse existing files before creating new modules.
23
+ - Remove or simplify unnecessary code only when it reduces complexity.
24
+ - Add concise comments explaining intent only where the change is non-obvious.
25
+ - Do not reformat unrelated code or alter indentation styles.
26
+
27
+ ## Naming & Organization
28
+
29
+ - Components go in `components/` — UI layout only, no ML logic
30
+ - Utilities go in `utils/` — computation only, no Dash imports
31
+ - Tests go in `tests/` with `test_` prefix matching the module they test
32
+ - RAG knowledge goes in `rag_docs/` as markdown files
33
+
34
+ ## Commit Messages
35
+
36
+ Format: `<type>(<scope>): <description>`
37
+
38
+ Types: `feat`, `fix`, `docs`, `style`, `refactor`, `test`, `chore`
39
+
40
+ Examples:
41
+ - `feat(ablation): Add multi-layer head selection`
42
+ - `fix(pipeline): Correct embedding stage token count`
43
+ - `refactor(utils): Remove unused activation caching code`
44
+
45
+ ## Error Handling
46
+
47
+ - Use built-in exception classes
48
+ - No bare `except:` clauses
49
+ - User-facing errors should be clear and non-technical
50
+
51
+ ## Dependencies
52
+
53
+ - Avoid adding new dependencies unless strictly needed
54
+ - Document any new dependency in requirements.txt with minimum version
55
+
56
+ ## Dash-Specific
57
+
58
+ - Callbacks must remain responsive — avoid heavy synchronous work without feedback indicators
59
+ - Use `dcc.Store` for session state; no server-side persistence
60
+
61
+ ## Quality Gates
62
+
63
+ Before marking any task complete:
64
+ - All tests pass
65
+ - Code coverage >80% for new code
66
+ - Follows style guide
67
+ - Public functions/methods have docstrings
68
+ - Type hints on public APIs
69
+ - No linting errors
70
+ - No security vulnerabilities (no hardcoded secrets, input validation present)
.context/modules/education.md ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Educational Philosophy
2
+
3
+ ## Audience
4
+
5
+ Primary: ML students and AI enthusiasts — people who are curious but may not have a full college-level CS/math background.
6
+
7
+ Secondary: Educators looking for interactive tools to teach transformer concepts.
8
+
9
+ ## Core Principles
10
+
11
+ ### Conceptual Understanding Over Mathematical Rigor
12
+ It is acceptable to skip complex math (e.g., full derivations of scaled dot-product attention) as long as the motivation and intuition are clearly communicated. The goal is "I understand what this does and why" not "I can derive this from scratch."
13
+
14
+ ### Action-Oriented Learning
15
+ Every architectural explanation should be paired with an interactive element. Don't just tell — let users poke at things and see what happens.
16
+
17
+ ### Progressive Disclosure
18
+ - **Surface level**: Clean interface, minimal jargon, tooltips for technical terms
19
+ - **Mid level**: In-situ descriptions followed by interactive examples
20
+ - **Deep level**: Glossary entries, video links, chatbot for open-ended questions
21
+
22
+ ### Framing
23
+ - Speak to curiosity: "What happens if...?" and "How does this work?"
24
+ - Tone is enthusiastic and accessible but concise — no walls of text
25
+ - Frame experiments as hypothesis testing: "What if I disable this head?"
26
+
27
+ ## Sanity Check Rule
28
+
29
+ Before shipping any educational content or visualization change, ask:
30
+ 1. Does this actually help someone understand transformers better?
31
+ 2. Is this accurate enough to build correct intuition (even if simplified)?
32
+ 3. Am I going down a rabbit hole, or is this genuinely useful?
33
+
34
+ Tunnel vision leads to bad outputs. Step back regularly.
35
+
36
+ ## Visual Consistency
37
+
38
+ - Attention head indices, layer numbers, and token highlights must be consistent across all panels
39
+ - Use consistent color language for different components (attention vs MLP)
40
+ - Support both light and dark modes with high contrast for data visualizations
.context/modules/product.md ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Product Definition
2
+
3
+ ## Vision
4
+
5
+ Demystify the inner workings of Transformer-based LLMs for students and curious individuals. Combine interactive visualizations with hands-on experimentation to transform abstract architectural concepts into tangible, observable phenomena.
6
+
7
+ ## Core Value Proposition
8
+
9
+ - **Visual Learning**: Translate complex matrix operations and data flows into clear, interactive representations (attention maps, logit lens)
10
+ - **Interactive Experimentation**: Go beyond observation — let users manipulate the model (ablation, activation patching) and immediately see consequences
11
+ - **Educational Scaffolding**: Support varying expertise levels with layered content, from tooltips to deep-dive glossaries to AI-guided chat
12
+
13
+ ## Key Features
14
+
15
+ - **Sequential Data Flow Visualization**: Step-by-step data transformation through model layers
16
+ - **Component Breakdown**: Detailed inspection views for self-attention (heads, weights) and MLPs
17
+ - **Interactive Experiments**:
18
+ - Ablation studies: selectively disable heads/layers to observe output impact
19
+ - Activation steering: modify activation values in real-time
20
+ - Prompt comparison: compare internal activations from different inputs side-by-side
21
+ - **Integrated Education**:
22
+ - Contextual tooltips for immediate clarity
23
+ - Glossary panel with in-depth definitions and video links
24
+ - AI chatbot with RAG-powered knowledge base (30 docs covering transformer concepts, usage, experiments, troubleshooting, interpretability)
25
+ - Step-by-step guided experiments for beginners
26
+
27
+ ## Brand & Voice
28
+
29
+ - **Tone**: Enthusiastic and accessible yet concise. Encouraging to learners while remaining direct and functional.
30
+ - **Framing**: Speak to curiosity — "How does this work?" and "What happens if...?"
31
+ - Avoid excessive jargon or long analogies. Prioritize clarity.
32
+
33
+ ## Visual Identity
34
+
35
+ - **Aesthetic**: Clean & modern. High whitespace, legible typography, clear visual hierarchy.
36
+ - **Modes**: Light and dark, both with high contrast for data visualizations.
37
+ - **Color Palette**: Consistent color language for different model components (e.g., specific colors for attention vs MLP) to aid mental mapping.
38
+
39
+ ## UI Patterns
40
+
41
+ - **Progressive Disclosure**: Tooltips for brief context, in-situ descriptions paired with interactive examples, glossary/chatbot for depth.
42
+ - **Sandbox Explorer**: Comprehensive control panel for free-form exploration (toggles, sliders, ablation switches).
43
+ - **Comparison View**: Integrated into the sandbox so users see modification impact relative to original state.
44
+ - **On-Demand Depth**: Keep the primary interface simple with clear paths to dive deeper.
45
+
46
+ ## User Experience
47
+
48
+ The interface centers on exploration and clarity. Users start by selecting a model and inputting text. The dashboard unfolds the model's processing pipeline, letting users zoom into specific components. Experimentation modes are clearly distinguished: hypothesize ("What if I turn off this head?") and test. Educational resources are omnipresent but non-intrusive — available on-demand.
.context/modules/testing.md ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Testing
2
+
3
+ ## Approach
4
+
5
+ Test-Driven Development (TDD) for all backend/utils logic:
6
+ 1. Write failing tests that define expected behavior
7
+ 2. Implement minimum code to pass
8
+ 3. Refactor with confidence
9
+
10
+ ## What to Test
11
+
12
+ - All `utils/` modules — these contain the core logic
13
+ - Model configuration and pattern matching
14
+ - Ablation metrics and scoring
15
+ - Head detection and categorization
16
+ - Beam search behavior
17
+ - Token attribution computations
18
+ - OpenRouter client (mock external calls)
19
+
20
+ ## What NOT to Test
21
+
22
+ - UI/frontend layout changes (components/ files)
23
+ - Trivial additions and documentation
24
+ - CSS and JavaScript assets
25
+
26
+ ## Framework & Conventions
27
+
28
+ - **Framework**: pytest
29
+ - **Location**: `tests/` directory
30
+ - **Naming**: `test_<module_name>.py` matching the module in `utils/`
31
+ - **Fixtures**: Defined in `conftest.py` for shared test state
32
+ - **Mocking**: Mock external dependencies (API calls, model loading when appropriate)
33
+ - **Coverage target**: >80% for new code
34
+
35
+ ## Running Tests
36
+
37
+ ```bash
38
+ pytest # Run all tests
39
+ pytest tests/test_<name>.py # Run specific test file
40
+ pytest --cov=utils --cov-report=html # With coverage
41
+ ```
42
+
43
+ ## When Tests Are Required
44
+
45
+ - New utility functions or modules
46
+ - Bug fixes (write a test that reproduces the bug first)
47
+ - Changes to computation logic
48
+ - Refactors that touch testable behavior
49
+
50
+ ## When Tests Are Optional
51
+
52
+ - Pure UI/layout changes
53
+ - Documentation updates
54
+ - Configuration changes
55
+
56
+ ## After Every Change
57
+
58
+ - Run `pytest` to verify all tests still pass
59
+ - If tests fail, iterate on debugging until fixed — don't move on with broken tests
.cursor/rules/AGENTS.md ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ description:
3
+ globs:
4
+ - "**/*.py"
5
+ - "**/*.md"
6
+ - "app.py"
7
+ - "components/**/*.py"
8
+ - "utils/**/*.py"
9
+ alwaysApply: true
10
+ ---
11
+
12
+ # Transformer Explanation Dashboard
13
+ Interactive Dash app for exploring LLM internals through visualization and experimentation. Built with Python, Dash, PyTorch, HF Transformers, Bertviz, pyvene.
14
+
15
+ ## Critical Rules
16
+ - **Dead code cleanup is mandatory during refactors.** Sweep for orphaned code from deprecated/deleted components every time.
17
+ - **Conceptual understanding over math rigor.** Skip complex derivations; focus on motivation and intuition that builds correct mental models.
18
+ - **Sanity check every change:** (1) Does this help someone learn about transformers? (2) Is the simplification accurate enough? (3) Am I in a rabbit hole?
19
+ - **TDD for backend logic.** Write failing tests first for anything in `utils/`. Skip tests for UI-only changes. Run `pytest` after every change; iterate until green.
20
+ - **Don't run the app.** Describe manual verification steps; the user will test themselves.
21
+ - **Surgical edits over rewrites.** Reuse existing files. Only create new modules when existing ones can't be extended.
22
+ - **No new dependencies** unless strictly necessary.
23
+ - **Push back on bad ideas.** Think through problems fully. Don't be a yes-man — challenge flawed approaches.
24
+ - **Components = layout only** (no ML logic). **Utils = computation only** (no Dash imports). **app.py = glue.**
25
+ - **Dash callbacks must stay responsive.** No heavy sync work in callbacks without feedback indicators.
26
+ - Don't reformat unrelated code or alter indentation styles.
27
+ - Check for zombie processes before debugging server errors.
28
+
29
+ ## Module Map
30
+ | Module | Path | Load when |
31
+ |--------|------|-----------|
32
+ | Product | `.context/modules/product.md` | Understanding vision, features, brand voice, visual identity, UX patterns |
33
+ | Architecture | `.context/modules/architecture.md` | Understanding system structure, data flow, or component boundaries |
34
+ | Conventions | `.context/modules/conventions.md` | Writing or reviewing code style, naming, commits, dead code cleanup |
35
+ | Education | `.context/modules/education.md` | Creating or editing educational content, visualizations, explanations |
36
+ | Testing | `.context/modules/testing.md` | Writing tests, running pytest, TDD workflow |
37
+
38
+ ## Data Files
39
+ | File | Path | Purpose |
40
+ |------|------|---------|
41
+ | Sessions | `.context/data/sessions.md` | Running work log (append-only) |
42
+ | Decisions | `.context/data/decisions.md` | Decision records with reasoning (append-only) |
43
+ | Lessons | `.context/data/lessons.md` | Hard-won knowledge and past mistakes (append-only) |
44
+
45
+ ## Memory Maintenance
46
+
47
+ Always look for opportunities to update the memory system:
48
+ - **New patterns**: "We've been doing X consistently — should I add it to conventions?"
49
+ - **Decisions made**: "We decided Y — should I record this in decisions.md?"
50
+ - **Mistakes caught**: "This went wrong because Z — should I add it to lessons.md?"
51
+ - **Scope changes**: "The project now includes W — should I create a new module?"
52
+
53
+ **Before any memory update**:
54
+ 1. State which file(s) would change and what the change would be
55
+ 2. Wait for approval
56
+ 3. Never update memory mid-task without mentioning it
57
+
58
+ **Rules**:
59
+ - Data files are append-only — add entries, never remove or overwrite past entries
60
+ - Modules can be edited but changes should be targeted, not full rewrites
61
+ - After substantive work sessions, append a summary to `.context/data/sessions.md`
62
+
63
+ ## Preferences
64
+ - Don't ask permission for changes that fall within an approved plan — just execute
65
+ - Commit normal changes to main; feature branches for major components/refactors. Never merge branches.
66
+ - Keep `todo.md` and `plans.md` current before/after changes. Tasks should be atomic.
67
+ - When in doubt, research options and make a minimal reasonable choice, noting it in `todo.md`
68
+ - Explain manual tests clearly — what to look for, expected behavior, where to check
.cursor/rules/minimal_changes.mdc DELETED
@@ -1,56 +0,0 @@
1
- ---
2
- description:
3
- globs:
4
- - "**/*.py"
5
- - "**/*.md"
6
- - "app.py"
7
- - "components/**/*.py"
8
- - "utils/**/*.py"
9
- alwaysApply: true
10
- ---
11
-
12
- # Minimal Change Rules
13
-
14
- - Testing & verification:
15
- - For substantial code changes (new files, new functionality), write tests first in `tests/` that describe expected behavior.
16
- - Skip tests for UI/frontend changes, trivial additions, and documentation.
17
- - After implementing changes, run `pytest` to verify all tests pass.
18
- - If tests fail, iterate on debugging until fixed.
19
-
20
- - Plan first:
21
- - Update `todo.md` with the smallest next actions tied to `plans.md`.
22
- - Keep tasks atomic and check them off as you go.
23
- - Use the `conductor` folder to learn about the project. Maintain this folder after every change to the code in order to keep running memory (only make changes if necessary).
24
-
25
- - Keep edits minimal:
26
- - Prefer small, surgical changes over refactors.
27
- - Reuse existing files: `app.py`, `components/sidebar.py`, `components/main_panel.py`, `utils/*`.
28
- - Remove or simplify clearly unnecessary code only when it reduces complexity.
29
-
30
- - Comments & style:
31
- - Add concise comments explaining intent where the change is non-obvious.
32
- - Do not reformat unrelated code or alter indentation styles.
33
-
34
- - Git workflow:
35
- - After each coherent set of changes:
36
- - git commit -am "[short, concise, and helpful message about what was done]"
37
- - Between features:
38
- - git push
39
- - git checkout -b feature/<short-name>
40
- - Never merge branches.
41
-
42
- - Ongoing planning:
43
- - Keep `todo.md` current before/after each change.
44
- - Update `plans.md` if scope/ideas evolve.
45
-
46
- - Research as needed:
47
- - If details are unclear (e.g., detection thresholds), research your options and make a minimal reasonable choice and note it in `todo.md`.
48
-
49
- # Guardrails
50
-
51
- - Avoid adding new dependencies unless strictly needed.
52
- - Avoid creating new modules/components unless existing ones cannot be cleanly extended.
53
- - Ensure Dash callbacks remain responsive; avoid heavy sync work in callbacks without feedback indicators.
54
-
55
- # Debugging
56
- - Sometimes a zombie process can cause errors. Check for zombie processes and kill them if necessary.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
README.md CHANGED
@@ -80,11 +80,4 @@ The project is structured around a central Dash application with modular compone
80
  * `head_detection.py`: Attention head categorization logic.
81
  * `beam_search.py`: Beam search implementation.
82
  * `tests/`: Comprehensive test suite ensuring stability.
83
- * `conductor/`: Detailed project documentation and product guidelines.
84
-
85
- ## Documentation
86
-
87
- Additional project documentation is available in the `conductor/` directory:
88
- * [Product Definition](conductor/product.md)
89
- * [Tech Stack](conductor/tech-stack.md)
90
- * [Workflow](conductor/workflow.md)
 
80
  * `head_detection.py`: Attention head categorization logic.
81
  * `beam_search.py`: Beam search implementation.
82
  * `tests/`: Comprehensive test suite ensuring stability.
83
+ * `.context/`: Project memory modules (architecture, conventions, education, product, testing) and data files (sessions, decisions, lessons).
 
 
 
 
 
 
 
conductor/code_styleguides/python.md DELETED
@@ -1,37 +0,0 @@
1
- # Google Python Style Guide Summary
2
-
3
- This document summarizes key rules and best practices from the Google Python Style Guide.
4
-
5
- ## 1. Python Language Rules
6
- - **Linting:** Run `pylint` on your code to catch bugs and style issues.
7
- - **Imports:** Use `import x` for packages/modules. Use `from x import y` only when `y` is a submodule.
8
- - **Exceptions:** Use built-in exception classes. Do not use bare `except:` clauses.
9
- - **Global State:** Avoid mutable global state. Module-level constants are okay and should be `ALL_CAPS_WITH_UNDERSCORES`.
10
- - **Comprehensions:** Use for simple cases. Avoid for complex logic where a full loop is more readable.
11
- - **Default Argument Values:** Do not use mutable objects (like `[]` or `{}`) as default values.
12
- - **True/False Evaluations:** Use implicit false (e.g., `if not my_list:`). Use `if foo is None:` to check for `None`.
13
- - **Type Annotations:** Strongly encouraged for all public APIs.
14
-
15
- ## 2. Python Style Rules
16
- - **Line Length:** Maximum 80 characters.
17
- - **Indentation:** 4 spaces per indentation level. Never use tabs.
18
- - **Blank Lines:** Two blank lines between top-level definitions (classes, functions). One blank line between method definitions.
19
- - **Whitespace:** Avoid extraneous whitespace. Surround binary operators with single spaces.
20
- - **Docstrings:** Use `"""triple double quotes"""`. Every public module, function, class, and method must have a docstring.
21
- - **Format:** Start with a one-line summary. Include `Args:`, `Returns:`, and `Raises:` sections.
22
- - **Strings:** Use f-strings for formatting. Be consistent with single (`'`) or double (`"`) quotes.
23
- - **`TODO` Comments:** Use `TODO(username): Fix this.` format.
24
- - **Imports Formatting:** Imports should be on separate lines and grouped: standard library, third-party, and your own application's imports.
25
-
26
- ## 3. Naming
27
- - **General:** `snake_case` for modules, functions, methods, and variables.
28
- - **Classes:** `PascalCase`.
29
- - **Constants:** `ALL_CAPS_WITH_UNDERSCORES`.
30
- - **Internal Use:** Use a single leading underscore (`_internal_variable`) for internal module/class members.
31
-
32
- ## 4. Main
33
- - All executable files should have a `main()` function that contains the main logic, called from a `if __name__ == '__main__':` block.
34
-
35
- **BE CONSISTENT.** When editing code, match the existing style.
36
-
37
- *Source: [Google Python Style Guide](https://google.github.io/styleguide/pyguide.html)*
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
conductor/index.md DELETED
@@ -1,14 +0,0 @@
1
- # Project Context
2
-
3
- ## Definition
4
- - [Product Definition](./product.md)
5
- - [Product Guidelines](./product-guidelines.md)
6
- - [Tech Stack](./tech-stack.md)
7
-
8
- ## Workflow
9
- - [Workflow](./workflow.md)
10
- - [Code Style Guides](./code_styleguides/)
11
-
12
- ## Management
13
- - [Tracks Registry](./tracks.md)
14
- - [Tracks Directory](./tracks/)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
conductor/product-guidelines.md DELETED
@@ -1,23 +0,0 @@
1
- # Product Guidelines
2
-
3
- ## Brand & Voice
4
- - **Tone:** Enthusiastic and Accessible yet Concise. The voice should be encouraging to learners while remaining direct and functional. Avoid excessive jargon or overly long analogies; prioritize clarity and "get-to-the-point" descriptions.
5
- - **Audience Engagement:** Speak directly to the user's curiosity. Frame technical explanations as answers to "How does this work?" or "What happens if...?"
6
-
7
- ## Visual Identity
8
- - **Aesthetic:** Clean & Modern. Prioritize high whitespace, legible typography, and a clear visual hierarchy (inspired by Material Design or Notion).
9
- - **Mode:** Support both Light and Dark modes, ensuring high contrast for data visualizations.
10
- - **Color Palette:** Use a consistent color language for different model components (e.g., specific colors for Attention vs. MLP layers) to aid mental mapping.
11
-
12
- ## User Interface & Experience
13
- - **Terminology & Disclosure:** Use a combination of Progressive Disclosure and In-Situ Definitions.
14
- - **Tooltips:** Use tooltips for most technical terms to provide immediate, brief context without cluttering the UI.
15
- - **In-Situ Descriptions:** Provide short, clear descriptions immediately followed by the relevant interactive example to solidify the concept through action.
16
- - **Experimentation Layout:** Sandbox Explorer.
17
- - Provide a comprehensive control panel for free-form exploration (toggles, sliders, ablation switches).
18
- - **Comparison View:** Integrate comparison elements into the sandbox so users can see the impact of their modifications relative to the original state.
19
-
20
- ## Design Principles
21
- - **Action-Oriented Learning:** Every architectural explanation should be paired with an interactive element.
22
- - **Visual Consistency:** Ensure that attention head indices, layer numbers, and token highlights are consistent across all panels and visualization types.
23
- - **On-Demand Depth:** Keep the primary interface simple, but provide clear paths (like the Glossary or tooltips) for users who want to dive deeper into the technicalities.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
conductor/product.md DELETED
@@ -1,32 +0,0 @@
1
- # Product Definition
2
-
3
- ## Initial Concept
4
- A tool for capturing activations from transformer models and visualizing attention patterns using bertviz and an interactive Dash web application.
5
-
6
- ## Vision
7
- To demystify the inner workings of Transformer-based Large Language Models (LLMs) for students and curious individuals. By combining interactive visualizations with hands-on experimentation capabilities, the tool transforms abstract architectural concepts into tangible, observable phenomena, fostering a deep, intuitive understanding of how these powerful models process information.
8
-
9
- ## Target Audience
10
- - **Primary:** Machine Learning Students and AI enthusiasts.
11
- - **Secondary:** Any individual seeking a practical, interactive way to learn about Transformer architectures and mechanical interpretability.
12
-
13
- ## Core Value Proposition
14
- - **Visual Learning:** Translates complex matrix operations and data flows into clear, interactive visual representations (Attention Maps, Logit Lens).
15
- - **Interactive Experimentation:** Goes beyond static observation by allowing users to manipulate the model (Ablation, Activation Patching) and immediately see the consequences.
16
- - **Educational Scaffolding:** Supports users of varying expertise levels with layered educational content, from simple tooltips to deep-dive glossaries and future AI-guided tutorials.
17
-
18
- ## Key Features
19
- - **Sequential Data Flow Visualization:** Illustrates how data transforms step-by-step through the model's layers.
20
- - **Component Breakdown:** Detailed inspection views for key components like Self-Attention (heads, weights) and MLPs.
21
- - **Interactive Experiments:**
22
- - **Ablation Studies:** selectively disable heads or layers to observe impact on output.
23
- - **Activation Steering:** modify activation values in real-time.
24
- - **Prompt Comparison:** compare internal activations resulting from two different input prompts side-by-side.
25
- - **Integrated Education:**
26
- - Contextual tooltips for immediate clarity.
27
- - Dedicated "Glossary" panel for in-depth definitions.
28
- - AI chatbot with RAG-powered knowledge base (30 documents covering transformer concepts, dashboard usage, guided experiments, result interpretation, troubleshooting, and mechanistic interpretability research).
29
- - Step-by-step guided experiments that walk beginners through the dashboard's features.
30
-
31
- ## User Experience
32
- The interface centers on exploration and clarity. Users start by selecting a model and inputting text. The dashboard then unfolds the model's processing pipeline, allowing users to "zoom in" on specific components. Experimentation modes are clearly distinguished, enabling users to hypothesize ("What if I turn off this head?") and test. Educational resources are omnipresent but non-intrusive, available on-demand to explain the *what* and *why* of what is being visualized.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
conductor/setup_state.json DELETED
@@ -1 +0,0 @@
1
- {"last_successful_step": "3.3_initial_track_generated"}
 
 
conductor/tech-stack.md DELETED
@@ -1,15 +0,0 @@
1
- # Tech Stack
2
-
3
- ## Core Technologies
4
- - **Programming Language:** Python
5
- - **Deep Learning Framework:** PyTorch & Hugging Face Transformers
6
-
7
- ## Frontend & Visualization
8
- - **Web Framework:** Plotly Dash
9
- - **Data Visualization:** Plotly
10
- - **Attention Visualization:** Bertviz
11
- - **Styling:** Custom CSS (Bootstrap-compatible)
12
-
13
- ## Interpretability & Research Tools
14
- - **Activation Capture:** PyVene
15
- - **Model Analysis:** Custom utilities for ablation and logit lens analysis
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
conductor/tracks.md DELETED
@@ -1,3 +0,0 @@
1
- # Project Tracks
2
-
3
- This file tracks all major tracks for the project. Each track has its own detailed plan in its respective folder.
 
 
 
 
conductor/workflow.md DELETED
@@ -1,333 +0,0 @@
1
- # Project Workflow
2
-
3
- ## Guiding Principles
4
-
5
- 1. **The Plan is the Source of Truth:** All work must be tracked in `plan.md`
6
- 2. **The Tech Stack is Deliberate:** Changes to the tech stack must be documented in `tech-stack.md` *before* implementation
7
- 3. **Test-Driven Development:** Write unit tests before implementing functionality
8
- 4. **High Code Coverage:** Aim for >80% code coverage for all modules
9
- 5. **User Experience First:** Every decision should prioritize user experience
10
- 6. **Non-Interactive & CI-Aware:** Prefer non-interactive commands. Use `CI=true` for watch-mode tools (tests, linters) to ensure single execution.
11
-
12
- ## Task Workflow
13
-
14
- All tasks follow a strict lifecycle:
15
-
16
- ### Standard Task Workflow
17
-
18
- 1. **Select Task:** Choose the next available task from `plan.md` in sequential order
19
-
20
- 2. **Mark In Progress:** Before beginning work, edit `plan.md` and change the task from `[ ]` to `[~]`
21
-
22
- 3. **Write Failing Tests (Red Phase):**
23
- - Create a new test file for the feature or bug fix.
24
- - Write one or more unit tests that clearly define the expected behavior and acceptance criteria for the task.
25
- - **CRITICAL:** Run the tests and confirm that they fail as expected. This is the "Red" phase of TDD. Do not proceed until you have failing tests.
26
-
27
- 4. **Implement to Pass Tests (Green Phase):**
28
- - Write the minimum amount of application code necessary to make the failing tests pass.
29
- - Run the test suite again and confirm that all tests now pass. This is the "Green" phase.
30
-
31
- 5. **Refactor (Optional but Recommended):**
32
- - With the safety of passing tests, refactor the implementation code and the test code to improve clarity, remove duplication, and enhance performance without changing the external behavior.
33
- - Rerun tests to ensure they still pass after refactoring.
34
-
35
- 6. **Verify Coverage:** Run coverage reports using the project's chosen tools. For example, in a Python project, this might look like:
36
- ```bash
37
- pytest --cov=app --cov-report=html
38
- ```
39
- Target: >80% coverage for new code. The specific tools and commands will vary by language and framework.
40
-
41
- 7. **Document Deviations:** If implementation differs from tech stack:
42
- - **STOP** implementation
43
- - Update `tech-stack.md` with new design
44
- - Add dated note explaining the change
45
- - Resume implementation
46
-
47
- 8. **Commit Code Changes:**
48
- - Stage all code changes related to the task.
49
- - Propose a clear, concise commit message e.g, `feat(ui): Create basic HTML structure for calculator`.
50
- - Perform the commit.
51
-
52
- 9. **Attach Task Summary with Git Notes:**
53
- - **Step 9.1: Get Commit Hash:** Obtain the hash of the *just-completed commit* (`git log -1 --format="%H"`).
54
- - **Step 9.2: Draft Note Content:** Create a detailed summary for the completed task. This should include the task name, a summary of changes, a list of all created/modified files, and the core "why" for the change.
55
- - **Step 9.3: Attach Note:** Use the `git notes` command to attach the summary to the commit.
56
- ```bash
57
- # The note content from the previous step is passed via the -m flag.
58
- git notes add -m "<note content>" <commit_hash>
59
- ```
60
-
61
- 10. **Get and Record Task Commit SHA:**
62
- - **Step 10.1: Update Plan:** Read `plan.md`, find the line for the completed task, update its status from `[~]` to `[x]`, and append the first 7 characters of the *just-completed commit's* commit hash.
63
- - **Step 10.2: Write Plan:** Write the updated content back to `plan.md`.
64
-
65
- 11. **Commit Plan Update:**
66
- - **Action:** Stage the modified `plan.md` file.
67
- - **Action:** Commit this change with a descriptive message (e.g., `conductor(plan): Mark task 'Create user model' as complete`).
68
-
69
- ### Phase Completion Verification and Checkpointing Protocol
70
-
71
- **Trigger:** This protocol is executed immediately after a task is completed that also concludes a phase in `plan.md`.
72
-
73
- 1. **Announce Protocol Start:** Inform the user that the phase is complete and the verification and checkpointing protocol has begun.
74
-
75
- 2. **Ensure Test Coverage for Phase Changes:**
76
- - **Step 2.1: Determine Phase Scope:** To identify the files changed in this phase, you must first find the starting point. Read `plan.md` to find the Git commit SHA of the *previous* phase's checkpoint. If no previous checkpoint exists, the scope is all changes since the first commit.
77
- - **Step 2.2: List Changed Files:** Execute `git diff --name-only <previous_checkpoint_sha> HEAD` to get a precise list of all files modified during this phase.
78
- - **Step 2.3: Verify and Create Tests:** For each file in the list:
79
- - **CRITICAL:** First, check its extension. Exclude non-code files (e.g., `.json`, `.md`, `.yaml`).
80
- - For each remaining code file, verify a corresponding test file exists.
81
- - If a test file is missing, you **must** create one. Before writing the test, **first, analyze other test files in the repository to determine the correct naming convention and testing style.** The new tests **must** validate the functionality described in this phase's tasks (`plan.md`).
82
-
83
- 3. **Execute Automated Tests with Proactive Debugging:**
84
- - Before execution, you **must** announce the exact shell command you will use to run the tests.
85
- - **Example Announcement:** "I will now run the automated test suite to verify the phase. **Command:** `CI=true npm test`"
86
- - Execute the announced command.
87
- - If tests fail, you **must** inform the user and begin debugging. You may attempt to propose a fix a **maximum of two times**. If the tests still fail after your second proposed fix, you **must stop**, report the persistent failure, and ask the user for guidance.
88
-
89
- 4. **Propose a Detailed, Actionable Manual Verification Plan:**
90
- - **CRITICAL:** To generate the plan, first analyze `product.md`, `product-guidelines.md`, and `plan.md` to determine the user-facing goals of the completed phase.
91
- - You **must** generate a step-by-step plan that walks the user through the verification process, including any necessary commands and specific, expected outcomes.
92
- - The plan you present to the user **must** follow this format:
93
-
94
- **For a Frontend Change:**
95
- ```
96
- The automated tests have passed. For manual verification, please follow these steps:
97
-
98
- **Manual Verification Steps:**
99
- 1. **Start the development server with the command:** `npm run dev`
100
- 2. **Open your browser to:** `http://localhost:3000`
101
- 3. **Confirm that you see:** The new user profile page, with the user's name and email displayed correctly.
102
- ```
103
-
104
- **For a Backend Change:**
105
- ```
106
- The automated tests have passed. For manual verification, please follow these steps:
107
-
108
- **Manual Verification Steps:**
109
- 1. **Ensure the server is running.**
110
- 2. **Execute the following command in your terminal:** `curl -X POST http://localhost:8080/api/v1/users -d '{"name": "test"}'`
111
- 3. **Confirm that you receive:** A JSON response with a status of `201 Created`.
112
- ```
113
-
114
- 5. **Await Explicit User Feedback:**
115
- - After presenting the detailed plan, ask the user for confirmation: "**Does this meet your expectations? Please confirm with yes or provide feedback on what needs to be changed.**"
116
- - **PAUSE** and await the user's response. Do not proceed without an explicit yes or confirmation.
117
-
118
- 6. **Create Checkpoint Commit:**
119
- - Stage all changes. If no changes occurred in this step, proceed with an empty commit.
120
- - Perform the commit with a clear and concise message (e.g., `conductor(checkpoint): Checkpoint end of Phase X`).
121
-
122
- 7. **Attach Auditable Verification Report using Git Notes:**
123
- - **Step 7.1: Draft Note Content:** Create a detailed verification report including the automated test command, the manual verification steps, and the user's confirmation.
124
- - **Step 7.2: Attach Note:** Use the `git notes` command and the full commit hash from the previous step to attach the full report to the checkpoint commit.
125
-
126
- 8. **Get and Record Phase Checkpoint SHA:**
127
- - **Step 8.1: Get Commit Hash:** Obtain the hash of the *just-created checkpoint commit* (`git log -1 --format="%H"`).
128
- - **Step 8.2: Update Plan:** Read `plan.md`, find the heading for the completed phase, and append the first 7 characters of the commit hash in the format `[checkpoint: <sha>]`.
129
- - **Step 8.3: Write Plan:** Write the updated content back to `plan.md`.
130
-
131
- 9. **Commit Plan Update:**
132
- - **Action:** Stage the modified `plan.md` file.
133
- - **Action:** Commit this change with a descriptive message following the format `conductor(plan): Mark phase '<PHASE NAME>' as complete`.
134
-
135
- 10. **Announce Completion:** Inform the user that the phase is complete and the checkpoint has been created, with the detailed verification report attached as a git note.
136
-
137
- ### Quality Gates
138
-
139
- Before marking any task complete, verify:
140
-
141
- - [ ] All tests pass
142
- - [ ] Code coverage meets requirements (>80%)
143
- - [ ] Code follows project's code style guidelines (as defined in `code_styleguides/`)
144
- - [ ] All public functions/methods are documented (e.g., docstrings, JSDoc, GoDoc)
145
- - [ ] Type safety is enforced (e.g., type hints, TypeScript types, Go types)
146
- - [ ] No linting or static analysis errors (using the project's configured tools)
147
- - [ ] Works correctly on mobile (if applicable)
148
- - [ ] Documentation updated if needed
149
- - [ ] No security vulnerabilities introduced
150
-
151
- ## Development Commands
152
-
153
- **AI AGENT INSTRUCTION: This section should be adapted to the project's specific language, framework, and build tools.**
154
-
155
- ### Setup
156
- ```bash
157
- # Example: Commands to set up the development environment (e.g., install dependencies, configure database)
158
- # e.g., for a Node.js project: npm install
159
- # e.g., for a Go project: go mod tidy
160
- ```
161
-
162
- ### Daily Development
163
- ```bash
164
- # Example: Commands for common daily tasks (e.g., start dev server, run tests, lint, format)
165
- # e.g., for a Node.js project: npm run dev, npm test, npm run lint
166
- # e.g., for a Go project: go run main.go, go test ./..., go fmt ./...
167
- ```
168
-
169
- ### Before Committing
170
- ```bash
171
- # Example: Commands to run all pre-commit checks (e.g., format, lint, type check, run tests)
172
- # e.g., for a Node.js project: npm run check
173
- # e.g., for a Go project: make check (if a Makefile exists)
174
- ```
175
-
176
- ## Testing Requirements
177
-
178
- ### Unit Testing
179
- - Every module must have corresponding tests.
180
- - Use appropriate test setup/teardown mechanisms (e.g., fixtures, beforeEach/afterEach).
181
- - Mock external dependencies.
182
- - Test both success and failure cases.
183
-
184
- ### Integration Testing
185
- - Test complete user flows
186
- - Verify database transactions
187
- - Test authentication and authorization
188
- - Check form submissions
189
-
190
- ### Mobile Testing
191
- - Test on actual iPhone when possible
192
- - Use Safari developer tools
193
- - Test touch interactions
194
- - Verify responsive layouts
195
- - Check performance on 3G/4G
196
-
197
- ## Code Review Process
198
-
199
- ### Self-Review Checklist
200
- Before requesting review:
201
-
202
- 1. **Functionality**
203
- - Feature works as specified
204
- - Edge cases handled
205
- - Error messages are user-friendly
206
-
207
- 2. **Code Quality**
208
- - Follows style guide
209
- - DRY principle applied
210
- - Clear variable/function names
211
- - Appropriate comments
212
-
213
- 3. **Testing**
214
- - Unit tests comprehensive
215
- - Integration tests pass
216
- - Coverage adequate (>80%)
217
-
218
- 4. **Security**
219
- - No hardcoded secrets
220
- - Input validation present
221
- - SQL injection prevented
222
- - XSS protection in place
223
-
224
- 5. **Performance**
225
- - Database queries optimized
226
- - Images optimized
227
- - Caching implemented where needed
228
-
229
- 6. **Mobile Experience**
230
- - Touch targets adequate (44x44px)
231
- - Text readable without zooming
232
- - Performance acceptable on mobile
233
- - Interactions feel native
234
-
235
- ## Commit Guidelines
236
-
237
- ### Message Format
238
- ```
239
- <type>(<scope>): <description>
240
-
241
- [optional body]
242
-
243
- [optional footer]
244
- ```
245
-
246
- ### Types
247
- - `feat`: New feature
248
- - `fix`: Bug fix
249
- - `docs`: Documentation only
250
- - `style`: Formatting, missing semicolons, etc.
251
- - `refactor`: Code change that neither fixes a bug nor adds a feature
252
- - `test`: Adding missing tests
253
- - `chore`: Maintenance tasks
254
-
255
- ### Examples
256
- ```bash
257
- git commit -m "feat(auth): Add remember me functionality"
258
- git commit -m "fix(posts): Correct excerpt generation for short posts"
259
- git commit -m "test(comments): Add tests for emoji reaction limits"
260
- git commit -m "style(mobile): Improve button touch targets"
261
- ```
262
-
263
- ## Definition of Done
264
-
265
- A task is complete when:
266
-
267
- 1. All code implemented to specification
268
- 2. Unit tests written and passing
269
- 3. Code coverage meets project requirements
270
- 4. Documentation complete (if applicable)
271
- 5. Code passes all configured linting and static analysis checks
272
- 6. Works beautifully on mobile (if applicable)
273
- 7. Implementation notes added to `plan.md`
274
- 8. Changes committed with proper message
275
- 9. Git note with task summary attached to the commit
276
-
277
- ## Emergency Procedures
278
-
279
- ### Critical Bug in Production
280
- 1. Create hotfix branch from main
281
- 2. Write failing test for bug
282
- 3. Implement minimal fix
283
- 4. Test thoroughly including mobile
284
- 5. Deploy immediately
285
- 6. Document in plan.md
286
-
287
- ### Data Loss
288
- 1. Stop all write operations
289
- 2. Restore from latest backup
290
- 3. Verify data integrity
291
- 4. Document incident
292
- 5. Update backup procedures
293
-
294
- ### Security Breach
295
- 1. Rotate all secrets immediately
296
- 2. Review access logs
297
- 3. Patch vulnerability
298
- 4. Notify affected users (if any)
299
- 5. Document and update security procedures
300
-
301
- ## Deployment Workflow
302
-
303
- ### Pre-Deployment Checklist
304
- - [ ] All tests passing
305
- - [ ] Coverage >80%
306
- - [ ] No linting errors
307
- - [ ] Mobile testing complete
308
- - [ ] Environment variables configured
309
- - [ ] Database migrations ready
310
- - [ ] Backup created
311
-
312
- ### Deployment Steps
313
- 1. Merge feature branch to main
314
- 2. Tag release with version
315
- 3. Push to deployment service
316
- 4. Run database migrations
317
- 5. Verify deployment
318
- 6. Test critical paths
319
- 7. Monitor for errors
320
-
321
- ### Post-Deployment
322
- 1. Monitor analytics
323
- 2. Check error logs
324
- 3. Gather user feedback
325
- 4. Plan next iteration
326
-
327
- ## Continuous Improvement
328
-
329
- - Review workflow weekly
330
- - Update based on pain points
331
- - Document lessons learned
332
- - Optimize for user happiness
333
- - Keep things simple and maintainable