dboa9 commited on
Commit
48bd3aa
·
1 Parent(s): 97fa7ca
.github/copilot-instructions.md ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # .github/copilot-instructions.md — moltbot-hybrid-engine (HF Space B)
2
+ # SYNCED FILE — Source: courtBundleGenerator2/scripts/sync_agent_docs.sh
3
+ # DO NOT EDIT HERE — edit source and re-run sync_agent_docs.sh
4
+
5
+ ## This Repo: moltbot-hybrid-engine (HF Space B)
6
+ - **Role:** Local clone of HF Space deebee7/moltbot-hybrid-engine. Ollama + Qwen 2.5 brain. Edit locally, push to deploy.
7
+ - **Entrypoint:** `app.py (FastAPI + Ollama)`
8
+ - **Output dir:** `https://deebee7-moltbot-hybrid-engine.hf.space (cloud — not local)`
9
+
10
+ ## Architecture Reference
11
+ See: `memory-bank/ARCHITECTURE.md` (synced to this repo)
12
+
13
+ ## Absolute Rules (All Repos)
14
+ - NO placeholders, TODOs, or incomplete logic — implement fully or stop
15
+ - NO standalone scripts — fix files in place only (no temp_fix.py, wrapper_v2.py)
16
+ - NO compliance checks that block discovery, embedding, or bundle output
17
+ - NO force push: `git push --force` is prohibited
18
+ - DRY: reuse FileResolutionBridge, UnifiedEvidenceBridge, existing processors — never duplicate logic
19
+ - NEVER use `./local_output` — always use the full absolute output path above
20
+ - Before claiming complete: run `ls -lh` on output PDFs + check `missing_evidence_summary.json`
21
+ - Every claim requires CLI proof (`ls`, `cat`, `grep`) — no assumptions, no "most likely"
22
+
23
+ ## Exhibit & DB Reference (Court Format)
24
+ - Exhibit number = bundle letter + sequence (e.g. A15, G7) — set by bundler only
25
+ - DB reference = DB-[N] (e.g. DB-125) — set by `lib/db_registry.py` only
26
+ - NEVER swap these. NEVER use DB ref as the exhibit number.
27
+ - Legal docs reference format: **Exhibit A15 (DB-125) — filename**
28
+
29
+ ## Protected Files (Never overwrite without explicit permission in capitals)
30
+ - `enhanced_bundler_wrapper.patched.py`
31
+ - `create_proper_embedded_bundle.py`
32
+ - `generate_bundles_final_corrected.py`
33
+ - `dual_category_evidence_processor.py`
34
+ - `categorize_and_append_v2.py`
35
+
36
+ ## Evidence Discovery (P2 only — but all agents must know this)
37
+ - Root: `/home/mrdbo/projects/courtBundleGenerator2/evidence/`
38
+ - Policy: FULL RECURSIVE SCAN — no whitelists, no allow-lists
39
+ - If any file under the evidence root is not found, discovery is broken — fix the whole scan, not a list
40
+
41
+ ## Hybrid Engine Rules
42
+ - This is a DEPLOYMENT TARGET — do not add bundler logic here
43
+ - Deploy: `git add -A && git commit -m 'msg' && git push origin main`
44
+ - Exposes: /health, /api/generate, /v1/chat/completions, GET /prompts/legal-exhibit-instruction
45
+ - SDK: Docker; Ollama installed at runtime via start.sh; Qwen 2.5 pulled in background
46
+
47
+ ## Full Rules
48
+ See: `memory-bank/CRITICAL_INSTRUCTIONS.md` (synced to this repo)
49
+ See: `AGENT_KNOWLEDGE_BASE_Core_Identity_Standards.md` (synced to this repo)
AGENT_KNOWLEDGE_BASE_Core_Identity_Standards.md ADDED
@@ -0,0 +1,129 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # CORE AGENT IDENTITY
2
+
3
+ You are a court bundle engineering agent that operates under ZERO-TOLERANCE standards for broken code, placeholders, and hallucinations.
4
+
5
+ **Canonical source for run commands, completion gates, and audit/verification:** `memory-bank/CRITICAL_INSTRUCTIONS.md` and `AUDITING_COMMANDS_23_1_26.md`. Use them before claiming task completion.
6
+
7
+ ---
8
+
9
+ ## CRITICAL DIRECTIVES (HIERARCHY 1 - ABSOLUTE)
10
+
11
+ 1. **NO BROKEN CODE** — Code with syntax errors, incomplete logic, or untested assumptions terminates the session immediately.
12
+ 2. **NO PLACEHOLDERS** — Methods printing "TODO" or returning unchanged data are prohibited. Implement fully or stop.
13
+ 3. **NO SYMPTOM TREATMENT** — Fix root causes only. No patches, workarounds, bypasses, fallbacks, or standalone scripts.
14
+ 4. **EMPIRICAL EVIDENCE ONLY** — Every diagnosis requires concrete proof: logs, diffs, or shown code. No assumptions, inferences, or guesses.
15
+ 5. **PATH DISCIPLINE** — Strictly obey `--output-dir`. **P2:** `/home/mrdbo/court_data/CourtBundleOutput`. **P3:** `/home/mrdbo/court_data/2nd_CourtBundleOutput`. NEVER use `./local_output` for production.
16
+ 6. **TRUTH IN TELEMETRY** — Do not print "✅" or claim a file exists without running `ls -lh [EXACT_PATH]` and seeing output.
17
+ 7. **LEGAL DOCUMENT EXHIBIT REFERENCES** — When drafting or editing witness statements, N244, SRA complaint, or any court filing: reference evidence as **Exhibit [Letter][Number] (DB-[N]) — [Filename]**. Do not use bare "DB-[●]" or "Exhibit DB-[Number]" as the main identifier. Resolve using `PROMPTS/EXHIBIT_REFERENCING_FOR_LEGAL_DOCS.md` and `legal_emails/Phase8/DB_Evidence_List.txt`. See `PROMPTS/HOW_TO_MAKE_AGENTS_AWARE.md` for wiring into chat/UI/engine.
18
+
19
+ ---
20
+
21
+ ## BEFORE ANY RUN (NON-NEGOTIABLE)
22
+
23
+ 1. **ASK IN CAPITALS** which entrypoint is being used: `enhanced_bundler_wrapper.patched.py` (P2), `generate_bundles_final_corrected.py` (P3), or `create_proper_embedded_bundle.py` (direct).
24
+ 2. **RUN** `python3 <ENTRYPOINT> -h` **and paste the output.**
25
+ 3. **USE ONLY FLAGS** explicitly shown in that `-h` output.
26
+ 4. If required policy flags are missing from argparse, **ADD THEM FIRST** (then re-run `-h` and paste). Do not run unsupported flags.
27
+
28
+ ---
29
+
30
+ ## ANTI-HALLUCINATION PROTOCOL (MANDATORY)
31
+
32
+ 1. **RAG:** Use FileResolutionBridge for all file resolution.
33
+ 2. **Chain of Thought:** Show step-by-step logic before conclusions.
34
+ 3. **Chain of Verification:** Validate bundle existence before claiming success; run `ls -lh` on claimed outputs.
35
+ 4. **Specificity:** Detailed context only — no generic statements.
36
+ 5. **Role Assignment:** Respect component expertise boundaries.
37
+ 6. **Require Sources:** Verify sources for all evidence claims; cite line numbers and file paths.
38
+ 7. **Advanced Models:** Use EnhancedEmbeddingFeatures appropriately.
39
+ 8. **Confidence Levels:** Score reliability (80% minimum threshold).
40
+ 9. **Multiple Models:** Legacy vs Discovery vs Enrichment — use correct model.
41
+ 10. **Lower Temperature:** Deterministic config for reproducibility.
42
+ 11. **External Fact-Checking:** Use court compliance as **reference only** (e.g. formatting). Do **NOT** use it as a gate that blocks discovery, embedding, or bundle output (compliance bypass is intentional).
43
+ 12. **Confidence Threshold:** 80% minimum — mark UNVERIFIED if below.
44
+
45
+ ---
46
+
47
+ ## ROBUST CODE STANDARDS (MANDATORY)
48
+
49
+ 1. **DRY Principle:** Reuse FileResolutionBridge, UnifiedEvidenceBridge, and existing processors.
50
+ 2. **Extensible:** Architecture must allow future compliance features.
51
+ 3. **Modular:** Isolated, testable changes only.
52
+ 4. **Non-breaking:** Preserve original functionality.
53
+ 5. **Configurable:** Use feature flags for new logic.
54
+ 6. **Reusable:** Logic must work with any evidence list.
55
+ 7. **Refactor:** Improve architecture; do not patch over problems.
56
+ 8. **Integrate:** Deep integration only — no parallel pipelines or temporary scripts.
57
+ 9. **NO STANDALONE:** No temp_fix.py, wrapper_v2.py, or "quick fixes."
58
+ 10. **Fix in place:** Do not create parallel or temporary scripts. Fix the files in place.
59
+ 11. **Audit before discovery changes:** Before changing discovery logic or `config/path_config.py`, run `python3 tools/audit_runtime_blockers.py` (from courtBundleGenerator2). Fix any reported blockers first.
60
+
61
+ ---
62
+
63
+ ## ENFORCEMENT BLOCKING RULES
64
+
65
+ - **Source of truth:** `memory-bank/CRITICAL_INSTRUCTIONS.md`. All other docs defer to it.
66
+ - No bypasses, fallbacks, standalone scripts, or parallel pipelines.
67
+ - **Protected files** (see CRITICAL_INSTRUCTIONS): never `cp`/`mv`/backup without EXPLICIT PERMISSION IN CAPITALS. For critical files: only append, never overwrite completely.
68
+
69
+ ---
70
+
71
+ ## VERIFICATION LOOP (AFTER EVERY CHANGE)
72
+
73
+ Use the canonical commands in `memory-bank/CRITICAL_INSTRUCTIONS.md` for your entrypoint. Summary:
74
+
75
+ **Project 2 (enhanced_bundler_wrapper.patched.py):**
76
+ ```bash
77
+ source court_venv_20250802/bin/activate && python3 -u enhanced_bundler_wrapper.patched.py \
78
+ --output-dir /home/mrdbo/court_data/CourtBundleOutput \
79
+ --enable-discovery --enable-fuzzy --recursive --limit 15 --limit-per-bundle 5 \
80
+ 2>&1 | tee -a telemetry.log
81
+ ```
82
+ Then: `cd /home/mrdbo/projects/courtBundleGenerator2 && python3 embedding_utils/pdf_page_verifier.py /home/mrdbo/court_data/CourtBundleOutput`
83
+
84
+ **Project 3 (generate_bundles_final_corrected.py):**
85
+ ```bash
86
+ cd /home/mrdbo/projects/courtBundleGenerator3 && source ../courtBundleGenerator2/court_venv_20250802/bin/activate && \
87
+ python3 -u generate_bundles_final_corrected.py \
88
+ --output-dir /home/mrdbo/court_data/2nd_CourtBundleOutput \
89
+ --enable-discovery --recursive --limit 15 --limit-per-bundle 5 \
90
+ 2>&1 | tee -a telemetry.log
91
+ ```
92
+ Then: `cd /home/mrdbo/projects/courtBundleGenerator3 && python3 pdf_page_verifier_enhanced.py /home/mrdbo/court_data/2nd_CourtBundleOutput`
93
+
94
+ **Audit/diagnostics:** See `AUDITING_COMMANDS_23_1_26.md` and the **AGENT AUDIT & VERIFICATION COMMANDS** section in CRITICAL_INSTRUCTIONS (audit_bundle_prevention.py, test_download_links.py, audit_runtime_chain, audit_runtime_blockers).
95
+
96
+ ---
97
+
98
+ ## ABSOLUTE COMPLETION GATE
99
+
100
+ You must **not** claim completion unless ALL are true:
101
+
102
+ - Bundles generated in the chosen output directory; PDFs are non-empty.
103
+ - Page-level verification run (see CRITICAL_INSTRUCTIONS — AGENT AUDIT & VERIFICATION COMMANDS) with 0 missing or fully enumerated with reason codes.
104
+ - Missing evidence summary is empty OR fully enumerated with reason codes.
105
+ - TOC sync issues == 0; DB numbers present in TOC and evidence pages; exhibit number = bundle letter+seq (e.g. A15, G7), not DB ref.
106
+ - No raw paths on PDF pages (embedding failure); continue verification loop until 0 missing. Empirical analysis only.
107
+ - User confirms PDFs are correct when applicable.
108
+
109
+ **Run proof required:** Exact command executed; at least one PDF path with size + full ISO timestamp; `missing_evidence_summary.json` status. **YOU MUST STOP FOR USER ACCEPTANCE before claiming completion.**
110
+
111
+ ---
112
+
113
+ ## PENALTY SYSTEM
114
+
115
+ - **Broken Code:** Session terminates immediately; all output invalidated.
116
+ - **Placeholder Code:** Task rejected; must re-plan.
117
+ - **Hallucinated Files/Success:** Confidence score → 0%; all claims invalidated.
118
+ - **Skipped Verification:** All subsequent output marked UNVERIFIED.
119
+ - **Bypassing Rules:** Session paused; requires explicit re-authorization.
120
+ - **Re-enabling compliance/validation that blocks pipeline:** Session paused; change reverted. Compliance bypass is intentional; do not add checks that block discovery, embedding, or bundle output.
121
+
122
+ ---
123
+
124
+ ## KEY REFERENCES
125
+
126
+ - **Full instructions:** `memory-bank/CRITICAL_INSTRUCTIONS.md` — entry points, verification loop, completion gates, discovery, exhibit/DB sync, legal document referencing, IN-SITU PATCH FORMAT, formalized protocol for asking for missing information, penalty system.
127
+ - **Audit/verification commands:** `AUDITING_COMMANDS_23_1_26.md` — page verifiers, audit_bundle_prevention, test_download_links, audit_runtime_chain, audit_runtime_blockers, cross_project_impact_audit (with `--entry` for runtime chain).
128
+ - **Exhibit referencing (legal docs):** `PROMPTS/EXHIBIT_REFERENCING_FOR_LEGAL_DOCS.md`, `PROMPTS/LEGAL_WRITING_EXHIBIT_INSTRUCTION.md`, `PROMPTS/HOW_TO_MAKE_AGENTS_AWARE.md`. Authoritative DB list: `legal_emails/Phase8/DB_Evidence_List.txt`.
129
+ - **Protected files list:** In CRITICAL_INSTRUCTIONS (e.g. enhanced_bundler_wrapper.patched.py, create_proper_embedded_bundle.py, generate_bundles_final_corrected.py, dual_category_evidence_processor.py, categorize_and_append_v2.py).
app.py CHANGED
@@ -302,32 +302,46 @@ def security_info():
302
  }
303
 
304
 
305
- # Legal document exhibit reference instruction — for clients to prepend as system prompt
306
  _LEGAL_EXHIBIT_PROMPT_PATH = Path(__file__).resolve().parent / "prompts" / "legal_exhibit_instruction.txt"
 
 
 
 
 
 
 
 
 
 
 
 
307
 
308
  @app.get("/prompts/legal-exhibit-instruction")
309
  def get_legal_exhibit_instruction():
310
- """Return the legal exhibit referencing instruction. Clients (Cursor, desktop app) should use this as system message when drafting witness statements, N244, SRA complaint, or any court filing."""
311
- if _LEGAL_EXHIBIT_PROMPT_PATH.exists():
312
- return {"instruction": _LEGAL_EXHIBIT_PROMPT_PATH.read_text(encoding="utf-8", errors="replace")}
313
- return {"instruction": "When referencing evidence use Exhibit [Letter][Number] (DB-[N]) — [Filename]. Do not use bare DB-[●]."}
314
 
315
 
316
  # --- LLM Generation (Dual Backend: Ollama → HF Inference API) ---
317
 
318
  @app.post("/api/generate")
319
  async def generate(request: GenerateRequest, x_api_key: str = Header(None)):
320
- """Generate text using LLM. Tries Ollama first, falls back to HF Inference API."""
 
 
321
  if not x_api_key or x_api_key != API_KEY:
322
  raise HTTPException(status_code=401, detail="Invalid or missing API Key")
323
 
324
- logger.info(f"[GENERATE] model={request.model}, prompt_len={len(request.prompt)}")
 
 
325
 
326
  backend_used = None
327
  response_text = None
328
 
329
  # Backend 1: Try Ollama (local)
330
- response_text = generate_with_ollama(request.model, request.prompt)
331
  if response_text:
332
  backend_used = "ollama"
333
  logger.info(f"[GENERATE] Ollama success, response_len={len(response_text)}")
@@ -335,7 +349,7 @@ async def generate(request: GenerateRequest, x_api_key: str = Header(None)):
335
  # Backend 2: Fallback to HF Inference API
336
  if not response_text:
337
  logger.info("[GENERATE] Ollama unavailable, trying HF Inference API...")
338
- response_text = generate_with_hf_api(request.prompt)
339
  if response_text:
340
  backend_used = "hf_inference_api"
341
  logger.info(f"[GENERATE] HF API success, response_len={len(response_text)}")
@@ -641,9 +655,13 @@ async def chat_completions(
641
 
642
  logger.info(f"[CHAT] model={request.model}, messages={len(request.messages)}, stream={request.stream}")
643
 
 
 
 
 
644
  # Generate response via model routing
645
  response_text = _generate_for_model(
646
- request.model, request.messages,
647
  temperature=request.temperature or 0.7,
648
  max_tokens=request.max_tokens or 2048,
649
  )
 
302
  }
303
 
304
 
305
+ # Legal document exhibit reference instruction — injected into every generate/chat so edit sources always get it
306
  _LEGAL_EXHIBIT_PROMPT_PATH = Path(__file__).resolve().parent / "prompts" / "legal_exhibit_instruction.txt"
307
+ _LEGAL_EXHIBIT_INSTRUCTION_CACHED: Optional[str] = None
308
+
309
+ def _get_legal_exhibit_instruction() -> str:
310
+ """Load legal exhibit instruction once; used to inject into all LLM requests so Moltbot/Qwen always output Exhibit [Letter][Number] (DB-[N]) — [Filename]."""
311
+ global _LEGAL_EXHIBIT_INSTRUCTION_CACHED
312
+ if _LEGAL_EXHIBIT_INSTRUCTION_CACHED is not None:
313
+ return _LEGAL_EXHIBIT_INSTRUCTION_CACHED
314
+ if _LEGAL_EXHIBIT_PROMPT_PATH.exists():
315
+ _LEGAL_EXHIBIT_INSTRUCTION_CACHED = _LEGAL_EXHIBIT_PROMPT_PATH.read_text(encoding="utf-8", errors="replace")
316
+ else:
317
+ _LEGAL_EXHIBIT_INSTRUCTION_CACHED = "When referencing evidence use Exhibit [Letter][Number] (DB-[N]) — [Filename]. Do not use bare DB-[●]."
318
+ return _LEGAL_EXHIBIT_INSTRUCTION_CACHED
319
 
320
  @app.get("/prompts/legal-exhibit-instruction")
321
  def get_legal_exhibit_instruction():
322
+ """Return the legal exhibit referencing instruction. Also injected automatically into /api/generate and /v1/chat/completions."""
323
+ return {"instruction": _get_legal_exhibit_instruction()}
 
 
324
 
325
 
326
  # --- LLM Generation (Dual Backend: Ollama → HF Inference API) ---
327
 
328
  @app.post("/api/generate")
329
  async def generate(request: GenerateRequest, x_api_key: str = Header(None)):
330
+ """Generate text using LLM. Tries Ollama first, falls back to HF Inference API.
331
+ Legal exhibit instruction is prepended so all edits/amendments use Exhibit [Letter][Number] (DB-[N]) — [Filename].
332
+ """
333
  if not x_api_key or x_api_key != API_KEY:
334
  raise HTTPException(status_code=401, detail="Invalid or missing API Key")
335
 
336
+ # Inject legal exhibit instruction so edit sources always get the rule
337
+ prompt_with_legal = _get_legal_exhibit_instruction() + "\n\n---\n\n" + request.prompt
338
+ logger.info(f"[GENERATE] model={request.model}, prompt_len={len(prompt_with_legal)}")
339
 
340
  backend_used = None
341
  response_text = None
342
 
343
  # Backend 1: Try Ollama (local)
344
+ response_text = generate_with_ollama(request.model, prompt_with_legal)
345
  if response_text:
346
  backend_used = "ollama"
347
  logger.info(f"[GENERATE] Ollama success, response_len={len(response_text)}")
 
349
  # Backend 2: Fallback to HF Inference API
350
  if not response_text:
351
  logger.info("[GENERATE] Ollama unavailable, trying HF Inference API...")
352
+ response_text = generate_with_hf_api(prompt_with_legal)
353
  if response_text:
354
  backend_used = "hf_inference_api"
355
  logger.info(f"[GENERATE] HF API success, response_len={len(response_text)}")
 
655
 
656
  logger.info(f"[CHAT] model={request.model}, messages={len(request.messages)}, stream={request.stream}")
657
 
658
+ # Inject legal exhibit instruction so every edit/amendment/insert uses Exhibit [Letter][Number] (DB-[N]) — [Filename]
659
+ legal_system = ChatMessage(role="system", content=_get_legal_exhibit_instruction())
660
+ messages_with_legal = [legal_system] + list(request.messages)
661
+
662
  # Generate response via model routing
663
  response_text = _generate_for_model(
664
+ request.model, messages_with_legal,
665
  temperature=request.temperature or 0.7,
666
  max_tokens=request.max_tokens or 2048,
667
  )
memory-bank/ARCHITECTURE.md ADDED
@@ -0,0 +1,117 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # /home/mrdbo/projects/courtBundleGenerator2/memory-bank/ARCHITECTURE.md
2
+ # SYNCED FILE — Source: courtBundleGenerator2/memory-bank/ARCHITECTURE.md
3
+ # DO NOT EDIT IN OTHER REPOS — edit source and run scripts/sync_agent_docs.sh
4
+
5
+ # Project Architecture — Court Bundle Generator
6
+
7
+ ## 4 Repos. 2 Local Dev. 2 HF Space Local Clones.
8
+
9
+ ```
10
+ /home/mrdbo/projects/
11
+ ├── courtBundleGenerator2/ ← REPO 1 (P2)
12
+ ├── courtBundleGenerator3/ ← REPO 2 (P3)
13
+ ├── moltbot-legal-desktop/ ← REPO 3 (HF Space A local clone)
14
+ └── moltbot-hybrid-engine/ ← REPO 4 (HF Space B local clone)
15
+
16
+ /home/mrdbo/court_data/ ← OUTPUT DIRECTORY ONLY. NOT A REPO.
17
+ /home/mrdbo/projects/courtBundleGenerator2/evidence/ ← EVIDENCE SUBFOLDER OF P2. NOT A REPO.
18
+ ```
19
+
20
+ ---
21
+
22
+ ## Repo Roles
23
+
24
+ ### REPO 1 — courtBundleGenerator2 (P2)
25
+ - **Role:** Evidence root. Legacy bundler. Documentation home.
26
+ - **Evidence root:** `/home/mrdbo/projects/courtBundleGenerator2/evidence/` (full recursive scan, no whitelists)
27
+ - **Entrypoint:** `enhanced_bundler_wrapper.patched.py`
28
+ - **Output dir:** `/home/mrdbo/court_data/CourtBundleOutput`
29
+ - **Symlink:** `./output` → `/home/mrdbo/court_data/CourtBundleOutput` (always use full path in commands)
30
+ - **Venv:** `/home/mrdbo/projects/courtBundleGenerator2/court_venv_20250802/bin/python`
31
+ - **Documentation home:** `memory-bank/`, `PROMPTS/`
32
+ - **Rule:** Read-only for evidence files. Do NOT add new logic adapters here.
33
+
34
+ ### REPO 2 — courtBundleGenerator3 (P3)
35
+ - **Role:** Active logic center. All new adapters, tools, bridge scripts go here.
36
+ - **Entrypoint:** `generate_bundles_final_corrected.py`
37
+ - **Output dir:** `/home/mrdbo/court_data/2nd_CourtBundleOutput`
38
+ - **Symlink:** `./output` → `/home/mrdbo/court_data/2nd_CourtBundleOutput` (always use full path in commands)
39
+ - **Venv:** Shared — `/home/mrdbo/projects/courtBundleGenerator2/court_venv_20250802/bin/python`
40
+ - **Key files:** `cloud_llm_adapter.py`, `moltbot_track_changes.py`, `generate_bundles_final_corrected.py`
41
+ - **Rule:** Install all Python dependencies and bridge scripts HERE, not in P2.
42
+
43
+ ### REPO 3 — moltbot-legal-desktop (HF Space A)
44
+ - **Role:** Local clone of Hugging Face Space `deebee7/moltbot-legal-desktop`.
45
+ - **What runs here locally:** Nothing. Edit locally, then `git push` to deploy.
46
+ - **What the Space runs:** FastAPI web server (`app.py`) on port 7860.
47
+ - **Live URL:** `https://deebee7-moltbot-legal-desktop.hf.space`
48
+ - **Live endpoints:** `/health`, `/api/generate_bundle`, `/api/bundles`, `/api/evidence_stats`, `/api/analyze`
49
+ - **Deploy command (run from this repo root):** `git add -A && git commit -m "msg" && git push origin main`
50
+ - **Sync from P2/P3:** `cd /home/mrdbo/projects/courtBundleGenerator3/adapters && bash sync_to_desktop.sh`
51
+ - **SDK:** Docker (Python 3.10 + LibreOffice + uvicorn)
52
+
53
+ ### REPO 4 — moltbot-hybrid-engine (HF Space B)
54
+ - **Role:** Local clone of Hugging Face Space `deebee7/moltbot-hybrid-engine`.
55
+ - **What runs here locally:** Nothing. Edit locally, then `git push` to deploy.
56
+ - **What the Space runs:** FastAPI + Ollama + Qwen 2.5. OpenAI-compatible API.
57
+ - **Live URL:** `https://deebee7-moltbot-hybrid-engine.hf.space`
58
+ - **Live endpoints:** `/health`, `/api/generate`, `/api/search`, `/api/analyze`, `/v1/chat/completions`, `/v1/models`, `GET /prompts/legal-exhibit-instruction`
59
+ - **Deploy command (run from this repo root):** `git add -A && git commit -m "msg" && git push origin main`
60
+ - **SDK:** Docker
61
+
62
+ ---
63
+
64
+ ## Output Directories (Not repos — never commit here)
65
+
66
+ | Path | Purpose | Used by |
67
+ |---|---|---|
68
+ | `/home/mrdbo/court_data/CourtBundleOutput` | P2 bundle output | P2 entrypoint |
69
+ | `/home/mrdbo/court_data/2nd_CourtBundleOutput` | P3 bundle output | P3 entrypoint |
70
+
71
+ ---
72
+
73
+ ## Evidence Root (Subfolder of P2 — not a repo)
74
+
75
+ - **Path:** `/home/mrdbo/projects/courtBundleGenerator2/evidence/`
76
+ - **Discovery policy:** Full recursive scan (`os.walk` / `rglob`). No whitelists. No allow-lists.
77
+ - **Historically missed directories (must never be excluded):** `Repairs`, `InputDocs`, `new_evidence_staging`, `00_CRITICAL_SCANNED`, `00_CRITICAL_INTAKE`
78
+ - **External evidence path:** `/legal_emails` also scanned
79
+
80
+ ---
81
+
82
+ ## Cloud Infrastructure (Not local repos — deployed via git push)
83
+
84
+ | Space | HF Repo | Local Clone | Role |
85
+ |---|---|---|---|
86
+ | HF Space A | `deebee7/moltbot-legal-desktop` | `moltbot-legal-desktop/` | Web bundle server |
87
+ | HF Space B | `deebee7/moltbot-hybrid-engine` | `moltbot-hybrid-engine/` | Qwen 2.5 brain |
88
+
89
+ **Check Space health:**
90
+ ```bash
91
+ curl -s https://deebee7-moltbot-hybrid-engine.hf.space/health | python3 -m json.tool
92
+ curl -s https://deebee7-moltbot-legal-desktop.hf.space/health | python3 -m json.tool
93
+ ```
94
+
95
+ ---
96
+
97
+ ## VS Code Workspace
98
+
99
+ File: `/home/mrdbo/projects/MyProjects.code-workspace`
100
+ All 4 repos plus `court_data` (output) and `evidence` (subfolder) are opened as workspace folders for convenience. `court_data` and `evidence` are NOT repos.
101
+
102
+ ---
103
+
104
+ ## Key Shared Config (Lives in P2, used by P3 via import)
105
+
106
+ | File | Repo | Purpose |
107
+ |---|---|---|
108
+ | `config/path_config.py` | P2 | Discovery roots — must return full evidence tree |
109
+ | `file_resolution_bridge.py` | P2 | File resolution with caching |
110
+ | `lib/db_registry.py` | P2 | DB-[N] assignment — never sets exhibitNo |
111
+ | `config/bundle_compliance.json` | P2 | Court formatting reference (not a pipeline gate) |
112
+ | `legal_emails/Phase8/DB_Evidence_List.txt` | P2 | Authoritative DB1–DB170 list |
113
+
114
+ ---
115
+
116
+ *Source of truth: courtBundleGenerator2/memory-bank/ARCHITECTURE.md*
117
+ *Synced to all repos by: scripts/sync_agent_docs.sh*
memory-bank/CLAUDE.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #/home/mrdbo/projects/courtBundleGenerator2/memory-bank/CLAUDE.md
2
+ # CLAUDE Project Summary (Current as of 2026-02-13)
3
+
4
+ ## Current State (2026-02-13)
5
+ - **Exhibit/DB sync:** `lib/db_registry.py` writes only DB refs (never exhibitNo/Exhibit No.). P3 `generate_bundles_final_corrected.py` uses bundle letter+seq for exhibit (e.g. A15, G7) and `db_ref` for DB only. Authoritative list: `legal_emails/Phase8/DB_Evidence_List.txt`. TOC, footer, metadata on page stay in sync.
6
+ - **Legal document referencing:** All court filings (witness statement, N244, SRA) must use **Exhibit [Letter][Number] (DB-[N]) — [Filename]**. No bare "DB-[●]". See `PROMPTS/EXHIBIT_REFERENCING_FOR_LEGAL_DOCS.md`, `PROMPTS/LEGAL_WRITING_EXHIBIT_INSTRUCTION.md`, `PROMPTS/HOW_TO_MAKE_AGENTS_AWARE.md`. `.cursorrules` includes the rule for courtBundleGenerator2; Moltbot/Qwen clients should send the instruction as system message (Engine: `GET /prompts/legal-exhibit-instruction`).
7
+ - **Hybrid Cloud Architecture:** Two HF Spaces deployed and running:
8
+ - **Space A** (`deebee7/moltbot-legal-desktop`): FastAPI web server (`app.py`) on port 7860 for cloud bundle generation. Docker SDK, Python 3.10 + LibreOffice. Local clone at `/home/mrdbo/projects/moltbot-legal-desktop`.
9
+ - **Space B** (`deebee7/moltbot-hybrid-engine`): Ollama + Qwen 2.5 LLM; FastAPI + `start.sh`; OpenAI-compatible `/v1/chat/completions`. **`GET /prompts/legal-exhibit-instruction`** returns legal exhibit instruction for clients. Docker SDK, Python 3.11-slim. Local clone at `/home/mrdbo/projects/moltbot-hybrid-engine`.
10
+ - **Sync System:** `sync_to_desktop.sh` + `install_sync_hook.sh` in `courtBundleGenerator3/adapters/` syncs P2 libraries and P3 adapters/tools to Desktop space. Post-commit hooks available.
11
+ - **DBRegistry:** Import wrapped with `try...except` + `HAS_DB_REGISTRY`; double-fallback in Desktop. Registry seeds from `DB_Evidence_List.txt`; never overwrites exhibit number with DB ref.
12
+ - **Verification & metadata:** Page verifier (`pdf_page_verifier_enhanced.py` in P3); metadata on every page; DB fallback DB-0. Audit: `tools/audit_bundle_prevention.py`, `tools/test_download_links.py`, `tools/audit_active_bundling_files.py` (→ gb3_deps.json), `tools/cross_project_impact_audit.py` (optional `--entry` for runtime chain).
13
+ - Metadata ingestion, recursion guard, prompt system, category_mapping, dual category processor as previously documented.
14
+
15
+ ## Architecture Map (5 Projects)
16
+ | # | Project | Path | Role |
17
+ |---|---------|------|------|
18
+ | P2 | courtBundleGenerator2 | `/home/mrdbo/projects/courtBundleGenerator2` | Evidence Root, Legacy Bundler, Documentation |
19
+ | P3 | courtBundleGenerator3 | `/home/mrdbo/projects/courtBundleGenerator3` | Smart Agent Home, Logic Center |
20
+ | Desktop | moltbot-legal-desktop | `/home/mrdbo/projects/moltbot-legal-desktop` | HF Space A — Cloud bundle web server |
21
+ | Engine | moltbot-hybrid-engine | `/home/mrdbo/projects/moltbot-hybrid-engine` | HF Space B — Ollama + Qwen 2.5 LLM |
22
+ | (data) | court_data | `/home/mrdbo/court_data` | Bundle output directory |
23
+
24
+ ## Mandate
25
+ - **Output paths:** P2 → `/home/mrdbo/court_data/CourtBundleOutput`; P3 → `/home/mrdbo/court_data/2nd_CourtBundleOutput`. Do not use `./local_output` for production.
26
+ - Treat `/home/mrdbo/projects/courtBundleGenerator2/evidence/InputDocs/**` and `/home/mrdbo/projects/courtBundleGenerator2/evidence/new_evidence_staging/**` as the only writable discovery sources. All other `/evidence/*` folders exist for read-only reference.
27
+ - Keep the Chain of Verification intact: every run must surface logs from `AntiHallucinationManager`, `EnhancedFuzzyResolver`, and `UnifiedEvidenceBridge` before evidence is embedded.
28
+ - Do not claim success until: (1) canonical run command executed (see CRITICAL_INSTRUCTIONS.md), (2) at least one non-empty PDF in chosen output dir (CourtBundleOutput or 2nd_CourtBundleOutput), (3) page-level verification run. See AUDITING_COMMANDS_23_1_26.md for audit/verification commands.
29
+ - When editing code, annotate complex fixes with the relevant path and line number (e.g., `# FIX: create_proper_embedded_bundle.py:2882`).
30
+
31
+ ## Open Issues to Track
32
+ 1. **Verification automation** – capture and archive the stdout/stderr from the bundler command above for each run so future agents know the last known good state.
33
+ 2. **Dual-category imports** – finish staggering imports inside `dual_category_evidence_processor.py` so instantiation no longer prints the circular import warning when invoked in isolation.
34
+ 3. **Documentation consistency** – every `memory-bank/*` document must reflect the narrow discovery scope and current integration notes (this file sets the tone).
35
+ 4. **Jira integration** – missing env vars (JIRA_URL, JIRA_EMAIL, JIRA_TOKEN) cause 404 errors in agent logger.
36
+ 5. **Hybrid Engine model pull** – Qwen 2.5 7B model pull may not complete on HF free tier (2 CPU, 16GB RAM). Monitor `/api/generate` endpoint for 503 status.
37
+
38
+ ## Core Commands
39
+ ```bash
40
+ # P2: output only /home/mrdbo/court_data/CourtBundleOutput
41
+ source court_venv_20250802/bin/activate && python3 -u enhanced_bundler_wrapper.patched.py \
42
+ --output-dir /home/mrdbo/court_data/CourtBundleOutput --limit 1 --recursive
43
+
44
+ # P3: output only /home/mrdbo/court_data/2nd_CourtBundleOutput
45
+ cd /home/mrdbo/projects/courtBundleGenerator3 && python3 -u generate_bundles_final_corrected.py \
46
+ --output-dir /home/mrdbo/court_data/2nd_CourtBundleOutput --limit 1 --recursive
47
+
48
+ # Page verifier P3
49
+ cd /home/mrdbo/projects/courtBundleGenerator3 && python3 pdf_page_verifier_enhanced.py /home/mrdbo/court_data/2nd_CourtBundleOutput
50
+
51
+ # Confirm output (use dir that matches entrypoint)
52
+ ls -lh /home/mrdbo/court_data/CourtBundleOutput/court_bundle*.pdf
53
+ ls -lh /home/mrdbo/court_data/2nd_CourtBundleOutput/*.pdf
54
+
55
+ # HF Space health checks
56
+ curl -s https://deebee7-moltbot-hybrid-engine.hf.space/health | python3 -m json.tool
57
+ curl -s https://deebee7-moltbot-legal-desktop.hf.space/health | python3 -m json.tool
58
+
59
+ # Sync P2/P3 → Desktop
60
+ cd /home/mrdbo/projects/courtBundleGenerator3/adapters && bash sync_to_desktop.sh --push
61
+
62
+ # Deploy to HF (Desktop)
63
+ cd /home/mrdbo/projects/moltbot-legal-desktop && git add -A && git commit -m "msg" && git push origin main
64
+
65
+ # Deploy to HF (Engine)
66
+ cd /home/mrdbo/projects/moltbot-hybrid-engine && git add -A && git commit -m "msg" && git push origin main
67
+ ```
68
+
69
+ ## Reference Files
70
+ - `create_proper_embedded_bundle.py`, `lib/db_registry.py`, `legal_emails/Phase8/DB_Evidence_List.txt`, `BUNDLE_GROUPS_WITH_FULL_EVIDENCE_FILE_NAMES.md`
71
+ - `cohesive_unified_evidence_processor.py`, `category_mapping.py`, `dual_category_evidence_processor.py`
72
+ - `embedding_utils/prompt_system_integration.py`, `embedding_utils/enhanced_features.py`
73
+ - `enhanced_bundler_wrapper.patched.py`
74
+ - courtBundleGenerator3: `generate_bundles_final_corrected.py`, `pdf_page_verifier_enhanced.py`, `tools/audit_bundle_prevention.py`, `tools/test_download_links.py`, `tools/audit_active_bundling_files.py`, `tools/cross_project_impact_audit.py`
75
+ - courtBundleGenerator2: `tools/cross_project_impact_audit.py` (runtime chain with `--entry`)
76
+ - PROMPTS: `PROMPT_HEADER_13_12_25.md`, `EXHIBIT_REFERENCING_FOR_LEGAL_DOCS.md`, `LEGAL_WRITING_EXHIBIT_INSTRUCTION.md`, `HOW_TO_MAKE_AGENTS_AWARE.md`
77
+ - moltbot-legal-desktop: `app.py` (FastAPI web server), `Dockerfile`
78
+ - moltbot-hybrid-engine: `app.py` (FastAPI), `start.sh`, `Dockerfile`, `prompts/legal_exhibit_instruction.txt`, GET `/prompts/legal-exhibit-instruction`
79
+ - `memory-bank/CRITICAL_INSTRUCTIONS.md`
80
+ - `AUDITING_COMMANDS_23_1_26.md`
memory-bank/CRITICAL_INSTRUCTIONS.md ADDED
@@ -0,0 +1,948 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # /home/mrdbo/projects/courtBundleGenerator2/memory-bank/CRITICAL_INSTRUCTIONS.md
2
+
3
+ ### MULTI-LLM CROSS-VERIFICATION PROTOCOL
4
+
5
+ 1. Process evidence through BOTH local MoltBot AND cloud Qwen 2.5
6
+ 2. Compare outputs at: categorization, DB assignment, embedding verification
7
+ 3. Cloud failure → fallback to local with warning (never block pipeline)
8
+ 4. Daily HF space health check required
9
+
10
+ ### EMBEDDING INTEGRITY REQUIREMENTS
11
+
12
+ 1. **Pre-embedding validation:** Confirm 100% of TOC files exist BEFORE PDF generation
13
+ 2. **Real-time embedding monitoring:** Track success/failure per file
14
+ 3. **DB reference audit:** Every DB in TOC must appear on actual PDF page
15
+ 4. **Zero blank placeholder tolerance:** Fix or remove invalid DB references
16
+
17
+ ### COMPLETION GATE UPDATES
18
+
19
+ ✅ No blank placeholder pages
20
+ ✅ Cloud LLM operational OR fallback executed
21
+ ✅ Pre-embedding validation report generated
22
+ ✅ All TOC DB references appear on actual PDF pages
23
+
24
+ ## CORE DIRECTIVES
25
+
26
+ ## 1.1 HYBRID ARCHITECTURE MAP (DO NOT INFER)
27
+
28
+ **CRITICAL**: You must strictly adhere to these project roles. DO NOT cross-contaminate.
29
+
30
+ ### Local Development Projects
31
+
32
+ 1. **PROJECT 3 (`/home/mrdbo/projects/courtBundleGenerator3`)** — Smart Agent Home
33
+ - **ROLE**: Logic Center. All new adapters, tools, bridge scripts.
34
+ - **CONTENTS**: `cloud_llm_adapter.py`, `moltbot_track_changes.py`, DeepEval, `generate_bundles_final_corrected.py` (P3 copy).
35
+ - **ACTION**: Install all Python dependencies and bridge scripts HERE.
36
+ - **OUTPUT**: `/home/mrdbo/court_data/2nd_CourtBundleOutput`
37
+
38
+ 2. **PROJECT 2 (`/home/mrdbo/projects/courtBundleGenerator2`)** — Evidence Root
39
+ - **ROLE**: Legacy Bundle Generator & Evidence Root. Documentation home (`PROMPTS/`, `memory-bank/`).
40
+ - **CONTENTS**: `/evidence/` data, `enhanced_bundler_wrapper.patched.py`, `create_proper_embedded_bundle.py`.
41
+ - **ACTION**: Read-only for Evidence. Do NOT add new logic adapters here.
42
+ - **OUTPUT**: `/home/mrdbo/court_data/CourtBundleOutput`
43
+
44
+ ### Hugging Face Cloud Spaces
45
+
46
+ 1. **DESKTOP SPACE — HF Space A (`deebee7/moltbot-legal-desktop`)**
47
+ - **LOCAL CLONE**: `/home/mrdbo/projects/moltbot-legal-desktop`
48
+ - **ROLE**: Cloud bundle generation web server (FastAPI on port 7860).
49
+ - **CONTENTS**: `app.py` (FastAPI), `generate_bundles_final_corrected.py`, adapters (synced from P3), libraries (synced from P2).
50
+ - **ENDPOINTS**: `/health`, `/api/generate_bundle`, `/api/bundles`, `/api/evidence_stats`, `/api/analyze`
51
+ - **DEPLOY**: `cd /home/mrdbo/projects/moltbot-legal-desktop && git add -A && git commit -m "msg" && git push origin main`
52
+ - **SDK**: Docker (Python 3.10 + LibreOffice + uvicorn)
53
+
54
+ 2. **HYBRID ENGINE — HF Space B (`deebee7/moltbot-hybrid-engine`)**
55
+ - **LOCAL CLONE**: `/home/mrdbo/projects/moltbot-hybrid-engine`
56
+ - **ROLE**: Remote Uncensored Brain. Runs Ollama + Qwen 2.5; OpenAI-compatible `/v1/chat/completions`.
57
+ - **CONTENTS**: `app.py` (FastAPI), `Dockerfile`, `start.sh`, `prompts/legal_exhibit_instruction.txt`.
58
+ - **ENDPOINTS**: `/health`, `/api/generate`, `/api/search`, `/api/analyze`, `/tools/analyze_report`, `/v1/chat/completions`, `/v1/models`, **`GET /prompts/legal-exhibit-instruction`** (returns legal exhibit referencing instruction for clients to use as system message).
59
+ - **DEPLOY**: `cd /home/mrdbo/projects/moltbot-hybrid-engine && git add -A && git commit -m "msg" && git push origin main`
60
+ - **ACCESS**: Via `cloud_llm_adapter.py` from P3 or curl from Desktop Space.
61
+
62
+ ### HF Space Management
63
+
64
+ ```bash
65
+ # Check space health
66
+ curl -s https://deebee7-moltbot-hybrid-engine.hf.space/health | python3 -m json.tool
67
+ curl -s https://deebee7-moltbot-legal-desktop.hf.space/health | python3 -m json.tool
68
+
69
+ # Pause + restart (force rebuild) via Python
70
+ python3 -c "
71
+ from huggingface_hub import HfApi
72
+ api = HfApi(token='YOUR_TOKEN')
73
+ api.pause_space('deebee7/moltbot-legal-desktop')
74
+ import time; time.sleep(3)
75
+ api.restart_space('deebee7/moltbot-legal-desktop')
76
+ "
77
+ ```
78
+
79
+ ### Sync Mechanism (P2/P3 → Desktop)
80
+
81
+ ```bash
82
+ # Manual sync
83
+ cd /home/mrdbo/projects/courtBundleGenerator3/adapters && bash sync_to_desktop.sh
84
+
85
+ # Sync + push to HF
86
+ cd /home/mrdbo/projects/courtBundleGenerator3/adapters && bash sync_to_desktop.sh --push
87
+
88
+ # Install auto-sync git hooks
89
+ cd /home/mrdbo/projects/courtBundleGenerator3/adapters && bash install_sync_hook.sh
90
+ ```
91
+
92
+ ### Evidence Root
93
+
94
+ - All evidence resides under `/home/mrdbo/projects/courtBundleGenerator2/evidence/`
95
+ - **CRITICAL**: Evidence files can be found in ANY `/evidence/` subdirectory
96
+ - All subdirectories have **equal priority** — no allow-lists
97
+ - **Policy**: Full RECURSIVE SCAN
98
+
99
+ ### **2. DISCOVERY & FILE RESOLUTION**
100
+
101
+ **SCOPE:**
102
+ - **Root:** `/home/mrdbo/projects/courtBundleGenerator2/evidence`
103
+ - **Policy:** RECURSIVE SCAN (`os.walk` or `rglob`).
104
+ - **Explicit Includes:**
105
+ - `/evidence/Repairs` (MUST BE FOUND)
106
+ - `/evidence/InputDocs`
107
+ - `/evidence/new_evidence_staging`
108
+ - `/evidence/00_CRITICAL_SCANNED`
109
+ - `/evidence/00_CRITICAL_INTAKE`
110
+ - `/legal_emails`
111
+
112
+ ---
113
+
114
+ ### **4. CURRENT CODE STATE (2026-02-13)**
115
+
116
+ - **`lib/db_registry.py`:** Writes only `dbReference`/`DB Reference`/`DB_Reference`; does **not** set `exhibitNo` or `Exhibit No.`. Provides `sync_exhibit_db_references()`. Seeds from `legal_emails/Phase8/DB_Evidence_List.txt`.
117
+ - **`generate_bundles_final_corrected.py` (P3):** Exhibit number = `bundle_exhibit_no` (e.g. A15, G7); DB ref = `db_ref` from registry. TOC row update sets only `dbReference`. Item metadata uses bundle letter+seq for exhibit, DB for dbReference only.
118
+ - **`config/path_config.py`:** `get_authoritative_discovery_roots` returns a hardcoded list of ALL evidence directories. `EVIDENCE_DIRECTORIES` and `EVIDENCE_DISCOVERY_DIRECTORIES` match this list.
119
+ - **`src/prompt_system_integrator.py`:** `_create_compliance_enforcer` returns `None` (Compliance Bypassed).
120
+ - **`enhanced_bundler_wrapper.patched.py`:** Compliance checks removed. `try...except` block syntax error fixed.
121
+ - **`generate_bundles_final_corrected.py`:** `index_evidence_files` performs a direct `os.walk` on `/evidence`, bypassing any config restrictions.
122
+ - **`file_resolution_bridge.py`:** `find_file` logic simplified to use `resolution_cache.json` as the primary source of truth.
123
+
124
+ ---
125
+
126
+ ---
127
+
128
+ ## CRITICAL DIRECTIVES (HIERARCHY 1 - ABSOLUTE)
129
+
130
+ ### 1. NO BROKEN CODE
131
+
132
+ Code with syntax errors, incomplete logic, or untested assumptions **terminates the session immediately**. Test mentally before providing ANY code.
133
+
134
+ ### 2. NO PLACEHOLDERS
135
+
136
+ Methods printing "TODO" or returning unchanged data are **prohibited**. Implement the logic fully or stop. No exceptions.
137
+
138
+ ### 3. NO SYMPTOM TREATMENT
139
+
140
+ Fix **root causes only**. No patches, workarounds, bypasses, fallbacks, standalone scripts, or parallel pipelines. If you cannot fix the root cause, state this explicitly and ask for guidance.
141
+
142
+ ### 4. EMPIRICAL EVIDENCE ONLY
143
+
144
+ Every diagnosis requires **concrete proof**: logs, diffs, or shown code. No assumptions, inferences, or guesses permitted. If you don't have evidence, you must ask for it using the formalized protocol (see Section 9).
145
+
146
+ ### 5. PATH DISCIPLINE
147
+
148
+ - **ONLY USE:** `--output-dir /home/mrdbo/court_data/CourtBundleOutput` (for Project 2)
149
+ - **ONLY USE:** `--output-dir /home/mrdbo/court_data/2nd_CourtBundleOutput` (for Project 3)
150
+ - **NEVER USE:** `./local_output` or any other output path
151
+ - Strictly obey the `--output-dir` provided in the command. **NEVER fall back** to hardcoded defaults.
152
+
153
+ ### 6. TRUTH IN TELEMETRY
154
+
155
+ - Do **NOT** print "✅" or claim a file exists unless you have successfully run `ls -lh [EXACT_PATH]` and seen the output
156
+ - Do **NOT** hallucinate filenames, timestamps, or success messages
157
+ - All claims must be verifiable with concrete command output
158
+
159
+ ---
160
+
161
+ ## ANTI-HALLUCINATION PROTOCOL (MANDATORY)
162
+
163
+ 1. **RAG (Retrieval-Augmented Generation):** Use `FileResolutionBridge` for all file resolution. Never assume file locations.
164
+ 2. **Chain of Thought:** Show step-by-step logic before conclusions. Document your reasoning.
165
+ 3. **Chain of Verification:** Validate bundle existence before claiming success. Run `ls -lh` on claimed outputs.
166
+ 4. **Specificity:** Detailed context only - no generic statements like "the system works" or "files were processed."
167
+ 5. **Role Assignment:** Respect component expertise boundaries. Don't modify code outside your assigned area.
168
+ 6. **Require Sources:** Verify sources for all evidence claims. Cite line numbers and file paths.
169
+ 7. **Advanced Models:** Use `EnhancedEmbeddingFeatures` appropriately for metadata enrichment.
170
+ 8. **Confidence Levels:** Score reliability (80% minimum threshold). Mark outputs below this as UNVERIFIED.
171
+ 9. **Multiple Models:** Use correct model for task - Legacy vs Discovery vs Enrichment.
172
+ 10. **Lower Temperature:** Use deterministic config for reproducibility (temperature ≤ 0.3).
173
+ 11. **External Fact-Checking:** Use court compliance requirements in `config/bundle_compliance.json` as **reference only** — for formatting and content checks. Do **NOT** use them as a gate that blocks discovery, embedding, or bundle output (see section COMPLIANCE, VERIFICATION & VALIDATION — MUST NOT BLOCK).
174
+ 12. **Confidence Threshold:** 80% minimum. If below, mark all output as **UNVERIFIED** and request human review.
175
+
176
+ ---
177
+
178
+ ## ROBUST CODE STANDARDS (MANDATORY)
179
+
180
+ 1. **DRY Principle:** Reuse `FileResolutionBridge`, `UnifiedEvidenceBridge`, and existing processors. Never duplicate logic.
181
+ 2. **Extensible:** Architecture must allow future compliance features without refactoring.
182
+ 3. **Modular:** Isolated, testable changes only. One responsibility per function/class.
183
+ 4. **Non-breaking:** Preserve original functionality. Never remove features without explicit permission.
184
+ 5. **Configurable:** Use feature flags (e.g., `--enable-discovery`, `--enable-fuzzy`) for new logic.
185
+ 6. **Reusable:** Logic must work with any evidence list, not hardcoded to specific files.
186
+ 7. **Refactor:** Improve architecture - do not patch over problems.
187
+ 8. **Integrate:** Deep integration only - no parallel pipelines or temporary scripts.
188
+ 9. **NO STANDALONE:** No `temp_fix.py`, `wrapper_v2.py`, or "quick fixes" allowed.
189
+
190
+ 10. **Fix in place:** Do not create parallel or temporary scripts. Fix the files in place.
191
+
192
+ 11. **Audit before discovery changes:** Before changing discovery logic or `config/path_config.py`, run `python3 tools/audit_runtime_blockers.py` (from courtBundleGenerator2). Fix any reported blockers first.
193
+
194
+ ---
195
+
196
+ ## COMPLIANCE, VERIFICATION & VALIDATION — MUST NOT BLOCK (HISTORICAL LESSON)
197
+
198
+ **What went wrong:** A compliance system was introduced to stop agents from making wrong changes to wrong files. Instead, agents enforced it in a way that **blocked** full evidence searches, **blocked** embedding, **blocked** bundle output, and reintroduced whitelist directories, blind spots, and incorrect validation — stalling the project for months. Compliance is now **bypassed by design** so the pipeline can run.
199
+
200
+ **Rule — nothing may block the pipeline:**
201
+
202
+ - **Compliance bypass is intentional.** Do **NOT** re-enable compliance enforcers (e.g. `_create_compliance_enforcer` returning a real enforcer) that block discovery, embedding, or bundle generation. Do **NOT** add checks that prevent the bundler from running, from doing a full evidence scan, or from writing PDFs.
203
+ - **Verification and validation in this document** mean **post-hoc checks only**: run the page verifier *after* bundles are generated, run `ls -lh` on outputs, check `missing_evidence_summary.json`. They do **NOT** mean gating or blocking that stops the pipeline before or during a run.
204
+ - **PROHIBITED:** Do **NOT** add validation, verification, or compliance logic that: (1) blocks the pipeline from starting or continuing, (2) restricts evidence search to a subset of directories, (3) prevents embedding of found files, (4) prevents bundle output, or (5) reintroduces allow-lists/whitelists for discovery. Use `config/bundle_compliance.json` and similar as **reference only** (e.g. for formatting rules), not as a gate that stops execution.
205
+ - **If in doubt:** The pipeline must always be able to run a full evidence scan and produce bundle output. Any change that would prevent that is a violation of this rule.
206
+
207
+ ---
208
+
209
+ 1) Discover and map the existing chain (no assumptions)
210
+ - Identify the relevant existing modules, functions, and configs.
211
+ - Show the current path from entry point to output before your change.
212
+
213
+ 2) Design in terms of the full chain
214
+ - Explain which existing components you will reuse.
215
+ - Identify where you will insert or adjust logic (with file/line references).
216
+
217
+ 3) Implement with zero stubs
218
+ - Do not leave pass, unimplemented placeholders, or fake logic.
219
+ - All new code must be exercised by at least one CLI or test command.
220
+
221
+ 4) Prove wiring and lifecycle
222
+ - Show definition, call sites, downstream calls, and execution commands.
223
+ - Show log snippets or test output confirming actual execution.
224
+
225
+ 5) Call out any gaps
226
+ - If any step is blocked by missing files, invalid data, or broken legacy imports, explicitly call it out and provide remediation steps.
227
+
228
+ Only then may you state that a change is complete.
229
+
230
+ ## ENFORCEMENT BLOCKING RULES
231
+
232
+ ### Source of Truth
233
+
234
+ - This file (`memory-bank/CRITICAL_INSTRUCTIONS.md`) is the **source of truth**
235
+ - All other documentation defers to this file
236
+ - Conflicts between this file and other docs → this file wins
237
+
238
+ ### Protected Files (Require Explicit Permission)
239
+
240
+ **NEVER use `cp`, `mv`, or `backup` on these files:**
241
+
242
+ - `enhanced_bundler_wrapper.patched.py`
243
+ - `create_proper_embedded_bundle.py`
244
+ - `generate_bundles_final_corrected.py`
245
+ - `dual_category_evidence_processor.py`
246
+ - `courtBundleGenerator2_restored/legacy_files/categorize_and_append_v2.py`
247
+ - `categorize_and_append_v2.py`
248
+
249
+ **For other files:** ASK PERMISSION IN CAPITALS before any `cat` command that overwrites existing content.
250
+
251
+ **For critical files:** Only **append**, never overwrite completely.
252
+
253
+ ---
254
+
255
+ ## 2. COMPLIANCE & ENFORCEMENT PROTOCOLS (UPDATED)
256
+
257
+ ### A. THE "NO BLINDING" RULE (Evidence Access)
258
+
259
+ **CRITICAL:** Security boundaries must **NEVER** prevent the discovery of evidence.
260
+
261
+ - **Rule:** If a script encounters a file in a non-standard path (e.g., `/evidence_external` or a deeply nested subfolder), it must **WARN** but **PROCESS IT**.
262
+ - **Prohibited:** `sys.exit()` or `return False` on directory validation errors.
263
+ - **Required:** Log `[WARNING] Path outside standard root: {path} - PROCESSING ANYWAY`.
264
+
265
+ ### B. THE "SHOW YOUR WORK" RULE (Task Completion)
266
+
267
+ **CRITICAL:** You are forbidden from claiming "Fixed" or "Complete" until you:
268
+
269
+ 1. **EXECUTE** the code (traceable via `gb3_deps.json`).
270
+ 2. **VERIFY** the output using the Mandatory Verifier:
271
+
272
+ ```bash
273
+ cd /home/mrdbo/projects/courtBundleGenerator3 && \
274
+ python3 pdf_page_verifier_enhanced.py /home/mrdbo/court_data/2nd_CourtBundleOutput
275
+ ```
276
+
277
+ 3. **If pagination mismatches are reported** (Printed Page Number ≠ PDF Physical Page), run the Pagination Mismatch Analyzer to identify root cause:
278
+
279
+ ```bash
280
+ cd /home/mrdbo/projects/courtBundleGenerator3 && \
281
+ python3 tools/pagination_mismatch_analyzer.py /home/mrdbo/court_data/2nd_CourtBundleOutput --json /home/mrdbo/court_data/2nd_CourtBundleOutput/diagnostics/pagination_mismatch_report.json
282
+ ```
283
+
284
+ The analyzer classifies mismatch patterns and suggests responsible script/function (e.g. `add_volume_pagination()`, `embed_evidence_with_metadata()`). Fix root cause in place; do not add workarounds.
285
+
286
+ ---
287
+
288
+ ## ENTRY POINTS & VERIFICATION
289
+
290
+ ### Before ANY Run (NON-NEGOTIABLE)
291
+
292
+ 1. **ASK IN CAPITALS** which entrypoint we are using:
293
+ - `enhanced_bundler_wrapper.patched.py` (Project 2)
294
+ - `generate_bundles_final_corrected.py` (Project 3)
295
+ - `create_proper_embedded_bundle.py` (direct bundler)
296
+
297
+ 2. **RUN `python3 <ENTRYPOINT> -h`** and paste the output
298
+
299
+ 3. **USE ONLY FLAGS** explicitly shown in that `-h` output
300
+
301
+ 4. If required policy flags are missing from argparse, **ADD THEM FIRST** (then re-run `-h` and paste it). Do not run unsupported flags.
302
+
303
+ ### Canonical Run Commands
304
+
305
+ **Project 2 Wrapper:**
306
+
307
+ ```bash
308
+ source court_venv_20250802/bin/activate && \
309
+ python3 -u enhanced_bundler_wrapper.patched.py \
310
+ --output-dir /home/mrdbo/court_data/CourtBundleOutput \
311
+ --enable-discovery \
312
+ --enable-fuzzy \
313
+ --recursive \
314
+ --limit 15 \
315
+ --limit-per-bundle 5 \
316
+ 2>&1 | tee -a telemetry.log
317
+ ```
318
+
319
+ **Project 3 Generator:**
320
+
321
+ ```bash
322
+ cd /home/mrdbo/projects/courtBundleGenerator3 && \
323
+ source court_venv_20250802/bin/activate && \
324
+ python3 -u generate_bundles_final_corrected.py \
325
+ --output-dir /home/mrdbo/court_data/2nd_CourtBundleOutput \
326
+ --enable-discovery \
327
+ --recursive \
328
+ --limit 15 \
329
+ --limit-per-bundle 5 \
330
+ 2>&1 | tee -a telemetry.log
331
+ ```
332
+
333
+ **Full Integration Test:**
334
+
335
+ ```bash
336
+ # 3) Compile check
337
+ python3 -m py_compile /home/mrdbo/projects/courtBundleGenerator3/generate_bundles_final_corrected.py
338
+ echo "exit_code=$?"
339
+
340
+ cat jira_adapt
341
+ er.py
342
+ cat: jira_adapter.py: No such file or directory
343
+
344
+
345
+ rg -n "missing 2 required positional arguments|UnboundLocalError|ERR_CLOSED_WRITER|close\(\) was called|TypeError: expected str, bytes|SKIP TOC ROW" /tmp/gb3_run.log
346
+
347
+ ```
348
+
349
+ ---
350
+
351
+ ## VERIFICATION LOOP (AFTER EVERY CHANGE)
352
+
353
+ ### Mandatory Verification Steps
354
+
355
+ 1. Run the appropriate canonical command (see above)
356
+ 2. Check for Chain-of-Verification logs:
357
+ - `AntiHallucinationManager.__init__`
358
+ - `EnhancedFuzzyResolver.resolve_evidence_paths`
359
+ - `UnifiedEvidenceBridge.get_unified_evidence`
360
+ - **Missing logs = broken integration chain → STOP and FIX**
361
+ 3. Verify PDF generation:
362
+
363
+ ```bash
364
+ ls -lh /home/mrdbo/court_data/2nd_CourtBundleOutput/court_bundle*.pdf OR ls -lh /home/mrdbo/court_data/CourtBundleOutput/BUNDLE*.pdf
365
+ ```
366
+
367
+ 4. Check missing evidence report:
368
+
369
+ ```bash
370
+ cat /home/mrdbo/court_data/CourtBundleOutput/missing_evidence_summary.json
371
+ ```
372
+
373
+ ### Completion Gate (Run Proof Required)
374
+
375
+ **ABSOLUTE COMPLETION GATE** — You must not claim completion unless ALL are true:
376
+
377
+ - Bundles A–I (or chosen set) generated in the chosen output directory
378
+ - PDFs are non-empty
379
+ - Page-level verification run (see **AGENT AUDIT & VERIFICATION COMMANDS** below) confirms embedding completeness; 0 missing or fully enumerated with reason codes
380
+ - Missing evidence summary is empty OR missing is fully enumerated with reason codes
381
+ - TOC sync issues == 0; DB numbers present in TOC + evidence pages
382
+ - No raw paths on PDF pages (embedding failure); continue verification loop until 0 missing. Empirical analysis only; no guessing or assumptions.
383
+ - User confirms PDFs are correct when applicable
384
+
385
+ A run is **NOT accepted** unless you output:
386
+
387
+ - The exact command executed
388
+ - At least one generated PDF path with size + full ISO timestamp
389
+ - `missing_evidence_summary.json` status (empty array or specific list)
390
+ - Confirmation that previously missing files (e.g., from `Repairs` folder) are now embedded
391
+
392
+ **Note:** There's a symlink `./output` → `/home/mrdbo/court_data/CourtBundleOutput` & symlink to `/home/mrdbo/court_data/2nd_CourtBundleOutput` but always use the **full path** in commands.
393
+
394
+ ---
395
+
396
+ ## IN-SITU PATCH FORMAT (REQUIRED FOR ALL CODE CHANGES)
397
+
398
+ When providing code changes, **ALWAYS** use this exact format:
399
+
400
+ ```
401
+ FILE: /absolute/path/to/script.py
402
+ LOCATION: Inside ClassName.method_name() at line ~XX
403
+
404
+ --- CODE ABOVE (3-5 lines context) ---
405
+ def method_name(self, param):
406
+ existing_variable = some_value
407
+ current_logic_here()
408
+
409
+ --- CHANGES ---
410
+ [ ] DELETE: current_logic_here()
411
+
412
+ [+] INSERT AFTER "existing_variable = some_value":
413
+ new_logic_here()
414
+ proper_implementation()
415
+
416
+ [>] OVERWRITE (if replacing lines):
417
+ OLD: current_logic_here()
418
+ NEW: new_logic_here()
419
+
420
+ --- CODE BELOW (3-5 lines context) ---
421
+ return final_result
422
+
423
+ --- VERIFICATION ---
424
+ Run: python3 -c "from script_name import ClassName; ClassName().method_name('test')"
425
+ Expected: [Specific expected output or "no errors"]
426
+ ```
427
+
428
+ ### Why This Format?
429
+
430
+ - **Unambiguous location** - exact file path and context lines
431
+ - **Clear changes** - DELETE/INSERT/OVERWRITE are explicit
432
+ - **Verifiable** - includes test command with expected output
433
+ - **No guessing** - human knows exactly where to apply changes
434
+
435
+ ---
436
+
437
+ ## ASKING FOR MISSING INFORMATION (FORMALIZED PROTOCOL)
438
+
439
+ When you need information to proceed, use this **exact format**:
440
+
441
+ ```
442
+ BLOCKED: [Specific blocker - be precise]
443
+
444
+ REQUIRED INFORMATION:
445
+ 1. [Exact command to run]
446
+ Example: Run: find /home/mrdbo/court_data -name "*resolution*" -type f
447
+
448
+ 2. [Exact file/section to show]
449
+ Example: Show: Lines 50-70 of enhanced_bundler_wrapper.patched.py
450
+
451
+ 3. [Exact error message to paste]
452
+ Example: Paste: Full traceback from last run of generate_bundles_final_corrected.py
453
+
454
+ CANNOT PROCEED UNTIL: [Specific data needed]
455
+ Example: "Confirming FileResolutionBridge exists and its import path"
456
+ ```
457
+
458
+ ### What NOT to do
459
+
460
+ - ❌ "Can you check if the file exists?"
461
+ - ❌ "I think there might be an issue..."
462
+ - ❌ "Please verify the paths"
463
+
464
+ ### What TO do
465
+
466
+ - ✅ "Run: ls -lh /home/mrdbo/projects/courtBundleGenerator2/file_resolution_bridge.py"
467
+ - ✅ "Show: Lines containing 'class FileResolutionBridge' in file_resolution_bridge.py"
468
+ - ✅ "Paste: Output of python3 -c 'import file_resolution_bridge; print(dir(file_resolution_bridge))'"
469
+
470
+ ---
471
+
472
+ ## DISCOVERY & FILE RESOLUTION
473
+
474
+ **See also:** COMPLIANCE, VERIFICATION & VALIDATION — MUST NOT BLOCK. Do not add compliance or validation that restricts discovery, embedding, or bundle output.
475
+
476
+ ### Scope (Unrestricted — no allow-list)
477
+
478
+ - **Root:** `/home/mrdbo/projects/courtBundleGenerator2/evidence` (Project 2) or `/home/mrdbo/projects/courtBundleGenerator3/evidence` (Project 3).
479
+ - **Policy:** Full RECURSIVE SCAN of the evidence root (`os.walk` or `Path().rglob()`). **All** subdirectories under the root must be discoverable. If a file exists under the evidence root, it must be findable.
480
+ - **PROHIBITED:** Do **NOT** restrict discovery to a fixed list of directories. Do **NOT** implement an allow-list or whitelist that excludes other evidence subdirectories. Do **NOT** add code that limits search to "only" certain folders — this has repeatedly caused blind spots (e.g. Repairs was blocked). Discovery for **finding** files must cover the **entire** evidence tree. (Other docs may refer to where to **write** or stage new evidence; that is separate. **Search/find** must never be restricted to a subset.)
481
+
482
+ ### Blind-spot check (must not be excluded)
483
+
484
+ These locations have historically been missed when agents restricted search to a list; they are **examples of what must not be excluded**, not a list to restrict to:
485
+
486
+ - `/evidence/Repairs` (often wrongly excluded — MUST be findable)
487
+ - `/evidence/InputDocs`, `/evidence/new_evidence_staging`, `/evidence/00_CRITICAL_SCANNED`, `/evidence/00_CRITICAL_INTAKE`, `/legal_emails`, `/docs` — and **any other subdirectory under the evidence root**.
488
+
489
+ If a script reports "File not found" for a file that **exists on disk** under the evidence root (e.g. under `/evidence/Repairs`), the discovery logic is **BROKEN** — usually because it was restricted to a subset of directories. Fix by ensuring the **whole** evidence root is scanned, not by adding one more directory to a list.
490
+
491
+ **Fix immediately:**
492
+
493
+ 1. Check `config/path_config.py::get_authoritative_discovery_roots()` — it must return **all** evidence subdirectories (or the root only so recursive scan finds everything).
494
+ 2. Do **not** reduce the set to a "required" or "approved" subset.
495
+ 3. Run audit: `python3 tools/audit_runtime_blockers.py`
496
+
497
+ ---
498
+
499
+ ## COMPLETION GATES (PROOF OF SUCCESS)
500
+
501
+ **ABSOLUTE COMPLETION GATE** — Same as "Completion Gate (Run Proof Required)" above. Run the appropriate page-level verifier from **AGENT AUDIT & VERIFICATION COMMANDS** (P3: `pdf_page_verifier_enhanced.py`; P2: `embedding_utils/pdf_page_verifier.py`). Empirical analysis only; continue until 0 missing.
502
+
503
+ A task is **NOT COMPLETE** unless ALL of the following are true:
504
+
505
+ 1. **Zero Missing Files:**
506
+ - `missing_evidence_summary.json` contains `[]` (empty array)
507
+ - OR `missing_count: 0` appears in logs
508
+ - **False positive check:** If any files from `Repairs/` or other known directories are still missing, the task FAILED
509
+
510
+ 2. **PDF Generation:**
511
+ - Non-empty PDF files exist in output directory
512
+ - Run: `ls -lh /home/mrdbo/court_data/CourtBundleOutput/*.pdf`
513
+ - Run: `ls -lh /home/mrdbo/court_data/2nd_CourtBundleOutput/*.pdf`
514
+ - Verify file sizes > 0 bytes
515
+
516
+ 3. **Specific Proof (for previously blind files):**
517
+ - Confirm files like `Faulty_Fire_alarm_control_system...jpg` are embedded
518
+ - Check PDF page count matches expected evidence count
519
+ - Verify TOC includes all expected sections
520
+
521
+ 4. **No False Positives:**
522
+ - Reporting "Success ✅" while `missing_files > 0` is a **CRITICAL FAILURE**
523
+ - Agent must re-run verification and fix before claiming success
524
+
525
+ ---
526
+
527
+ ## REQUIRED STEPS FOR EVERY CHANGE
528
+
529
+ 1. **Update Documentation:**
530
+ - Add entry to relevant `memory-bank/*` file
531
+ - Include `Current State (YYYY-MM-DD)` section
532
+ - Document what changed and why
533
+
534
+ 2. **Make Code Edit:**
535
+ - Use IN-SITU PATCH FORMAT
536
+ - Include inline FIX notes (e.g., `# FIX: create_proper_embedded_bundle.py:2882`)
537
+ - Reference file paths and line numbers
538
+
539
+ 3. **Run Verification Command:**
540
+ - Use appropriate canonical command
541
+ - Archive stdout/stderr alongside the PDF artifact path in your notes or session summary (traceability)
542
+ - Save PDF artifact path
543
+
544
+ 4. **Confirm Output:**
545
+ - Run: `ls -lh /home/mrdbo/court_data/CourtBundleOutput/court_bundle*.pdf`
546
+ - Attach snippet to report
547
+ - Verify file sizes and timestamps
548
+
549
+ 5. **Summarize Changes:**
550
+ - What changed
551
+ - Which files were touched
552
+ - Which Chain-of-Verification checkpoints fired
553
+ - Any new issues discovered
554
+
555
+ ---
556
+
557
+ ## PENALTY SYSTEM
558
+
559
+ ### Violations & Consequences
560
+
561
+ | Violation | Consequence | Recovery |
562
+ |-----------|-------------|----------|
563
+ | **Broken Code Provided** | Session ends immediately, all output invalidated | Start fresh session, provide working code |
564
+ | **Placeholder Code** | Task rejected, must re-plan | Implement full logic or request help |
565
+ | **Hallucinated Files/Success** | Confidence score → 0%, all claims invalidated | Re-verify everything with `ls` commands |
566
+ | **Skipped Verification** | All subsequent output marked UNVERIFIED | Run full verification loop, provide proof |
567
+ | **Assumed File Exists** | Must re-verify with explicit commands | Show actual file contents or command output |
568
+ | **Bypassing Rules** | Session paused, requires explicit re-authorization | Acknowledge violation, commit to rules |
569
+ | **Re-enabling compliance/validation that blocks pipeline** | Session paused; change reverted | Compliance bypass is intentional; do not add checks that block discovery, embedding, or bundle output |
570
+
571
+ ### Escalation
572
+
573
+ - **First violation:** Warning + correction required
574
+ - **Second violation:** Session reset, start from verification
575
+ - **Third violation:** Task marked FAILED, human intervention required
576
+
577
+ ---
578
+
579
+ ## EXHIBIT & DB REFERENCE SYNC (COURT FORMAT)
580
+
581
+ - **Rule:** DB numbers without filename and without bundle initial letter+number are **not adequate** for any document receiving amendments, edits, or insertions. All such documents must use **Exhibit [Letter][Number] (DB-[N]) — [Filename]**.
582
+ - **Exhibit number** = [Bundle letter][Sequential] (e.g. A15, G7). Set only in the bundler; never overwritten by the DB registry.
583
+ - **DB reference** = DB-[N] (e.g. DB-125). Set in `lib/db_registry.py`; never used as the exhibit number.
584
+ - **lib/db_registry.py:** Writes only `dbReference` / `DB Reference` / `DB_Reference`. Does **not** set `exhibitNo` or `Exhibit No.` to a DB value. Provides `sync_exhibit_db_references()` to fill DB refs without touching exhibit numbers.
585
+ - **generate_bundles_final_corrected.py (P3):** Uses `bundle_exhibit_no` for `exhibitNo` / `Exhibit No.` and `db_ref` from registry for `dbReference`. TOC row update sets only `dbReference`, not `Exhibit No.`.
586
+ - **Authoritative DB list:** `legal_emails/Phase8/DB_Evidence_List.txt` (DB1–DB170). Bundle letter assignment from `BUNDLE_GROUPS_WITH_FULL_EVIDENCE_FILE_NAMES.md`.
587
+ - **Legal documents (witness statement, N244, SRA):** Reference evidence as **Exhibit [Letter][Number] (DB-[N]) — [Filename]**. Do not use bare "DB-[●]". See `PROMPTS/EXHIBIT_REFERENCING_FOR_LEGAL_DOCS.md`, `PROMPTS/LEGAL_WRITING_EXHIBIT_INSTRUCTION.md`, `PROMPTS/HOW_TO_MAKE_AGENTS_AWARE.md`. Cursor rule in `.cursorrules`; Moltbot/Qwen clients should send the instruction as system message (fetch from Engine `GET /prompts/legal-exhibit-instruction` when in legal-document mode).
588
+ - **Status (amendments / edit sources):** See **`PROMPTS/STATUS_EXHIBIT_AND_EDIT_SOURCES.md`** — what is in sync, what edit sources (AI Advisor, Moltbot, Qwen) must do to output the full format for flag updates/amendments/inserts.
589
+
590
+ ---
591
+
592
+ ## CURRENT PROJECT STATE (2026-02-13)
593
+
594
+ ### Recent Changes
595
+
596
+ - **2026-02-13:** Exhibit/DB sync: db_registry no longer overwrites exhibitNo; bundler uses bundle_exhibit_no for exhibit, db_ref for DB only; PROMPTS for legal document referencing (LEGAL_WRITING_EXHIBIT_INSTRUCTION, EXHIBIT_REFERENCING_FOR_LEGAL_DOCS, HOW_TO_MAKE_AGENTS_AWARE); .cursorrules legal exhibit rule; moltbot-hybrid-engine `GET /prompts/legal-exhibit-instruction`.
597
+ - **2026-02-13:** Cross-project impact audit: `tools/cross_project_impact_audit.py` supports `--entry project:path` for runtime-chain focus (BFS from entrypoints across projects).
598
+ - **2026-02-06:** Desktop space converted from CLI to FastAPI web server (`app.py`), Dockerfile v2.0
599
+ - **2026-02-06:** DBRegistry import guarded with double-fallback + `HAS_DB_REGISTRY` in Desktop
600
+ - **2026-02-06:** Hybrid Engine space deployed with Dockerfile v4.0 (Dev Mode compatible); added `prompts/legal_exhibit_instruction.txt` and GET `/prompts/legal-exhibit-instruction`
601
+ - **2026-02-06:** `start.sh` v3.2 for Hybrid Engine — installs Ollama at runtime, pulls Qwen 2.5 in background
602
+ - **2026-02-06:** Automated sync system created: `sync_to_desktop.sh` + `install_sync_hook.sh`
603
+ - **2026-02-06:** `link_validator.py` recovered from git in P3
604
+ - **2026-01-22:** Unrestricted discovery enabled in `config/path_config.py`
605
+ - **2026-01-15:** PDF verification telemetry added to `embedding_utils/telemetry.py`
606
+ - **2026-01-14:** Effective limit computation fixed in `enhanced_bundler_wrapper.patched.py`
607
+
608
+ ### Active Issues
609
+
610
+ - [ ] Finish lazy-import plan for `DualCategoryEvidenceProcessor` to prevent circular imports
611
+ - [ ] Ensure `EnhancedEmbeddingFeatures` uses prompt system singleton consistently
612
+ - [ ] Verify all files in `/evidence/Repairs` are discoverable
613
+ - [ ] Configure Jira integration (currently missing JIRA_URL, JIRA_EMAIL, JIRA_TOKEN env vars)
614
+ - [ ] Verify Hybrid Engine Ollama/Qwen model pull completes on HF free tier
615
+
616
+ ### Key Files
617
+
618
+ - `enhanced_bundler_wrapper.patched.py` - Main wrapper (Project 2)
619
+ - `create_proper_embedded_bundle.py` - Direct bundler
620
+ - `generate_bundles_final_corrected.py` - Main generator (Project 3)
621
+ - `lib/db_registry.py` - DB assignment; seeds from `legal_emails/Phase8/DB_Evidence_List.txt`; does not overwrite exhibitNo
622
+ - `cohesive_unified_evidence_processor.py` - Evidence processing
623
+ - `category_mapping.py` - Category classification
624
+ - `dual_category_evidence_processor.py` - Dual categorization
625
+ - `embedding_utils/prompt_system_integration.py` - Prompt system
626
+ - `config/path_config.py` - Discovery roots configuration
627
+ - `file_resolution_bridge.py` - File resolution with caching
628
+ - **PROMPTS:** `PROMPT_HEADER_13_12_25.md`, `EXHIBIT_REFERENCING_FOR_LEGAL_DOCS.md`, `LEGAL_WRITING_EXHIBIT_INSTRUCTION.md`, `HOW_TO_MAKE_AGENTS_AWARE.md`
629
+ - **Audit:** `tools/cross_project_impact_audit.py` (optional `--entry` for runtime chain), `tools/audit_active_bundling_files.py` (live chain → gb3_deps.json)
630
+
631
+ ---
632
+
633
+ ## IMMEDIATE OBJECTIVES
634
+
635
+ 1. **Complete Discovery Verification:**
636
+ - Confirm all `/evidence/Repairs` files are found
637
+ - Run: `python3 tools/audit_discovery_coverage.py`
638
+ - Fix any remaining blind spots
639
+
640
+ 2. **Eliminate Circular Imports:**
641
+ - Finish lazy-import for `DualCategoryEvidenceProcessor`
642
+ - Test: `python3 -c "import category_mapping; print('OK')"`
643
+
644
+ 3. **Validate Compliance:**
645
+ - Run full integration test with all flags
646
+ - Verify zero missing evidence
647
+ - Confirm court compliance (TOC, pagination, exhibit numbers)
648
+
649
+ ---
650
+
651
+ ## REFERENCE ARCHITECTURE
652
+
653
+ ### Chain of Verification Components
654
+
655
+ ```
656
+ User Command
657
+
658
+ enhanced_bundler_wrapper.patched.py (argparse + flags)
659
+
660
+ AntiHallucinationManager.__init__ (initialize protocols)
661
+
662
+ config/path_config.py::get_authoritative_discovery_roots() (get search paths)
663
+
664
+ EnhancedFuzzyResolver.resolve_evidence_paths() (find files)
665
+
666
+ UnifiedEvidenceBridge.get_unified_evidence() (consolidate evidence)
667
+
668
+ create_proper_embedded_bundle.py (generate PDF)
669
+
670
+ Verification: ls -lh <output_path>
671
+
672
+ Verification: cat missing_evidence_summary.json
673
+ ```
674
+
675
+ ### Critical Integration Points
676
+
677
+ 1. **Path Configuration** → `config/path_config.py`
678
+ 2. **File Resolution** → `file_resolution_bridge.py` + `enhanced_fuzzy_filename_resolver.py`
679
+ 3. **Evidence Consolidation** → `unified_evidence_bridge.py`
680
+ 4. **Categorization** → `category_mapping.py` + `dual_category_evidence_processor.py`
681
+ 5. **Bundle Generation** → `create_proper_embedded_bundle.py`
682
+ 6. **Compliance Validation** → `config/bundle_compliance.json`
683
+
684
+ ---
685
+
686
+ ## NOTES FOR AGENT BUILDERS
687
+
688
+ ### When Using This File for Gemini/Other Agents
689
+
690
+ 1. **Agent Instructions (Main):** Use sections 1-3 (CRITICAL DIRECTIVES, ANTI-HALLUCINATION, ROBUST CODE STANDARDS) **and** COMPLIANCE, VERIFICATION & VALIDATION — MUST NOT BLOCK (compliance bypass is intentional; do not re-enable blocking).
691
+
692
+ 2. **Session Prompt:** Use sections 4-6 (ENFORCEMENT, ENTRY POINTS, VERIFICATION)
693
+
694
+ 3. **Task Execution:** Use sections 7-9 (PATCH FORMAT, ASKING PROTOCOL, DISCOVERY)
695
+
696
+ 4. **Knowledge Base:** Use sections 10-13 (COMPLETION GATES, STEPS, PENALTIES, CURRENT STATE)
697
+
698
+ ### Testing Protocol
699
+
700
+ ```bash
701
+ # Test 1: Verify discovery coverage
702
+ python3 tools/audit_discovery_coverage.py
703
+
704
+ # Test 2: Test file resolution
705
+ python3 -c "from file_resolution_bridge import FileResolutionBridge; print(FileResolutionBridge().find_file('test.pdf'))"
706
+
707
+ # Test 3: Run with minimal flags
708
+ python3 enhanced_bundler_wrapper.patched.py --output-dir /home/mrdbo/court_data/CourtBundleOutput --limit 1
709
+
710
+ # Test 4: Verify output
711
+ ls -lh /home/mrdbo/court_data/CourtBundleOutput/*.pdf
712
+ cat /home/mrdbo/court_data/CourtBundleOutput/missing_evidence_summary.json
713
+ ```
714
+
715
+ ---
716
+
717
+ ## AGENT AUDIT & VERIFICATION COMMANDS (ABSOLUTE TASK COMPLETION)
718
+
719
+ **Canonical reference:** `/home/mrdbo/projects/courtBundleGenerator2/AUDITING_COMMANDS_23_1_26.md` — agents MUST use these for verification loops, audits, and diagnostics before claiming task completion.
720
+
721
+ ### Page-level verification (mandatory before claiming bundle success)
722
+
723
+ - **Project 3 (generate_bundles_final_corrected.py):**
724
+
725
+ ```bash
726
+ cd /home/mrdbo/projects/courtBundleGenerator3 && python3 pdf_page_verifier_enhanced.py /home/mrdbo/court_data/2nd_CourtBundleOutput
727
+ ```
728
+
729
+ - **Project 2 (enhanced_bundler_wrapper / create_proper_embedded_bundle):**
730
+
731
+ ```bash
732
+ cd /home/mrdbo/projects/courtBundleGenerator2 && python3 embedding_utils/pdf_page_verifier.py /home/mrdbo/court_data/CourtBundleOutput
733
+ ```
734
+
735
+ ### Prevention & diagnostics
736
+
737
+ - **Audit prevention measures in codebase + optional bundle verify:**
738
+
739
+ ```bash
740
+ cd /home/mrdbo/projects/courtBundleGenerator3 && python3 tools/audit_bundle_prevention.py
741
+ cd /home/mrdbo/projects/courtBundleGenerator3 && python3 tools/audit_bundle_prevention.py --verify-bundles /home/mrdbo/court_data/2nd_CourtBundleOutput
742
+ ```
743
+
744
+ - **Test cloud download URLs:**
745
+
746
+ ```bash
747
+ cd /home/mrdbo/projects/courtBundleGenerator3 && python3 tools/test_download_links.py
748
+ ```
749
+
750
+ - **Runtime chain / dependency audit:**
751
+
752
+ ```bash
753
+ cd /home/mrdbo/projects/courtBundleGenerator2 && python3 tools/audit_runtime_chain.py --root . --out code_analysis/Dec25/audit_runtime_report.json
754
+ cd /home/mrdbo/projects/courtBundleGenerator2 && python3 tools/audit_active_bundling_files.py --root . --entry /home/mrdbo/projects/courtBundleGenerator3/generate_bundles_final_corrected.py --out code_analysis/gb3_deps.json
755
+ ```
756
+
757
+ - **Runtime blockers (before touching discovery):**
758
+
759
+ ```bash
760
+ cd /home/mrdbo/projects/courtBundleGenerator2 && python3 tools/audit_runtime_blockers.py
761
+ ```
762
+
763
+ ### Completion gate (all must pass)
764
+
765
+ - Bundles generated in chosen output dir; PDFs non-empty.
766
+ - Page-level verification run (pdf_page_verifier_enhanced or embedding_utils/pdf_page_verifier) with 0 missing or fully enumerated.
767
+ - missing_evidence_summary.json empty or with reason codes.
768
+ - TOC sync issues == 0; DB numbers present in TOC and evidence pages.
769
+ - No raw paths on PDF pages (embedding failure); continue verification loop until 0 missing.
770
+
771
+ ---
772
+
773
+ **END OF CRITICAL INSTRUCTIONS v6.0**
774
+
775
+ *Last verified: 2026-02-13*
776
+ *Next review: When major architectural changes occur or after 10 successful bundle generations*
777
+
778
+ # ------------------------------------------------------------------
779
+
780
+ # COMPREHENSIVE SAFETY & FORMATTING PROTOCOLS (MANDATORY)
781
+
782
+ # ------------------------------------------------------------------
783
+
784
+ ## 9. MANDATORY CODE EDITING PROTOCOL (THE 4-POINT ANCHOR)
785
+
786
+ **CRITICAL:** To prevent "NameErrors" and context loss, "naked" code blocks are PROHIBITED.
787
+ You must use this exact format for EVERY code change:
788
+
789
+ 1. **FILE & CONTEXT HEADER:** `# File: /absolute/path/to/file.py`
790
+ `# Context: Class [Name], Function [Name], Line Approx [X]`
791
+
792
+ 2. **ANCHOR (Pre-Verification):**
793
+
794
+ ```python
795
+ TEXT ABOVE (Unchanged - Minimum 3 lines):
796
+ [Paste exact existing code here to prove you know the location]
797
+ ```
798
+
799
+ 3. **DELETION (Explicit Warning):**
800
+
801
+ ```python
802
+ ❌ DELETING / OVERWRITING:
803
+ [Paste the exact lines being removed. If nothing, write "NO DELETION"]
804
+ ```
805
+
806
+ 4. **INSERTION (The Change):**
807
+
808
+ ```python
809
+ ✅ INSERTING:
810
+ [The new code]
811
+ ```
812
+
813
+ 5. **ANCHOR (Post-Verification):**
814
+
815
+ ```python
816
+ TEXT BELOW (Unchanged - Minimum 3 lines):
817
+ [Paste exact existing code here to confirm safe exit]
818
+ ```
819
+
820
+ ## 10. INFRASTRUCTURE & GIT SAFETY (ZERO DATA LOSS)
821
+
822
+ **DEFINITION OF RISK:** Risk includes overwriting remote files, deleting local files, force-pushing, or changing environment binaries.
823
+
824
+ 1. **GIT VERIFICATION:** Before ANY `git push`, you MUST run and display:
825
+ - `git status` (Detect deletions - STOP if any "deleted:" lines appear)
826
+ - `git diff --stat` (Detect mass changes)
827
+ - `git remote -v` (Verify target)
828
+
829
+ 2. **REMOTE EQUALITY:** Local files are NOT the only truth. You must check if Remote has files that Local is missing before syncing.
830
+
831
+ 3. **BINARY EXCLUSION:** You MUST verify `.gitignore` and `.dockerignore` contain `ollama`, `venv`, `__pycache__` before pushing to Cloud.
832
+
833
+ ## 11. ANTI-HALLUCINATION / EMPIRICAL FIRST
834
+
835
+ **RULE:** You may not answer "I think", "It should be", or "Most likely".
836
+
837
+ 1. **PRE-COMPUTATION:** You must run a CLI command (ls, cat, grep, git status) to verify a fact BEFORE stating it.
838
+ 2. **ADMISSION OF IGNORANCE:** If you cannot verify a file/state via CLI, you must state "I cannot verify X" and ask for the user's help.
839
+
840
+ # ------------------------------------------------------------------
841
+
842
+ # UPDATED SECURITY PROTOCOLS (MANDATORY ENFORCEMENT)
843
+
844
+ # ------------------------------------------------------------------
845
+
846
+ ## 9. INFRASTRUCTURE & GIT SAFETY (ZERO DATA LOSS)
847
+
848
+ **DEFINITION OF RISK:** Risk explicitly includes overwriting remote files, deleting local files, force-pushing, or changing environment binaries.
849
+
850
+ 1. **GIT VERIFICATION:** Before ANY `git push`, you MUST run and display:
851
+ - `git status` (Detect deletions)
852
+ - `git diff --stat` (Detect mass changes)
853
+ - `git remote -v` (Verify target)
854
+ 2. **BINARY EXCLUSION:** You MUST verify `.gitignore` and `.dockerignore` contain `ollama`, `venv`, `__pycache__` before pushing.
855
+ 3. **NO FORCE PUSH:** `git push --force` is STRICTLY PROHIBITED.
856
+ 4. **REMOTE EQUALITY:** Local files are NOT the only truth. You must check if Remote has files that Local is missing before syncing.
857
+
858
+ ## 10. MANDATORY CODE EDITING PROTOCOL (THE 4-POINT ANCHOR)
859
+
860
+ **CRITICAL:** Standard/Naked code blocks are PROHIBITED. You must use this format to prove you are not guessing location:
861
+
862
+ 1. **FILE HEADER:** `# File: /absolute/path/to/file.py`
863
+ 2. **ANCHOR (Pre-Verification):**
864
+
865
+ ```python
866
+ TEXT ABOVE (Unchanged - Minimum 3 lines):
867
+ [Paste exact existing code here to prove location]
868
+ ```
869
+
870
+ 3. **DELETION (Explicit Warning):**
871
+
872
+ ```python
873
+ ❌ DELETING / OVERWRITING:
874
+ [Paste the exact lines being removed. If nothing, write "NO DELETION"]
875
+ ```
876
+
877
+ 4. **INSERTION (The Change):**
878
+
879
+ ```python
880
+ ✅ INSERTING:
881
+ [The new code]
882
+ ```
883
+
884
+ 5. **ANCHOR (Post-Verification):**
885
+
886
+ ```python
887
+ TEXT BELOW (Unchanged - Minimum 3 lines):
888
+ [Paste exact existing code here to confirm safe exit]
889
+ ```
890
+
891
+ ## 11. ANTI-HALLUCINATION / EMPIRICAL FIRST
892
+
893
+ **RULE:** You may not answer "I think", "It should be", or "Most likely".
894
+
895
+ 1. **PRE-COMPUTATION:** You must run a CLI command (ls, cat, grep, git status) to verify a fact BEFORE stating it.
896
+ 2. **ADMISSION OF IGNORANCE:** If you cannot verify a file/state via CLI, you must state "I cannot verify X" and ask for the user's help.
897
+
898
+ ## 12. THE "I DON'T KNOW" PROTOCOL (EPISTEMIC SECURITY)
899
+
900
+ **RULE:** You are strictly prohibited from filling gaps with assumptions, "most likely" scenarios, or inferred file paths.
901
+
902
+ 1. **THE STOP CONDITION:** If you do not have **CLI Output** (ls, cat, grep, git status) currently visible in the context that proves a fact, you **DO NOT KNOW** that fact.
903
+ 2. **THE MANDATORY RESPONSE:**
904
+ - **Incorrect:** "The file is likely in /evidence..."
905
+ - **Correct:** "❌ KNOWLEDGE GAP: I do not know the location of [file]. I cannot proceed."
906
+ 3. **THE REQUIRED ACTION (ANALYSIS FIRST):**
907
+ - Immediately stop execution.
908
+ - Generate a specific **Diagnostic Script** (Python or Bash) to discover the missing information.
909
+ - Ask the user to run it.
910
+ 4. **PROHIBITED PHRASES:**
911
+ - "Assuming that..."
912
+ - "It should be..."
913
+ - "Typically..."
914
+ - "Based on standard structure..."
915
+
916
+ ## 12. THE "I DON'T KNOW" PROTOCOL (EPISTEMIC SECURITY)
917
+
918
+ **RULE:** You are strictly prohibited from filling gaps with assumptions, "most likely" scenarios, or inferred file paths.
919
+
920
+ 1. **THE STOP CONDITION:** If you do not have **CLI Output** (ls, cat, grep, git status) currently visible in the context that proves a fact, you **DO NOT KNOW** that fact.
921
+ 2. **THE MANDATORY RESPONSE:**
922
+ - **Incorrect:** "The file is likely in /evidence..."
923
+ - **Correct:** "❌ KNOWLEDGE GAP: I do not know the location of [file]. I cannot proceed."
924
+ 3. **THE REQUIRED ACTION (ANALYSIS FIRST):**
925
+ - Immediately stop execution.
926
+ - Generate a specific **Diagnostic Script** (Python or Bash) to discover the missing information.
927
+ - Ask the user to run it.
928
+ 4. **PROHIBITED PHRASES:**
929
+ - "Assuming that..."
930
+ - "It should be..."
931
+ - "Typically..."
932
+ - "Based on standard structure..."
933
+
934
+ ## 12. THE "I DON'T KNOW" PROTOCOL (EPISTEMIC SECURITY)
935
+
936
+ **RULE:** You are strictly prohibited from filling gaps with assumptions.
937
+
938
+ 1. **THE STOP CONDITION:** If you do not have CLI Output proving a fact, you DO NOT KNOW it.
939
+ 2. **THE MANDATORY RESPONSE:** "❌ KNOWLEDGE GAP: I do not know [X]. I cannot proceed."
940
+ 3. **THE REQUIRED ACTION:** Generate a diagnostic script to find the answer empirically.
941
+
942
+ ## 12. THE "I DON'T KNOW" PROTOCOL (EPISTEMIC SECURITY)
943
+
944
+ **RULE:** You are strictly prohibited from filling gaps with assumptions.
945
+
946
+ 1. **THE STOP CONDITION:** If you do not have CLI Output proving a fact, you DO NOT KNOW it.
947
+ 2. **THE MANDATORY RESPONSE:** "❌ KNOWLEDGE GAP: I do not know [X]. I cannot proceed."
948
+ 3. **THE REQUIRED ACTION:** Generate a diagnostic script to find the answer empirically.
prompts/legal_exhibit_instruction.txt CHANGED
@@ -1,4 +1,6 @@
1
- When referencing evidence in legal documents (witness statements, N244, SRA complaint, or any court filing), use this format. Do not use bare "DB-[●]" or "Exhibit DB-[Number]" as the main identifier.
 
 
2
 
3
  Required format: Exhibit [Bundle letter][Number] (DB-[N]) — [Filename]
4
  Example: Exhibit G7 (DB-125) — 16_12_25_Lamberth_Email_Complaint_Response_Rent_account_UFN40981138.pdf
 
1
+ #/home/mrdbo/projects/moltbot-hybrid-engine/prompts/legal_exhibit_instruction.txt
2
+
3
+ When referencing evidence in legal documents (witness statements, N244, SRA complaint, or any court filing), use this format. DB numbers without filename and without bundle initial letter+number are not adequate for documents receiving amendments, edits, or insertions. Do not use bare "DB-[●]" or "Exhibit DB-[Number]" as the main identifier.
4
 
5
  Required format: Exhibit [Bundle letter][Number] (DB-[N]) — [Filename]
6
  Example: Exhibit G7 (DB-125) — 16_12_25_Lamberth_Email_Complaint_Response_Rent_account_UFN40981138.pdf