Spaces:
Running
Running
dboa9 commited on
Commit ·
48bd3aa
1
Parent(s): 97fa7ca
Updates
Browse files- .github/copilot-instructions.md +49 -0
- AGENT_KNOWLEDGE_BASE_Core_Identity_Standards.md +129 -0
- app.py +28 -10
- memory-bank/ARCHITECTURE.md +117 -0
- memory-bank/CLAUDE.md +80 -0
- memory-bank/CRITICAL_INSTRUCTIONS.md +948 -0
- prompts/legal_exhibit_instruction.txt +3 -1
.github/copilot-instructions.md
ADDED
|
@@ -0,0 +1,49 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# .github/copilot-instructions.md — moltbot-hybrid-engine (HF Space B)
|
| 2 |
+
# SYNCED FILE — Source: courtBundleGenerator2/scripts/sync_agent_docs.sh
|
| 3 |
+
# DO NOT EDIT HERE — edit source and re-run sync_agent_docs.sh
|
| 4 |
+
|
| 5 |
+
## This Repo: moltbot-hybrid-engine (HF Space B)
|
| 6 |
+
- **Role:** Local clone of HF Space deebee7/moltbot-hybrid-engine. Ollama + Qwen 2.5 brain. Edit locally, push to deploy.
|
| 7 |
+
- **Entrypoint:** `app.py (FastAPI + Ollama)`
|
| 8 |
+
- **Output dir:** `https://deebee7-moltbot-hybrid-engine.hf.space (cloud — not local)`
|
| 9 |
+
|
| 10 |
+
## Architecture Reference
|
| 11 |
+
See: `memory-bank/ARCHITECTURE.md` (synced to this repo)
|
| 12 |
+
|
| 13 |
+
## Absolute Rules (All Repos)
|
| 14 |
+
- NO placeholders, TODOs, or incomplete logic — implement fully or stop
|
| 15 |
+
- NO standalone scripts — fix files in place only (no temp_fix.py, wrapper_v2.py)
|
| 16 |
+
- NO compliance checks that block discovery, embedding, or bundle output
|
| 17 |
+
- NO force push: `git push --force` is prohibited
|
| 18 |
+
- DRY: reuse FileResolutionBridge, UnifiedEvidenceBridge, existing processors — never duplicate logic
|
| 19 |
+
- NEVER use `./local_output` — always use the full absolute output path above
|
| 20 |
+
- Before claiming complete: run `ls -lh` on output PDFs + check `missing_evidence_summary.json`
|
| 21 |
+
- Every claim requires CLI proof (`ls`, `cat`, `grep`) — no assumptions, no "most likely"
|
| 22 |
+
|
| 23 |
+
## Exhibit & DB Reference (Court Format)
|
| 24 |
+
- Exhibit number = bundle letter + sequence (e.g. A15, G7) — set by bundler only
|
| 25 |
+
- DB reference = DB-[N] (e.g. DB-125) — set by `lib/db_registry.py` only
|
| 26 |
+
- NEVER swap these. NEVER use DB ref as the exhibit number.
|
| 27 |
+
- Legal docs reference format: **Exhibit A15 (DB-125) — filename**
|
| 28 |
+
|
| 29 |
+
## Protected Files (Never overwrite without explicit permission in capitals)
|
| 30 |
+
- `enhanced_bundler_wrapper.patched.py`
|
| 31 |
+
- `create_proper_embedded_bundle.py`
|
| 32 |
+
- `generate_bundles_final_corrected.py`
|
| 33 |
+
- `dual_category_evidence_processor.py`
|
| 34 |
+
- `categorize_and_append_v2.py`
|
| 35 |
+
|
| 36 |
+
## Evidence Discovery (P2 only — but all agents must know this)
|
| 37 |
+
- Root: `/home/mrdbo/projects/courtBundleGenerator2/evidence/`
|
| 38 |
+
- Policy: FULL RECURSIVE SCAN — no whitelists, no allow-lists
|
| 39 |
+
- If any file under the evidence root is not found, discovery is broken — fix the whole scan, not a list
|
| 40 |
+
|
| 41 |
+
## Hybrid Engine Rules
|
| 42 |
+
- This is a DEPLOYMENT TARGET — do not add bundler logic here
|
| 43 |
+
- Deploy: `git add -A && git commit -m 'msg' && git push origin main`
|
| 44 |
+
- Exposes: /health, /api/generate, /v1/chat/completions, GET /prompts/legal-exhibit-instruction
|
| 45 |
+
- SDK: Docker; Ollama installed at runtime via start.sh; Qwen 2.5 pulled in background
|
| 46 |
+
|
| 47 |
+
## Full Rules
|
| 48 |
+
See: `memory-bank/CRITICAL_INSTRUCTIONS.md` (synced to this repo)
|
| 49 |
+
See: `AGENT_KNOWLEDGE_BASE_Core_Identity_Standards.md` (synced to this repo)
|
AGENT_KNOWLEDGE_BASE_Core_Identity_Standards.md
ADDED
|
@@ -0,0 +1,129 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# CORE AGENT IDENTITY
|
| 2 |
+
|
| 3 |
+
You are a court bundle engineering agent that operates under ZERO-TOLERANCE standards for broken code, placeholders, and hallucinations.
|
| 4 |
+
|
| 5 |
+
**Canonical source for run commands, completion gates, and audit/verification:** `memory-bank/CRITICAL_INSTRUCTIONS.md` and `AUDITING_COMMANDS_23_1_26.md`. Use them before claiming task completion.
|
| 6 |
+
|
| 7 |
+
---
|
| 8 |
+
|
| 9 |
+
## CRITICAL DIRECTIVES (HIERARCHY 1 - ABSOLUTE)
|
| 10 |
+
|
| 11 |
+
1. **NO BROKEN CODE** — Code with syntax errors, incomplete logic, or untested assumptions terminates the session immediately.
|
| 12 |
+
2. **NO PLACEHOLDERS** — Methods printing "TODO" or returning unchanged data are prohibited. Implement fully or stop.
|
| 13 |
+
3. **NO SYMPTOM TREATMENT** — Fix root causes only. No patches, workarounds, bypasses, fallbacks, or standalone scripts.
|
| 14 |
+
4. **EMPIRICAL EVIDENCE ONLY** — Every diagnosis requires concrete proof: logs, diffs, or shown code. No assumptions, inferences, or guesses.
|
| 15 |
+
5. **PATH DISCIPLINE** — Strictly obey `--output-dir`. **P2:** `/home/mrdbo/court_data/CourtBundleOutput`. **P3:** `/home/mrdbo/court_data/2nd_CourtBundleOutput`. NEVER use `./local_output` for production.
|
| 16 |
+
6. **TRUTH IN TELEMETRY** — Do not print "✅" or claim a file exists without running `ls -lh [EXACT_PATH]` and seeing output.
|
| 17 |
+
7. **LEGAL DOCUMENT EXHIBIT REFERENCES** — When drafting or editing witness statements, N244, SRA complaint, or any court filing: reference evidence as **Exhibit [Letter][Number] (DB-[N]) — [Filename]**. Do not use bare "DB-[●]" or "Exhibit DB-[Number]" as the main identifier. Resolve using `PROMPTS/EXHIBIT_REFERENCING_FOR_LEGAL_DOCS.md` and `legal_emails/Phase8/DB_Evidence_List.txt`. See `PROMPTS/HOW_TO_MAKE_AGENTS_AWARE.md` for wiring into chat/UI/engine.
|
| 18 |
+
|
| 19 |
+
---
|
| 20 |
+
|
| 21 |
+
## BEFORE ANY RUN (NON-NEGOTIABLE)
|
| 22 |
+
|
| 23 |
+
1. **ASK IN CAPITALS** which entrypoint is being used: `enhanced_bundler_wrapper.patched.py` (P2), `generate_bundles_final_corrected.py` (P3), or `create_proper_embedded_bundle.py` (direct).
|
| 24 |
+
2. **RUN** `python3 <ENTRYPOINT> -h` **and paste the output.**
|
| 25 |
+
3. **USE ONLY FLAGS** explicitly shown in that `-h` output.
|
| 26 |
+
4. If required policy flags are missing from argparse, **ADD THEM FIRST** (then re-run `-h` and paste). Do not run unsupported flags.
|
| 27 |
+
|
| 28 |
+
---
|
| 29 |
+
|
| 30 |
+
## ANTI-HALLUCINATION PROTOCOL (MANDATORY)
|
| 31 |
+
|
| 32 |
+
1. **RAG:** Use FileResolutionBridge for all file resolution.
|
| 33 |
+
2. **Chain of Thought:** Show step-by-step logic before conclusions.
|
| 34 |
+
3. **Chain of Verification:** Validate bundle existence before claiming success; run `ls -lh` on claimed outputs.
|
| 35 |
+
4. **Specificity:** Detailed context only — no generic statements.
|
| 36 |
+
5. **Role Assignment:** Respect component expertise boundaries.
|
| 37 |
+
6. **Require Sources:** Verify sources for all evidence claims; cite line numbers and file paths.
|
| 38 |
+
7. **Advanced Models:** Use EnhancedEmbeddingFeatures appropriately.
|
| 39 |
+
8. **Confidence Levels:** Score reliability (80% minimum threshold).
|
| 40 |
+
9. **Multiple Models:** Legacy vs Discovery vs Enrichment — use correct model.
|
| 41 |
+
10. **Lower Temperature:** Deterministic config for reproducibility.
|
| 42 |
+
11. **External Fact-Checking:** Use court compliance as **reference only** (e.g. formatting). Do **NOT** use it as a gate that blocks discovery, embedding, or bundle output (compliance bypass is intentional).
|
| 43 |
+
12. **Confidence Threshold:** 80% minimum — mark UNVERIFIED if below.
|
| 44 |
+
|
| 45 |
+
---
|
| 46 |
+
|
| 47 |
+
## ROBUST CODE STANDARDS (MANDATORY)
|
| 48 |
+
|
| 49 |
+
1. **DRY Principle:** Reuse FileResolutionBridge, UnifiedEvidenceBridge, and existing processors.
|
| 50 |
+
2. **Extensible:** Architecture must allow future compliance features.
|
| 51 |
+
3. **Modular:** Isolated, testable changes only.
|
| 52 |
+
4. **Non-breaking:** Preserve original functionality.
|
| 53 |
+
5. **Configurable:** Use feature flags for new logic.
|
| 54 |
+
6. **Reusable:** Logic must work with any evidence list.
|
| 55 |
+
7. **Refactor:** Improve architecture; do not patch over problems.
|
| 56 |
+
8. **Integrate:** Deep integration only — no parallel pipelines or temporary scripts.
|
| 57 |
+
9. **NO STANDALONE:** No temp_fix.py, wrapper_v2.py, or "quick fixes."
|
| 58 |
+
10. **Fix in place:** Do not create parallel or temporary scripts. Fix the files in place.
|
| 59 |
+
11. **Audit before discovery changes:** Before changing discovery logic or `config/path_config.py`, run `python3 tools/audit_runtime_blockers.py` (from courtBundleGenerator2). Fix any reported blockers first.
|
| 60 |
+
|
| 61 |
+
---
|
| 62 |
+
|
| 63 |
+
## ENFORCEMENT BLOCKING RULES
|
| 64 |
+
|
| 65 |
+
- **Source of truth:** `memory-bank/CRITICAL_INSTRUCTIONS.md`. All other docs defer to it.
|
| 66 |
+
- No bypasses, fallbacks, standalone scripts, or parallel pipelines.
|
| 67 |
+
- **Protected files** (see CRITICAL_INSTRUCTIONS): never `cp`/`mv`/backup without EXPLICIT PERMISSION IN CAPITALS. For critical files: only append, never overwrite completely.
|
| 68 |
+
|
| 69 |
+
---
|
| 70 |
+
|
| 71 |
+
## VERIFICATION LOOP (AFTER EVERY CHANGE)
|
| 72 |
+
|
| 73 |
+
Use the canonical commands in `memory-bank/CRITICAL_INSTRUCTIONS.md` for your entrypoint. Summary:
|
| 74 |
+
|
| 75 |
+
**Project 2 (enhanced_bundler_wrapper.patched.py):**
|
| 76 |
+
```bash
|
| 77 |
+
source court_venv_20250802/bin/activate && python3 -u enhanced_bundler_wrapper.patched.py \
|
| 78 |
+
--output-dir /home/mrdbo/court_data/CourtBundleOutput \
|
| 79 |
+
--enable-discovery --enable-fuzzy --recursive --limit 15 --limit-per-bundle 5 \
|
| 80 |
+
2>&1 | tee -a telemetry.log
|
| 81 |
+
```
|
| 82 |
+
Then: `cd /home/mrdbo/projects/courtBundleGenerator2 && python3 embedding_utils/pdf_page_verifier.py /home/mrdbo/court_data/CourtBundleOutput`
|
| 83 |
+
|
| 84 |
+
**Project 3 (generate_bundles_final_corrected.py):**
|
| 85 |
+
```bash
|
| 86 |
+
cd /home/mrdbo/projects/courtBundleGenerator3 && source ../courtBundleGenerator2/court_venv_20250802/bin/activate && \
|
| 87 |
+
python3 -u generate_bundles_final_corrected.py \
|
| 88 |
+
--output-dir /home/mrdbo/court_data/2nd_CourtBundleOutput \
|
| 89 |
+
--enable-discovery --recursive --limit 15 --limit-per-bundle 5 \
|
| 90 |
+
2>&1 | tee -a telemetry.log
|
| 91 |
+
```
|
| 92 |
+
Then: `cd /home/mrdbo/projects/courtBundleGenerator3 && python3 pdf_page_verifier_enhanced.py /home/mrdbo/court_data/2nd_CourtBundleOutput`
|
| 93 |
+
|
| 94 |
+
**Audit/diagnostics:** See `AUDITING_COMMANDS_23_1_26.md` and the **AGENT AUDIT & VERIFICATION COMMANDS** section in CRITICAL_INSTRUCTIONS (audit_bundle_prevention.py, test_download_links.py, audit_runtime_chain, audit_runtime_blockers).
|
| 95 |
+
|
| 96 |
+
---
|
| 97 |
+
|
| 98 |
+
## ABSOLUTE COMPLETION GATE
|
| 99 |
+
|
| 100 |
+
You must **not** claim completion unless ALL are true:
|
| 101 |
+
|
| 102 |
+
- Bundles generated in the chosen output directory; PDFs are non-empty.
|
| 103 |
+
- Page-level verification run (see CRITICAL_INSTRUCTIONS — AGENT AUDIT & VERIFICATION COMMANDS) with 0 missing or fully enumerated with reason codes.
|
| 104 |
+
- Missing evidence summary is empty OR fully enumerated with reason codes.
|
| 105 |
+
- TOC sync issues == 0; DB numbers present in TOC and evidence pages; exhibit number = bundle letter+seq (e.g. A15, G7), not DB ref.
|
| 106 |
+
- No raw paths on PDF pages (embedding failure); continue verification loop until 0 missing. Empirical analysis only.
|
| 107 |
+
- User confirms PDFs are correct when applicable.
|
| 108 |
+
|
| 109 |
+
**Run proof required:** Exact command executed; at least one PDF path with size + full ISO timestamp; `missing_evidence_summary.json` status. **YOU MUST STOP FOR USER ACCEPTANCE before claiming completion.**
|
| 110 |
+
|
| 111 |
+
---
|
| 112 |
+
|
| 113 |
+
## PENALTY SYSTEM
|
| 114 |
+
|
| 115 |
+
- **Broken Code:** Session terminates immediately; all output invalidated.
|
| 116 |
+
- **Placeholder Code:** Task rejected; must re-plan.
|
| 117 |
+
- **Hallucinated Files/Success:** Confidence score → 0%; all claims invalidated.
|
| 118 |
+
- **Skipped Verification:** All subsequent output marked UNVERIFIED.
|
| 119 |
+
- **Bypassing Rules:** Session paused; requires explicit re-authorization.
|
| 120 |
+
- **Re-enabling compliance/validation that blocks pipeline:** Session paused; change reverted. Compliance bypass is intentional; do not add checks that block discovery, embedding, or bundle output.
|
| 121 |
+
|
| 122 |
+
---
|
| 123 |
+
|
| 124 |
+
## KEY REFERENCES
|
| 125 |
+
|
| 126 |
+
- **Full instructions:** `memory-bank/CRITICAL_INSTRUCTIONS.md` — entry points, verification loop, completion gates, discovery, exhibit/DB sync, legal document referencing, IN-SITU PATCH FORMAT, formalized protocol for asking for missing information, penalty system.
|
| 127 |
+
- **Audit/verification commands:** `AUDITING_COMMANDS_23_1_26.md` — page verifiers, audit_bundle_prevention, test_download_links, audit_runtime_chain, audit_runtime_blockers, cross_project_impact_audit (with `--entry` for runtime chain).
|
| 128 |
+
- **Exhibit referencing (legal docs):** `PROMPTS/EXHIBIT_REFERENCING_FOR_LEGAL_DOCS.md`, `PROMPTS/LEGAL_WRITING_EXHIBIT_INSTRUCTION.md`, `PROMPTS/HOW_TO_MAKE_AGENTS_AWARE.md`. Authoritative DB list: `legal_emails/Phase8/DB_Evidence_List.txt`.
|
| 129 |
+
- **Protected files list:** In CRITICAL_INSTRUCTIONS (e.g. enhanced_bundler_wrapper.patched.py, create_proper_embedded_bundle.py, generate_bundles_final_corrected.py, dual_category_evidence_processor.py, categorize_and_append_v2.py).
|
app.py
CHANGED
|
@@ -302,32 +302,46 @@ def security_info():
|
|
| 302 |
}
|
| 303 |
|
| 304 |
|
| 305 |
-
# Legal document exhibit reference instruction —
|
| 306 |
_LEGAL_EXHIBIT_PROMPT_PATH = Path(__file__).resolve().parent / "prompts" / "legal_exhibit_instruction.txt"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 307 |
|
| 308 |
@app.get("/prompts/legal-exhibit-instruction")
|
| 309 |
def get_legal_exhibit_instruction():
|
| 310 |
-
"""Return the legal exhibit referencing instruction.
|
| 311 |
-
|
| 312 |
-
return {"instruction": _LEGAL_EXHIBIT_PROMPT_PATH.read_text(encoding="utf-8", errors="replace")}
|
| 313 |
-
return {"instruction": "When referencing evidence use Exhibit [Letter][Number] (DB-[N]) — [Filename]. Do not use bare DB-[●]."}
|
| 314 |
|
| 315 |
|
| 316 |
# --- LLM Generation (Dual Backend: Ollama → HF Inference API) ---
|
| 317 |
|
| 318 |
@app.post("/api/generate")
|
| 319 |
async def generate(request: GenerateRequest, x_api_key: str = Header(None)):
|
| 320 |
-
"""Generate text using LLM. Tries Ollama first, falls back to HF Inference API.
|
|
|
|
|
|
|
| 321 |
if not x_api_key or x_api_key != API_KEY:
|
| 322 |
raise HTTPException(status_code=401, detail="Invalid or missing API Key")
|
| 323 |
|
| 324 |
-
|
|
|
|
|
|
|
| 325 |
|
| 326 |
backend_used = None
|
| 327 |
response_text = None
|
| 328 |
|
| 329 |
# Backend 1: Try Ollama (local)
|
| 330 |
-
response_text = generate_with_ollama(request.model,
|
| 331 |
if response_text:
|
| 332 |
backend_used = "ollama"
|
| 333 |
logger.info(f"[GENERATE] Ollama success, response_len={len(response_text)}")
|
|
@@ -335,7 +349,7 @@ async def generate(request: GenerateRequest, x_api_key: str = Header(None)):
|
|
| 335 |
# Backend 2: Fallback to HF Inference API
|
| 336 |
if not response_text:
|
| 337 |
logger.info("[GENERATE] Ollama unavailable, trying HF Inference API...")
|
| 338 |
-
response_text = generate_with_hf_api(
|
| 339 |
if response_text:
|
| 340 |
backend_used = "hf_inference_api"
|
| 341 |
logger.info(f"[GENERATE] HF API success, response_len={len(response_text)}")
|
|
@@ -641,9 +655,13 @@ async def chat_completions(
|
|
| 641 |
|
| 642 |
logger.info(f"[CHAT] model={request.model}, messages={len(request.messages)}, stream={request.stream}")
|
| 643 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 644 |
# Generate response via model routing
|
| 645 |
response_text = _generate_for_model(
|
| 646 |
-
request.model,
|
| 647 |
temperature=request.temperature or 0.7,
|
| 648 |
max_tokens=request.max_tokens or 2048,
|
| 649 |
)
|
|
|
|
| 302 |
}
|
| 303 |
|
| 304 |
|
| 305 |
+
# Legal document exhibit reference instruction — injected into every generate/chat so edit sources always get it
|
| 306 |
_LEGAL_EXHIBIT_PROMPT_PATH = Path(__file__).resolve().parent / "prompts" / "legal_exhibit_instruction.txt"
|
| 307 |
+
_LEGAL_EXHIBIT_INSTRUCTION_CACHED: Optional[str] = None
|
| 308 |
+
|
| 309 |
+
def _get_legal_exhibit_instruction() -> str:
|
| 310 |
+
"""Load legal exhibit instruction once; used to inject into all LLM requests so Moltbot/Qwen always output Exhibit [Letter][Number] (DB-[N]) — [Filename]."""
|
| 311 |
+
global _LEGAL_EXHIBIT_INSTRUCTION_CACHED
|
| 312 |
+
if _LEGAL_EXHIBIT_INSTRUCTION_CACHED is not None:
|
| 313 |
+
return _LEGAL_EXHIBIT_INSTRUCTION_CACHED
|
| 314 |
+
if _LEGAL_EXHIBIT_PROMPT_PATH.exists():
|
| 315 |
+
_LEGAL_EXHIBIT_INSTRUCTION_CACHED = _LEGAL_EXHIBIT_PROMPT_PATH.read_text(encoding="utf-8", errors="replace")
|
| 316 |
+
else:
|
| 317 |
+
_LEGAL_EXHIBIT_INSTRUCTION_CACHED = "When referencing evidence use Exhibit [Letter][Number] (DB-[N]) — [Filename]. Do not use bare DB-[●]."
|
| 318 |
+
return _LEGAL_EXHIBIT_INSTRUCTION_CACHED
|
| 319 |
|
| 320 |
@app.get("/prompts/legal-exhibit-instruction")
|
| 321 |
def get_legal_exhibit_instruction():
|
| 322 |
+
"""Return the legal exhibit referencing instruction. Also injected automatically into /api/generate and /v1/chat/completions."""
|
| 323 |
+
return {"instruction": _get_legal_exhibit_instruction()}
|
|
|
|
|
|
|
| 324 |
|
| 325 |
|
| 326 |
# --- LLM Generation (Dual Backend: Ollama → HF Inference API) ---
|
| 327 |
|
| 328 |
@app.post("/api/generate")
|
| 329 |
async def generate(request: GenerateRequest, x_api_key: str = Header(None)):
|
| 330 |
+
"""Generate text using LLM. Tries Ollama first, falls back to HF Inference API.
|
| 331 |
+
Legal exhibit instruction is prepended so all edits/amendments use Exhibit [Letter][Number] (DB-[N]) — [Filename].
|
| 332 |
+
"""
|
| 333 |
if not x_api_key or x_api_key != API_KEY:
|
| 334 |
raise HTTPException(status_code=401, detail="Invalid or missing API Key")
|
| 335 |
|
| 336 |
+
# Inject legal exhibit instruction so edit sources always get the rule
|
| 337 |
+
prompt_with_legal = _get_legal_exhibit_instruction() + "\n\n---\n\n" + request.prompt
|
| 338 |
+
logger.info(f"[GENERATE] model={request.model}, prompt_len={len(prompt_with_legal)}")
|
| 339 |
|
| 340 |
backend_used = None
|
| 341 |
response_text = None
|
| 342 |
|
| 343 |
# Backend 1: Try Ollama (local)
|
| 344 |
+
response_text = generate_with_ollama(request.model, prompt_with_legal)
|
| 345 |
if response_text:
|
| 346 |
backend_used = "ollama"
|
| 347 |
logger.info(f"[GENERATE] Ollama success, response_len={len(response_text)}")
|
|
|
|
| 349 |
# Backend 2: Fallback to HF Inference API
|
| 350 |
if not response_text:
|
| 351 |
logger.info("[GENERATE] Ollama unavailable, trying HF Inference API...")
|
| 352 |
+
response_text = generate_with_hf_api(prompt_with_legal)
|
| 353 |
if response_text:
|
| 354 |
backend_used = "hf_inference_api"
|
| 355 |
logger.info(f"[GENERATE] HF API success, response_len={len(response_text)}")
|
|
|
|
| 655 |
|
| 656 |
logger.info(f"[CHAT] model={request.model}, messages={len(request.messages)}, stream={request.stream}")
|
| 657 |
|
| 658 |
+
# Inject legal exhibit instruction so every edit/amendment/insert uses Exhibit [Letter][Number] (DB-[N]) — [Filename]
|
| 659 |
+
legal_system = ChatMessage(role="system", content=_get_legal_exhibit_instruction())
|
| 660 |
+
messages_with_legal = [legal_system] + list(request.messages)
|
| 661 |
+
|
| 662 |
# Generate response via model routing
|
| 663 |
response_text = _generate_for_model(
|
| 664 |
+
request.model, messages_with_legal,
|
| 665 |
temperature=request.temperature or 0.7,
|
| 666 |
max_tokens=request.max_tokens or 2048,
|
| 667 |
)
|
memory-bank/ARCHITECTURE.md
ADDED
|
@@ -0,0 +1,117 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# /home/mrdbo/projects/courtBundleGenerator2/memory-bank/ARCHITECTURE.md
|
| 2 |
+
# SYNCED FILE — Source: courtBundleGenerator2/memory-bank/ARCHITECTURE.md
|
| 3 |
+
# DO NOT EDIT IN OTHER REPOS — edit source and run scripts/sync_agent_docs.sh
|
| 4 |
+
|
| 5 |
+
# Project Architecture — Court Bundle Generator
|
| 6 |
+
|
| 7 |
+
## 4 Repos. 2 Local Dev. 2 HF Space Local Clones.
|
| 8 |
+
|
| 9 |
+
```
|
| 10 |
+
/home/mrdbo/projects/
|
| 11 |
+
├── courtBundleGenerator2/ ← REPO 1 (P2)
|
| 12 |
+
├── courtBundleGenerator3/ ← REPO 2 (P3)
|
| 13 |
+
├── moltbot-legal-desktop/ ← REPO 3 (HF Space A local clone)
|
| 14 |
+
└── moltbot-hybrid-engine/ ← REPO 4 (HF Space B local clone)
|
| 15 |
+
|
| 16 |
+
/home/mrdbo/court_data/ ← OUTPUT DIRECTORY ONLY. NOT A REPO.
|
| 17 |
+
/home/mrdbo/projects/courtBundleGenerator2/evidence/ ← EVIDENCE SUBFOLDER OF P2. NOT A REPO.
|
| 18 |
+
```
|
| 19 |
+
|
| 20 |
+
---
|
| 21 |
+
|
| 22 |
+
## Repo Roles
|
| 23 |
+
|
| 24 |
+
### REPO 1 — courtBundleGenerator2 (P2)
|
| 25 |
+
- **Role:** Evidence root. Legacy bundler. Documentation home.
|
| 26 |
+
- **Evidence root:** `/home/mrdbo/projects/courtBundleGenerator2/evidence/` (full recursive scan, no whitelists)
|
| 27 |
+
- **Entrypoint:** `enhanced_bundler_wrapper.patched.py`
|
| 28 |
+
- **Output dir:** `/home/mrdbo/court_data/CourtBundleOutput`
|
| 29 |
+
- **Symlink:** `./output` → `/home/mrdbo/court_data/CourtBundleOutput` (always use full path in commands)
|
| 30 |
+
- **Venv:** `/home/mrdbo/projects/courtBundleGenerator2/court_venv_20250802/bin/python`
|
| 31 |
+
- **Documentation home:** `memory-bank/`, `PROMPTS/`
|
| 32 |
+
- **Rule:** Read-only for evidence files. Do NOT add new logic adapters here.
|
| 33 |
+
|
| 34 |
+
### REPO 2 — courtBundleGenerator3 (P3)
|
| 35 |
+
- **Role:** Active logic center. All new adapters, tools, bridge scripts go here.
|
| 36 |
+
- **Entrypoint:** `generate_bundles_final_corrected.py`
|
| 37 |
+
- **Output dir:** `/home/mrdbo/court_data/2nd_CourtBundleOutput`
|
| 38 |
+
- **Symlink:** `./output` → `/home/mrdbo/court_data/2nd_CourtBundleOutput` (always use full path in commands)
|
| 39 |
+
- **Venv:** Shared — `/home/mrdbo/projects/courtBundleGenerator2/court_venv_20250802/bin/python`
|
| 40 |
+
- **Key files:** `cloud_llm_adapter.py`, `moltbot_track_changes.py`, `generate_bundles_final_corrected.py`
|
| 41 |
+
- **Rule:** Install all Python dependencies and bridge scripts HERE, not in P2.
|
| 42 |
+
|
| 43 |
+
### REPO 3 — moltbot-legal-desktop (HF Space A)
|
| 44 |
+
- **Role:** Local clone of Hugging Face Space `deebee7/moltbot-legal-desktop`.
|
| 45 |
+
- **What runs here locally:** Nothing. Edit locally, then `git push` to deploy.
|
| 46 |
+
- **What the Space runs:** FastAPI web server (`app.py`) on port 7860.
|
| 47 |
+
- **Live URL:** `https://deebee7-moltbot-legal-desktop.hf.space`
|
| 48 |
+
- **Live endpoints:** `/health`, `/api/generate_bundle`, `/api/bundles`, `/api/evidence_stats`, `/api/analyze`
|
| 49 |
+
- **Deploy command (run from this repo root):** `git add -A && git commit -m "msg" && git push origin main`
|
| 50 |
+
- **Sync from P2/P3:** `cd /home/mrdbo/projects/courtBundleGenerator3/adapters && bash sync_to_desktop.sh`
|
| 51 |
+
- **SDK:** Docker (Python 3.10 + LibreOffice + uvicorn)
|
| 52 |
+
|
| 53 |
+
### REPO 4 — moltbot-hybrid-engine (HF Space B)
|
| 54 |
+
- **Role:** Local clone of Hugging Face Space `deebee7/moltbot-hybrid-engine`.
|
| 55 |
+
- **What runs here locally:** Nothing. Edit locally, then `git push` to deploy.
|
| 56 |
+
- **What the Space runs:** FastAPI + Ollama + Qwen 2.5. OpenAI-compatible API.
|
| 57 |
+
- **Live URL:** `https://deebee7-moltbot-hybrid-engine.hf.space`
|
| 58 |
+
- **Live endpoints:** `/health`, `/api/generate`, `/api/search`, `/api/analyze`, `/v1/chat/completions`, `/v1/models`, `GET /prompts/legal-exhibit-instruction`
|
| 59 |
+
- **Deploy command (run from this repo root):** `git add -A && git commit -m "msg" && git push origin main`
|
| 60 |
+
- **SDK:** Docker
|
| 61 |
+
|
| 62 |
+
---
|
| 63 |
+
|
| 64 |
+
## Output Directories (Not repos — never commit here)
|
| 65 |
+
|
| 66 |
+
| Path | Purpose | Used by |
|
| 67 |
+
|---|---|---|
|
| 68 |
+
| `/home/mrdbo/court_data/CourtBundleOutput` | P2 bundle output | P2 entrypoint |
|
| 69 |
+
| `/home/mrdbo/court_data/2nd_CourtBundleOutput` | P3 bundle output | P3 entrypoint |
|
| 70 |
+
|
| 71 |
+
---
|
| 72 |
+
|
| 73 |
+
## Evidence Root (Subfolder of P2 — not a repo)
|
| 74 |
+
|
| 75 |
+
- **Path:** `/home/mrdbo/projects/courtBundleGenerator2/evidence/`
|
| 76 |
+
- **Discovery policy:** Full recursive scan (`os.walk` / `rglob`). No whitelists. No allow-lists.
|
| 77 |
+
- **Historically missed directories (must never be excluded):** `Repairs`, `InputDocs`, `new_evidence_staging`, `00_CRITICAL_SCANNED`, `00_CRITICAL_INTAKE`
|
| 78 |
+
- **External evidence path:** `/legal_emails` also scanned
|
| 79 |
+
|
| 80 |
+
---
|
| 81 |
+
|
| 82 |
+
## Cloud Infrastructure (Not local repos — deployed via git push)
|
| 83 |
+
|
| 84 |
+
| Space | HF Repo | Local Clone | Role |
|
| 85 |
+
|---|---|---|---|
|
| 86 |
+
| HF Space A | `deebee7/moltbot-legal-desktop` | `moltbot-legal-desktop/` | Web bundle server |
|
| 87 |
+
| HF Space B | `deebee7/moltbot-hybrid-engine` | `moltbot-hybrid-engine/` | Qwen 2.5 brain |
|
| 88 |
+
|
| 89 |
+
**Check Space health:**
|
| 90 |
+
```bash
|
| 91 |
+
curl -s https://deebee7-moltbot-hybrid-engine.hf.space/health | python3 -m json.tool
|
| 92 |
+
curl -s https://deebee7-moltbot-legal-desktop.hf.space/health | python3 -m json.tool
|
| 93 |
+
```
|
| 94 |
+
|
| 95 |
+
---
|
| 96 |
+
|
| 97 |
+
## VS Code Workspace
|
| 98 |
+
|
| 99 |
+
File: `/home/mrdbo/projects/MyProjects.code-workspace`
|
| 100 |
+
All 4 repos plus `court_data` (output) and `evidence` (subfolder) are opened as workspace folders for convenience. `court_data` and `evidence` are NOT repos.
|
| 101 |
+
|
| 102 |
+
---
|
| 103 |
+
|
| 104 |
+
## Key Shared Config (Lives in P2, used by P3 via import)
|
| 105 |
+
|
| 106 |
+
| File | Repo | Purpose |
|
| 107 |
+
|---|---|---|
|
| 108 |
+
| `config/path_config.py` | P2 | Discovery roots — must return full evidence tree |
|
| 109 |
+
| `file_resolution_bridge.py` | P2 | File resolution with caching |
|
| 110 |
+
| `lib/db_registry.py` | P2 | DB-[N] assignment — never sets exhibitNo |
|
| 111 |
+
| `config/bundle_compliance.json` | P2 | Court formatting reference (not a pipeline gate) |
|
| 112 |
+
| `legal_emails/Phase8/DB_Evidence_List.txt` | P2 | Authoritative DB1–DB170 list |
|
| 113 |
+
|
| 114 |
+
---
|
| 115 |
+
|
| 116 |
+
*Source of truth: courtBundleGenerator2/memory-bank/ARCHITECTURE.md*
|
| 117 |
+
*Synced to all repos by: scripts/sync_agent_docs.sh*
|
memory-bank/CLAUDE.md
ADDED
|
@@ -0,0 +1,80 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#/home/mrdbo/projects/courtBundleGenerator2/memory-bank/CLAUDE.md
|
| 2 |
+
# CLAUDE Project Summary (Current as of 2026-02-13)
|
| 3 |
+
|
| 4 |
+
## Current State (2026-02-13)
|
| 5 |
+
- **Exhibit/DB sync:** `lib/db_registry.py` writes only DB refs (never exhibitNo/Exhibit No.). P3 `generate_bundles_final_corrected.py` uses bundle letter+seq for exhibit (e.g. A15, G7) and `db_ref` for DB only. Authoritative list: `legal_emails/Phase8/DB_Evidence_List.txt`. TOC, footer, metadata on page stay in sync.
|
| 6 |
+
- **Legal document referencing:** All court filings (witness statement, N244, SRA) must use **Exhibit [Letter][Number] (DB-[N]) — [Filename]**. No bare "DB-[●]". See `PROMPTS/EXHIBIT_REFERENCING_FOR_LEGAL_DOCS.md`, `PROMPTS/LEGAL_WRITING_EXHIBIT_INSTRUCTION.md`, `PROMPTS/HOW_TO_MAKE_AGENTS_AWARE.md`. `.cursorrules` includes the rule for courtBundleGenerator2; Moltbot/Qwen clients should send the instruction as system message (Engine: `GET /prompts/legal-exhibit-instruction`).
|
| 7 |
+
- **Hybrid Cloud Architecture:** Two HF Spaces deployed and running:
|
| 8 |
+
- **Space A** (`deebee7/moltbot-legal-desktop`): FastAPI web server (`app.py`) on port 7860 for cloud bundle generation. Docker SDK, Python 3.10 + LibreOffice. Local clone at `/home/mrdbo/projects/moltbot-legal-desktop`.
|
| 9 |
+
- **Space B** (`deebee7/moltbot-hybrid-engine`): Ollama + Qwen 2.5 LLM; FastAPI + `start.sh`; OpenAI-compatible `/v1/chat/completions`. **`GET /prompts/legal-exhibit-instruction`** returns legal exhibit instruction for clients. Docker SDK, Python 3.11-slim. Local clone at `/home/mrdbo/projects/moltbot-hybrid-engine`.
|
| 10 |
+
- **Sync System:** `sync_to_desktop.sh` + `install_sync_hook.sh` in `courtBundleGenerator3/adapters/` syncs P2 libraries and P3 adapters/tools to Desktop space. Post-commit hooks available.
|
| 11 |
+
- **DBRegistry:** Import wrapped with `try...except` + `HAS_DB_REGISTRY`; double-fallback in Desktop. Registry seeds from `DB_Evidence_List.txt`; never overwrites exhibit number with DB ref.
|
| 12 |
+
- **Verification & metadata:** Page verifier (`pdf_page_verifier_enhanced.py` in P3); metadata on every page; DB fallback DB-0. Audit: `tools/audit_bundle_prevention.py`, `tools/test_download_links.py`, `tools/audit_active_bundling_files.py` (→ gb3_deps.json), `tools/cross_project_impact_audit.py` (optional `--entry` for runtime chain).
|
| 13 |
+
- Metadata ingestion, recursion guard, prompt system, category_mapping, dual category processor as previously documented.
|
| 14 |
+
|
| 15 |
+
## Architecture Map (5 Projects)
|
| 16 |
+
| # | Project | Path | Role |
|
| 17 |
+
|---|---------|------|------|
|
| 18 |
+
| P2 | courtBundleGenerator2 | `/home/mrdbo/projects/courtBundleGenerator2` | Evidence Root, Legacy Bundler, Documentation |
|
| 19 |
+
| P3 | courtBundleGenerator3 | `/home/mrdbo/projects/courtBundleGenerator3` | Smart Agent Home, Logic Center |
|
| 20 |
+
| Desktop | moltbot-legal-desktop | `/home/mrdbo/projects/moltbot-legal-desktop` | HF Space A — Cloud bundle web server |
|
| 21 |
+
| Engine | moltbot-hybrid-engine | `/home/mrdbo/projects/moltbot-hybrid-engine` | HF Space B — Ollama + Qwen 2.5 LLM |
|
| 22 |
+
| (data) | court_data | `/home/mrdbo/court_data` | Bundle output directory |
|
| 23 |
+
|
| 24 |
+
## Mandate
|
| 25 |
+
- **Output paths:** P2 → `/home/mrdbo/court_data/CourtBundleOutput`; P3 → `/home/mrdbo/court_data/2nd_CourtBundleOutput`. Do not use `./local_output` for production.
|
| 26 |
+
- Treat `/home/mrdbo/projects/courtBundleGenerator2/evidence/InputDocs/**` and `/home/mrdbo/projects/courtBundleGenerator2/evidence/new_evidence_staging/**` as the only writable discovery sources. All other `/evidence/*` folders exist for read-only reference.
|
| 27 |
+
- Keep the Chain of Verification intact: every run must surface logs from `AntiHallucinationManager`, `EnhancedFuzzyResolver`, and `UnifiedEvidenceBridge` before evidence is embedded.
|
| 28 |
+
- Do not claim success until: (1) canonical run command executed (see CRITICAL_INSTRUCTIONS.md), (2) at least one non-empty PDF in chosen output dir (CourtBundleOutput or 2nd_CourtBundleOutput), (3) page-level verification run. See AUDITING_COMMANDS_23_1_26.md for audit/verification commands.
|
| 29 |
+
- When editing code, annotate complex fixes with the relevant path and line number (e.g., `# FIX: create_proper_embedded_bundle.py:2882`).
|
| 30 |
+
|
| 31 |
+
## Open Issues to Track
|
| 32 |
+
1. **Verification automation** – capture and archive the stdout/stderr from the bundler command above for each run so future agents know the last known good state.
|
| 33 |
+
2. **Dual-category imports** – finish staggering imports inside `dual_category_evidence_processor.py` so instantiation no longer prints the circular import warning when invoked in isolation.
|
| 34 |
+
3. **Documentation consistency** – every `memory-bank/*` document must reflect the narrow discovery scope and current integration notes (this file sets the tone).
|
| 35 |
+
4. **Jira integration** – missing env vars (JIRA_URL, JIRA_EMAIL, JIRA_TOKEN) cause 404 errors in agent logger.
|
| 36 |
+
5. **Hybrid Engine model pull** – Qwen 2.5 7B model pull may not complete on HF free tier (2 CPU, 16GB RAM). Monitor `/api/generate` endpoint for 503 status.
|
| 37 |
+
|
| 38 |
+
## Core Commands
|
| 39 |
+
```bash
|
| 40 |
+
# P2: output only /home/mrdbo/court_data/CourtBundleOutput
|
| 41 |
+
source court_venv_20250802/bin/activate && python3 -u enhanced_bundler_wrapper.patched.py \
|
| 42 |
+
--output-dir /home/mrdbo/court_data/CourtBundleOutput --limit 1 --recursive
|
| 43 |
+
|
| 44 |
+
# P3: output only /home/mrdbo/court_data/2nd_CourtBundleOutput
|
| 45 |
+
cd /home/mrdbo/projects/courtBundleGenerator3 && python3 -u generate_bundles_final_corrected.py \
|
| 46 |
+
--output-dir /home/mrdbo/court_data/2nd_CourtBundleOutput --limit 1 --recursive
|
| 47 |
+
|
| 48 |
+
# Page verifier P3
|
| 49 |
+
cd /home/mrdbo/projects/courtBundleGenerator3 && python3 pdf_page_verifier_enhanced.py /home/mrdbo/court_data/2nd_CourtBundleOutput
|
| 50 |
+
|
| 51 |
+
# Confirm output (use dir that matches entrypoint)
|
| 52 |
+
ls -lh /home/mrdbo/court_data/CourtBundleOutput/court_bundle*.pdf
|
| 53 |
+
ls -lh /home/mrdbo/court_data/2nd_CourtBundleOutput/*.pdf
|
| 54 |
+
|
| 55 |
+
# HF Space health checks
|
| 56 |
+
curl -s https://deebee7-moltbot-hybrid-engine.hf.space/health | python3 -m json.tool
|
| 57 |
+
curl -s https://deebee7-moltbot-legal-desktop.hf.space/health | python3 -m json.tool
|
| 58 |
+
|
| 59 |
+
# Sync P2/P3 → Desktop
|
| 60 |
+
cd /home/mrdbo/projects/courtBundleGenerator3/adapters && bash sync_to_desktop.sh --push
|
| 61 |
+
|
| 62 |
+
# Deploy to HF (Desktop)
|
| 63 |
+
cd /home/mrdbo/projects/moltbot-legal-desktop && git add -A && git commit -m "msg" && git push origin main
|
| 64 |
+
|
| 65 |
+
# Deploy to HF (Engine)
|
| 66 |
+
cd /home/mrdbo/projects/moltbot-hybrid-engine && git add -A && git commit -m "msg" && git push origin main
|
| 67 |
+
```
|
| 68 |
+
|
| 69 |
+
## Reference Files
|
| 70 |
+
- `create_proper_embedded_bundle.py`, `lib/db_registry.py`, `legal_emails/Phase8/DB_Evidence_List.txt`, `BUNDLE_GROUPS_WITH_FULL_EVIDENCE_FILE_NAMES.md`
|
| 71 |
+
- `cohesive_unified_evidence_processor.py`, `category_mapping.py`, `dual_category_evidence_processor.py`
|
| 72 |
+
- `embedding_utils/prompt_system_integration.py`, `embedding_utils/enhanced_features.py`
|
| 73 |
+
- `enhanced_bundler_wrapper.patched.py`
|
| 74 |
+
- courtBundleGenerator3: `generate_bundles_final_corrected.py`, `pdf_page_verifier_enhanced.py`, `tools/audit_bundle_prevention.py`, `tools/test_download_links.py`, `tools/audit_active_bundling_files.py`, `tools/cross_project_impact_audit.py`
|
| 75 |
+
- courtBundleGenerator2: `tools/cross_project_impact_audit.py` (runtime chain with `--entry`)
|
| 76 |
+
- PROMPTS: `PROMPT_HEADER_13_12_25.md`, `EXHIBIT_REFERENCING_FOR_LEGAL_DOCS.md`, `LEGAL_WRITING_EXHIBIT_INSTRUCTION.md`, `HOW_TO_MAKE_AGENTS_AWARE.md`
|
| 77 |
+
- moltbot-legal-desktop: `app.py` (FastAPI web server), `Dockerfile`
|
| 78 |
+
- moltbot-hybrid-engine: `app.py` (FastAPI), `start.sh`, `Dockerfile`, `prompts/legal_exhibit_instruction.txt`, GET `/prompts/legal-exhibit-instruction`
|
| 79 |
+
- `memory-bank/CRITICAL_INSTRUCTIONS.md`
|
| 80 |
+
- `AUDITING_COMMANDS_23_1_26.md`
|
memory-bank/CRITICAL_INSTRUCTIONS.md
ADDED
|
@@ -0,0 +1,948 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# /home/mrdbo/projects/courtBundleGenerator2/memory-bank/CRITICAL_INSTRUCTIONS.md
|
| 2 |
+
|
| 3 |
+
### MULTI-LLM CROSS-VERIFICATION PROTOCOL
|
| 4 |
+
|
| 5 |
+
1. Process evidence through BOTH local MoltBot AND cloud Qwen 2.5
|
| 6 |
+
2. Compare outputs at: categorization, DB assignment, embedding verification
|
| 7 |
+
3. Cloud failure → fallback to local with warning (never block pipeline)
|
| 8 |
+
4. Daily HF space health check required
|
| 9 |
+
|
| 10 |
+
### EMBEDDING INTEGRITY REQUIREMENTS
|
| 11 |
+
|
| 12 |
+
1. **Pre-embedding validation:** Confirm 100% of TOC files exist BEFORE PDF generation
|
| 13 |
+
2. **Real-time embedding monitoring:** Track success/failure per file
|
| 14 |
+
3. **DB reference audit:** Every DB in TOC must appear on actual PDF page
|
| 15 |
+
4. **Zero blank placeholder tolerance:** Fix or remove invalid DB references
|
| 16 |
+
|
| 17 |
+
### COMPLETION GATE UPDATES
|
| 18 |
+
|
| 19 |
+
✅ No blank placeholder pages
|
| 20 |
+
✅ Cloud LLM operational OR fallback executed
|
| 21 |
+
✅ Pre-embedding validation report generated
|
| 22 |
+
✅ All TOC DB references appear on actual PDF pages
|
| 23 |
+
|
| 24 |
+
## CORE DIRECTIVES
|
| 25 |
+
|
| 26 |
+
## 1.1 HYBRID ARCHITECTURE MAP (DO NOT INFER)
|
| 27 |
+
|
| 28 |
+
**CRITICAL**: You must strictly adhere to these project roles. DO NOT cross-contaminate.
|
| 29 |
+
|
| 30 |
+
### Local Development Projects
|
| 31 |
+
|
| 32 |
+
1. **PROJECT 3 (`/home/mrdbo/projects/courtBundleGenerator3`)** — Smart Agent Home
|
| 33 |
+
- **ROLE**: Logic Center. All new adapters, tools, bridge scripts.
|
| 34 |
+
- **CONTENTS**: `cloud_llm_adapter.py`, `moltbot_track_changes.py`, DeepEval, `generate_bundles_final_corrected.py` (P3 copy).
|
| 35 |
+
- **ACTION**: Install all Python dependencies and bridge scripts HERE.
|
| 36 |
+
- **OUTPUT**: `/home/mrdbo/court_data/2nd_CourtBundleOutput`
|
| 37 |
+
|
| 38 |
+
2. **PROJECT 2 (`/home/mrdbo/projects/courtBundleGenerator2`)** — Evidence Root
|
| 39 |
+
- **ROLE**: Legacy Bundle Generator & Evidence Root. Documentation home (`PROMPTS/`, `memory-bank/`).
|
| 40 |
+
- **CONTENTS**: `/evidence/` data, `enhanced_bundler_wrapper.patched.py`, `create_proper_embedded_bundle.py`.
|
| 41 |
+
- **ACTION**: Read-only for Evidence. Do NOT add new logic adapters here.
|
| 42 |
+
- **OUTPUT**: `/home/mrdbo/court_data/CourtBundleOutput`
|
| 43 |
+
|
| 44 |
+
### Hugging Face Cloud Spaces
|
| 45 |
+
|
| 46 |
+
1. **DESKTOP SPACE — HF Space A (`deebee7/moltbot-legal-desktop`)**
|
| 47 |
+
- **LOCAL CLONE**: `/home/mrdbo/projects/moltbot-legal-desktop`
|
| 48 |
+
- **ROLE**: Cloud bundle generation web server (FastAPI on port 7860).
|
| 49 |
+
- **CONTENTS**: `app.py` (FastAPI), `generate_bundles_final_corrected.py`, adapters (synced from P3), libraries (synced from P2).
|
| 50 |
+
- **ENDPOINTS**: `/health`, `/api/generate_bundle`, `/api/bundles`, `/api/evidence_stats`, `/api/analyze`
|
| 51 |
+
- **DEPLOY**: `cd /home/mrdbo/projects/moltbot-legal-desktop && git add -A && git commit -m "msg" && git push origin main`
|
| 52 |
+
- **SDK**: Docker (Python 3.10 + LibreOffice + uvicorn)
|
| 53 |
+
|
| 54 |
+
2. **HYBRID ENGINE — HF Space B (`deebee7/moltbot-hybrid-engine`)**
|
| 55 |
+
- **LOCAL CLONE**: `/home/mrdbo/projects/moltbot-hybrid-engine`
|
| 56 |
+
- **ROLE**: Remote Uncensored Brain. Runs Ollama + Qwen 2.5; OpenAI-compatible `/v1/chat/completions`.
|
| 57 |
+
- **CONTENTS**: `app.py` (FastAPI), `Dockerfile`, `start.sh`, `prompts/legal_exhibit_instruction.txt`.
|
| 58 |
+
- **ENDPOINTS**: `/health`, `/api/generate`, `/api/search`, `/api/analyze`, `/tools/analyze_report`, `/v1/chat/completions`, `/v1/models`, **`GET /prompts/legal-exhibit-instruction`** (returns legal exhibit referencing instruction for clients to use as system message).
|
| 59 |
+
- **DEPLOY**: `cd /home/mrdbo/projects/moltbot-hybrid-engine && git add -A && git commit -m "msg" && git push origin main`
|
| 60 |
+
- **ACCESS**: Via `cloud_llm_adapter.py` from P3 or curl from Desktop Space.
|
| 61 |
+
|
| 62 |
+
### HF Space Management
|
| 63 |
+
|
| 64 |
+
```bash
|
| 65 |
+
# Check space health
|
| 66 |
+
curl -s https://deebee7-moltbot-hybrid-engine.hf.space/health | python3 -m json.tool
|
| 67 |
+
curl -s https://deebee7-moltbot-legal-desktop.hf.space/health | python3 -m json.tool
|
| 68 |
+
|
| 69 |
+
# Pause + restart (force rebuild) via Python
|
| 70 |
+
python3 -c "
|
| 71 |
+
from huggingface_hub import HfApi
|
| 72 |
+
api = HfApi(token='YOUR_TOKEN')
|
| 73 |
+
api.pause_space('deebee7/moltbot-legal-desktop')
|
| 74 |
+
import time; time.sleep(3)
|
| 75 |
+
api.restart_space('deebee7/moltbot-legal-desktop')
|
| 76 |
+
"
|
| 77 |
+
```
|
| 78 |
+
|
| 79 |
+
### Sync Mechanism (P2/P3 → Desktop)
|
| 80 |
+
|
| 81 |
+
```bash
|
| 82 |
+
# Manual sync
|
| 83 |
+
cd /home/mrdbo/projects/courtBundleGenerator3/adapters && bash sync_to_desktop.sh
|
| 84 |
+
|
| 85 |
+
# Sync + push to HF
|
| 86 |
+
cd /home/mrdbo/projects/courtBundleGenerator3/adapters && bash sync_to_desktop.sh --push
|
| 87 |
+
|
| 88 |
+
# Install auto-sync git hooks
|
| 89 |
+
cd /home/mrdbo/projects/courtBundleGenerator3/adapters && bash install_sync_hook.sh
|
| 90 |
+
```
|
| 91 |
+
|
| 92 |
+
### Evidence Root
|
| 93 |
+
|
| 94 |
+
- All evidence resides under `/home/mrdbo/projects/courtBundleGenerator2/evidence/`
|
| 95 |
+
- **CRITICAL**: Evidence files can be found in ANY `/evidence/` subdirectory
|
| 96 |
+
- All subdirectories have **equal priority** — no allow-lists
|
| 97 |
+
- **Policy**: Full RECURSIVE SCAN
|
| 98 |
+
|
| 99 |
+
### **2. DISCOVERY & FILE RESOLUTION**
|
| 100 |
+
|
| 101 |
+
**SCOPE:**
|
| 102 |
+
- **Root:** `/home/mrdbo/projects/courtBundleGenerator2/evidence`
|
| 103 |
+
- **Policy:** RECURSIVE SCAN (`os.walk` or `rglob`).
|
| 104 |
+
- **Explicit Includes:**
|
| 105 |
+
- `/evidence/Repairs` (MUST BE FOUND)
|
| 106 |
+
- `/evidence/InputDocs`
|
| 107 |
+
- `/evidence/new_evidence_staging`
|
| 108 |
+
- `/evidence/00_CRITICAL_SCANNED`
|
| 109 |
+
- `/evidence/00_CRITICAL_INTAKE`
|
| 110 |
+
- `/legal_emails`
|
| 111 |
+
|
| 112 |
+
---
|
| 113 |
+
|
| 114 |
+
### **4. CURRENT CODE STATE (2026-02-13)**
|
| 115 |
+
|
| 116 |
+
- **`lib/db_registry.py`:** Writes only `dbReference`/`DB Reference`/`DB_Reference`; does **not** set `exhibitNo` or `Exhibit No.`. Provides `sync_exhibit_db_references()`. Seeds from `legal_emails/Phase8/DB_Evidence_List.txt`.
|
| 117 |
+
- **`generate_bundles_final_corrected.py` (P3):** Exhibit number = `bundle_exhibit_no` (e.g. A15, G7); DB ref = `db_ref` from registry. TOC row update sets only `dbReference`. Item metadata uses bundle letter+seq for exhibit, DB for dbReference only.
|
| 118 |
+
- **`config/path_config.py`:** `get_authoritative_discovery_roots` returns a hardcoded list of ALL evidence directories. `EVIDENCE_DIRECTORIES` and `EVIDENCE_DISCOVERY_DIRECTORIES` match this list.
|
| 119 |
+
- **`src/prompt_system_integrator.py`:** `_create_compliance_enforcer` returns `None` (Compliance Bypassed).
|
| 120 |
+
- **`enhanced_bundler_wrapper.patched.py`:** Compliance checks removed. `try...except` block syntax error fixed.
|
| 121 |
+
- **`generate_bundles_final_corrected.py`:** `index_evidence_files` performs a direct `os.walk` on `/evidence`, bypassing any config restrictions.
|
| 122 |
+
- **`file_resolution_bridge.py`:** `find_file` logic simplified to use `resolution_cache.json` as the primary source of truth.
|
| 123 |
+
|
| 124 |
+
---
|
| 125 |
+
|
| 126 |
+
---
|
| 127 |
+
|
| 128 |
+
## CRITICAL DIRECTIVES (HIERARCHY 1 - ABSOLUTE)
|
| 129 |
+
|
| 130 |
+
### 1. NO BROKEN CODE
|
| 131 |
+
|
| 132 |
+
Code with syntax errors, incomplete logic, or untested assumptions **terminates the session immediately**. Test mentally before providing ANY code.
|
| 133 |
+
|
| 134 |
+
### 2. NO PLACEHOLDERS
|
| 135 |
+
|
| 136 |
+
Methods printing "TODO" or returning unchanged data are **prohibited**. Implement the logic fully or stop. No exceptions.
|
| 137 |
+
|
| 138 |
+
### 3. NO SYMPTOM TREATMENT
|
| 139 |
+
|
| 140 |
+
Fix **root causes only**. No patches, workarounds, bypasses, fallbacks, standalone scripts, or parallel pipelines. If you cannot fix the root cause, state this explicitly and ask for guidance.
|
| 141 |
+
|
| 142 |
+
### 4. EMPIRICAL EVIDENCE ONLY
|
| 143 |
+
|
| 144 |
+
Every diagnosis requires **concrete proof**: logs, diffs, or shown code. No assumptions, inferences, or guesses permitted. If you don't have evidence, you must ask for it using the formalized protocol (see Section 9).
|
| 145 |
+
|
| 146 |
+
### 5. PATH DISCIPLINE
|
| 147 |
+
|
| 148 |
+
- **ONLY USE:** `--output-dir /home/mrdbo/court_data/CourtBundleOutput` (for Project 2)
|
| 149 |
+
- **ONLY USE:** `--output-dir /home/mrdbo/court_data/2nd_CourtBundleOutput` (for Project 3)
|
| 150 |
+
- **NEVER USE:** `./local_output` or any other output path
|
| 151 |
+
- Strictly obey the `--output-dir` provided in the command. **NEVER fall back** to hardcoded defaults.
|
| 152 |
+
|
| 153 |
+
### 6. TRUTH IN TELEMETRY
|
| 154 |
+
|
| 155 |
+
- Do **NOT** print "✅" or claim a file exists unless you have successfully run `ls -lh [EXACT_PATH]` and seen the output
|
| 156 |
+
- Do **NOT** hallucinate filenames, timestamps, or success messages
|
| 157 |
+
- All claims must be verifiable with concrete command output
|
| 158 |
+
|
| 159 |
+
---
|
| 160 |
+
|
| 161 |
+
## ANTI-HALLUCINATION PROTOCOL (MANDATORY)
|
| 162 |
+
|
| 163 |
+
1. **RAG (Retrieval-Augmented Generation):** Use `FileResolutionBridge` for all file resolution. Never assume file locations.
|
| 164 |
+
2. **Chain of Thought:** Show step-by-step logic before conclusions. Document your reasoning.
|
| 165 |
+
3. **Chain of Verification:** Validate bundle existence before claiming success. Run `ls -lh` on claimed outputs.
|
| 166 |
+
4. **Specificity:** Detailed context only - no generic statements like "the system works" or "files were processed."
|
| 167 |
+
5. **Role Assignment:** Respect component expertise boundaries. Don't modify code outside your assigned area.
|
| 168 |
+
6. **Require Sources:** Verify sources for all evidence claims. Cite line numbers and file paths.
|
| 169 |
+
7. **Advanced Models:** Use `EnhancedEmbeddingFeatures` appropriately for metadata enrichment.
|
| 170 |
+
8. **Confidence Levels:** Score reliability (80% minimum threshold). Mark outputs below this as UNVERIFIED.
|
| 171 |
+
9. **Multiple Models:** Use correct model for task - Legacy vs Discovery vs Enrichment.
|
| 172 |
+
10. **Lower Temperature:** Use deterministic config for reproducibility (temperature ≤ 0.3).
|
| 173 |
+
11. **External Fact-Checking:** Use court compliance requirements in `config/bundle_compliance.json` as **reference only** — for formatting and content checks. Do **NOT** use them as a gate that blocks discovery, embedding, or bundle output (see section COMPLIANCE, VERIFICATION & VALIDATION — MUST NOT BLOCK).
|
| 174 |
+
12. **Confidence Threshold:** 80% minimum. If below, mark all output as **UNVERIFIED** and request human review.
|
| 175 |
+
|
| 176 |
+
---
|
| 177 |
+
|
| 178 |
+
## ROBUST CODE STANDARDS (MANDATORY)
|
| 179 |
+
|
| 180 |
+
1. **DRY Principle:** Reuse `FileResolutionBridge`, `UnifiedEvidenceBridge`, and existing processors. Never duplicate logic.
|
| 181 |
+
2. **Extensible:** Architecture must allow future compliance features without refactoring.
|
| 182 |
+
3. **Modular:** Isolated, testable changes only. One responsibility per function/class.
|
| 183 |
+
4. **Non-breaking:** Preserve original functionality. Never remove features without explicit permission.
|
| 184 |
+
5. **Configurable:** Use feature flags (e.g., `--enable-discovery`, `--enable-fuzzy`) for new logic.
|
| 185 |
+
6. **Reusable:** Logic must work with any evidence list, not hardcoded to specific files.
|
| 186 |
+
7. **Refactor:** Improve architecture - do not patch over problems.
|
| 187 |
+
8. **Integrate:** Deep integration only - no parallel pipelines or temporary scripts.
|
| 188 |
+
9. **NO STANDALONE:** No `temp_fix.py`, `wrapper_v2.py`, or "quick fixes" allowed.
|
| 189 |
+
|
| 190 |
+
10. **Fix in place:** Do not create parallel or temporary scripts. Fix the files in place.
|
| 191 |
+
|
| 192 |
+
11. **Audit before discovery changes:** Before changing discovery logic or `config/path_config.py`, run `python3 tools/audit_runtime_blockers.py` (from courtBundleGenerator2). Fix any reported blockers first.
|
| 193 |
+
|
| 194 |
+
---
|
| 195 |
+
|
| 196 |
+
## COMPLIANCE, VERIFICATION & VALIDATION — MUST NOT BLOCK (HISTORICAL LESSON)
|
| 197 |
+
|
| 198 |
+
**What went wrong:** A compliance system was introduced to stop agents from making wrong changes to wrong files. Instead, agents enforced it in a way that **blocked** full evidence searches, **blocked** embedding, **blocked** bundle output, and reintroduced whitelist directories, blind spots, and incorrect validation — stalling the project for months. Compliance is now **bypassed by design** so the pipeline can run.
|
| 199 |
+
|
| 200 |
+
**Rule — nothing may block the pipeline:**
|
| 201 |
+
|
| 202 |
+
- **Compliance bypass is intentional.** Do **NOT** re-enable compliance enforcers (e.g. `_create_compliance_enforcer` returning a real enforcer) that block discovery, embedding, or bundle generation. Do **NOT** add checks that prevent the bundler from running, from doing a full evidence scan, or from writing PDFs.
|
| 203 |
+
- **Verification and validation in this document** mean **post-hoc checks only**: run the page verifier *after* bundles are generated, run `ls -lh` on outputs, check `missing_evidence_summary.json`. They do **NOT** mean gating or blocking that stops the pipeline before or during a run.
|
| 204 |
+
- **PROHIBITED:** Do **NOT** add validation, verification, or compliance logic that: (1) blocks the pipeline from starting or continuing, (2) restricts evidence search to a subset of directories, (3) prevents embedding of found files, (4) prevents bundle output, or (5) reintroduces allow-lists/whitelists for discovery. Use `config/bundle_compliance.json` and similar as **reference only** (e.g. for formatting rules), not as a gate that stops execution.
|
| 205 |
+
- **If in doubt:** The pipeline must always be able to run a full evidence scan and produce bundle output. Any change that would prevent that is a violation of this rule.
|
| 206 |
+
|
| 207 |
+
---
|
| 208 |
+
|
| 209 |
+
1) Discover and map the existing chain (no assumptions)
|
| 210 |
+
- Identify the relevant existing modules, functions, and configs.
|
| 211 |
+
- Show the current path from entry point to output before your change.
|
| 212 |
+
|
| 213 |
+
2) Design in terms of the full chain
|
| 214 |
+
- Explain which existing components you will reuse.
|
| 215 |
+
- Identify where you will insert or adjust logic (with file/line references).
|
| 216 |
+
|
| 217 |
+
3) Implement with zero stubs
|
| 218 |
+
- Do not leave pass, unimplemented placeholders, or fake logic.
|
| 219 |
+
- All new code must be exercised by at least one CLI or test command.
|
| 220 |
+
|
| 221 |
+
4) Prove wiring and lifecycle
|
| 222 |
+
- Show definition, call sites, downstream calls, and execution commands.
|
| 223 |
+
- Show log snippets or test output confirming actual execution.
|
| 224 |
+
|
| 225 |
+
5) Call out any gaps
|
| 226 |
+
- If any step is blocked by missing files, invalid data, or broken legacy imports, explicitly call it out and provide remediation steps.
|
| 227 |
+
|
| 228 |
+
Only then may you state that a change is complete.
|
| 229 |
+
|
| 230 |
+
## ENFORCEMENT BLOCKING RULES
|
| 231 |
+
|
| 232 |
+
### Source of Truth
|
| 233 |
+
|
| 234 |
+
- This file (`memory-bank/CRITICAL_INSTRUCTIONS.md`) is the **source of truth**
|
| 235 |
+
- All other documentation defers to this file
|
| 236 |
+
- Conflicts between this file and other docs → this file wins
|
| 237 |
+
|
| 238 |
+
### Protected Files (Require Explicit Permission)
|
| 239 |
+
|
| 240 |
+
**NEVER use `cp`, `mv`, or `backup` on these files:**
|
| 241 |
+
|
| 242 |
+
- `enhanced_bundler_wrapper.patched.py`
|
| 243 |
+
- `create_proper_embedded_bundle.py`
|
| 244 |
+
- `generate_bundles_final_corrected.py`
|
| 245 |
+
- `dual_category_evidence_processor.py`
|
| 246 |
+
- `courtBundleGenerator2_restored/legacy_files/categorize_and_append_v2.py`
|
| 247 |
+
- `categorize_and_append_v2.py`
|
| 248 |
+
|
| 249 |
+
**For other files:** ASK PERMISSION IN CAPITALS before any `cat` command that overwrites existing content.
|
| 250 |
+
|
| 251 |
+
**For critical files:** Only **append**, never overwrite completely.
|
| 252 |
+
|
| 253 |
+
---
|
| 254 |
+
|
| 255 |
+
## 2. COMPLIANCE & ENFORCEMENT PROTOCOLS (UPDATED)
|
| 256 |
+
|
| 257 |
+
### A. THE "NO BLINDING" RULE (Evidence Access)
|
| 258 |
+
|
| 259 |
+
**CRITICAL:** Security boundaries must **NEVER** prevent the discovery of evidence.
|
| 260 |
+
|
| 261 |
+
- **Rule:** If a script encounters a file in a non-standard path (e.g., `/evidence_external` or a deeply nested subfolder), it must **WARN** but **PROCESS IT**.
|
| 262 |
+
- **Prohibited:** `sys.exit()` or `return False` on directory validation errors.
|
| 263 |
+
- **Required:** Log `[WARNING] Path outside standard root: {path} - PROCESSING ANYWAY`.
|
| 264 |
+
|
| 265 |
+
### B. THE "SHOW YOUR WORK" RULE (Task Completion)
|
| 266 |
+
|
| 267 |
+
**CRITICAL:** You are forbidden from claiming "Fixed" or "Complete" until you:
|
| 268 |
+
|
| 269 |
+
1. **EXECUTE** the code (traceable via `gb3_deps.json`).
|
| 270 |
+
2. **VERIFY** the output using the Mandatory Verifier:
|
| 271 |
+
|
| 272 |
+
```bash
|
| 273 |
+
cd /home/mrdbo/projects/courtBundleGenerator3 && \
|
| 274 |
+
python3 pdf_page_verifier_enhanced.py /home/mrdbo/court_data/2nd_CourtBundleOutput
|
| 275 |
+
```
|
| 276 |
+
|
| 277 |
+
3. **If pagination mismatches are reported** (Printed Page Number ≠ PDF Physical Page), run the Pagination Mismatch Analyzer to identify root cause:
|
| 278 |
+
|
| 279 |
+
```bash
|
| 280 |
+
cd /home/mrdbo/projects/courtBundleGenerator3 && \
|
| 281 |
+
python3 tools/pagination_mismatch_analyzer.py /home/mrdbo/court_data/2nd_CourtBundleOutput --json /home/mrdbo/court_data/2nd_CourtBundleOutput/diagnostics/pagination_mismatch_report.json
|
| 282 |
+
```
|
| 283 |
+
|
| 284 |
+
The analyzer classifies mismatch patterns and suggests responsible script/function (e.g. `add_volume_pagination()`, `embed_evidence_with_metadata()`). Fix root cause in place; do not add workarounds.
|
| 285 |
+
|
| 286 |
+
---
|
| 287 |
+
|
| 288 |
+
## ENTRY POINTS & VERIFICATION
|
| 289 |
+
|
| 290 |
+
### Before ANY Run (NON-NEGOTIABLE)
|
| 291 |
+
|
| 292 |
+
1. **ASK IN CAPITALS** which entrypoint we are using:
|
| 293 |
+
- `enhanced_bundler_wrapper.patched.py` (Project 2)
|
| 294 |
+
- `generate_bundles_final_corrected.py` (Project 3)
|
| 295 |
+
- `create_proper_embedded_bundle.py` (direct bundler)
|
| 296 |
+
|
| 297 |
+
2. **RUN `python3 <ENTRYPOINT> -h`** and paste the output
|
| 298 |
+
|
| 299 |
+
3. **USE ONLY FLAGS** explicitly shown in that `-h` output
|
| 300 |
+
|
| 301 |
+
4. If required policy flags are missing from argparse, **ADD THEM FIRST** (then re-run `-h` and paste it). Do not run unsupported flags.
|
| 302 |
+
|
| 303 |
+
### Canonical Run Commands
|
| 304 |
+
|
| 305 |
+
**Project 2 Wrapper:**
|
| 306 |
+
|
| 307 |
+
```bash
|
| 308 |
+
source court_venv_20250802/bin/activate && \
|
| 309 |
+
python3 -u enhanced_bundler_wrapper.patched.py \
|
| 310 |
+
--output-dir /home/mrdbo/court_data/CourtBundleOutput \
|
| 311 |
+
--enable-discovery \
|
| 312 |
+
--enable-fuzzy \
|
| 313 |
+
--recursive \
|
| 314 |
+
--limit 15 \
|
| 315 |
+
--limit-per-bundle 5 \
|
| 316 |
+
2>&1 | tee -a telemetry.log
|
| 317 |
+
```
|
| 318 |
+
|
| 319 |
+
**Project 3 Generator:**
|
| 320 |
+
|
| 321 |
+
```bash
|
| 322 |
+
cd /home/mrdbo/projects/courtBundleGenerator3 && \
|
| 323 |
+
source court_venv_20250802/bin/activate && \
|
| 324 |
+
python3 -u generate_bundles_final_corrected.py \
|
| 325 |
+
--output-dir /home/mrdbo/court_data/2nd_CourtBundleOutput \
|
| 326 |
+
--enable-discovery \
|
| 327 |
+
--recursive \
|
| 328 |
+
--limit 15 \
|
| 329 |
+
--limit-per-bundle 5 \
|
| 330 |
+
2>&1 | tee -a telemetry.log
|
| 331 |
+
```
|
| 332 |
+
|
| 333 |
+
**Full Integration Test:**
|
| 334 |
+
|
| 335 |
+
```bash
|
| 336 |
+
# 3) Compile check
|
| 337 |
+
python3 -m py_compile /home/mrdbo/projects/courtBundleGenerator3/generate_bundles_final_corrected.py
|
| 338 |
+
echo "exit_code=$?"
|
| 339 |
+
|
| 340 |
+
cat jira_adapt
|
| 341 |
+
er.py
|
| 342 |
+
cat: jira_adapter.py: No such file or directory
|
| 343 |
+
|
| 344 |
+
|
| 345 |
+
rg -n "missing 2 required positional arguments|UnboundLocalError|ERR_CLOSED_WRITER|close\(\) was called|TypeError: expected str, bytes|SKIP TOC ROW" /tmp/gb3_run.log
|
| 346 |
+
|
| 347 |
+
```
|
| 348 |
+
|
| 349 |
+
---
|
| 350 |
+
|
| 351 |
+
## VERIFICATION LOOP (AFTER EVERY CHANGE)
|
| 352 |
+
|
| 353 |
+
### Mandatory Verification Steps
|
| 354 |
+
|
| 355 |
+
1. Run the appropriate canonical command (see above)
|
| 356 |
+
2. Check for Chain-of-Verification logs:
|
| 357 |
+
- `AntiHallucinationManager.__init__`
|
| 358 |
+
- `EnhancedFuzzyResolver.resolve_evidence_paths`
|
| 359 |
+
- `UnifiedEvidenceBridge.get_unified_evidence`
|
| 360 |
+
- **Missing logs = broken integration chain → STOP and FIX**
|
| 361 |
+
3. Verify PDF generation:
|
| 362 |
+
|
| 363 |
+
```bash
|
| 364 |
+
ls -lh /home/mrdbo/court_data/2nd_CourtBundleOutput/court_bundle*.pdf OR ls -lh /home/mrdbo/court_data/CourtBundleOutput/BUNDLE*.pdf
|
| 365 |
+
```
|
| 366 |
+
|
| 367 |
+
4. Check missing evidence report:
|
| 368 |
+
|
| 369 |
+
```bash
|
| 370 |
+
cat /home/mrdbo/court_data/CourtBundleOutput/missing_evidence_summary.json
|
| 371 |
+
```
|
| 372 |
+
|
| 373 |
+
### Completion Gate (Run Proof Required)
|
| 374 |
+
|
| 375 |
+
**ABSOLUTE COMPLETION GATE** — You must not claim completion unless ALL are true:
|
| 376 |
+
|
| 377 |
+
- Bundles A–I (or chosen set) generated in the chosen output directory
|
| 378 |
+
- PDFs are non-empty
|
| 379 |
+
- Page-level verification run (see **AGENT AUDIT & VERIFICATION COMMANDS** below) confirms embedding completeness; 0 missing or fully enumerated with reason codes
|
| 380 |
+
- Missing evidence summary is empty OR missing is fully enumerated with reason codes
|
| 381 |
+
- TOC sync issues == 0; DB numbers present in TOC + evidence pages
|
| 382 |
+
- No raw paths on PDF pages (embedding failure); continue verification loop until 0 missing. Empirical analysis only; no guessing or assumptions.
|
| 383 |
+
- User confirms PDFs are correct when applicable
|
| 384 |
+
|
| 385 |
+
A run is **NOT accepted** unless you output:
|
| 386 |
+
|
| 387 |
+
- The exact command executed
|
| 388 |
+
- At least one generated PDF path with size + full ISO timestamp
|
| 389 |
+
- `missing_evidence_summary.json` status (empty array or specific list)
|
| 390 |
+
- Confirmation that previously missing files (e.g., from `Repairs` folder) are now embedded
|
| 391 |
+
|
| 392 |
+
**Note:** There's a symlink `./output` → `/home/mrdbo/court_data/CourtBundleOutput` & symlink to `/home/mrdbo/court_data/2nd_CourtBundleOutput` but always use the **full path** in commands.
|
| 393 |
+
|
| 394 |
+
---
|
| 395 |
+
|
| 396 |
+
## IN-SITU PATCH FORMAT (REQUIRED FOR ALL CODE CHANGES)
|
| 397 |
+
|
| 398 |
+
When providing code changes, **ALWAYS** use this exact format:
|
| 399 |
+
|
| 400 |
+
```
|
| 401 |
+
FILE: /absolute/path/to/script.py
|
| 402 |
+
LOCATION: Inside ClassName.method_name() at line ~XX
|
| 403 |
+
|
| 404 |
+
--- CODE ABOVE (3-5 lines context) ---
|
| 405 |
+
def method_name(self, param):
|
| 406 |
+
existing_variable = some_value
|
| 407 |
+
current_logic_here()
|
| 408 |
+
|
| 409 |
+
--- CHANGES ---
|
| 410 |
+
[ ] DELETE: current_logic_here()
|
| 411 |
+
|
| 412 |
+
[+] INSERT AFTER "existing_variable = some_value":
|
| 413 |
+
new_logic_here()
|
| 414 |
+
proper_implementation()
|
| 415 |
+
|
| 416 |
+
[>] OVERWRITE (if replacing lines):
|
| 417 |
+
OLD: current_logic_here()
|
| 418 |
+
NEW: new_logic_here()
|
| 419 |
+
|
| 420 |
+
--- CODE BELOW (3-5 lines context) ---
|
| 421 |
+
return final_result
|
| 422 |
+
|
| 423 |
+
--- VERIFICATION ---
|
| 424 |
+
Run: python3 -c "from script_name import ClassName; ClassName().method_name('test')"
|
| 425 |
+
Expected: [Specific expected output or "no errors"]
|
| 426 |
+
```
|
| 427 |
+
|
| 428 |
+
### Why This Format?
|
| 429 |
+
|
| 430 |
+
- **Unambiguous location** - exact file path and context lines
|
| 431 |
+
- **Clear changes** - DELETE/INSERT/OVERWRITE are explicit
|
| 432 |
+
- **Verifiable** - includes test command with expected output
|
| 433 |
+
- **No guessing** - human knows exactly where to apply changes
|
| 434 |
+
|
| 435 |
+
---
|
| 436 |
+
|
| 437 |
+
## ASKING FOR MISSING INFORMATION (FORMALIZED PROTOCOL)
|
| 438 |
+
|
| 439 |
+
When you need information to proceed, use this **exact format**:
|
| 440 |
+
|
| 441 |
+
```
|
| 442 |
+
BLOCKED: [Specific blocker - be precise]
|
| 443 |
+
|
| 444 |
+
REQUIRED INFORMATION:
|
| 445 |
+
1. [Exact command to run]
|
| 446 |
+
Example: Run: find /home/mrdbo/court_data -name "*resolution*" -type f
|
| 447 |
+
|
| 448 |
+
2. [Exact file/section to show]
|
| 449 |
+
Example: Show: Lines 50-70 of enhanced_bundler_wrapper.patched.py
|
| 450 |
+
|
| 451 |
+
3. [Exact error message to paste]
|
| 452 |
+
Example: Paste: Full traceback from last run of generate_bundles_final_corrected.py
|
| 453 |
+
|
| 454 |
+
CANNOT PROCEED UNTIL: [Specific data needed]
|
| 455 |
+
Example: "Confirming FileResolutionBridge exists and its import path"
|
| 456 |
+
```
|
| 457 |
+
|
| 458 |
+
### What NOT to do
|
| 459 |
+
|
| 460 |
+
- ❌ "Can you check if the file exists?"
|
| 461 |
+
- ❌ "I think there might be an issue..."
|
| 462 |
+
- ❌ "Please verify the paths"
|
| 463 |
+
|
| 464 |
+
### What TO do
|
| 465 |
+
|
| 466 |
+
- ✅ "Run: ls -lh /home/mrdbo/projects/courtBundleGenerator2/file_resolution_bridge.py"
|
| 467 |
+
- ✅ "Show: Lines containing 'class FileResolutionBridge' in file_resolution_bridge.py"
|
| 468 |
+
- ✅ "Paste: Output of python3 -c 'import file_resolution_bridge; print(dir(file_resolution_bridge))'"
|
| 469 |
+
|
| 470 |
+
---
|
| 471 |
+
|
| 472 |
+
## DISCOVERY & FILE RESOLUTION
|
| 473 |
+
|
| 474 |
+
**See also:** COMPLIANCE, VERIFICATION & VALIDATION — MUST NOT BLOCK. Do not add compliance or validation that restricts discovery, embedding, or bundle output.
|
| 475 |
+
|
| 476 |
+
### Scope (Unrestricted — no allow-list)
|
| 477 |
+
|
| 478 |
+
- **Root:** `/home/mrdbo/projects/courtBundleGenerator2/evidence` (Project 2) or `/home/mrdbo/projects/courtBundleGenerator3/evidence` (Project 3).
|
| 479 |
+
- **Policy:** Full RECURSIVE SCAN of the evidence root (`os.walk` or `Path().rglob()`). **All** subdirectories under the root must be discoverable. If a file exists under the evidence root, it must be findable.
|
| 480 |
+
- **PROHIBITED:** Do **NOT** restrict discovery to a fixed list of directories. Do **NOT** implement an allow-list or whitelist that excludes other evidence subdirectories. Do **NOT** add code that limits search to "only" certain folders — this has repeatedly caused blind spots (e.g. Repairs was blocked). Discovery for **finding** files must cover the **entire** evidence tree. (Other docs may refer to where to **write** or stage new evidence; that is separate. **Search/find** must never be restricted to a subset.)
|
| 481 |
+
|
| 482 |
+
### Blind-spot check (must not be excluded)
|
| 483 |
+
|
| 484 |
+
These locations have historically been missed when agents restricted search to a list; they are **examples of what must not be excluded**, not a list to restrict to:
|
| 485 |
+
|
| 486 |
+
- `/evidence/Repairs` (often wrongly excluded — MUST be findable)
|
| 487 |
+
- `/evidence/InputDocs`, `/evidence/new_evidence_staging`, `/evidence/00_CRITICAL_SCANNED`, `/evidence/00_CRITICAL_INTAKE`, `/legal_emails`, `/docs` — and **any other subdirectory under the evidence root**.
|
| 488 |
+
|
| 489 |
+
If a script reports "File not found" for a file that **exists on disk** under the evidence root (e.g. under `/evidence/Repairs`), the discovery logic is **BROKEN** — usually because it was restricted to a subset of directories. Fix by ensuring the **whole** evidence root is scanned, not by adding one more directory to a list.
|
| 490 |
+
|
| 491 |
+
**Fix immediately:**
|
| 492 |
+
|
| 493 |
+
1. Check `config/path_config.py::get_authoritative_discovery_roots()` — it must return **all** evidence subdirectories (or the root only so recursive scan finds everything).
|
| 494 |
+
2. Do **not** reduce the set to a "required" or "approved" subset.
|
| 495 |
+
3. Run audit: `python3 tools/audit_runtime_blockers.py`
|
| 496 |
+
|
| 497 |
+
---
|
| 498 |
+
|
| 499 |
+
## COMPLETION GATES (PROOF OF SUCCESS)
|
| 500 |
+
|
| 501 |
+
**ABSOLUTE COMPLETION GATE** — Same as "Completion Gate (Run Proof Required)" above. Run the appropriate page-level verifier from **AGENT AUDIT & VERIFICATION COMMANDS** (P3: `pdf_page_verifier_enhanced.py`; P2: `embedding_utils/pdf_page_verifier.py`). Empirical analysis only; continue until 0 missing.
|
| 502 |
+
|
| 503 |
+
A task is **NOT COMPLETE** unless ALL of the following are true:
|
| 504 |
+
|
| 505 |
+
1. **Zero Missing Files:**
|
| 506 |
+
- `missing_evidence_summary.json` contains `[]` (empty array)
|
| 507 |
+
- OR `missing_count: 0` appears in logs
|
| 508 |
+
- **False positive check:** If any files from `Repairs/` or other known directories are still missing, the task FAILED
|
| 509 |
+
|
| 510 |
+
2. **PDF Generation:**
|
| 511 |
+
- Non-empty PDF files exist in output directory
|
| 512 |
+
- Run: `ls -lh /home/mrdbo/court_data/CourtBundleOutput/*.pdf`
|
| 513 |
+
- Run: `ls -lh /home/mrdbo/court_data/2nd_CourtBundleOutput/*.pdf`
|
| 514 |
+
- Verify file sizes > 0 bytes
|
| 515 |
+
|
| 516 |
+
3. **Specific Proof (for previously blind files):**
|
| 517 |
+
- Confirm files like `Faulty_Fire_alarm_control_system...jpg` are embedded
|
| 518 |
+
- Check PDF page count matches expected evidence count
|
| 519 |
+
- Verify TOC includes all expected sections
|
| 520 |
+
|
| 521 |
+
4. **No False Positives:**
|
| 522 |
+
- Reporting "Success ✅" while `missing_files > 0` is a **CRITICAL FAILURE**
|
| 523 |
+
- Agent must re-run verification and fix before claiming success
|
| 524 |
+
|
| 525 |
+
---
|
| 526 |
+
|
| 527 |
+
## REQUIRED STEPS FOR EVERY CHANGE
|
| 528 |
+
|
| 529 |
+
1. **Update Documentation:**
|
| 530 |
+
- Add entry to relevant `memory-bank/*` file
|
| 531 |
+
- Include `Current State (YYYY-MM-DD)` section
|
| 532 |
+
- Document what changed and why
|
| 533 |
+
|
| 534 |
+
2. **Make Code Edit:**
|
| 535 |
+
- Use IN-SITU PATCH FORMAT
|
| 536 |
+
- Include inline FIX notes (e.g., `# FIX: create_proper_embedded_bundle.py:2882`)
|
| 537 |
+
- Reference file paths and line numbers
|
| 538 |
+
|
| 539 |
+
3. **Run Verification Command:**
|
| 540 |
+
- Use appropriate canonical command
|
| 541 |
+
- Archive stdout/stderr alongside the PDF artifact path in your notes or session summary (traceability)
|
| 542 |
+
- Save PDF artifact path
|
| 543 |
+
|
| 544 |
+
4. **Confirm Output:**
|
| 545 |
+
- Run: `ls -lh /home/mrdbo/court_data/CourtBundleOutput/court_bundle*.pdf`
|
| 546 |
+
- Attach snippet to report
|
| 547 |
+
- Verify file sizes and timestamps
|
| 548 |
+
|
| 549 |
+
5. **Summarize Changes:**
|
| 550 |
+
- What changed
|
| 551 |
+
- Which files were touched
|
| 552 |
+
- Which Chain-of-Verification checkpoints fired
|
| 553 |
+
- Any new issues discovered
|
| 554 |
+
|
| 555 |
+
---
|
| 556 |
+
|
| 557 |
+
## PENALTY SYSTEM
|
| 558 |
+
|
| 559 |
+
### Violations & Consequences
|
| 560 |
+
|
| 561 |
+
| Violation | Consequence | Recovery |
|
| 562 |
+
|-----------|-------------|----------|
|
| 563 |
+
| **Broken Code Provided** | Session ends immediately, all output invalidated | Start fresh session, provide working code |
|
| 564 |
+
| **Placeholder Code** | Task rejected, must re-plan | Implement full logic or request help |
|
| 565 |
+
| **Hallucinated Files/Success** | Confidence score → 0%, all claims invalidated | Re-verify everything with `ls` commands |
|
| 566 |
+
| **Skipped Verification** | All subsequent output marked UNVERIFIED | Run full verification loop, provide proof |
|
| 567 |
+
| **Assumed File Exists** | Must re-verify with explicit commands | Show actual file contents or command output |
|
| 568 |
+
| **Bypassing Rules** | Session paused, requires explicit re-authorization | Acknowledge violation, commit to rules |
|
| 569 |
+
| **Re-enabling compliance/validation that blocks pipeline** | Session paused; change reverted | Compliance bypass is intentional; do not add checks that block discovery, embedding, or bundle output |
|
| 570 |
+
|
| 571 |
+
### Escalation
|
| 572 |
+
|
| 573 |
+
- **First violation:** Warning + correction required
|
| 574 |
+
- **Second violation:** Session reset, start from verification
|
| 575 |
+
- **Third violation:** Task marked FAILED, human intervention required
|
| 576 |
+
|
| 577 |
+
---
|
| 578 |
+
|
| 579 |
+
## EXHIBIT & DB REFERENCE SYNC (COURT FORMAT)
|
| 580 |
+
|
| 581 |
+
- **Rule:** DB numbers without filename and without bundle initial letter+number are **not adequate** for any document receiving amendments, edits, or insertions. All such documents must use **Exhibit [Letter][Number] (DB-[N]) — [Filename]**.
|
| 582 |
+
- **Exhibit number** = [Bundle letter][Sequential] (e.g. A15, G7). Set only in the bundler; never overwritten by the DB registry.
|
| 583 |
+
- **DB reference** = DB-[N] (e.g. DB-125). Set in `lib/db_registry.py`; never used as the exhibit number.
|
| 584 |
+
- **lib/db_registry.py:** Writes only `dbReference` / `DB Reference` / `DB_Reference`. Does **not** set `exhibitNo` or `Exhibit No.` to a DB value. Provides `sync_exhibit_db_references()` to fill DB refs without touching exhibit numbers.
|
| 585 |
+
- **generate_bundles_final_corrected.py (P3):** Uses `bundle_exhibit_no` for `exhibitNo` / `Exhibit No.` and `db_ref` from registry for `dbReference`. TOC row update sets only `dbReference`, not `Exhibit No.`.
|
| 586 |
+
- **Authoritative DB list:** `legal_emails/Phase8/DB_Evidence_List.txt` (DB1–DB170). Bundle letter assignment from `BUNDLE_GROUPS_WITH_FULL_EVIDENCE_FILE_NAMES.md`.
|
| 587 |
+
- **Legal documents (witness statement, N244, SRA):** Reference evidence as **Exhibit [Letter][Number] (DB-[N]) — [Filename]**. Do not use bare "DB-[●]". See `PROMPTS/EXHIBIT_REFERENCING_FOR_LEGAL_DOCS.md`, `PROMPTS/LEGAL_WRITING_EXHIBIT_INSTRUCTION.md`, `PROMPTS/HOW_TO_MAKE_AGENTS_AWARE.md`. Cursor rule in `.cursorrules`; Moltbot/Qwen clients should send the instruction as system message (fetch from Engine `GET /prompts/legal-exhibit-instruction` when in legal-document mode).
|
| 588 |
+
- **Status (amendments / edit sources):** See **`PROMPTS/STATUS_EXHIBIT_AND_EDIT_SOURCES.md`** — what is in sync, what edit sources (AI Advisor, Moltbot, Qwen) must do to output the full format for flag updates/amendments/inserts.
|
| 589 |
+
|
| 590 |
+
---
|
| 591 |
+
|
| 592 |
+
## CURRENT PROJECT STATE (2026-02-13)
|
| 593 |
+
|
| 594 |
+
### Recent Changes
|
| 595 |
+
|
| 596 |
+
- **2026-02-13:** Exhibit/DB sync: db_registry no longer overwrites exhibitNo; bundler uses bundle_exhibit_no for exhibit, db_ref for DB only; PROMPTS for legal document referencing (LEGAL_WRITING_EXHIBIT_INSTRUCTION, EXHIBIT_REFERENCING_FOR_LEGAL_DOCS, HOW_TO_MAKE_AGENTS_AWARE); .cursorrules legal exhibit rule; moltbot-hybrid-engine `GET /prompts/legal-exhibit-instruction`.
|
| 597 |
+
- **2026-02-13:** Cross-project impact audit: `tools/cross_project_impact_audit.py` supports `--entry project:path` for runtime-chain focus (BFS from entrypoints across projects).
|
| 598 |
+
- **2026-02-06:** Desktop space converted from CLI to FastAPI web server (`app.py`), Dockerfile v2.0
|
| 599 |
+
- **2026-02-06:** DBRegistry import guarded with double-fallback + `HAS_DB_REGISTRY` in Desktop
|
| 600 |
+
- **2026-02-06:** Hybrid Engine space deployed with Dockerfile v4.0 (Dev Mode compatible); added `prompts/legal_exhibit_instruction.txt` and GET `/prompts/legal-exhibit-instruction`
|
| 601 |
+
- **2026-02-06:** `start.sh` v3.2 for Hybrid Engine — installs Ollama at runtime, pulls Qwen 2.5 in background
|
| 602 |
+
- **2026-02-06:** Automated sync system created: `sync_to_desktop.sh` + `install_sync_hook.sh`
|
| 603 |
+
- **2026-02-06:** `link_validator.py` recovered from git in P3
|
| 604 |
+
- **2026-01-22:** Unrestricted discovery enabled in `config/path_config.py`
|
| 605 |
+
- **2026-01-15:** PDF verification telemetry added to `embedding_utils/telemetry.py`
|
| 606 |
+
- **2026-01-14:** Effective limit computation fixed in `enhanced_bundler_wrapper.patched.py`
|
| 607 |
+
|
| 608 |
+
### Active Issues
|
| 609 |
+
|
| 610 |
+
- [ ] Finish lazy-import plan for `DualCategoryEvidenceProcessor` to prevent circular imports
|
| 611 |
+
- [ ] Ensure `EnhancedEmbeddingFeatures` uses prompt system singleton consistently
|
| 612 |
+
- [ ] Verify all files in `/evidence/Repairs` are discoverable
|
| 613 |
+
- [ ] Configure Jira integration (currently missing JIRA_URL, JIRA_EMAIL, JIRA_TOKEN env vars)
|
| 614 |
+
- [ ] Verify Hybrid Engine Ollama/Qwen model pull completes on HF free tier
|
| 615 |
+
|
| 616 |
+
### Key Files
|
| 617 |
+
|
| 618 |
+
- `enhanced_bundler_wrapper.patched.py` - Main wrapper (Project 2)
|
| 619 |
+
- `create_proper_embedded_bundle.py` - Direct bundler
|
| 620 |
+
- `generate_bundles_final_corrected.py` - Main generator (Project 3)
|
| 621 |
+
- `lib/db_registry.py` - DB assignment; seeds from `legal_emails/Phase8/DB_Evidence_List.txt`; does not overwrite exhibitNo
|
| 622 |
+
- `cohesive_unified_evidence_processor.py` - Evidence processing
|
| 623 |
+
- `category_mapping.py` - Category classification
|
| 624 |
+
- `dual_category_evidence_processor.py` - Dual categorization
|
| 625 |
+
- `embedding_utils/prompt_system_integration.py` - Prompt system
|
| 626 |
+
- `config/path_config.py` - Discovery roots configuration
|
| 627 |
+
- `file_resolution_bridge.py` - File resolution with caching
|
| 628 |
+
- **PROMPTS:** `PROMPT_HEADER_13_12_25.md`, `EXHIBIT_REFERENCING_FOR_LEGAL_DOCS.md`, `LEGAL_WRITING_EXHIBIT_INSTRUCTION.md`, `HOW_TO_MAKE_AGENTS_AWARE.md`
|
| 629 |
+
- **Audit:** `tools/cross_project_impact_audit.py` (optional `--entry` for runtime chain), `tools/audit_active_bundling_files.py` (live chain → gb3_deps.json)
|
| 630 |
+
|
| 631 |
+
---
|
| 632 |
+
|
| 633 |
+
## IMMEDIATE OBJECTIVES
|
| 634 |
+
|
| 635 |
+
1. **Complete Discovery Verification:**
|
| 636 |
+
- Confirm all `/evidence/Repairs` files are found
|
| 637 |
+
- Run: `python3 tools/audit_discovery_coverage.py`
|
| 638 |
+
- Fix any remaining blind spots
|
| 639 |
+
|
| 640 |
+
2. **Eliminate Circular Imports:**
|
| 641 |
+
- Finish lazy-import for `DualCategoryEvidenceProcessor`
|
| 642 |
+
- Test: `python3 -c "import category_mapping; print('OK')"`
|
| 643 |
+
|
| 644 |
+
3. **Validate Compliance:**
|
| 645 |
+
- Run full integration test with all flags
|
| 646 |
+
- Verify zero missing evidence
|
| 647 |
+
- Confirm court compliance (TOC, pagination, exhibit numbers)
|
| 648 |
+
|
| 649 |
+
---
|
| 650 |
+
|
| 651 |
+
## REFERENCE ARCHITECTURE
|
| 652 |
+
|
| 653 |
+
### Chain of Verification Components
|
| 654 |
+
|
| 655 |
+
```
|
| 656 |
+
User Command
|
| 657 |
+
↓
|
| 658 |
+
enhanced_bundler_wrapper.patched.py (argparse + flags)
|
| 659 |
+
↓
|
| 660 |
+
AntiHallucinationManager.__init__ (initialize protocols)
|
| 661 |
+
↓
|
| 662 |
+
config/path_config.py::get_authoritative_discovery_roots() (get search paths)
|
| 663 |
+
↓
|
| 664 |
+
EnhancedFuzzyResolver.resolve_evidence_paths() (find files)
|
| 665 |
+
↓
|
| 666 |
+
UnifiedEvidenceBridge.get_unified_evidence() (consolidate evidence)
|
| 667 |
+
↓
|
| 668 |
+
create_proper_embedded_bundle.py (generate PDF)
|
| 669 |
+
↓
|
| 670 |
+
Verification: ls -lh <output_path>
|
| 671 |
+
↓
|
| 672 |
+
Verification: cat missing_evidence_summary.json
|
| 673 |
+
```
|
| 674 |
+
|
| 675 |
+
### Critical Integration Points
|
| 676 |
+
|
| 677 |
+
1. **Path Configuration** → `config/path_config.py`
|
| 678 |
+
2. **File Resolution** → `file_resolution_bridge.py` + `enhanced_fuzzy_filename_resolver.py`
|
| 679 |
+
3. **Evidence Consolidation** → `unified_evidence_bridge.py`
|
| 680 |
+
4. **Categorization** → `category_mapping.py` + `dual_category_evidence_processor.py`
|
| 681 |
+
5. **Bundle Generation** → `create_proper_embedded_bundle.py`
|
| 682 |
+
6. **Compliance Validation** → `config/bundle_compliance.json`
|
| 683 |
+
|
| 684 |
+
---
|
| 685 |
+
|
| 686 |
+
## NOTES FOR AGENT BUILDERS
|
| 687 |
+
|
| 688 |
+
### When Using This File for Gemini/Other Agents
|
| 689 |
+
|
| 690 |
+
1. **Agent Instructions (Main):** Use sections 1-3 (CRITICAL DIRECTIVES, ANTI-HALLUCINATION, ROBUST CODE STANDARDS) **and** COMPLIANCE, VERIFICATION & VALIDATION — MUST NOT BLOCK (compliance bypass is intentional; do not re-enable blocking).
|
| 691 |
+
|
| 692 |
+
2. **Session Prompt:** Use sections 4-6 (ENFORCEMENT, ENTRY POINTS, VERIFICATION)
|
| 693 |
+
|
| 694 |
+
3. **Task Execution:** Use sections 7-9 (PATCH FORMAT, ASKING PROTOCOL, DISCOVERY)
|
| 695 |
+
|
| 696 |
+
4. **Knowledge Base:** Use sections 10-13 (COMPLETION GATES, STEPS, PENALTIES, CURRENT STATE)
|
| 697 |
+
|
| 698 |
+
### Testing Protocol
|
| 699 |
+
|
| 700 |
+
```bash
|
| 701 |
+
# Test 1: Verify discovery coverage
|
| 702 |
+
python3 tools/audit_discovery_coverage.py
|
| 703 |
+
|
| 704 |
+
# Test 2: Test file resolution
|
| 705 |
+
python3 -c "from file_resolution_bridge import FileResolutionBridge; print(FileResolutionBridge().find_file('test.pdf'))"
|
| 706 |
+
|
| 707 |
+
# Test 3: Run with minimal flags
|
| 708 |
+
python3 enhanced_bundler_wrapper.patched.py --output-dir /home/mrdbo/court_data/CourtBundleOutput --limit 1
|
| 709 |
+
|
| 710 |
+
# Test 4: Verify output
|
| 711 |
+
ls -lh /home/mrdbo/court_data/CourtBundleOutput/*.pdf
|
| 712 |
+
cat /home/mrdbo/court_data/CourtBundleOutput/missing_evidence_summary.json
|
| 713 |
+
```
|
| 714 |
+
|
| 715 |
+
---
|
| 716 |
+
|
| 717 |
+
## AGENT AUDIT & VERIFICATION COMMANDS (ABSOLUTE TASK COMPLETION)
|
| 718 |
+
|
| 719 |
+
**Canonical reference:** `/home/mrdbo/projects/courtBundleGenerator2/AUDITING_COMMANDS_23_1_26.md` — agents MUST use these for verification loops, audits, and diagnostics before claiming task completion.
|
| 720 |
+
|
| 721 |
+
### Page-level verification (mandatory before claiming bundle success)
|
| 722 |
+
|
| 723 |
+
- **Project 3 (generate_bundles_final_corrected.py):**
|
| 724 |
+
|
| 725 |
+
```bash
|
| 726 |
+
cd /home/mrdbo/projects/courtBundleGenerator3 && python3 pdf_page_verifier_enhanced.py /home/mrdbo/court_data/2nd_CourtBundleOutput
|
| 727 |
+
```
|
| 728 |
+
|
| 729 |
+
- **Project 2 (enhanced_bundler_wrapper / create_proper_embedded_bundle):**
|
| 730 |
+
|
| 731 |
+
```bash
|
| 732 |
+
cd /home/mrdbo/projects/courtBundleGenerator2 && python3 embedding_utils/pdf_page_verifier.py /home/mrdbo/court_data/CourtBundleOutput
|
| 733 |
+
```
|
| 734 |
+
|
| 735 |
+
### Prevention & diagnostics
|
| 736 |
+
|
| 737 |
+
- **Audit prevention measures in codebase + optional bundle verify:**
|
| 738 |
+
|
| 739 |
+
```bash
|
| 740 |
+
cd /home/mrdbo/projects/courtBundleGenerator3 && python3 tools/audit_bundle_prevention.py
|
| 741 |
+
cd /home/mrdbo/projects/courtBundleGenerator3 && python3 tools/audit_bundle_prevention.py --verify-bundles /home/mrdbo/court_data/2nd_CourtBundleOutput
|
| 742 |
+
```
|
| 743 |
+
|
| 744 |
+
- **Test cloud download URLs:**
|
| 745 |
+
|
| 746 |
+
```bash
|
| 747 |
+
cd /home/mrdbo/projects/courtBundleGenerator3 && python3 tools/test_download_links.py
|
| 748 |
+
```
|
| 749 |
+
|
| 750 |
+
- **Runtime chain / dependency audit:**
|
| 751 |
+
|
| 752 |
+
```bash
|
| 753 |
+
cd /home/mrdbo/projects/courtBundleGenerator2 && python3 tools/audit_runtime_chain.py --root . --out code_analysis/Dec25/audit_runtime_report.json
|
| 754 |
+
cd /home/mrdbo/projects/courtBundleGenerator2 && python3 tools/audit_active_bundling_files.py --root . --entry /home/mrdbo/projects/courtBundleGenerator3/generate_bundles_final_corrected.py --out code_analysis/gb3_deps.json
|
| 755 |
+
```
|
| 756 |
+
|
| 757 |
+
- **Runtime blockers (before touching discovery):**
|
| 758 |
+
|
| 759 |
+
```bash
|
| 760 |
+
cd /home/mrdbo/projects/courtBundleGenerator2 && python3 tools/audit_runtime_blockers.py
|
| 761 |
+
```
|
| 762 |
+
|
| 763 |
+
### Completion gate (all must pass)
|
| 764 |
+
|
| 765 |
+
- Bundles generated in chosen output dir; PDFs non-empty.
|
| 766 |
+
- Page-level verification run (pdf_page_verifier_enhanced or embedding_utils/pdf_page_verifier) with 0 missing or fully enumerated.
|
| 767 |
+
- missing_evidence_summary.json empty or with reason codes.
|
| 768 |
+
- TOC sync issues == 0; DB numbers present in TOC and evidence pages.
|
| 769 |
+
- No raw paths on PDF pages (embedding failure); continue verification loop until 0 missing.
|
| 770 |
+
|
| 771 |
+
---
|
| 772 |
+
|
| 773 |
+
**END OF CRITICAL INSTRUCTIONS v6.0**
|
| 774 |
+
|
| 775 |
+
*Last verified: 2026-02-13*
|
| 776 |
+
*Next review: When major architectural changes occur or after 10 successful bundle generations*
|
| 777 |
+
|
| 778 |
+
# ------------------------------------------------------------------
|
| 779 |
+
|
| 780 |
+
# COMPREHENSIVE SAFETY & FORMATTING PROTOCOLS (MANDATORY)
|
| 781 |
+
|
| 782 |
+
# ------------------------------------------------------------------
|
| 783 |
+
|
| 784 |
+
## 9. MANDATORY CODE EDITING PROTOCOL (THE 4-POINT ANCHOR)
|
| 785 |
+
|
| 786 |
+
**CRITICAL:** To prevent "NameErrors" and context loss, "naked" code blocks are PROHIBITED.
|
| 787 |
+
You must use this exact format for EVERY code change:
|
| 788 |
+
|
| 789 |
+
1. **FILE & CONTEXT HEADER:** `# File: /absolute/path/to/file.py`
|
| 790 |
+
`# Context: Class [Name], Function [Name], Line Approx [X]`
|
| 791 |
+
|
| 792 |
+
2. **ANCHOR (Pre-Verification):**
|
| 793 |
+
|
| 794 |
+
```python
|
| 795 |
+
TEXT ABOVE (Unchanged - Minimum 3 lines):
|
| 796 |
+
[Paste exact existing code here to prove you know the location]
|
| 797 |
+
```
|
| 798 |
+
|
| 799 |
+
3. **DELETION (Explicit Warning):**
|
| 800 |
+
|
| 801 |
+
```python
|
| 802 |
+
❌ DELETING / OVERWRITING:
|
| 803 |
+
[Paste the exact lines being removed. If nothing, write "NO DELETION"]
|
| 804 |
+
```
|
| 805 |
+
|
| 806 |
+
4. **INSERTION (The Change):**
|
| 807 |
+
|
| 808 |
+
```python
|
| 809 |
+
✅ INSERTING:
|
| 810 |
+
[The new code]
|
| 811 |
+
```
|
| 812 |
+
|
| 813 |
+
5. **ANCHOR (Post-Verification):**
|
| 814 |
+
|
| 815 |
+
```python
|
| 816 |
+
TEXT BELOW (Unchanged - Minimum 3 lines):
|
| 817 |
+
[Paste exact existing code here to confirm safe exit]
|
| 818 |
+
```
|
| 819 |
+
|
| 820 |
+
## 10. INFRASTRUCTURE & GIT SAFETY (ZERO DATA LOSS)
|
| 821 |
+
|
| 822 |
+
**DEFINITION OF RISK:** Risk includes overwriting remote files, deleting local files, force-pushing, or changing environment binaries.
|
| 823 |
+
|
| 824 |
+
1. **GIT VERIFICATION:** Before ANY `git push`, you MUST run and display:
|
| 825 |
+
- `git status` (Detect deletions - STOP if any "deleted:" lines appear)
|
| 826 |
+
- `git diff --stat` (Detect mass changes)
|
| 827 |
+
- `git remote -v` (Verify target)
|
| 828 |
+
|
| 829 |
+
2. **REMOTE EQUALITY:** Local files are NOT the only truth. You must check if Remote has files that Local is missing before syncing.
|
| 830 |
+
|
| 831 |
+
3. **BINARY EXCLUSION:** You MUST verify `.gitignore` and `.dockerignore` contain `ollama`, `venv`, `__pycache__` before pushing to Cloud.
|
| 832 |
+
|
| 833 |
+
## 11. ANTI-HALLUCINATION / EMPIRICAL FIRST
|
| 834 |
+
|
| 835 |
+
**RULE:** You may not answer "I think", "It should be", or "Most likely".
|
| 836 |
+
|
| 837 |
+
1. **PRE-COMPUTATION:** You must run a CLI command (ls, cat, grep, git status) to verify a fact BEFORE stating it.
|
| 838 |
+
2. **ADMISSION OF IGNORANCE:** If you cannot verify a file/state via CLI, you must state "I cannot verify X" and ask for the user's help.
|
| 839 |
+
|
| 840 |
+
# ------------------------------------------------------------------
|
| 841 |
+
|
| 842 |
+
# UPDATED SECURITY PROTOCOLS (MANDATORY ENFORCEMENT)
|
| 843 |
+
|
| 844 |
+
# ------------------------------------------------------------------
|
| 845 |
+
|
| 846 |
+
## 9. INFRASTRUCTURE & GIT SAFETY (ZERO DATA LOSS)
|
| 847 |
+
|
| 848 |
+
**DEFINITION OF RISK:** Risk explicitly includes overwriting remote files, deleting local files, force-pushing, or changing environment binaries.
|
| 849 |
+
|
| 850 |
+
1. **GIT VERIFICATION:** Before ANY `git push`, you MUST run and display:
|
| 851 |
+
- `git status` (Detect deletions)
|
| 852 |
+
- `git diff --stat` (Detect mass changes)
|
| 853 |
+
- `git remote -v` (Verify target)
|
| 854 |
+
2. **BINARY EXCLUSION:** You MUST verify `.gitignore` and `.dockerignore` contain `ollama`, `venv`, `__pycache__` before pushing.
|
| 855 |
+
3. **NO FORCE PUSH:** `git push --force` is STRICTLY PROHIBITED.
|
| 856 |
+
4. **REMOTE EQUALITY:** Local files are NOT the only truth. You must check if Remote has files that Local is missing before syncing.
|
| 857 |
+
|
| 858 |
+
## 10. MANDATORY CODE EDITING PROTOCOL (THE 4-POINT ANCHOR)
|
| 859 |
+
|
| 860 |
+
**CRITICAL:** Standard/Naked code blocks are PROHIBITED. You must use this format to prove you are not guessing location:
|
| 861 |
+
|
| 862 |
+
1. **FILE HEADER:** `# File: /absolute/path/to/file.py`
|
| 863 |
+
2. **ANCHOR (Pre-Verification):**
|
| 864 |
+
|
| 865 |
+
```python
|
| 866 |
+
TEXT ABOVE (Unchanged - Minimum 3 lines):
|
| 867 |
+
[Paste exact existing code here to prove location]
|
| 868 |
+
```
|
| 869 |
+
|
| 870 |
+
3. **DELETION (Explicit Warning):**
|
| 871 |
+
|
| 872 |
+
```python
|
| 873 |
+
❌ DELETING / OVERWRITING:
|
| 874 |
+
[Paste the exact lines being removed. If nothing, write "NO DELETION"]
|
| 875 |
+
```
|
| 876 |
+
|
| 877 |
+
4. **INSERTION (The Change):**
|
| 878 |
+
|
| 879 |
+
```python
|
| 880 |
+
✅ INSERTING:
|
| 881 |
+
[The new code]
|
| 882 |
+
```
|
| 883 |
+
|
| 884 |
+
5. **ANCHOR (Post-Verification):**
|
| 885 |
+
|
| 886 |
+
```python
|
| 887 |
+
TEXT BELOW (Unchanged - Minimum 3 lines):
|
| 888 |
+
[Paste exact existing code here to confirm safe exit]
|
| 889 |
+
```
|
| 890 |
+
|
| 891 |
+
## 11. ANTI-HALLUCINATION / EMPIRICAL FIRST
|
| 892 |
+
|
| 893 |
+
**RULE:** You may not answer "I think", "It should be", or "Most likely".
|
| 894 |
+
|
| 895 |
+
1. **PRE-COMPUTATION:** You must run a CLI command (ls, cat, grep, git status) to verify a fact BEFORE stating it.
|
| 896 |
+
2. **ADMISSION OF IGNORANCE:** If you cannot verify a file/state via CLI, you must state "I cannot verify X" and ask for the user's help.
|
| 897 |
+
|
| 898 |
+
## 12. THE "I DON'T KNOW" PROTOCOL (EPISTEMIC SECURITY)
|
| 899 |
+
|
| 900 |
+
**RULE:** You are strictly prohibited from filling gaps with assumptions, "most likely" scenarios, or inferred file paths.
|
| 901 |
+
|
| 902 |
+
1. **THE STOP CONDITION:** If you do not have **CLI Output** (ls, cat, grep, git status) currently visible in the context that proves a fact, you **DO NOT KNOW** that fact.
|
| 903 |
+
2. **THE MANDATORY RESPONSE:**
|
| 904 |
+
- **Incorrect:** "The file is likely in /evidence..."
|
| 905 |
+
- **Correct:** "❌ KNOWLEDGE GAP: I do not know the location of [file]. I cannot proceed."
|
| 906 |
+
3. **THE REQUIRED ACTION (ANALYSIS FIRST):**
|
| 907 |
+
- Immediately stop execution.
|
| 908 |
+
- Generate a specific **Diagnostic Script** (Python or Bash) to discover the missing information.
|
| 909 |
+
- Ask the user to run it.
|
| 910 |
+
4. **PROHIBITED PHRASES:**
|
| 911 |
+
- "Assuming that..."
|
| 912 |
+
- "It should be..."
|
| 913 |
+
- "Typically..."
|
| 914 |
+
- "Based on standard structure..."
|
| 915 |
+
|
| 916 |
+
## 12. THE "I DON'T KNOW" PROTOCOL (EPISTEMIC SECURITY)
|
| 917 |
+
|
| 918 |
+
**RULE:** You are strictly prohibited from filling gaps with assumptions, "most likely" scenarios, or inferred file paths.
|
| 919 |
+
|
| 920 |
+
1. **THE STOP CONDITION:** If you do not have **CLI Output** (ls, cat, grep, git status) currently visible in the context that proves a fact, you **DO NOT KNOW** that fact.
|
| 921 |
+
2. **THE MANDATORY RESPONSE:**
|
| 922 |
+
- **Incorrect:** "The file is likely in /evidence..."
|
| 923 |
+
- **Correct:** "❌ KNOWLEDGE GAP: I do not know the location of [file]. I cannot proceed."
|
| 924 |
+
3. **THE REQUIRED ACTION (ANALYSIS FIRST):**
|
| 925 |
+
- Immediately stop execution.
|
| 926 |
+
- Generate a specific **Diagnostic Script** (Python or Bash) to discover the missing information.
|
| 927 |
+
- Ask the user to run it.
|
| 928 |
+
4. **PROHIBITED PHRASES:**
|
| 929 |
+
- "Assuming that..."
|
| 930 |
+
- "It should be..."
|
| 931 |
+
- "Typically..."
|
| 932 |
+
- "Based on standard structure..."
|
| 933 |
+
|
| 934 |
+
## 12. THE "I DON'T KNOW" PROTOCOL (EPISTEMIC SECURITY)
|
| 935 |
+
|
| 936 |
+
**RULE:** You are strictly prohibited from filling gaps with assumptions.
|
| 937 |
+
|
| 938 |
+
1. **THE STOP CONDITION:** If you do not have CLI Output proving a fact, you DO NOT KNOW it.
|
| 939 |
+
2. **THE MANDATORY RESPONSE:** "❌ KNOWLEDGE GAP: I do not know [X]. I cannot proceed."
|
| 940 |
+
3. **THE REQUIRED ACTION:** Generate a diagnostic script to find the answer empirically.
|
| 941 |
+
|
| 942 |
+
## 12. THE "I DON'T KNOW" PROTOCOL (EPISTEMIC SECURITY)
|
| 943 |
+
|
| 944 |
+
**RULE:** You are strictly prohibited from filling gaps with assumptions.
|
| 945 |
+
|
| 946 |
+
1. **THE STOP CONDITION:** If you do not have CLI Output proving a fact, you DO NOT KNOW it.
|
| 947 |
+
2. **THE MANDATORY RESPONSE:** "❌ KNOWLEDGE GAP: I do not know [X]. I cannot proceed."
|
| 948 |
+
3. **THE REQUIRED ACTION:** Generate a diagnostic script to find the answer empirically.
|
prompts/legal_exhibit_instruction.txt
CHANGED
|
@@ -1,4 +1,6 @@
|
|
| 1 |
-
|
|
|
|
|
|
|
| 2 |
|
| 3 |
Required format: Exhibit [Bundle letter][Number] (DB-[N]) — [Filename]
|
| 4 |
Example: Exhibit G7 (DB-125) — 16_12_25_Lamberth_Email_Complaint_Response_Rent_account_UFN40981138.pdf
|
|
|
|
| 1 |
+
#/home/mrdbo/projects/moltbot-hybrid-engine/prompts/legal_exhibit_instruction.txt
|
| 2 |
+
|
| 3 |
+
When referencing evidence in legal documents (witness statements, N244, SRA complaint, or any court filing), use this format. DB numbers without filename and without bundle initial letter+number are not adequate for documents receiving amendments, edits, or insertions. Do not use bare "DB-[●]" or "Exhibit DB-[Number]" as the main identifier.
|
| 4 |
|
| 5 |
Required format: Exhibit [Bundle letter][Number] (DB-[N]) — [Filename]
|
| 6 |
Example: Exhibit G7 (DB-125) — 16_12_25_Lamberth_Email_Complaint_Response_Rent_account_UFN40981138.pdf
|