# G.U.I.D.E. — Technical Architecture & Specification

## 1. Overview

G.U.I.D.E. (Grievance Utility for Information Extraction, Drafting and Enrichment)
is a **four-layer, spec-driven system** built for consumer complaint resolution.
Every component has a clear contract; layers communicate through defined interfaces
so each piece can be tested and replaced independently.

```
┌─────────────────────────────────────────────────────────────────┐
│                        GRADIO FRONTEND                          │
│  Chat · Verify (HITL) · Documents · Draft · Escalation · About │
└──────────────────────────┬──────────────────────────────────────┘
                           │ HTTP (REST)
┌──────────────────────────▼──────────────────────────────────────┐
│                       FASTAPI BACKEND                           │
│   /api/session  /api/message  /api/upload                       │
│   /api/session/{id}/validate-entities   /api/status             │
└──┬──────────────────┬────────────────────────────────────────────┘
   │                  │
   │ Step 1 (local)   │ Step 2 (external API — after redaction)
   │                  │
┌──▼──────────────┐  ┌▼────────────────────────────────────────────┐
│  PRESIDIO       │  │  CLAUDE MANAGED AGENT (CMA)                 │
│  PIIRedactor    │  │                                             │
│  (runs locally) │  │  Tools:                                     │
│                 │  │  - classify_domain()  ──► DomainClassifier  │
│  Redacts:       │  │  - extract_entities() ──► EvidenceNER       │
│  PERSON         │  │  - process_document() ──► OCR / ViT + NER  │
│  PHONE_NUMBER   │  │  - draft_complaint()  ──► Claude (internal) │
│  EMAIL_ADDRESS  │  │  - recommend_action() ──► NextActionPredict │
│  CREDIT_CARD    │  │                                             │
│  IN_AADHAAR     │  │  HITL gate: agent pauses before drafting    │
│  IN_PAN         │  │  and requests user confirmation of entities │
│  IBAN_CODE      │  │                                             │
│  ...            │  │  Memory: per-user session state             │
└─────────────────┘  └─────────────────────────────────────────────┘
                                      │
                         ┌────────────▼─────────────────────────────┐
                         │   DEEP LEARNING LAYER                    │
                         │                                          │
                         │  1. DomainClassifier (DistilBERT)        │
                         │  2. EvidenceNER      (DistilBERT tokens) │
                         │  3. DocumentViT      (ViT image encoder) │
                         │  4. NextActionPredictor (MLP)            │
                         └──────────────────────────────────────────┘
                                      │
                         ┌────────────▼─────────────┐
                         │   DOCUMENT PROCESSOR     │
                         │  Tesseract OCR            │
                         │  pdfplumber (PDF parse)   │
                         │  PIL (image pre-process)  │
                         │  ViT (image understanding)│
                         └──────────────────────────┘
```

---

## 2. Component Specifications

### 2.0 Privacy Preprocessing — Microsoft Presidio

This layer runs **locally** before any message is forwarded to Claude or any
external service. It is the first step in the pipeline.

| Attribute    | Value                                                     |
|--------------|-----------------------------------------------------------|
| Library      | `presidio-analyzer` + `presidio-anonymizer`               |
| NLP engine   | spaCy `en_core_web_lg` (local, no network call)           |
| Trigger      | Every `/api/session/{id}/message` call                    |
| Entity types | `PERSON`, `PHONE_NUMBER`, `EMAIL_ADDRESS`, `CREDIT_CARD`, `IBAN_CODE`, `US_BANK_NUMBER`, `IN_AADHAAR`, `IN_PAN`, `IN_VEHICLE_REGISTRATION` |
| Replacement  | Each detected span → `<ENTITY_TYPE>` placeholder          |
| Output       | `RedactionResult(redacted_text, pii_types_found, pii_redacted)` |
| Failure mode | On any error, original text is returned unchanged (fail-open so pipeline is never blocked) |

**Why local?**  
The user's name, account number, and Aadhaar UID must never leave the device
in plaintext. Running Presidio in-process (same Python server) ensures redaction
happens before TCP bytes are written to the Anthropic API.

**What the DL models receive:**  
The local DL models (DomainClassifier, EvidenceNER) are called by Claude through
tool calls — they therefore also receive redacted text. The EvidenceNER model is
still effective because it targets structural patterns (amounts, dates, reference
IDs) that are not redacted by Presidio.

---

### 2.1 Deep Learning Layer

#### 2.1.1 DomainClassifier

| Attribute       | Value                                              |
|-----------------|----------------------------------------------------|
| Architecture         | `distilbert-base-uncased` + linear classification head |
| Task                 | Multi-class text classification                    |
| Classes (6)          | `ecommerce`, `telecom`, `banking`, `cibil`, `insurance`, `general` |
| Training data        | CFPB Consumer Complaint Database (3M+ rows) — one-time download from Kaggle. Save as `data/raw/complaints.csv`. |
| Script               | `python -m src.classifier.train --cfpb_csv data/raw/complaints.csv --output_dir models/domain_classifier` |
| Input                | Redacted complaint text (string, max 512 tokens)   |
| Output               | `DomainResult(domain: str, confidence: float, all_probs: dict, low_confidence: bool)` |
| Confidence threshold | `0.50` — results below this set `low_confidence=True` |
| Low-confidence path  | Agent asks user one clarifying domain question; does not proceed until user confirms |
| Keyword fallback     | Used when no checkpoint exists; always returns `confidence=0.0`, `low_confidence=True` |
| Fine-tune time       | ~30 min CPU / ~5 min GPU (T4)                      |
| Library              | HuggingFace `transformers` + `datasets`            |

**Why DistilBERT?**  
DistilBERT is 40% smaller and 60% faster than BERT-base with only 3% accuracy
loss. For a project with limited compute, it is the ideal starting point.
The CFPB dataset maps naturally to our 6 classes after label remapping.

**Low-confidence handling:**  
`general` is the intentional catch-all class — the model never returns an error, only a
domain + probability. However, a low probability on all classes (e.g., the complaint text
is too short or ambiguous) means the winning domain is unreliable. When `confidence < 0.50`
the `low_confidence` flag is set and the CMA agent pauses to ask the user one clarifying
question ("Is this about e-commerce, telecom, banking, credit score, insurance, or other?")
before continuing. The user's answer overrides the model's suggestion and is stored with
`domain_source = "user_confirmed"` so later tools know the domain is authoritative.

---

#### 2.1.2 EvidenceNER

| Attribute    | Value                                                      |
|--------------|------------------------------------------------------------|
| Architecture | `distilbert-base-uncased` with token classification head   |
| Task         | Named Entity Recognition (NER) on complaint text           |
| Entity types | `ORG`, `AMOUNT`, `DATE`, `REF_ID`, `ACCOUNT`, `PERSON`     |
| Training     | ~4,000 synthetic complaint sentences generated in-memory by `src/ner/train.py` (no download needed). Optionally augmented with CoNLL-2003 via HuggingFace if internet is available (maps PER→PERSON, ORG→ORG; discards LOC/MISC). |
| Script       | `python -m src.ner.train --output_dir models/evidence_ner` |
| Input        | Redacted text (from user or OCR)                           |
| Output       | List of `{text, label, start, end, confidence}` spans      |

**Entities and their use in drafting:**
| Entity   | Example                         | Used for                    |
|----------|---------------------------------|-----------------------------|
| ORG      | "Flipkart", "HDFC Bank"         | Complaint addressee         |
| AMOUNT   | "₹4,299", "Rs. 1,200"           | Financial loss quantified   |
| DATE     | "12 March 2024", "last Tuesday" | Incident timeline           |
| REF_ID   | "Order #OD-2930291", "TXN123"   | Evidence reference          |
| ACCOUNT  | "XXXX-1234", "loan account"     | Dispute target              |
| PERSON   | "customer care executive"       | Named witness/contact       |

---

#### 2.1.3 DocumentViT

| Attribute    | Value                                                         |
|--------------|---------------------------------------------------------------|
| Architecture | Vision Transformer (`google/vit-base-patch16-224` fine-tuned)|
| Task         | Structured evidence extraction from document images           |
| Input        | Scanned receipt / bill / screenshot (PIL Image)               |
| Output       | List of `{text, label, confidence}` spans (same schema as NER)|
| When used    | After OCR; ViT runs as a complementary pass on image-type docs|
| Library      | HuggingFace `transformers` (`ViTForImageClassification` + custom head) |

**Why ViT alongside OCR?**  
Tesseract OCR excels at clean printed text but struggles with handwriting, logos,
and table structures. The ViT model, fine-tuned on receipt and bill images, directly
classifies image regions and extracts amount/date/provider fields — especially
useful for blurry screenshots and poorly-scanned documents.

---

#### 2.1.4 NextActionPredictor

| Attribute     | Value                                                             |
|---------------|-------------------------------------------------------------------|
| Architecture  | 2-hidden-layer MLP (12-dim input → 64 → 64 → 6)                 |
| Input         | 12-dim feature vector: domain one-hot (6) + entity flags (5) + prior_contact (1) |
| Output        | Ranked list of `{action, authority, url, confidence}`             |
| Actions       | 6 classes: `company_support`, `nch`, `trai`, `rbi_ombudsman`, `irdai`, `legal` |
| Training data | ~6,000 synthetic (domain, entity_flags, prior_contact → action) examples generated in-memory from `DOMAIN_ACTION_PRIORS`; no download needed. Trains in < 30 seconds on CPU. |
| Script        | `python -m src.next_action.train --output_dir models/next_action` |
| Fallback      | If no checkpoint exists, `DOMAIN_ACTION_PRIORS` rule-based mapping is used so the pipeline always works. |

**Escalation routing logic:**
| Domain    | Primary Authority         | Secondary          |
|-----------|---------------------------|--------------------|
| E-commerce| Company support → NCH     | Consumer Forum     |
| Telecom   | Company support → TRAI    | NCH                |
| Banking   | Company support → RBI BO  | Banking Ombudsman  |
| CIBIL     | Bureau direct → RBI BO    | SEBI (if investment)|
| Insurance | Company support → IRDAI   | Insurance Ombudsman|
| General   | Company support → NCH     | Consumer Forum     |

---

### 2.2 Claude Managed Agent (CMA)

The CMA is the orchestration layer. It maintains **per-user session state**
(conversation history, extracted entities, uploaded docs, draft versions) and
decides at each turn which tool to invoke.

**Key constraint:** Claude only ever sees **redacted text** (PII replaced with
`<ENTITY_TYPE>` placeholders by Presidio before the API call). This is documented
in the system prompt so Claude knows not to try to recover original values.

#### Agent System Prompt Summary
```
You are G.U.I.D.E., an expert consumer complaint assistant.
PII has already been redacted locally — work with placeholders as-is.

Rules:
1. Always classify the domain first using classify_domain().
   • If low_confidence=false (≥ 0.50): store domain and proceed.
   • If low_confidence=true (< 0.50 or keyword fallback): ask the user ONE
     clarifying question ("Is this about e-commerce, telecom, banking, credit
     score, insurance, or other?") before continuing. Store domain_source=
     "user_confirmed" when the domain comes from the user.
   • If classify_domain() errors: same clarifying question as above.
2. Ask ONE targeted follow-up question at a time if information is missing.
3. If documents are uploaded, always run process_document before drafting.
4. HITL gate: Before calling draft_complaint, present extracted details
   as a numbered summary and ask the user to confirm them.  Wait for
   [USER CONFIRMED] message before proceeding.
5. Never draft until domain, provider, date, amount, prior contact, and
   desired resolution are all known AND user-confirmed.
6. Generate drafts in formal English: Subject / To / Body / From.
7. Always recommend the next escalation step with specific portal URLs.
```

#### Tool Specifications

| Tool              | Input                        | Output                          | Calls             |
|-------------------|------------------------------|---------------------------------|-------------------|
| `classify_domain` | `complaint_text: str`        | `DomainResult`                  | DL DomainClassifier |
| `extract_entities`| `text: str`                  | `List[Entity]`                  | DL EvidenceNER    |
| `process_document`| `file_path: str`             | `{raw_text, entities}`          | OCR + ViT + EvidenceNER |
| `draft_complaint` | `complaint_context: dict`    | `ComplaintDraft`                | Claude (internal) |
| `recommend_action`| `domain: str, entities: dict`| `List[EscalationAction]`        | DL NextAction     |
| `store_memory`    | `key: str, value: any`       | `None`                          | CMA Memory Store  |
| `get_memory`      | `key: str`                   | `any`                           | CMA Memory Store  |

#### CMA Decision Flow (per user turn)

```
User message received
        │
        ▼  ── PRESIDIO (API layer, before agent) ───────────────
  PII redacted locally → redacted_text forwarded to Claude
        │
        ▼
  Is domain known? ──No──► call classify_domain() ──► store in memory
        │
       Yes
        │
        ▼
  Are minimum fields complete? ──No──► ask ONE follow-up question
  (provider, date, amount, ref)
        │
       Yes
        │
        ▼
  Was a document uploaded? ──Yes──► call process_document() ──► merge entities
        │
       No
        │
        ▼  ── HITL GATE ──────────────────────────────────────────
  Present extracted details summary → ask user to confirm
        │
  Wait for [USER CONFIRMED] message (from /validate-entities endpoint)
        │
        ▼
  Has user confirmed entities? ──Yes──► call draft_complaint() ──► show draft
        │
        ▼
  Has user asked next steps? ──Yes──► call recommend_action() ──► show escalation
```

---

### 2.3 Human-in-the-Loop (HITL) Validation

After all required fields are collected and before draft generation, the system
pauses and requires explicit user confirmation of the extracted entities.

| Step | Component | Description |
|------|-----------|-------------|
| 1    | CMA       | Presents extracted entities as a numbered summary in chat |
| 2    | Frontend  | Populates the **Verify Entities** tab with pre-filled editable fields |
| 3    | User      | Reviews, edits any incorrect value, and clicks "Confirm & Generate Draft" |
| 4    | API       | `POST /api/session/{id}/validate-entities` sends verified entities to CMA |
| 5    | CMA       | Receives `[USER CONFIRMED]` message and calls `draft_complaint()` |

**Why HITL?**  
PII redaction replaces some values with placeholders (e.g., a name becomes
`<PERSON>`). The HITL step lets the user supply the correct readable label
(e.g., "HDFC Bank" rather than just `<ORG>`) that will appear in the final draft,
improving both accuracy and trust in the generated complaint.

---

### 2.4 Document Processor

| Feature      | Implementation                        |
|--------------|---------------------------------------|
| PDF parsing  | `pdfplumber` (text-native PDFs)       |
| Image OCR    | `pytesseract` + `Pillow` (pre-process)|
| ViT pass     | `google/vit-base-patch16-224` fine-tuned on receipt/bill images |
| Pre-process  | Greyscale → adaptive threshold → deskew |
| Output       | Clean extracted text + NER entities   |
| Formats      | PDF, PNG, JPG, JPEG, WEBP             |

---

### 2.5 FastAPI Backend

| Endpoint                              | Method | Description                            |
|---------------------------------------|--------|----------------------------------------|
| `/api/health`                         | GET    | Health check (all components)          |
| `/api/session/create`                 | POST   | Create new CMA session                 |
| `/api/session/{id}/message`           | POST   | Send message → Presidio redact → agent |
| `/api/session/{id}/upload`            | POST   | Upload a document to session           |
| `/api/session/{id}/validate-entities` | POST   | HITL: submit user-confirmed entities   |
| `/api/session/{id}/history`           | GET    | Retrieve conversation history          |
| `/api/classify`                       | POST   | Direct DL classification (debug)       |
| `/api/extract`                        | POST   | Direct NER extraction (debug)          |

---

### 2.6 Gradio Frontend

Tabs:
1. **Chat** — Conversational interface; shows 🔒 privacy badge when PII is redacted
2. **Verify Entities** — HITL panel: editable entity fields + "Confirm & Generate Draft"
3. **Documents** — Drag-and-drop upload; shows extracted entities
4. **Complaint Draft** — Rendered complaint with copy/download
5. **Escalation Guide** — Recommended authorities with portal links
6. **About** — Architecture diagram, model cards, tech stack

---

## 3. Technology Stack

| Layer              | Technology                     | Reason                                        |
|--------------------|--------------------------------|-----------------------------------------------|
| Launcher           | `start.py` (stdlib only)       | Single script — trains all models then starts servers |
| Privacy            | Microsoft Presidio + spaCy     | Local PII redaction, no cloud call            |
| DL Models          | HuggingFace Transformers       | Industry standard for NLP + ViT               |
| Classifier data    | CFPB dataset (Kaggle, one-time)| 3M+ real complaints, public license           |
| NER data           | Synthetic in-memory            | Template-generated; no download required      |
| NextAction data    | Synthetic in-memory            | Generated from domain priors; no download     |
| Agent              | Anthropic CMA (default `claude-sonnet-4-6`, set via `GUIDE_MODEL`) | Stateful, tool-using agent     |
| Backend            | FastAPI + Uvicorn              | Async, fast, OpenAPI auto-docs                |
| Frontend           | Gradio 4.x                    | ML-native UI, file upload, chat               |
| OCR                | pytesseract + pdfplumber       | Proven, open-source                           |
| ViT doc model      | HuggingFace ViT                | Image-based evidence extraction               |
| Env                | Python 3.10+                   | Required by CMA SDK                           |
| Config             | python-dotenv                  | Secure API key management                     |
| Notebooks (optional) | Jupyter                      | EDA and demo only; not required to run system |

---

## 4. Data Flow — End to End

```
User types: "Rahul Sharma — Flipkart hasn't refunded ₹4,299 for order OD-123
             cancelled 3 weeks ago. My phone is 9876543210."
                │
                ▼  ── LOCAL ONLY ─────────────────────────────────────────────
        Presidio PIIRedactor detects: PERSON("Rahul Sharma"), PHONE("9876543210")
        Redacted: "<PERSON> — Flipkart hasn't refunded ₹4,299 for order OD-123
                   cancelled 3 weeks ago. My phone is <PHONE_NUMBER>."
                │
                ▼  ── EXTERNAL API ────────────────────────────────────────────
        FastAPI → CMA session with redacted text
                │
        Claude CMA agent processes redacted message
                │
                ├──► classify_domain(...)
                │         └──► DomainClassifier → {domain: "ecommerce", conf: 0.97}
                │
                ├──► extract_entities(...)
                │         └──► EvidenceNER → [ORG:"Flipkart", AMOUNT:"₹4,299",
                │                             REF_ID:"OD-123", DATE:"3 weeks ago"]
                │
                └──► HITL gate: "I have extracted the following — please confirm:
                                  - Company: Flipkart
                                  - Amount: ₹4,299
                                  - Order ID: OD-123
                                  - Date: 3 weeks ago
                                 Is this correct?"
                │
        User reviews in "Verify Entities" tab → edits if needed → clicks Confirm
                │
                ▼
        POST /validate-entities → [USER CONFIRMED] → draft_complaint()
                │
        ComplaintDraft generated and shown in Draft tab
                │
        recommend_action(domain="ecommerce") → [NCH, Consumer Forum]
                │
        Gradio renders: Draft · Evidence table · Escalation panel
```

---

## 5. Project Directory Structure

```
Project_ResolveAI/
│
├── start.py                       ← SINGLE ENTRY POINT — trains all models then starts servers
│
├── docs/
│   ├── abstract.md                ← project abstract (G.U.I.D.E.)
│   └── architecture.md            ← this file (spec)
│
├── src/
│   ├── __init__.py
│   ├── privacy/                   # Presidio PII redaction (runs before any external call)
│   │   ├── __init__.py
│   │   └── redactor.py            ← PIIRedactor singleton
│   │
│   ├── classifier/                # DL Domain Classifier (DistilBERT)
│   │   ├── model.py               ← DomainClassifier, CFPB_PRODUCT_MAP, LABEL2ID
│   │   ├── dataset.py             ← load_cfpb_csv(), clean_complaint_text(), ComplaintDataset
│   │   ├── train.py               ← CLI: --cfpb_csv, --output_dir, --epochs, --batch_size
│   │   └── predict.py             ← DomainPredictor singleton, classify_domain()
│   │
│   ├── ner/                       # DL NER model (DistilBERT token classifier)
│   │   ├── model.py               ← EvidenceNER, NER_LABELS, NER_LABEL2ID
│   │   ├── train.py               ← Generates synthetic data in-memory; CLI: --output_dir
│   │   └── predict.py             ← NERPredictor singleton, extract_entities()
│   │
│   ├── next_action/               # Next-action MLP predictor
│   │   ├── model.py               ← NextActionMLP, DOMAIN_ACTION_PRIORS, build_feature_vector()
│   │   ├── train.py               ← Generates synthetic features; CLI: --output_dir, --epochs
│   │   └── predict.py             ← NextActionPredictor singleton (MLP or rule-based fallback)
│   │
│   ├── document_processor/        # OCR + PDF parsing + ViT
│   │   ├── ocr.py                 ← Tesseract + pdfplumber pipeline
│   │   └── vit_extractor.py       ← ViT-based image evidence extraction
│   │
│   ├── agent/                     # Claude CMA integration
│   │   ├── tools.py               ← Tool definitions (JSON Schema) + execute_tool()
│   │   ├── prompts.py             ← SYSTEM_PROMPT (HITL rule 6, privacy context)
│   │   └── session.py             ← AgentManager singleton, send_message()
│   │
│   └── api/                       # FastAPI application
│       ├── main.py                ← Lifespan: Presidio → DL models → CMA agent
│       ├── routes.py              ← /message (Presidio→agent), /validate-entities (HITL)
│       └── schemas.py             ← Pydantic models incl. HITL + pii_redacted fields
│
├── ui/
│   └── app.py                     ← Gradio: Chat · Verify · Docs · Draft · Escalation · About
│
├── notebooks/                     # OPTIONAL — EDA and interactive demos only
│   ├── 01_data_exploration.ipynb  ← Explore CFPB dataset, save processed CSV
│   ├── 02_classifier_training.ipynb
│   ├── 04_cma_agent_demo.ipynb
│   └── 05_end_to_end_demo.ipynb
│
├── data/
│   ├── raw/                       ← Place CFPB complaints.csv here (one-time download)
│   ├── processed/                 ← Output of EDA notebook; not required for training
│   └── sample_complaints/         ← Synthetic domain-specific CSVs for augmentation
│
├── models/                        ← Created by training; populated by start.py
│   ├── domain_classifier/         ← best_model.pt + tokenizer files
│   ├── evidence_ner/              ← best_model.pt + tokenizer files
│   ├── document_vit/              ← ViT checkpoint
│   └── next_action/               ← best_model.pt
│
├── CLAUDE.md                      ← Guidance for Claude Code
├── requirements.txt               ← presidio-analyzer, presidio-anonymizer, spacy, torch, etc.
├── .env.example                   ← Template — copy to .env and add ANTHROPIC_API_KEY
├── .gitignore
└── README.md
```

---

## 6. Setup and Running

### Step 1 — Get an Anthropic API key

1. Go to https://console.anthropic.com → Sign up (free tier available)
2. Navigate to **API Keys** → **Create Key**
3. Copy the key (shown only once)
4. In the project root, copy the template and fill in your key:
   ```
   cp .env.example .env
   # then edit .env:
   ANTHROPIC_API_KEY=sk-ant-...
   ```
   Never commit `.env` to git (listed in `.gitignore`).

### Step 2 — Install dependencies

```bash
pip install -r requirements.txt
python -m spacy download en_core_web_lg      # Presidio NLP model (local, ~750 MB)
```

### Step 3 — Download CFPB data (first run only)

The DomainClassifier requires the CFPB Consumer Complaint Database:

- Download from Kaggle: `consumer-complaint-database` dataset
- Save the CSV to `data/raw/complaints.csv`
- Size: ~600 MB; one-time download; not committed to git

This is only needed to train the classifier. The NER and NextAction models generate their training data in-memory automatically.

### Step 4 — Run (single command)

```bash
# First run — trains all models then starts servers:
python start.py --cfpb_csv data/raw/complaints.csv

# After first run — models already trained, skip training:
python start.py --no-train

# Force retrain everything:
python start.py --cfpb_csv data/raw/complaints.csv --train

# Train only (no servers):
python start.py --cfpb_csv data/raw/complaints.csv --train-only
```

When running, `start.py` will:

1. Validate `.env` and `ANTHROPIC_API_KEY`
2. Train **DomainClassifier** (~30 min CPU / ~5 min GPU T4) — skipped if checkpoint exists
3. Train **EvidenceNER** (~10 min CPU) — skipped if checkpoint exists
4. Train **NextActionMLP** (< 30 sec CPU) — skipped if checkpoint exists
5. Start **FastAPI** at `http://localhost:8000` (Swagger docs at `/docs`)
6. Start **Gradio UI** at `http://localhost:7860`

Both servers print to the same terminal, prefixed with `[API]` or `[UI]`. Press **Ctrl+C** to stop everything cleanly.

---

## 7. Development Phases

| Phase | Deliverable                                    | Status       |
|-------|------------------------------------------------|--------------|
| 0     | Abstract submitted (G.U.I.D.E.)                | Done ✓       |
| 1     | Architecture + spec (`docs/architecture.md`)   | Done ✓       |
| 2     | Project scaffold + environment setup           | Done ✓       |
| 3     | Presidio PII redaction layer (`src/privacy/`)  | Done ✓       |
| 4     | DL: DomainClassifier — model, dataset, train   | Done ✓       |
| 5     | DL: EvidenceNER + NextActionMLP — model, train | Done ✓       |
| 6     | DL: ViT document extractor (fine-tuning)       | In progress  |
| 7     | CMA agent + tools integration                  | Done ✓       |
| 8     | Document processor (OCR + ViT pipeline)        | Done ✓       |
| 9     | FastAPI backend (HITL endpoint, schemas)       | Done ✓       |
| 10    | Gradio UI (Verify tab, privacy badge, HITL)    | Done ✓       |
| 11    | Single launcher (`start.py`) + CLAUDE.md       | Done ✓       |
| 12    | Integration testing + demo notebooks           | Remaining    |
| 13    | Final report + presentation                    | Remaining    |