SmokeScan / CLAUDE.md
KinetoLabs's picture
Fix critical model implementations and add sample scenarios
f3ebc82
|
raw
history blame
7.26 kB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

FDAM AI Pipeline - Fire Damage Assessment Methodology v4.0.1 implementation. An AI-powered system that generates professional Cleaning Specifications / Scope of Work documents for fire damage restoration.

  • Deployment: HuggingFace Spaces with Nvidia 4xL4 (96GB VRAM total, 24GB per GPU)
  • Local Dev: RTX 4090 (24GB) - insufficient for full model stack; use mock models locally
  • Spec Document: FDAM_AI_Pipeline_Technical_Spec.md is the authoritative technical reference

Critical Constraints

  1. No External API Calls - 100% locally-owned models only (no Claude/OpenAI APIs)
  2. Memory Budget - 4xL4 96GB total: 58GB vision (30B BF16) + ~16GB embedding + ~16GB reranker (90GB used, ~6GB headroom)
  3. Processing Time - 60-90 seconds per assessment is acceptable
  4. MVP Scope - Phase 1 (PRE) and Phase 2 (PRA) only; no lab results processing yet
  5. Static RAG - Knowledge base is pre-indexed; no user document uploads

Tech Stack

Component Technology
UI Framework Gradio 6.x
Vision/Generation Qwen3-VL-30B-A3B-Instruct
Embeddings Qwen3-VL-Embedding-8B
Reranker Qwen3-VL-Reranker-8B
Vector Store ChromaDB 0.4.x
Validation Pydantic 2.x
PDF Generation Pandoc 3.x
Package Manager pip + requirements.txt

UI Components (Gradio 6.x)

The frontend uses optimized input components:

Field Component Notes
State gr.Dropdown 50 US states + DC + territories
Dates gr.DateTime Calendar picker, no time selection
ZIP Code gr.Textbox + blur validation Real-time format validation
Credentials gr.Dropdown(multiselect=True) CIH, CSP, PE, etc.
Floor gr.Dropdown Basement through Roof
Ceiling Height gr.Dropdown + custom option 8-20 ft presets
Image Upload gr.Files(file_count="multiple") Batch upload support

Keyboard Shortcuts:

  • Ctrl+1 through Ctrl+5: Navigate between tabs

Development Commands

# Install dependencies
pip install -r requirements.txt

# Run locally with mock models
MOCK_MODELS=true python app.py

# Run with real models (HuggingFace only - requires A100)
python app.py

# Recommended tooling (install as dev dependencies)
ruff check .              # Linting
ruff format .             # Formatting
pytest tests/ -v          # Testing
mypy .                    # Type checking

Architecture

6-Stage Processing Pipeline

  1. Input Validation - Pydantic schema validation (schemas/input.py)
  2. Vision Analysis - Per-image zone/material/condition detection (pipeline/vision.py)
  3. RAG Retrieval - Disposition lookup, thresholds, methods (rag/retriever.py)
  4. FDAM Logic - Disposition matrix application (pipeline/main.py)
  5. Calculations - Surface areas, ACH, labor estimates (pipeline/calculations.py)
  6. Document Generation - SOW, sampling plan, confidence report (pipeline/generator.py)

Target Project Structure

β”œβ”€β”€ app.py                 # Gradio entry point
β”œβ”€β”€ config/                # Inference and app settings
β”œβ”€β”€ models/                # Model loading (mock vs real)
β”œβ”€β”€ rag/                   # Chunking, vectorstore, retrieval
β”œβ”€β”€ schemas/               # Pydantic input/output models
β”œβ”€β”€ pipeline/              # Main processing logic
β”œβ”€β”€ ui/                    # Gradio UI components
β”œβ”€β”€ RAG-KB/                # Knowledge base source files
β”œβ”€β”€ chroma_db/             # ChromaDB persistence (generated)
└── tests/

Domain Knowledge

Zone Classifications

  • Burn Zone: Direct fire involvement, structural char, exposed/damaged elements
  • Near-Field: Adjacent to burn zone, heavy smoke/heat exposure, visible contamination
  • Far-Field: Smoke migration only, light deposits, no structural damage

Condition Levels

  • Background: No visible contamination
  • Light: Faint discoloration, minimal deposits
  • Moderate: Visible film/deposits, surface color altered
  • Heavy: Thick deposits, surface texture obscured
  • Structural Damage: Physical damage requiring repair before cleaning

Dispositions (FDAM Β§4.3)

  • No Action: Document only
  • Clean: Standard cleaning protocol
  • Evaluate: Requires professional judgment
  • Remove: Material must be removed
  • Remove/Repair: Remove and repair/replace

Facility Classifications (affects thresholds)

  • Operational: Active workplace (higher thresholds: 500 Β΅g/100cmΒ² lead)
  • Non-Operational: Unoccupied (lower thresholds: 22 Β΅g/100cmΒ² lead)
  • Public/Childcare: Most stringent (EPA/HUD Oct 2024: 0.54 Β΅g/100cmΒ² floors)

Key Calculations

  • ACH Formula: Units = (Volume Γ— 4) / (CFM Γ— 60) per NADCA ACR 2021
  • Sample Density: Varies by area size per FDAM Β§2.3
  • Ceiling Deck: Enhanced sampling (1 per 2,500 SF per FDAM Β§4.5)

RAG Knowledge Base

Source documents in /RAG-KB/:

  • FDAM v4.0.1 methodology (primary reference)
  • BNL SOP IH75190 (metals clearance thresholds)
  • IICRC/RIA/CIRI Technical Guide (wildfire restoration)
  • Lab method guides (PLM, ICP-MS)

Chunking rules:

  • Keep tables intact (never split markdown tables)
  • Preserve headers with content
  • Include metadata (source, category, section)

Confidence Framework

Score Level Action
β‰₯90% Very High Accept without review
70-89% High Accept, note in report
50-69% Moderate Flag for human review
<50% Low Require human verification

Multi-GPU Model Loading

The 4xL4 setup requires models to be distributed across GPUs. Use device_map="auto" in transformers:

model = AutoModel.from_pretrained(
    "Qwen/Qwen3-VL-30B-A3B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto",  # Automatically distributes across available GPUs
    trust_remote_code=True
)

Expected distribution (BF16, ~90GB total):

  • Vision model (30B): ~58GB spread across GPUs via device_map="auto"
  • Embedding model (8B): ~16GB
  • Reranker model (8B): ~16GB
  • Headroom: ~6GB for KV cache

Fallback: If VRAM issues arise, use Qwen/Qwen3-VL-8B-Instruct (~16GB) instead of 30B

Local Development Strategy

The RTX 4090 (24GB VRAM) cannot run the full model stack (~90GB required). Use this workflow:

  1. Set MOCK_MODELS=true environment variable
  2. Mock responses return realistic JSON matching vision output schema
  3. Test pipeline logic, UI, calculations without real inference
  4. Deploy to HuggingFace Spaces for real model testing
  5. Request build logs after deployment to confirm success

Code Style

  • Use Literal["a", "b", "c"] unions instead of Enum for simple string choices
  • Pydantic models for all input/output validation
  • Explicit return types on public functions
  • Result types or explicit error returns over thrown exceptions
  • Group imports: stdlib β†’ third-party β†’ local

WSL Note

Dev servers must be exposed for WSL access. Use --host 0.0.0.0 with Gradio:

app.launch(server_name="0.0.0.0", server_port=7860)