Spaces:
Paused
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
FDAM AI Pipeline - Fire Damage Assessment Methodology v4.0.1 implementation. An AI-powered system that generates professional Cleaning Specifications / Scope of Work documents for fire damage restoration.
- Deployment: HuggingFace Spaces with Nvidia 4xL4 (96GB VRAM total, 24GB per GPU)
- Local Dev: RTX 4090 (24GB) - insufficient for full model stack; use mock models locally
- Spec Document:
FDAM_AI_Pipeline_Technical_Spec.mdis the authoritative technical reference
Critical Constraints
- No External API Calls - 100% locally-owned models only (no Claude/OpenAI APIs)
- Memory Budget - 4xL4 96GB total:
58GB vision (30B BF16) + ~16GB embedding + ~16GB reranker (90GB used, ~6GB headroom) - Processing Time - 60-90 seconds per assessment is acceptable
- MVP Scope - Phase 1 (PRE) and Phase 2 (PRA) only; no lab results processing yet
- Static RAG - Knowledge base is pre-indexed; no user document uploads
Tech Stack
| Component | Technology |
|---|---|
| UI Framework | Gradio 6.x |
| Vision/Generation | Qwen3-VL-30B-A3B-Instruct |
| Embeddings | Qwen3-VL-Embedding-8B |
| Reranker | Qwen3-VL-Reranker-8B |
| Vector Store | ChromaDB 0.4.x |
| Validation | Pydantic 2.x |
| PDF Generation | Pandoc 3.x |
| Package Manager | pip + requirements.txt |
UI Components (Gradio 6.x)
The frontend uses optimized input components:
| Field | Component | Notes |
|---|---|---|
| State | gr.Dropdown |
50 US states + DC + territories |
| Dates | gr.DateTime |
Calendar picker, no time selection |
| ZIP Code | gr.Textbox + blur validation |
Real-time format validation |
| Credentials | gr.Dropdown(multiselect=True) |
CIH, CSP, PE, etc. |
| Floor | gr.Dropdown |
Basement through Roof |
| Ceiling Height | gr.Dropdown + custom option |
8-20 ft presets |
| Image Upload | gr.Files(file_count="multiple") |
Batch upload support |
Keyboard Shortcuts:
Ctrl+1throughCtrl+5: Navigate between tabs
Development Commands
# Install dependencies
pip install -r requirements.txt
# Run locally with mock models
MOCK_MODELS=true python app.py
# Run with real models (HuggingFace only - requires A100)
python app.py
# Recommended tooling (install as dev dependencies)
ruff check . # Linting
ruff format . # Formatting
pytest tests/ -v # Testing
mypy . # Type checking
Architecture
6-Stage Processing Pipeline
- Input Validation - Pydantic schema validation (schemas/input.py)
- Vision Analysis - Per-image zone/material/condition detection (pipeline/vision.py)
- RAG Retrieval - Disposition lookup, thresholds, methods (rag/retriever.py)
- FDAM Logic - Disposition matrix application (pipeline/main.py)
- Calculations - Surface areas, ACH, labor estimates (pipeline/calculations.py)
- Document Generation - SOW, sampling plan, confidence report (pipeline/generator.py)
Target Project Structure
βββ app.py # Gradio entry point
βββ config/ # Inference and app settings
βββ models/ # Model loading (mock vs real)
βββ rag/ # Chunking, vectorstore, retrieval
βββ schemas/ # Pydantic input/output models
βββ pipeline/ # Main processing logic
βββ ui/ # Gradio UI components
βββ RAG-KB/ # Knowledge base source files
βββ chroma_db/ # ChromaDB persistence (generated)
βββ tests/
Domain Knowledge
Zone Classifications
- Burn Zone: Direct fire involvement, structural char, exposed/damaged elements
- Near-Field: Adjacent to burn zone, heavy smoke/heat exposure, visible contamination
- Far-Field: Smoke migration only, light deposits, no structural damage
Condition Levels
- Background: No visible contamination
- Light: Faint discoloration, minimal deposits
- Moderate: Visible film/deposits, surface color altered
- Heavy: Thick deposits, surface texture obscured
- Structural Damage: Physical damage requiring repair before cleaning
Dispositions (FDAM Β§4.3)
- No Action: Document only
- Clean: Standard cleaning protocol
- Evaluate: Requires professional judgment
- Remove: Material must be removed
- Remove/Repair: Remove and repair/replace
Facility Classifications (affects thresholds)
- Operational: Active workplace (higher thresholds: 500 Β΅g/100cmΒ² lead)
- Non-Operational: Unoccupied (lower thresholds: 22 Β΅g/100cmΒ² lead)
- Public/Childcare: Most stringent (EPA/HUD Oct 2024: 0.54 Β΅g/100cmΒ² floors)
Key Calculations
- ACH Formula:
Units = (Volume Γ 4) / (CFM Γ 60)per NADCA ACR 2021 - Sample Density: Varies by area size per FDAM Β§2.3
- Ceiling Deck: Enhanced sampling (1 per 2,500 SF per FDAM Β§4.5)
RAG Knowledge Base
Source documents in /RAG-KB/:
- FDAM v4.0.1 methodology (primary reference)
- BNL SOP IH75190 (metals clearance thresholds)
- IICRC/RIA/CIRI Technical Guide (wildfire restoration)
- Lab method guides (PLM, ICP-MS)
Chunking rules:
- Keep tables intact (never split markdown tables)
- Preserve headers with content
- Include metadata (source, category, section)
Confidence Framework
| Score | Level | Action |
|---|---|---|
| β₯90% | Very High | Accept without review |
| 70-89% | High | Accept, note in report |
| 50-69% | Moderate | Flag for human review |
| <50% | Low | Require human verification |
Multi-GPU Model Loading
The 4xL4 setup requires models to be distributed across GPUs. Use device_map="auto" in transformers:
model = AutoModel.from_pretrained(
"Qwen/Qwen3-VL-30B-A3B-Instruct",
torch_dtype=torch.bfloat16,
device_map="auto", # Automatically distributes across available GPUs
trust_remote_code=True
)
Expected distribution (BF16, ~90GB total):
- Vision model (30B): ~58GB spread across GPUs via device_map="auto"
- Embedding model (8B): ~16GB
- Reranker model (8B): ~16GB
- Headroom: ~6GB for KV cache
Fallback: If VRAM issues arise, use Qwen/Qwen3-VL-8B-Instruct (~16GB) instead of 30B
Local Development Strategy
The RTX 4090 (24GB VRAM) cannot run the full model stack (~90GB required). Use this workflow:
- Set
MOCK_MODELS=trueenvironment variable - Mock responses return realistic JSON matching vision output schema
- Test pipeline logic, UI, calculations without real inference
- Deploy to HuggingFace Spaces for real model testing
- Request build logs after deployment to confirm success
Code Style
- Use
Literal["a", "b", "c"]unions instead of Enum for simple string choices - Pydantic models for all input/output validation
- Explicit return types on public functions
- Result types or explicit error returns over thrown exceptions
- Group imports: stdlib β third-party β local
WSL Note
Dev servers must be exposed for WSL access. Use --host 0.0.0.0 with Gradio:
app.launch(server_name="0.0.0.0", server_port=7860)