Asadrizvi64 commited on Feb 23

Commit

5666923

1 Parent(s): 352967a

Electrical Outlets diagnostic pipeline v1.0

Browse files

Files changed (23) hide show

.gitignore +38 -10
README.md +237 -0
app.py +262 -0
config/audio_train_config.yaml +41 -0
config/image_train_config.yaml +43 -0
config/label_mapping.json +93 -0
config/schema.yaml +81 -0
config/thresholds.yaml +13 -0
releases.md +8 -0
requirements.txt +145 -0
src/__init__.py +4 -0
src/data/audio_dataset.py +105 -0
src/data/image_dataset.py +155 -0
src/fusion/fusion_logic.py +129 -0
src/inference/wrapper.py +144 -0
src/models/audio_model.py +87 -0
src/models/image_model.py +67 -0
test.py +388 -0
test_single_image.py +90 -0
tests/test_fusion.py +60 -0
training/train_audio.py +202 -0
training/train_image.py +329 -0
weights/.gitkeep +1 -0

.gitignore CHANGED Viewed

@@ -1,10 +1,38 @@
-venv/
-__pycache__/
-*.pt
-*.pdf
-notebooks/
-ELECTRICAL*
-electrical_outlets_sounds_100/
-*.wav
-*.jpg
-*.png

+ELECTRICAL OUTLETS-20260106T153508Z-3-001/
+electrical_outlets_sounds_100/
+111/
+# Model weights (upload separately via LFS)
+weights/*.pt
+# Binary files
+*.pdf
+*.jpg
+*.jpeg
+*.png
+*.wav
+*.mp3
+# Python
+__pycache__/
+*.py[cod]
+*.egg-info/
+venv/
+.venv/
+env/
+# IDE
+.vscode/
+.idea/
+# OS
+.DS_Store
+Thumbs.db
+# Notebooks
+notebooks/
+# Misc
+tmp/
+*.log
+wandb/

README.md ADDED Viewed

	@@ -0,0 +1,237 @@

+# Electrical Outlets & Switches Diagnostic Pipeline
+Non-intrusive AI diagnostic system for electrical outlets and switches using **image classification** and **audio analysis** with decision-level fusion.
+## Overview
+This pipeline analyzes photos and/or audio recordings of electrical outlets to detect potential safety issues without requiring physical inspection. It uses two independent models fused at the decision level for robust predictions.
+### Image Model
+- **Architecture:** EfficientNet-B0 (frozen backbone) + MLP head (512 → 5 classes)
+- **Classes:** burn/overheating, cracked faceplate, loose outlet, normal, water exposed
+- **Performance:** 77.3% accuracy, 66.7% minimum per-class recall
+- **Training data:** 1,299 images across 10 source categories merged into 5 classes
+### Audio Model
+- **Architecture:** 3-layer Spectrogram CNN (32→64→128 channels + adaptive pooling)
+- **Classes:** normal, buzzing, crackling/arcing, arcing pop
+- **Performance:** 100% macro recall on validation
+- **Training data:** 100 WAV files (22050 Hz, mel spectrograms with SpecAugment)
+### Fusion
+- Decision-level fusion combining both modalities
+- Safety-first: prefers "uncertain" over "normal" when in doubt
+- Severity = max(image_severity, audio_severity)
+- Configurable confidence thresholds in `config/thresholds.yaml`
+## Project Structure
+```
+CV/
+├── config/
+│   ├── label_mapping.json          # Class definitions & folder→class mapping
+│   ├── image_train_config.yaml     # Image training hyperparameters
+│   ├── audio_train_config.yaml     # Audio training hyperparameters
+│   ├── thresholds.yaml             # Fusion confidence thresholds
+│   └── schema.yaml                 # API output schema
+├── src/
+│   ├── data/
+│   │   ├── image_dataset.py        # Image dataset with stratified splits
+│   │   └── audio_dataset.py        # Audio dataset with stratified splits
+│   ├── models/
+│   │   ├── image_model.py          # EfficientNet-B0 + MLP classifier
+│   │   └── audio_model.py          # Spectrogram CNN classifier
+│   ├── fusion/
+│   │   └── fusion_logic.py         # Decision-level fusion
+│   └── inference/
+│       └── wrapper.py              # End-to-end inference pipeline
+├── training/
+│   ├── train_image.py              # Image model training (2-stage)
+│   └── train_audio.py              # Audio model training
+├── api/
+│   └── main.py                     # FastAPI endpoint
+├── weights/
+│   ├── electrical_outlets_image_best.pt   # Trained image model
+│   └── electrical_outlets_audio_best.pt   # Trained audio model
+├── tests/
+│   └── test_fusion.py              # Fusion logic tests
+├── test_single_image.py            # Quick single-image testing
+├── requirements.txt
+└── README.md
+```
+## Setup
+### Requirements
+- Python 3.10+
+- NVIDIA GPU with CUDA (recommended: RTX 3090 or better)
+### Installation
+```bash
+git clone https://huggingface.co/<your-repo>/electrical-outlets-diagnostic
+cd electrical-outlets-diagnostic
+pip install -r requirements.txt
+# If GPU: install CUDA-enabled PyTorch
+pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
+# Also needed on Windows:
+pip install soundfile
+```
+### Download Weights
+Download the model weights from the HuggingFace repository and place them in `weights/`:
+```
+weights/
+├── electrical_outlets_image_best.pt   (~ 17 MB)
+└── electrical_outlets_audio_best.pt   (~ 2 MB)
+```
+## Usage
+### Test a Single Image
+```bash
+python test_single_image.py --image path/to/outlet_photo.jpg
+```
+Output:
+```
+==================================================
+  burned_outlet.jpg
+==================================================
+  → burn_overheating  (high severity)
+  → 87.3% confidence
+  → issue_detected
+  burn_overheating      87.3% ██████████████████████████ ◄
+  cracked_faceplate      5.2% █
+  loose_outlet           3.1% ▊
+  normal                 2.8% ▊
+  water_exposed          1.6% ▍
+```
+### API Server
+```bash
+uvicorn api.main:app --host 0.0.0.0 --port 8000
+```
+#### Endpoints
+**POST** `/v1/diagnose/electrical_outlets`
+Upload image and/or audio for diagnosis:
+```bash
+# Image only
+curl -X POST http://localhost:8000/v1/diagnose/electrical_outlets \
+  -F "image=@outlet_photo.jpg"
+# Image + Audio
+curl -X POST http://localhost:8000/v1/diagnose/electrical_outlets \
+  -F "image=@outlet_photo.jpg" \
+  -F "audio=@outlet_recording.wav"
+```
+Response:
+```json
+{
+  "diagnostic_element": "electrical_outlets",
+  "result": "issue_detected",
+  "issue_type": "burn_overheating",
+  "severity": "high",
+  "confidence": 0.873,
+  "modality_contributions": null,
+  "primary_issue": "burn_overheating",
+  "secondary_issue": null
+}
+```
+**GET** `/health` — Check model availability
+### Python API
+```python
+from src.inference.wrapper import run_electrical_outlets_inference
+result = run_electrical_outlets_inference(
+    image_path="path/to/photo.jpg",
+    audio_path="path/to/recording.wav",  # optional
+)
+print(result)
+```
+## Training
+### Image Model
+```bash
+python training/train_image.py --device cuda
+```
+Two-stage training:
+1. **Stage 1:** Frozen EfficientNet-B0 backbone, train MLP head only (80-100 epochs)
+2. **Stage 2:** Unfreeze last 2 backbone blocks, fine-tune with low LR (25 epochs)
+### Audio Model
+```bash
+python training/train_audio.py --device cuda
+```
+Single-stage with SpecAugment, class-weighted loss, cosine LR schedule.
+## Class Mapping
+### Image Classes (5)
+| Class | Issue Type | Severity | Source Folders |
+|-------|-----------|----------|----------------|
+| 0 | burn_overheating | high | Burn marks (250), Discoloration (100), Sparking damage (150) |
+| 1 | cracked_faceplate | medium | Cracked faceplate (150), Damaged switches (50) |
+| 2 | loose_outlet | medium | Loose outlet (200), Exposed wiring (150) |
+| 3 | normal | low | Normal outlets (50), Normal switches (50) |
+| 4 | water_exposed | high | Water intrusion (150) |
+### Audio Classes (4)
+| Class | Issue Type | Severity |
+|-------|-----------|----------|
+| 0 | normal | low |
+| 1 | buzzing | high |
+| 2 | crackling_arcing | high |
+| 3 | arcing_pop | critical |
+## Severity Levels
+| Level | Action Required |
+|-------|----------------|
+| **low** | Monitor — no immediate action |
+| **medium** | Schedule repair |
+| **high** | Shut off circuit immediately |
+| **critical** | Shut off main breaker immediately |
+## Fusion Logic
+The fusion layer combines image and audio predictions:
+- If **both agree** on issue → `issue_detected` with max severity
+- If **both agree** on normal with high confidence → `normal`
+- If **they disagree** → `uncertain` (unless one has >92% confidence)
+- **Safety-first:** defaults to `uncertain` over `normal` when confidence is low
+## Limitations
+- Image model trained on web-sourced images (some watermarked/AI-generated)
+- Audio model trained on 100 synthetic clips — use as supporting evidence only
+- Water damage and cracked faceplate classes have lower recall (64-67%)
+- No GFCI failure detection (no training data available)
+- Real-world accuracy will be lower than validation metrics
+## License
+Proprietary — for use in the Electrical Outlets diagnostic pipeline only.

app.py ADDED Viewed

	@@ -0,0 +1,262 @@

+"""
+Electrical Outlets Diagnostic — Gradio Demo
+Install: pip install gradio
+Run:     python app.py
+"""
+from pathlib import Path
+import sys
+import json
+import torch
+import numpy as np
+from torchvision import transforms
+from PIL import Image
+ROOT = Path(__file__).resolve().parent
+sys.path.insert(0, str(ROOT))
+DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
+IMAGE_MODEL = None
+IMAGE_TEMP = 1.0
+AUDIO_MODEL = None
+AUDIO_TEMP = 1.0
+AUDIO_CFG = {}
+def load_models():
+    global IMAGE_MODEL, IMAGE_TEMP, AUDIO_MODEL, AUDIO_TEMP, AUDIO_CFG
+    img_weights = ROOT / "weights" / "electrical_outlets_image_best.pt"
+    mapping = ROOT / "config" / "label_mapping.json"
+    if img_weights.exists():
+        from src.models.image_model import ElectricalOutletsImageModel
+        ckpt = torch.load(img_weights, map_location=DEVICE, weights_only=False)
+        head_hidden = ckpt["model_state_dict"]["head.1.weight"].shape[0]
+        IMAGE_MODEL = ElectricalOutletsImageModel(
+            num_classes=ckpt["num_classes"], label_mapping_path=mapping,
+            pretrained=False, head_hidden=head_hidden,
+        )
+        IMAGE_MODEL.load_state_dict(ckpt["model_state_dict"])
+        IMAGE_MODEL.idx_to_issue_type = ckpt.get("idx_to_issue_type")
+        IMAGE_MODEL.idx_to_severity = ckpt.get("idx_to_severity")
+        IMAGE_MODEL.eval().to(DEVICE)
+        T = ckpt.get("temperature", 1.0)
+        IMAGE_TEMP = T if 0 < T < 10 else 1.0
+        print(f"  Image model loaded ({ckpt['num_classes']} classes, head={head_hidden})")
+    audio_weights = ROOT / "weights" / "electrical_outlets_audio_best.pt"
+    if audio_weights.exists():
+        from src.models.audio_model import ElectricalOutletsAudioModel
+        import yaml
+        ckpt = torch.load(audio_weights, map_location=DEVICE, weights_only=False)
+        audio_cfg_path = ROOT / "config" / "audio_train_config.yaml"
+        n_mels, time_steps = 128, 128
+        if audio_cfg_path.exists():
+            with open(audio_cfg_path) as f:
+                AUDIO_CFG = yaml.safe_load(f)
+            n_mels = AUDIO_CFG.get("model", {}).get("n_mels", 128)
+            time_steps = AUDIO_CFG.get("model", {}).get("time_steps", 128)
+        AUDIO_MODEL = ElectricalOutletsAudioModel(
+            num_classes=ckpt["num_classes"], label_mapping_path=mapping,
+            n_mels=n_mels, time_steps=time_steps,
+        )
+        AUDIO_MODEL.load_state_dict(ckpt["model_state_dict"])
+        AUDIO_MODEL.idx_to_label = ckpt.get("idx_to_label")
+        AUDIO_MODEL.idx_to_issue_type = ckpt.get("idx_to_issue_type")
+        AUDIO_MODEL.idx_to_severity = ckpt.get("idx_to_severity")
+        AUDIO_MODEL.eval().to(DEVICE)
+        T = ckpt.get("temperature", 1.0)
+        AUDIO_TEMP = T if 0 < T < 10 else 1.0
+        print(f"  Audio model loaded ({ckpt['num_classes']} classes)")
+SEV_COLORS = {"low": "#22c55e", "medium": "#f59e0b", "high": "#ef4444", "critical": "#dc2626"}
+SEV_ICONS = {"low": "✅", "medium": "⚠️", "high": "🔴", "critical": "🚨"}
+def make_bar_html(probs_dict, highlight=None):
+    rows = ""
+    for name, prob in sorted(probs_dict.items(), key=lambda x: -x[1]):
+        pct = prob * 100
+        color = "#60a5fa" if name != highlight else "#f59e0b"
+        rows += f"""
+        <div style="display:flex;align-items:center;gap:8px;margin:3px 0;">
+            <div style="width:140px;font-size:13px;text-align:right;color:#ccc;">{name.replace('_',' ')}</div>
+            <div style="flex:1;background:#2a2a3e;border-radius:4px;height:20px;overflow:hidden;">
+                <div style="width:{pct}%;background:{color};height:100%;border-radius:4px;"></div>
+            </div>
+            <div style="width:55px;font-size:13px;color:#eee;">{pct:.1f}%</div>
+        </div>"""
+    return f'<div style="padding:8px 0;">{rows}</div>'
+def make_result_html(pred, title, probs_dict=None):
+    sev = pred.get("severity", "low")
+    color = SEV_COLORS.get(sev, "#666")
+    sev_icon = SEV_ICONS.get(sev, "")
+    conf = pred.get("confidence", 0)
+    issue = (pred.get("issue_type") or "uncertain").replace("_", " ").title()
+    result_text = pred.get("result", "").replace("_", " ").title()
+    bars = make_bar_html(probs_dict, pred.get("issue_type")) if probs_dict else ""
+    return f"""
+    <div style="background:#1a1a2e;border-radius:12px;padding:20px;margin:8px 0;
+                border-left:4px solid {color};color:#e0e0e0;font-family:system-ui;">
+        <div style="font-size:12px;color:#888;text-transform:uppercase;letter-spacing:1px;margin-bottom:10px;">{title}</div>
+        <div style="font-size:26px;font-weight:700;margin-bottom:6px;">{result_text}</div>
+        <div style="font-size:18px;color:{color};font-weight:600;margin-bottom:14px;">{issue}</div>
+        <div style="display:flex;gap:32px;">
+            <div><div style="font-size:11px;color:#888;text-transform:uppercase;">Severity</div>
+                 <div style="font-size:15px;font-weight:600;color:{color};">{sev_icon} {sev.upper()}</div></div>
+            <div><div style="font-size:11px;color:#888;text-transform:uppercase;">Confidence</div>
+                 <div style="font-size:15px;font-weight:600;">{conf:.1%}</div></div>
+        </div>
+        {bars}
+    </div>"""
+def predict_image_fn(img):
+    if IMAGE_MODEL is None:
+        return None, None
+    tf = transforms.Compose([
+        transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(),
+        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
+    ])
+    x = tf(img.convert("RGB")).unsqueeze(0).to(DEVICE)
+    with torch.no_grad():
+        logits = IMAGE_MODEL(x) / IMAGE_TEMP
+        probs = torch.softmax(logits, dim=-1)[0]
+    pred = IMAGE_MODEL.predict_to_schema(logits)
+    probs_dict = {IMAGE_MODEL.idx_to_issue_type[i]: p for i, p in enumerate(probs.tolist())}
+    return pred, probs_dict
+def predict_audio_fn(audio_tuple):
+    if AUDIO_MODEL is None:
+        return None, None
+    import torchaudio
+    sr_in, audio_data = audio_tuple
+    if isinstance(audio_data, np.ndarray):
+        waveform = torch.from_numpy(audio_data.astype(np.float32))
+        if waveform.dim() == 1:
+            waveform = waveform.unsqueeze(0)
+        elif waveform.dim() == 2:
+            if waveform.shape[1] <= 2:
+                waveform = waveform.T
+            if waveform.shape[0] > 1:
+                waveform = waveform.mean(dim=0, keepdim=True)
+        mx = waveform.abs().max()
+        if mx > 0:
+            waveform = waveform / mx
+    else:
+        return None, None
+    sample_rate = AUDIO_CFG.get("data", {}).get("sample_rate", 22050)
+    if sr_in != sample_rate:
+        waveform = torchaudio.functional.resample(waveform, sr_in, sample_rate)
+    target_len = int(AUDIO_CFG.get("data", {}).get("target_length_sec", 5.0) * sample_rate)
+    if waveform.shape[1] >= target_len:
+        s = (waveform.shape[1] - target_len) // 2
+        waveform = waveform[:, s:s + target_len]
+    else:
+        waveform = torch.nn.functional.pad(waveform, (0, target_len - waveform.shape[1]))
+    sc = AUDIO_CFG.get("spectrogram", {})
+    mel = torchaudio.transforms.MelSpectrogram(
+        sample_rate=sample_rate, n_fft=sc.get("n_fft", 1024),
+        hop_length=sc.get("hop_length", 512), win_length=sc.get("win_length", 1024),
+        n_mels=sc.get("n_mels", 128),
+    )(waveform)
+    log_mel = torch.log(mel.clamp(min=1e-5)).unsqueeze(0).to(DEVICE)
+    with torch.no_grad():
+        logits = AUDIO_MODEL(log_mel) / AUDIO_TEMP
+        probs = torch.softmax(logits, dim=-1)[0]
+    pred = AUDIO_MODEL.predict_to_schema(logits)
+    labels = AUDIO_MODEL.idx_to_label or [f"class_{i}" for i in range(AUDIO_MODEL.num_classes)]
+    probs_dict = {labels[i]: p for i, p in enumerate(probs.tolist())}
+    return pred, probs_dict
+def fuse_fn(image_pred, audio_pred):
+    from src.fusion.fusion_logic import fuse_modalities, ModalityOutput
+    import yaml
+    th_path = ROOT / "config" / "thresholds.yaml"
+    th = {}
+    if th_path.exists():
+        with open(th_path) as f:
+            th = yaml.safe_load(f) or {}
+    img_out = ModalityOutput(result=image_pred["result"], issue_type=image_pred.get("issue_type"),
+                             severity=image_pred["severity"], confidence=image_pred["confidence"])
+    aud_out = ModalityOutput(result=audio_pred["result"], issue_type=audio_pred.get("issue_type"),
+                             severity=audio_pred["severity"], confidence=audio_pred["confidence"])
+    return fuse_modalities(img_out, aud_out,
+                           confidence_issue_min=th.get("confidence_issue_min", 0.6),
+                           confidence_normal_min=th.get("confidence_normal_min", 0.75),
+                           uncertain_if_disagree=th.get("uncertain_if_disagree", True),
+                           high_confidence_override=th.get("high_confidence_override", 0.92))
+def diagnose(image, audio):
+    if image is None and audio is None:
+        return '<div style="padding:40px;color:#888;text-align:center;font-style:italic;">Upload an image or audio to begin diagnosis...</div>'
+    img_pred, img_probs, aud_pred, aud_probs = None, None, None, None
+    try:
+        if image is not None:
+            img = Image.fromarray(image) if isinstance(image, np.ndarray) else image
+            img_pred, img_probs = predict_image_fn(img)
+        if audio is not None:
+            aud_pred, aud_probs = predict_audio_fn(audio)
+    except Exception as e:
+        return f'<div style="padding:20px;color:#f87171;">Error: {e}</div>'
+    html = ""
+    if img_pred and aud_pred:
+        fused = fuse_fn(img_pred, aud_pred)
+        html += make_result_html(fused, "⚡ Fused Diagnosis")
+        html += '<div style="display:flex;gap:12px;">'
+        html += f'<div style="flex:1;">{make_result_html(img_pred, "📷 Image", img_probs)}</div>'
+        html += f'<div style="flex:1;">{make_result_html(aud_pred, "🎤 Audio", aud_probs)}</div>'
+        html += '</div>'
+    elif img_pred:
+        html += make_result_html(img_pred, "📷 Image Diagnosis", img_probs)
+    elif aud_pred:
+        html += make_result_html(aud_pred, "🎤 Audio Diagnosis", aud_probs)
+    else:
+        html = '<div style="padding:20px;color:#f87171;">Could not process input.</div>'
+    return html
+if __name__ == "__main__":
+    import gradio as gr
+    print("Loading models...")
+    load_models()
+    print(f"Device: {DEVICE}\n")
+    with gr.Blocks(
+        title="Electrical Outlets Diagnostic",
+        theme=gr.themes.Base(primary_hue="red", secondary_hue="amber", neutral_hue="slate",
+                             font=gr.themes.GoogleFont("Inter")),
+        css=".gradio-container{max-width:960px!important} footer{display:none!important}"
+    ) as demo:
+        gr.Markdown("# ⚡ Electrical Outlets Diagnostic\nUpload a **photo** and/or **audio** to detect safety issues.")
+        with gr.Row():
+            with gr.Column(scale=1):
+                image_input = gr.Image(label="📷 Outlet Photo", type="numpy", height=300)
+                audio_input = gr.Audio(label="🎤 Audio Recording", type="numpy")
+                btn = gr.Button("🔍 Diagnose", variant="primary", size="lg")
+            with gr.Column(scale=1):
+                output = gr.HTML(value='<div style="padding:40px;color:#888;text-align:center;font-style:italic;">Upload an image or audio to begin...</div>')
+        btn.click(fn=diagnose, inputs=[image_input, audio_input], outputs=[output])
+        image_input.change(fn=diagnose, inputs=[image_input, audio_input], outputs=[output])
+        audio_input.change(fn=diagnose, inputs=[image_input, audio_input], outputs=[output])
+        gr.Markdown("---\n| Severity | Action |\n|--|--|\n| ✅ Low | Monitor |\n| ⚠️ Medium | Schedule repair |\n| 🔴 High | Shut off circuit |\n| 🚨 Critical | Shut off main breaker |")
+    demo.launch(server_name="127.0.0.1", server_port=7860, share=False, show_error=True)

config/audio_train_config.yaml ADDED Viewed

	@@ -0,0 +1,41 @@

+# Audio model training config - Electrical Outlets
+# 100 samples: heavy augmentation, balanced batching, treat as preliminary
+data:
+  root: "electrical_outlets_sounds_100"
+  label_mapping: "config/label_mapping.json"
+  train_ratio: 0.7
+  val_ratio: 0.15
+  seed: 42
+  batch_size: 16
+  num_workers: 0
+  target_length_sec: 5.0
+  sample_rate: 16000
+spectrogram:
+  n_mels: 64
+  n_fft: 512
+  hop_length: 256
+  win_length: 512
+model:
+  num_classes: 4
+  n_mels: 64
+  time_steps: 128
+training:
+  epochs: 80
+  lr: 1.0e-3
+  weight_decay: 1.0e-4
+  use_class_weights: true
+  early_stopping_patience: 12
+  early_stopping_metric: "val_macro_recall"
+calibration:
+  use_temperature_scaling: true
+  val_fraction_for_calibration: 0.5
+output:
+  weights_dir: "weights"
+  best_name: "electrical_outlets_audio_best.pt"
+  report_name: "audio_model_report.md"

config/image_train_config.yaml ADDED Viewed

	@@ -0,0 +1,43 @@

+# v5.1 — Push past 63% min recall
+# Changes: higher finetune LR, bigger head, fixed temp scaling
+data:
+  root: "ELECTRICAL OUTLETS-20260106T153508Z-3-001"
+  label_mapping: "config/label_mapping.json"
+  train_ratio: 0.7
+  val_ratio: 0.15
+  seed: 42
+  batch_size: 64
+  num_workers: 4
+augmentation:
+  resize: 256
+  crop: 224
+model:
+  num_classes: 5
+  pretrained: true
+  head_hidden: 512         # was 256 — more capacity with 1300 images
+  head_dropout: 0.5        # was 0.4 — stronger regularization
+training:
+  epochs: 100              # was 80 — give head more time
+  lr: 3.0e-3
+  weight_decay: 1.0e-3
+  use_class_weights: true
+  use_focal: true
+  focal_alpha: 0.25
+  focal_gamma: 2.0
+  early_stopping_patience: 25   # was 20
+  early_stopping_metric: "val_min_recall"
+  finetune_last_blocks: true
+  finetune_lr: 2.0e-4           # was 5e-5 — 4x higher, backbone needs to adapt more
+  finetune_epochs: 30            # was 25
+calibration:
+  use_temperature_scaling: false  # DISABLED — was producing negative T
+output:
+  weights_dir: "weights"
+  best_name: "electrical_outlets_image_best.pt"
+  report_name: "image_model_report.md"

config/label_mapping.json ADDED Viewed

	@@ -0,0 +1,93 @@

+{
+  "image": {
+    "classes": [
+      {
+        "folder_key": "burn_marks_overheating",
+        "issue_type": "burn_overheating",
+        "severity": "high",
+        "description": "Fire, overheating, sparking, discoloration"
+      },
+      {
+        "folder_key": "cracked_faceplates",
+        "issue_type": "cracked_faceplate",
+        "severity": "medium",
+        "description": "Cracked/broken faceplate, damaged switches"
+      },
+      {
+        "folder_key": "loose_outlets",
+        "issue_type": "loose_outlet",
+        "severity": "medium",
+        "description": "Loose outlet, pulled from wall, exposed wiring"
+      },
+      {
+        "folder_key": "normal_outlets",
+        "issue_type": "normal",
+        "severity": "low",
+        "description": "Normal outlet/switch condition"
+      },
+      {
+        "folder_key": "water_exposed",
+        "issue_type": "water_exposed",
+        "severity": "high",
+        "description": "Water intrusion near outlet"
+      }
+    ],
+    "folder_to_class": {
+      "Burn marks - overheating 250": "burn_marks_overheating",
+      "Discoloration (heat aging) 100": "burn_marks_overheating",
+      "Sparking damage evidence 150": "burn_marks_overheating",
+      "Cracked faceplate 150": "cracked_faceplates",
+      "Damaged switches 50": "cracked_faceplates",
+      "Loose outlet - pulled from wall 200": "loose_outlets",
+      "Exposed wiring 150": "loose_outlets",
+      "Normal outlets 50": "normal_outlets",
+      "Normal switches 50": "normal_outlets",
+      "Water intrusion near outlet 150": "water_exposed"
+    },
+    "class_to_idx": {
+      "burn_marks_overheating": 0,
+      "cracked_faceplates": 1,
+      "loose_outlets": 2,
+      "normal_outlets": 3,
+      "water_exposed": 4
+    },
+    "idx_to_issue_type": [
+      "burn_overheating",
+      "cracked_faceplate",
+      "loose_outlet",
+      "normal",
+      "water_exposed"
+    ],
+    "idx_to_severity": ["high", "medium", "medium", "low", "high"]
+  },
+  "audio": {
+    "file_pattern_to_label": {
+      "normal_near_silent": "normal",
+      "plug_insert_remove_clicks": "normal",
+      "load_switching": "normal",
+      "buzzing_outlet": "buzzing",
+      "loose_contact_crackle": "crackling_arcing",
+      "arcing_pop": "arcing_pop"
+    },
+    "label_to_severity": {
+      "normal": "low",
+      "buzzing": "high",
+      "crackling_arcing": "high",
+      "arcing_pop": "critical"
+    },
+    "label_to_issue_type": {
+      "normal": "normal",
+      "buzzing": "buzzing",
+      "crackling_arcing": "crackling_arcing",
+      "arcing_pop": "arcing_pop"
+    },
+    "class_to_idx": {
+      "normal": 0,
+      "buzzing": 1,
+      "crackling_arcing": 2,
+      "arcing_pop": 3
+    },
+    "idx_to_label": ["normal", "buzzing", "crackling_arcing", "arcing_pop"],
+    "num_classes": 4
+  }
+}

config/schema.yaml ADDED Viewed

	@@ -0,0 +1,81 @@

+# Canonical output schema for Electrical Outlets diagnostic element
+# Used by image model, audio model, fusion layer, and API
+# Aligned to client PDF: Electrical outlet & switchs diagnostiocs
+diagnostic_element: electrical_outlets
+result:
+  type: string
+  enum:
+    - issue_detected
+    - normal
+    - uncertain
+  description: "Final outcome; uncertain triggers backend-guided adjustment or escalation"
+issue_type:
+  type: string
+  nullable: true
+  enum:
+    # Image-derived (NOT OPEN, PDF diagnostics 1-38)
+    - burn_overheating
+    - cracked_faceplate
+    - gfci_failure
+    - loose_outlet
+    - water_exposed
+    # Audio-derived (PDF diagnostics 21-28)
+    - buzzing
+    - humming
+    - crackling_arcing
+    - arcing_pop
+    - sizzling
+    - clicking_idle
+    # Combined / generic
+    - normal
+  description: "Primary issue type when result is issue_detected; null for normal/uncertain when no single type"
+severity:
+  type: string
+  enum:
+    - low
+    - medium
+    - high
+    - critical
+  description: "Per PDF: low=monitor, medium=repair, high=shut circuit, critical=shut main breaker"
+confidence:
+  type: number
+  minimum: 0
+  maximum: 1
+  description: "Calibrated probability; drives uncertain path when below threshold"
+modality_contributions:
+  type: object
+  nullable: true
+  properties:
+    image:
+      type: object
+      nullable: true
+      properties:
+        result: { type: string }
+        issue_type: { type: string, nullable: true }
+        severity: { type: string }
+        confidence: { type: number }
+    audio:
+      type: object
+      nullable: true
+      properties:
+        result: { type: string }
+        issue_type: { type: string, nullable: true }
+        severity: { type: string }
+        confidence: { type: number }
+  description: "Per-modality outputs for transparency; present when both image and audio provided"
+# For fusion when both modalities detect different issues
+primary_issue:
+  type: string
+  nullable: true
+  description: "Higher-severity issue when both modalities detect issues"
+secondary_issue:
+  type: string
+  nullable: true
+  description: "Other issue when both modalities detect different issues"

config/thresholds.yaml ADDED Viewed

	@@ -0,0 +1,13 @@

+# Threshold and safety configuration - Electrical Outlets
+# Prefer "uncertain" over "normal" when in doubt (minimize false negatives)
+confidence_issue_min: 0.6   # below this -> result = uncertain when issue_detected
+confidence_normal_min: 0.75 # both modalities must exceed this to return "normal"
+uncertain_if_disagree: true # image defect + audio normal (or vice versa) -> uncertain unless one side very high
+high_confidence_override: 0.92 # if one modality >= this and says issue_detected, can override disagree
+severity_order:
+  - low
+  - medium
+  - high
+  - critical

releases.md ADDED Viewed

	@@ -0,0 +1,8 @@

+# Releases
+## v1.0 — February 2026
+- 5-class image model (EfficientNet-B0 + MLP head): 77% accuracy, 67% min recall
+- 4-class audio model (Spectrogram CNN): 100% recall
+- Decision-level fusion with configurable thresholds
+- Gradio demo app
+- FastAPI endpoint

requirements.txt ADDED Viewed

	@@ -0,0 +1,145 @@

+aiofiles==23.2.1
+aiohappyeyeballs==2.6.1
+aiohttp==3.11.18
+aiosignal==1.3.2
+altair==5.5.0
+annotated-doc==0.0.4
+annotated-types==0.7.0
+anyio==4.9.0
+async-timeout==4.0.3
+attrs==25.3.0
+beautifulsoup4==4.13.4
+Brotli @ file:///D:/bld/brotli-split_1725267609074/work
+certifi @ file:///home/conda/feedstock_root/build_artifacts/certifi_1739515848642/work/certifi
+cffi @ file:///D:/bld/cffi_1725560792189/work
+charset-normalizer @ file:///home/conda/feedstock_root/build_artifacts/charset-normalizer_1746214863626/work
+click==8.1.8
+colorama==0.4.6
+comtypes==1.4.10
+contourpy==1.3.0
+cycler==0.12.1
+dataclasses-json==0.6.7
+docopt==0.6.2
+exceptiongroup==1.2.2
+fastapi==0.95.2
+ffmpy==1.0.0
+filelock==3.18.0
+fonttools==4.60.2
+frozenlist==1.6.0
+fsspec==2025.3.2
+gradio==3.50.2
+gradio_client==0.6.1
+greenlet==3.2.1
+h11==0.16.0
+h2 @ file:///home/conda/feedstock_root/build_artifacts/h2_1738578511449/work
+hpack @ file:///home/conda/feedstock_root/build_artifacts/hpack_1737618293087/work
+httpcore==1.0.9
+httpx==0.28.1
+httpx-sse==0.4.0
+huggingface-hub==0.31.1
+hyperframe @ file:///home/conda/feedstock_root/build_artifacts/hyperframe_1737618333194/work
+idna @ file:///home/conda/feedstock_root/build_artifacts/idna_1733211830134/work
+importlib_resources==6.5.2
+Jinja2==3.1.6
+joblib==1.5.3
+Js2Py==0.74
+jsonpatch==1.33
+jsonpointer==3.0.0
+jsonschema==4.25.1
+jsonschema-specifications==2025.9.1
+kiwisolver==1.4.7
+langchain==0.3.25
+langchain-community==0.3.23
+langchain-core==0.3.58
+langchain-ollama==0.3.2
+langchain-text-splitters==0.3.8
+langgraph==0.4.1
+langgraph-checkpoint==2.0.25
+langgraph-prebuilt==0.1.8
+langgraph-sdk==0.1.66
+langsmith==0.3.42
+llvmlite==0.43.0
+markdown-it-py==3.0.0
+MarkupSafe==2.1.5
+marshmallow==3.26.1
+matplotlib==3.9.4
+mdurl==0.1.2
+more-itertools==10.7.0
+mpmath==1.3.0
+multidict==6.4.3
+mypy_extensions==1.1.0
+narwhals==2.17.0
+networkx==3.2.1
+numba==0.60.0
+numpy==1.26.4
+ollama==0.4.8
+openai-whisper==20240930
+orjson==3.10.18
+ormsgpack==1.9.1
+packaging==24.2
+pandas==2.3.3
+pillow==10.4.0
+pipwin==0.5.2
+propcache==0.3.1
+PyAudio==0.2.14
+pycparser @ file:///home/conda/feedstock_root/build_artifacts/bld/rattler-build_pycparser_1733195786/work
+pydantic==1.10.13
+pydantic-settings==2.9.1
+pydantic_core==2.41.5
+pydub==0.25.1
+pygame==2.6.1
+Pygments==2.19.2
+pyjsparser==2.7.1
+pyparsing==3.3.2
+pypiwin32==223
+PyPrind==2.11.3
+pySmartDL==1.3.4
+PySocks @ file:///D:/bld/pysocks_1733217287171/work
+python-dateutil==2.9.0.post0
+python-dotenv==1.1.0
+python-multipart==0.0.20
+pyttsx3==2.98
+pytz==2025.2
+pywin32==310
+PyYAML==6.0.2
+referencing==0.36.2
+regex==2024.11.6
+requests @ file:///home/conda/feedstock_root/build_artifacts/requests_1733217035951/work
+requests-toolbelt==1.0.0
+rich==14.3.3
+rpds-py==0.27.1
+ruff==0.15.2
+scikit-learn==1.6.1
+scipy==1.13.1
+semantic-version==2.10.0
+shellingham==1.5.4
+six==1.17.0
+sniffio==1.3.1
+soundfile==0.13.1
+soupsieve==2.7
+SpeechRecognition @ file:///home/conda/feedstock_root/build_artifacts/speechrecognition_1742707644995/work
+SQLAlchemy==2.0.40
+starlette==0.27.0
+sympy==1.13.1
+tenacity==9.1.2
+threadpoolctl==3.6.0
+tiktoken==0.9.0
+tomlkit==0.12.0
+torch==2.6.0+cu124
+torchaudio==2.6.0+cu124
+torchvision==0.21.0+cu124
+tqdm==4.67.1
+typer==0.23.2
+typing-inspect==0.9.0
+typing-inspection==0.4.2
+typing_extensions==4.15.0
+tzdata==2025.2
+tzlocal==5.3.1
+urllib3 @ file:///home/conda/feedstock_root/build_artifacts/urllib3_1744323578849/work
+uvicorn==0.39.0
+websockets==11.0.3
+win_inet_pton @ file:///D:/bld/win_inet_pton_1733130564612/work
+xxhash==3.5.0
+yarl==1.20.0
+zipp==3.23.0
+zstandard==0.23.0

src/__init__.py ADDED Viewed

	@@ -0,0 +1,4 @@

+from .image_model import ElectricalOutletsImageModel
+from .audio_model import ElectricalOutletsAudioModel
+__all__ = ["ElectricalOutletsImageModel", "ElectricalOutletsAudioModel"]

src/data/audio_dataset.py ADDED Viewed

	@@ -0,0 +1,105 @@

+"""
+Audio dataset for Electrical Outlets. Uses README/file naming and config/label_mapping.json.
+PATCHED: rglob for subfolders, torchaudio import at module level, stratified splits.
+"""
+from pathlib import Path
+import json
+import logging
+from collections import defaultdict
+from typing import Optional, Callable, List, Tuple
+import torch
+import torchaudio
+from torch.utils.data import Dataset
+logger = logging.getLogger(__name__)
+def _label_from_filename(filename: str, file_pattern_to_label: dict) -> str:
+    for pattern, label in file_pattern_to_label.items():
+        if filename.startswith(pattern) or pattern in filename:
+            return label
+    return "normal"
+class ElectricalOutletsAudioDataset(Dataset):
+    """Audio dataset from electrical_outlets_sounds_100 WAVs."""
+    def __init__(
+        self,
+        root: Path,
+        label_mapping_path: Path,
+        split: str = "train",
+        train_ratio: float = 0.7,
+        val_ratio: float = 0.15,
+        seed: int = 42,
+        transform: Optional[Callable] = None,
+        target_length_sec: float = 5.0,
+        sample_rate: int = 22050,
+    ):
+        self.root = Path(root)
+        self.transform = transform
+        self.target_length_sec = target_length_sec
+        self.sample_rate = sample_rate
+        with open(label_mapping_path) as f:
+            lm = json.load(f)
+        self.file_pattern_to_label = lm["audio"]["file_pattern_to_label"]
+        self.class_to_idx = lm["audio"]["class_to_idx"]
+        self.idx_to_label = lm["audio"]["idx_to_label"]
+        self.label_to_severity = lm["audio"]["label_to_severity"]
+        self.label_to_issue_type = lm["audio"]["label_to_issue_type"]
+        self.num_classes = len(self.class_to_idx)
+        self.samples: List[Tuple[Path, int]] = []
+        # rglob to search subfolders
+        for wav in self.root.rglob("*.wav"):
+            label = _label_from_filename(wav.stem, self.file_pattern_to_label)
+            if label not in self.class_to_idx:
+                logger.warning(f"Unmatched audio file: {wav.name} → label '{label}' not in class_to_idx")
+                continue
+            self.samples.append((wav, self.class_to_idx[label]))
+        # Stratified split
+        by_class = defaultdict(list)
+        for i, (_, cls) in enumerate(self.samples):
+            by_class[cls].append(i)
+        train_idx, val_idx, test_idx = [], [], []
+        for cls in sorted(by_class.keys()):
+            indices = by_class[cls]
+            g = torch.Generator().manual_seed(seed)
+            perm = torch.randperm(len(indices), generator=g).tolist()
+            n_cls = len(indices)
+            n_tr = int(n_cls * train_ratio)
+            n_va = int(n_cls * val_ratio)
+            train_idx.extend([indices[p] for p in perm[:n_tr]])
+            val_idx.extend([indices[p] for p in perm[n_tr:n_tr + n_va]])
+            test_idx.extend([indices[p] for p in perm[n_tr + n_va:]])
+        if split == "train":
+            self.indices = train_idx
+        elif split == "val":
+            self.indices = val_idx
+        else:
+            self.indices = test_idx
+    def __len__(self) -> int:
+        return len(self.indices)
+    def __getitem__(self, idx: int):
+        i = self.indices[idx]
+        path, cls = self.samples[i]
+        waveform, sr = torchaudio.load(str(path))
+        if sr != self.sample_rate:
+            waveform = torchaudio.functional.resample(waveform, sr, self.sample_rate)
+        if waveform.shape[0] > 1:
+            waveform = waveform.mean(dim=0, keepdim=True)
+        target_len = int(self.target_length_sec * self.sample_rate)
+        if waveform.shape[1] >= target_len:
+            start = (waveform.shape[1] - target_len) // 2
+            waveform = waveform[:, start : start + target_len]
+        else:
+            waveform = torch.nn.functional.pad(waveform, (0, target_len - waveform.shape[1]))
+        if self.transform:
+            waveform = self.transform(waveform)
+        return waveform, cls

src/data/image_dataset.py ADDED Viewed

	@@ -0,0 +1,155 @@

+"""
+Image dataset for Electrical Outlets.
+FINAL v5: Direct folder_to_class mapping — no pattern matching, no ambiguity.
+"""
+from pathlib import Path
+import json
+import logging
+from collections import defaultdict
+from typing import Optional, Callable, List, Tuple
+import torch
+from torch.utils.data import Dataset
+from PIL import Image
+logger = logging.getLogger(__name__)
+logging.basicConfig(level=logging.INFO, format="%(message)s")
+class ElectricalOutletsImageDataset(Dataset):
+    def __init__(
+        self,
+        root: Path,
+        label_mapping_path: Path,
+        split: str = "train",
+        train_ratio: float = 0.7,
+        val_ratio: float = 0.15,
+        seed: int = 42,
+        transform: Optional[Callable] = None,
+        extensions: Tuple[str, ...] = (".jpg", ".jpeg", ".png"),
+    ):
+        self.root = Path(root)
+        self.transform = transform
+        self.extensions = extensions
+        self.split = split
+        with open(label_mapping_path) as f:
+            lm = json.load(f)
+        self.folder_to_class = lm["image"]["folder_to_class"]
+        self.class_to_idx = lm["image"]["class_to_idx"]
+        self.idx_to_issue_type = lm["image"]["idx_to_issue_type"]
+        self.idx_to_severity = lm["image"]["idx_to_severity"]
+        self.num_classes = len(self.class_to_idx)
+        # Build samples list
+        self.samples: List[Tuple[Path, int]] = []
+        class_counts = defaultdict(int)
+        matched_folders = []
+        unmatched_folders = []
+        for folder in sorted(self.root.iterdir()):
+            if not folder.is_dir():
+                continue
+            # Direct lookup by exact folder name
+            class_key = self.folder_to_class.get(folder.name)
+            if class_key is None:
+                unmatched_folders.append(folder.name)
+                continue
+            cls_idx = self.class_to_idx[class_key]
+            count = 0
+            for f in folder.iterdir():
+                if f.suffix.lower() in self.extensions:
+                    self.samples.append((f, cls_idx))
+                    count += 1
+            class_counts[cls_idx] += count
+            matched_folders.append(f"  ✓ {folder.name} → {class_key} (idx={cls_idx}): {count} images")
+        # Log results
+        logger.info(f"\n{'='*60}")
+        logger.info(f"Dataset loading from: {self.root}")
+        logger.info(f"{'='*60}")
+        for line in matched_folders:
+            logger.info(line)
+        for uf in unmatched_folders:
+            logger.warning(f"  ✗ SKIPPED: '{uf}' (not in folder_to_class)")
+        logger.info(f"\nClass distribution:")
+        for idx in sorted(class_counts.keys()):
+            name = [k for k, v in self.class_to_idx.items() if v == idx][0]
+            logger.info(f"  Class {idx} ({name}): {class_counts[idx]} images")
+        logger.info(f"Total: {len(self.samples)} images in {self.num_classes} classes")
+        if len(self.samples) == 0:
+            logger.error("NO SAMPLES FOUND! Check that data_root points to the folder containing your class subfolders.")
+            raise ValueError(f"No images found in {self.root}. Check folder names match label_mapping.json folder_to_class keys.")
+        # Stratified split
+        by_class = defaultdict(list)
+        for i, (_, cls) in enumerate(self.samples):
+            by_class[cls].append(i)
+        train_idx, val_idx, test_idx = [], [], []
+        for cls in sorted(by_class.keys()):
+            indices = by_class[cls]
+            g = torch.Generator().manual_seed(seed)
+            perm = torch.randperm(len(indices), generator=g).tolist()
+            n_cls = len(indices)
+            n_tr = int(n_cls * train_ratio)
+            n_va = int(n_cls * val_ratio)
+            train_idx.extend([indices[p] for p in perm[:n_tr]])
+            val_idx.extend([indices[p] for p in perm[n_tr:n_tr + n_va]])
+            test_idx.extend([indices[p] for p in perm[n_tr + n_va:]])
+        if split == "train":
+            self.indices = train_idx
+        elif split == "val":
+            self.indices = val_idx
+        else:
+            self.indices = test_idx
+        logger.info(f"Split '{split}': {len(self.indices)} samples\n")
+    def __len__(self):
+        return len(self.indices)
+    def __getitem__(self, idx):
+        i = self.indices[idx]
+        path, cls = self.samples[i]
+        img = Image.open(path).convert("RGB")
+        if self.transform:
+            img = self.transform(img)
+        return img, cls
+    def get_issue_type(self, class_idx: int) -> str:
+        return self.idx_to_issue_type[class_idx]
+    def get_severity(self, class_idx: int) -> str:
+        return self.idx_to_severity[class_idx]
+def get_image_class_weights(label_mapping_path: Path, root: Path) -> torch.Tensor:
+    """Compute inverse frequency weights for class-weighted loss."""
+    with open(label_mapping_path) as f:
+        lm = json.load(f)
+    folder_to_class = lm["image"]["folder_to_class"]
+    class_to_idx = lm["image"]["class_to_idx"]
+    num_classes = len(class_to_idx)
+    counts = [0] * num_classes
+    root = Path(root)
+    for folder in root.iterdir():
+        if not folder.is_dir():
+            continue
+        class_key = folder_to_class.get(folder.name)
+        if class_key is None:
+            continue
+        cls_idx = class_to_idx[class_key]
+        n = sum(1 for f in folder.iterdir() if f.suffix.lower() in (".jpg", ".jpeg", ".png"))
+        counts[cls_idx] += n
+    total = sum(counts)
+    if total == 0:
+        return torch.ones(num_classes)
+    weights = [total / (num_classes * c) if c else 1.0 for c in counts]
+    return torch.tensor(weights, dtype=torch.float32)

src/fusion/fusion_logic.py ADDED Viewed

	@@ -0,0 +1,129 @@

+"""
+Decision-level fusion for Electrical Outlets. No early fusion.
+Rules: final_severity = max(image_severity, audio_severity); result = issue_detected | normal | uncertain.
+"""
+from typing import Optional, Dict, Any
+from dataclasses import dataclass
+@dataclass
+class ModalityOutput:
+    result: str  # issue_detected | normal | uncertain
+    issue_type: Optional[str] = None
+    severity: str = "low"
+    confidence: float = 0.0
+def _severity_rank(s: str, order: list) -> int:
+    try:
+        return order.index(s)
+    except ValueError:
+        return 0
+def fuse_modalities(
+    image_out: Optional[ModalityOutput],
+    audio_out: Optional[ModalityOutput],
+    confidence_issue_min: float = 0.6,
+    confidence_normal_min: float = 0.75,
+    uncertain_if_disagree: bool = True,
+    high_confidence_override: float = 0.92,
+    severity_order: Optional[list] = None,
+) -> Dict[str, Any]:
+    """
+    Fuse image and/or audio outputs into single diagnostic result.
+    Prefer uncertain over normal when in doubt.
+    """
+    if severity_order is None:
+        severity_order = ["low", "medium", "high", "critical"]
+    modality_contributions = {}
+    outputs = []
+    if image_out is not None:
+        outputs.append(("image", image_out))
+        modality_contributions["image"] = {
+            "result": image_out.result,
+            "issue_type": image_out.issue_type,
+            "severity": image_out.severity,
+            "confidence": image_out.confidence,
+        }
+    if audio_out is not None:
+        outputs.append(("audio", audio_out))
+        modality_contributions["audio"] = {
+            "result": audio_out.result,
+            "issue_type": audio_out.issue_type,
+            "severity": audio_out.severity,
+            "confidence": audio_out.confidence,
+        }
+    if not outputs:
+        return {
+            "diagnostic_element": "electrical_outlets",
+            "result": "uncertain",
+            "issue_type": None,
+            "severity": "low",
+            "confidence": 0.0,
+            "modality_contributions": None,
+            "primary_issue": None,
+            "secondary_issue": None,
+        }
+    # Severity: max across modalities
+    max_severity_rank = -1
+    max_severity = "low"
+    for _, out in outputs:
+        r = _severity_rank(out.severity, severity_order)
+        if r > max_severity_rank:
+            max_severity_rank = r
+            max_severity = out.severity
+    # Result and issue_type
+    primary_issue = None
+    secondary_issue = None
+    has_issue = any(o.result == "issue_detected" for _, o in outputs)
+    all_normal = all(o.result == "normal" for _, o in outputs)
+    max_conf = max(o.confidence for _, o in outputs)
+    disagree = len(outputs) == 2 and (
+        (outputs[0][1].result == "issue_detected" and outputs[1][1].result == "normal")
+        or (outputs[0][1].result == "normal" and outputs[1][1].result == "issue_detected")
+    )
+    if has_issue and max_conf >= confidence_issue_min:
+        if disagree and uncertain_if_disagree:
+            override = any(o.confidence >= high_confidence_override and o.result == "issue_detected" for _, o in outputs)
+            if override:
+                result = "issue_detected"
+                issue_type = next(o.issue_type for _, o in outputs if o.result == "issue_detected" and o.confidence >= high_confidence_override)
+                primary_issue = issue_type
+            else:
+                result = "uncertain"
+                issue_type = None
+        else:
+            result = "issue_detected"
+            defect_outs = [(n, o) for n, o in outputs if o.result == "issue_detected"]
+            if len(defect_outs) >= 2:
+                defect_outs.sort(key=lambda x: _severity_rank(x[1].severity, severity_order), reverse=True)
+                issue_type = defect_outs[0][1].issue_type
+                primary_issue = defect_outs[0][1].issue_type
+                secondary_issue = defect_outs[1][1].issue_type if defect_outs[0][1].issue_type != defect_outs[1][1].issue_type else None
+            else:
+                issue_type = defect_outs[0][1].issue_type
+                primary_issue = issue_type
+    elif all_normal and all(o.confidence >= confidence_normal_min for _, o in outputs):
+        result = "normal"
+        issue_type = "normal"
+    else:
+        result = "uncertain"
+        issue_type = None
+    confidence = max_conf if result != "uncertain" else min(o.confidence for _, o in outputs)
+    return {
+        "diagnostic_element": "electrical_outlets",
+        "result": result,
+        "issue_type": issue_type,
+        "severity": max_severity,
+        "confidence": round(confidence, 4),
+        "modality_contributions": modality_contributions if len(modality_contributions) > 1 else None,
+        "primary_issue": primary_issue if result == "issue_detected" else None,
+        "secondary_issue": secondary_issue,
+    }

src/inference/wrapper.py ADDED Viewed

	@@ -0,0 +1,144 @@

+"""
+Inference wrapper: load image + audio models, run modalities present, apply fusion, return schema.
+"""
+from pathlib import Path
+from typing import Optional, Dict, Any, BinaryIO
+import json
+import torch
+import torchaudio
+from torchvision import transforms
+from PIL import Image
+# Optional imports for models
+import sys
+ROOT = Path(__file__).resolve().parent.parent.parent
+sys.path.insert(0, str(ROOT))
+def _load_image_model(weights_path: Path, label_mapping_path: Path, device: str):
+    from src.models.image_model import ElectricalOutletsImageModel
+    ckpt = torch.load(weights_path, map_location=device)
+    model = ElectricalOutletsImageModel(
+        num_classes=ckpt["num_classes"],
+        label_mapping_path=label_mapping_path,
+        pretrained=False,
+    )
+    model.load_state_dict(ckpt["model_state_dict"])
+    model.idx_to_issue_type = ckpt.get("idx_to_issue_type")
+    model.idx_to_severity = ckpt.get("idx_to_severity")
+    model.eval()
+    return model.to(device), ckpt.get("temperature", 1.0)
+def _load_audio_model(weights_path: Path, label_mapping_path: Path, device: str, config: dict):
+    from src.models.audio_model import ElectricalOutletsAudioModel
+    ckpt = torch.load(weights_path, map_location=device)
+    model = ElectricalOutletsAudioModel(
+        num_classes=ckpt["num_classes"],
+        label_mapping_path=label_mapping_path,
+        n_mels=config.get("n_mels", 64),
+        time_steps=config.get("time_steps", 128),
+    )
+    model.load_state_dict(ckpt["model_state_dict"])
+    model.idx_to_label = ckpt.get("idx_to_label")
+    model.idx_to_issue_type = ckpt.get("idx_to_issue_type")
+    model.idx_to_severity = ckpt.get("idx_to_severity")
+    model.eval()
+    return model.to(device), ckpt.get("temperature", 1.0)
+def run_electrical_outlets_inference(
+    image_path: Optional[Path] = None,
+    image_fp: Optional[BinaryIO] = None,
+    audio_path: Optional[Path] = None,
+    audio_fp: Optional[BinaryIO] = None,
+    weights_dir: Path = None,
+    config_dir: Path = None,
+    device: str = None,
+) -> Dict[str, Any]:
+    """
+    Run image and/or audio model, then fuse. Returns canonical schema dict.
+    """
+    if weights_dir is None:
+        weights_dir = ROOT / "weights"
+    if config_dir is None:
+        config_dir = ROOT / "config"
+    if device is None:
+        device = "cuda" if torch.cuda.is_available() else "cpu"
+    label_mapping_path = config_dir / "label_mapping.json"
+    thresholds_path = config_dir / "thresholds.yaml"
+    import yaml
+    with open(thresholds_path) as f:
+        thresholds = yaml.safe_load(f)
+    image_out = None
+    if image_path or image_fp:
+        img = Image.open(image_path or image_fp).convert("RGB")
+        tf = transforms.Compose([
+            transforms.Resize(256),
+            transforms.CenterCrop(224),
+            transforms.ToTensor(),
+            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
+        ])
+        x = tf(img).unsqueeze(0).to(device)
+        model, T = _load_image_model(weights_dir / "electrical_outlets_image_best.pt", label_mapping_path, device)
+        with torch.no_grad():
+            logits = model(x) / T
+        from src.fusion.fusion_logic import ModalityOutput
+        pred = model.predict_to_schema(logits)
+        image_out = ModalityOutput(
+            result=pred["result"],
+            issue_type=pred.get("issue_type"),
+            severity=pred["severity"],
+            confidence=pred["confidence"],
+        )
+    audio_out = None
+    if (audio_path or audio_fp) and (weights_dir / "electrical_outlets_audio_best.pt").exists():
+        if audio_path:
+            waveform, sr = torchaudio.load(str(audio_path))
+        else:
+            import io
+            waveform, sr = torchaudio.load(io.BytesIO(audio_fp.read()))
+        if sr != 16000:
+            waveform = torchaudio.functional.resample(waveform, sr, 16000)
+        if waveform.shape[0] > 1:
+            waveform = waveform.mean(dim=0, keepdim=True)
+        target_len = int(5.0 * 16000)
+        if waveform.shape[1] >= target_len:
+            start = (waveform.shape[1] - target_len) // 2
+            waveform = waveform[:, start : start + target_len]
+        else:
+            waveform = torch.nn.functional.pad(waveform, (0, target_len - waveform.shape[1]))
+        mel = torchaudio.transforms.MelSpectrogram(
+            sample_rate=16000, n_fft=512, hop_length=256, win_length=512, n_mels=64,
+        )(waveform)
+        log_mel = torch.log(mel.clamp(min=1e-5)).unsqueeze(0).to(device)
+        model, T = _load_audio_model(
+            weights_dir / "electrical_outlets_audio_best.pt",
+            label_mapping_path,
+            device,
+            {"n_mels": 64, "time_steps": 128},
+        )
+        with torch.no_grad():
+            logits = model(log_mel) / T
+        from src.fusion.fusion_logic import ModalityOutput
+        pred = model.predict_to_schema(logits)
+        audio_out = ModalityOutput(
+            result=pred["result"],
+            issue_type=pred.get("issue_type"),
+            severity=pred["severity"],
+            confidence=pred["confidence"],
+        )
+    from src.fusion.fusion_logic import fuse_modalities
+    return fuse_modalities(
+        image_out,
+        audio_out,
+        confidence_issue_min=thresholds.get("confidence_issue_min", 0.6),
+        confidence_normal_min=thresholds.get("confidence_normal_min", 0.75),
+        uncertain_if_disagree=thresholds.get("uncertain_if_disagree", True),
+        high_confidence_override=thresholds.get("high_confidence_override", 0.92),
+        severity_order=thresholds.get("severity_order"),
+    )

src/models/audio_model.py ADDED Viewed

	@@ -0,0 +1,87 @@

+"""
+Audio classifier for Electrical Outlets. Expects spectrogram or waveform; outputs class logits.
+Severity from label_mapping. Small CNN for 100-sample regime.
+"""
+from pathlib import Path
+from typing import Dict, Any, Optional
+import json
+import torch
+import torch.nn as nn
+class SpectrogramCNN(nn.Module):
+    """Lightweight CNN on mel spectrogram (n_mels x time)."""
+    def __init__(self, n_mels: int = 64, time_steps: int = 128, num_classes: int = 4):
+        super().__init__()
+        self.conv = nn.Sequential(
+            nn.Conv2d(1, 32, 3, padding=1),
+            nn.BatchNorm2d(32),
+            nn.ReLU(),
+            nn.MaxPool2d(2),
+            nn.Conv2d(32, 64, 3, padding=1),
+            nn.BatchNorm2d(64),
+            nn.ReLU(),
+            nn.MaxPool2d(2),
+            nn.Conv2d(64, 128, 3, padding=1),
+            nn.BatchNorm2d(128),
+            nn.ReLU(),
+            nn.AdaptiveAvgPool2d(1),
+        )
+        self.fc = nn.Linear(128, num_classes)
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        if x.dim() == 2:
+            x = x.unsqueeze(0).unsqueeze(0)
+        elif x.dim() == 3:
+            x = x.unsqueeze(1)
+        x = self.conv(x)
+        x = x.flatten(1)
+        return self.fc(x)
+class ElectricalOutletsAudioModel(nn.Module):
+    """Wrapper: optional mel transform then SpectrogramCNN. Severity from mapping."""
+    def __init__(
+        self,
+        num_classes: int = 4,
+        label_mapping_path: Optional[Path] = None,
+        n_mels: int = 64,
+        time_steps: int = 128,
+    ):
+        super().__init__()
+        self.num_classes = num_classes
+        self.n_mels = n_mels
+        self.time_steps = time_steps
+        self.backbone = SpectrogramCNN(n_mels=n_mels, time_steps=time_steps, num_classes=num_classes)
+        self.idx_to_label = None
+        self.idx_to_issue_type = None
+        self.idx_to_severity = None
+        if label_mapping_path and Path(label_mapping_path).exists():
+            with open(label_mapping_path) as f:
+                lm = json.load(f)
+            self.idx_to_label = lm["audio"]["idx_to_label"]
+            self.idx_to_issue_type = [lm["audio"]["label_to_issue_type"].get(lbl, "normal") for lbl in lm["audio"]["idx_to_label"]]
+            self.idx_to_severity = [lm["audio"]["label_to_severity"].get(lm["audio"]["idx_to_label"][i], "medium") for i in range(num_classes)]
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        return self.backbone(x)
+    def predict_to_schema(self, logits: torch.Tensor) -> Dict[str, Any]:
+        probs = torch.softmax(logits, dim=-1)
+        if logits.dim() == 1:
+            probs = probs.unsqueeze(0)
+        conf, pred = probs.max(dim=-1)
+        pred = pred.item() if pred.numel() == 1 else pred
+        conf = conf.item() if conf.numel() == 1 else conf
+        issue_type = (self.idx_to_issue_type or ["normal"] * self.num_classes)[pred]
+        severity = (self.idx_to_severity or ["medium"] * self.num_classes)[pred]
+        result = "normal" if issue_type == "normal" else "issue_detected"
+        return {
+            "result": result,
+            "issue_type": issue_type,
+            "severity": severity,
+            "confidence": float(conf),
+            "class_idx": int(pred),
+        }

src/models/image_model.py ADDED Viewed

	@@ -0,0 +1,67 @@

+"""
+Image classifier for Electrical Outlets. EfficientNet-B0 backbone + MLP head.
+FINAL v5: 5 classes (no GFCI).
+"""
+from pathlib import Path
+from typing import Dict, Any, Optional
+import json
+import torch
+import torch.nn as nn
+from torchvision import models
+class ElectricalOutletsImageModel(nn.Module):
+    def __init__(
+        self,
+        num_classes: int = 5,
+        label_mapping_path: Optional[Path] = None,
+        pretrained: bool = True,
+        head_hidden: int = 256,
+        head_dropout: float = 0.4,
+    ):
+        super().__init__()
+        self.num_classes = num_classes
+        self.backbone = models.efficientnet_b0(
+            weights=models.EfficientNet_B0_Weights.IMAGENET1K_V1 if pretrained else None
+        )
+        in_features = self.backbone.classifier[1].in_features  # 1280
+        self.backbone.classifier = nn.Identity()
+        self.head = nn.Sequential(
+            nn.Dropout(head_dropout),
+            nn.Linear(in_features, head_hidden),
+            nn.ReLU(),
+            nn.Dropout(head_dropout * 0.5),
+            nn.Linear(head_hidden, num_classes),
+        )
+        self.idx_to_issue_type = None
+        self.idx_to_severity = None
+        if label_mapping_path and Path(label_mapping_path).exists():
+            with open(label_mapping_path) as f:
+                lm = json.load(f)
+            self.idx_to_issue_type = lm["image"]["idx_to_issue_type"]
+            self.idx_to_severity = lm["image"]["idx_to_severity"]
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        features = self.backbone(x)
+        return self.head(features)
+    def predict_to_schema(self, logits: torch.Tensor) -> Dict[str, Any]:
+        probs = torch.softmax(logits, dim=-1)
+        if logits.dim() == 1:
+            probs = probs.unsqueeze(0)
+        conf, pred = probs.max(dim=-1)
+        pred = pred.item() if pred.numel() == 1 else pred
+        conf = conf.item() if conf.numel() == 1 else conf
+        issue_type = (self.idx_to_issue_type or ["unknown"] * self.num_classes)[pred]
+        severity = (self.idx_to_severity or ["medium"] * self.num_classes)[pred]
+        result = "normal" if issue_type == "normal" else "issue_detected"
+        return {
+            "result": result,
+            "issue_type": issue_type,
+            "severity": severity,
+            "confidence": float(conf),
+            "class_idx": int(pred),
+        }

test.py ADDED Viewed

	@@ -0,0 +1,388 @@

+"""
+Test script for Electrical Outlets diagnostic pipeline.
+Usage:
+  python test.py --image path/to/outlet.jpg                    # Test image only
+  python test.py --audio path/to/recording.wav                 # Test audio only
+  python test.py --image photo.jpg --audio recording.wav       # Test both (fusion)
+  python test.py --list                                        # List sample images from dataset
+  python test.py --eval                                        # Run full validation set evaluation
+Requirements:
+  pip install torch torchvision torchaudio Pillow PyYAML soundfile
+"""
+from pathlib import Path
+import sys
+import argparse
+import json
+from collections import defaultdict
+import torch
+from torchvision import transforms
+from PIL import Image
+ROOT = Path(__file__).resolve().parent
+sys.path.insert(0, str(ROOT))
+def load_image_model(weights_path, mapping_path, device):
+    from src.models.image_model import ElectricalOutletsImageModel
+    ckpt = torch.load(weights_path, map_location=device, weights_only=False)
+    # Infer head_hidden from saved weights (head.1 is the first Linear)
+    head_hidden = ckpt["model_state_dict"]["head.1.weight"].shape[0]
+    model = ElectricalOutletsImageModel(
+        num_classes=ckpt["num_classes"],
+        label_mapping_path=Path(mapping_path),
+        pretrained=False,
+        head_hidden=head_hidden,
+    )
+    model.load_state_dict(ckpt["model_state_dict"])
+    model.idx_to_issue_type = ckpt.get("idx_to_issue_type")
+    model.idx_to_severity = ckpt.get("idx_to_severity")
+    model.eval().to(device)
+    T = ckpt.get("temperature", 1.0)
+    # Clamp bad temperature values
+    if T <= 0 or T > 10:
+        T = 1.0
+    return model, T
+def load_audio_model(weights_path, mapping_path, device):
+    from src.models.audio_model import ElectricalOutletsAudioModel
+    import yaml
+    ckpt = torch.load(weights_path, map_location=device, weights_only=False)
+    # Load audio config for n_mels
+    audio_cfg_path = ROOT / "config" / "audio_train_config.yaml"
+    n_mels, time_steps = 128, 128
+    if audio_cfg_path.exists():
+        with open(audio_cfg_path) as f:
+            acfg = yaml.safe_load(f)
+        n_mels = acfg.get("model", {}).get("n_mels", 128)
+        time_steps = acfg.get("model", {}).get("time_steps", 128)
+    model = ElectricalOutletsAudioModel(
+        num_classes=ckpt["num_classes"],
+        label_mapping_path=Path(mapping_path),
+        n_mels=n_mels,
+        time_steps=time_steps,
+    )
+    model.load_state_dict(ckpt["model_state_dict"])
+    model.idx_to_label = ckpt.get("idx_to_label")
+    model.idx_to_issue_type = ckpt.get("idx_to_issue_type")
+    model.idx_to_severity = ckpt.get("idx_to_severity")
+    model.eval().to(device)
+    T = ckpt.get("temperature", 1.0)
+    if T <= 0 or T > 10:
+        T = 1.0
+    return model, T
+def predict_image(image_path, device="cuda"):
+    weights = ROOT / "weights" / "electrical_outlets_image_best.pt"
+    mapping = ROOT / "config" / "label_mapping.json"
+    if not weights.exists():
+        print(f"ERROR: Image weights not found at {weights}")
+        return None
+    model, T = load_image_model(weights, mapping, device)
+    tf = transforms.Compose([
+        transforms.Resize(256),
+        transforms.CenterCrop(224),
+        transforms.ToTensor(),
+        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
+    ])
+    img = Image.open(image_path).convert("RGB")
+    x = tf(img).unsqueeze(0).to(device)
+    with torch.no_grad():
+        logits = model(x) / T
+        probs = torch.softmax(logits, dim=-1)
+    pred = model.predict_to_schema(logits)
+    print(f"\n{'='*55}")
+    print(f"  IMAGE: {Path(image_path).name}")
+    print(f"{'='*55}")
+    print(f"  Prediction:  {pred['issue_type']}")
+    print(f"  Severity:    {pred['severity']}")
+    print(f"  Confidence:  {pred['confidence']:.1%}")
+    print(f"  Result:      {pred['result']}")
+    print(f"\n  Class probabilities:")
+    for i, p in enumerate(probs[0].tolist()):
+        name = model.idx_to_issue_type[i] if model.idx_to_issue_type else f"class_{i}"
+        bar = "█" * int(p * 30)
+        tag = " ◄" if i == pred["class_idx"] else ""
+        print(f"    {name:20s} {p:6.1%} {bar}{tag}")
+    return pred
+def predict_audio(audio_path, device="cuda"):
+    import torchaudio
+    import yaml
+    weights = ROOT / "weights" / "electrical_outlets_audio_best.pt"
+    mapping = ROOT / "config" / "label_mapping.json"
+    if not weights.exists():
+        print(f"ERROR: Audio weights not found at {weights}")
+        return None
+    model, T = load_audio_model(weights, mapping, device)
+    # Load audio config
+    audio_cfg_path = ROOT / "config" / "audio_train_config.yaml"
+    sample_rate, n_mels, n_fft, hop, win = 22050, 128, 1024, 512, 1024
+    target_sec = 5.0
+    if audio_cfg_path.exists():
+        with open(audio_cfg_path) as f:
+            acfg = yaml.safe_load(f)
+        sample_rate = acfg["data"].get("sample_rate", 22050)
+        target_sec = acfg["data"].get("target_length_sec", 5.0)
+        sc = acfg.get("spectrogram", {})
+        n_mels = sc.get("n_mels", 128)
+        n_fft = sc.get("n_fft", 1024)
+        hop = sc.get("hop_length", 512)
+        win = sc.get("win_length", 1024)
+    waveform, sr = torchaudio.load(str(audio_path))
+    if sr != sample_rate:
+        waveform = torchaudio.functional.resample(waveform, sr, sample_rate)
+    if waveform.shape[0] > 1:
+        waveform = waveform.mean(dim=0, keepdim=True)
+    target_len = int(target_sec * sample_rate)
+    if waveform.shape[1] >= target_len:
+        start = (waveform.shape[1] - target_len) // 2
+        waveform = waveform[:, start:start + target_len]
+    else:
+        waveform = torch.nn.functional.pad(waveform, (0, target_len - waveform.shape[1]))
+    mel = torchaudio.transforms.MelSpectrogram(
+        sample_rate=sample_rate, n_fft=n_fft, hop_length=hop,
+        win_length=win, n_mels=n_mels,
+    )(waveform)
+    log_mel = torch.log(mel.clamp(min=1e-5)).unsqueeze(0).to(device)
+    with torch.no_grad():
+        logits = model(log_mel) / T
+        probs = torch.softmax(logits, dim=-1)
+    pred = model.predict_to_schema(logits)
+    print(f"\n{'='*55}")
+    print(f"  AUDIO: {Path(audio_path).name}")
+    print(f"{'='*55}")
+    print(f"  Prediction:  {pred['issue_type']}")
+    print(f"  Severity:    {pred['severity']}")
+    print(f"  Confidence:  {pred['confidence']:.1%}")
+    print(f"  Result:      {pred['result']}")
+    print(f"\n  Class probabilities:")
+    labels = model.idx_to_label or [f"class_{i}" for i in range(model.num_classes)]
+    for i, p in enumerate(probs[0].tolist()):
+        bar = "█" * int(p * 30)
+        tag = " ◄" if i == pred["class_idx"] else ""
+        print(f"    {labels[i]:20s} {p:6.1%} {bar}{tag}")
+    return pred
+def run_fusion(image_pred, audio_pred):
+    from src.fusion.fusion_logic import fuse_modalities, ModalityOutput
+    import yaml
+    thresholds_path = ROOT / "config" / "thresholds.yaml"
+    thresholds = {}
+    if thresholds_path.exists():
+        with open(thresholds_path) as f:
+            thresholds = yaml.safe_load(f)
+    image_out = ModalityOutput(
+        result=image_pred["result"],
+        issue_type=image_pred.get("issue_type"),
+        severity=image_pred["severity"],
+        confidence=image_pred["confidence"],
+    ) if image_pred else None
+    audio_out = ModalityOutput(
+        result=audio_pred["result"],
+        issue_type=audio_pred.get("issue_type"),
+        severity=audio_pred["severity"],
+        confidence=audio_pred["confidence"],
+    ) if audio_pred else None
+    result = fuse_modalities(
+        image_out, audio_out,
+        confidence_issue_min=thresholds.get("confidence_issue_min", 0.6),
+        confidence_normal_min=thresholds.get("confidence_normal_min", 0.75),
+        uncertain_if_disagree=thresholds.get("uncertain_if_disagree", True),
+        high_confidence_override=thresholds.get("high_confidence_override", 0.92),
+        severity_order=thresholds.get("severity_order"),
+    )
+    print(f"\n{'='*55}")
+    print(f"  FUSED RESULT")
+    print(f"{'='*55}")
+    print(f"  Result:      {result['result']}")
+    print(f"  Issue:       {result['issue_type']}")
+    print(f"  Severity:    {result['severity']}")
+    print(f"  Confidence:  {result['confidence']:.1%}")
+    if result.get("primary_issue"):
+        print(f"  Primary:     {result['primary_issue']}")
+    if result.get("secondary_issue"):
+        print(f"  Secondary:   {result['secondary_issue']}")
+    return result
+def list_samples():
+    mapping_path = ROOT / "config" / "label_mapping.json"
+    with open(mapping_path) as f:
+        lm = json.load(f)
+    data_root = ROOT / "ELECTRICAL OUTLETS-20260106T153508Z-3-001"
+    if not data_root.exists():
+        print(f"Dataset not found at {data_root}")
+        return
+    print(f"\nDataset: {data_root}")
+    print(f"{'='*60}")
+    for folder in sorted(data_root.iterdir()):
+        if not folder.is_dir():
+            continue
+        cls = lm["image"]["folder_to_class"].get(folder.name, "UNMAPPED")
+        imgs = list(folder.glob("*.jpg")) + list(folder.glob("*.jpeg")) + list(folder.glob("*.png"))
+        print(f"\n  {folder.name}")
+        print(f"  → class: {cls} | {len(imgs)} images")
+        for img in imgs[:3]:
+            print(f"    {img}")
+    # Audio
+    audio_root = ROOT / "electrical_outlets_sounds_100"
+    if audio_root.exists():
+        print(f"\n\nAudio: {audio_root}")
+        print(f"{'='*60}")
+        for folder in sorted(audio_root.iterdir()):
+            if folder.is_dir():
+                wavs = list(folder.glob("*.wav"))
+                print(f"  {folder.name}: {len(wavs)} files")
+                for w in wavs[:2]:
+                    print(f"    {w}")
+def run_eval(device="cuda"):
+    """Run full evaluation on validation split."""
+    weights = ROOT / "weights" / "electrical_outlets_image_best.pt"
+    mapping = ROOT / "config" / "label_mapping.json"
+    if not weights.exists():
+        print("No image weights found.")
+        return
+    model, T = load_image_model(weights, mapping, device)
+    import yaml
+    cfg_path = ROOT / "config" / "image_train_config.yaml"
+    with open(cfg_path) as f:
+        cfg = yaml.safe_load(f)
+    from src.data.image_dataset import ElectricalOutletsImageDataset
+    val_tf = transforms.Compose([
+        transforms.Resize(256),
+        transforms.CenterCrop(224),
+        transforms.ToTensor(),
+        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
+    ])
+    data_root = ROOT / cfg["data"]["root"]
+    val_ds = ElectricalOutletsImageDataset(
+        data_root, mapping, split="val",
+        train_ratio=cfg["data"]["train_ratio"],
+        val_ratio=cfg["data"]["val_ratio"],
+        seed=cfg["data"].get("seed", 42),
+        transform=val_tf,
+    )
+    with open(mapping) as f:
+        lm = json.load(f)
+    issue_names = lm["image"]["idx_to_issue_type"]
+    correct = 0
+    total = 0
+    class_correct = defaultdict(int)
+    class_total = defaultdict(int)
+    confusion = defaultdict(lambda: defaultdict(int))
+    model.eval()
+    with torch.no_grad():
+        for i in range(len(val_ds)):
+            x, y = val_ds[i]
+            logits = model(x.unsqueeze(0).to(device)) / T
+            pred = logits.argmax(1).item()
+            correct += (pred == y)
+            total += 1
+            class_correct[y] += (pred == y)
+            class_total[y] += 1
+            confusion[y][pred] += 1
+    print(f"\n{'='*55}")
+    print(f"  VALIDATION RESULTS ({total} samples)")
+    print(f"{'='*55}")
+    print(f"  Overall accuracy: {correct/total:.1%}")
+    print(f"\n  Per-class recall:")
+    for c in sorted(class_total.keys()):
+        name = issue_names[c] if c < len(issue_names) else f"class_{c}"
+        recall = class_correct[c] / class_total[c] if class_total[c] > 0 else 0
+        bar = "█" * int(recall * 20)
+        print(f"    {name:20s} {recall:6.1%} ({class_correct[c]}/{class_total[c]}) {bar}")
+    print(f"\n  Confusion matrix:")
+    classes = sorted(class_total.keys())
+    header = "  Actual \\ Pred  " + "".join(f"{issue_names[c][:8]:>9s}" for c in classes)
+    print(header)
+    for actual in classes:
+        row = f"  {issue_names[actual][:14]:14s}"
+        for pred_c in classes:
+            count = confusion[actual][pred_c]
+            row += f"  {count:6d}" if count > 0 else f"  {'·':>6s}"
+            row += " "
+        print(row)
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser(description="Test Electrical Outlets Diagnostic Pipeline")
+    parser.add_argument("--image", type=str, help="Path to image file")
+    parser.add_argument("--audio", type=str, help="Path to audio WAV file")
+    parser.add_argument("--list", action="store_true", help="List sample files from dataset")
+    parser.add_argument("--eval", action="store_true", help="Run full validation evaluation")
+    parser.add_argument("--device", default="cuda" if torch.cuda.is_available() else "cpu")
+    args = parser.parse_args()
+    if args.list:
+        list_samples()
+    elif args.eval:
+        run_eval(args.device)
+    elif args.image or args.audio:
+        img_pred = predict_image(args.image, args.device) if args.image else None
+        audio_pred = predict_audio(args.audio, args.device) if args.audio else None
+        if img_pred and audio_pred:
+            run_fusion(img_pred, audio_pred)
+        print()
+    else:
+        print("Electrical Outlets Diagnostic Pipeline — Test Script")
+        print("=" * 55)
+        print()
+        print("Usage:")
+        print("  python test.py --image path/to/photo.jpg")
+        print("  python test.py --audio path/to/recording.wav")
+        print("  python test.py --image photo.jpg --audio recording.wav")
+        print("  python test.py --list")
+        print("  python test.py --eval")
+        print()
+        print("Examples:")
+        print('  python test.py --image "ELECTRICAL OUTLETS-20260106T153508Z-3-001\\Burn marks - overheating 250\\img_001.jpg"')
+        print('  python test.py --audio "electrical_outlets_sounds_100\\buzzing_outlet\\buzzing_outlet_060.wav"')

test_single_image.py ADDED Viewed

	@@ -0,0 +1,90 @@

+"""
+Quick test: classify a single image.
+  python test_single_image.py --image "path/to/image.jpg"
+  python test_single_image.py --list
+"""
+from pathlib import Path
+import sys
+import argparse
+import json
+import torch
+from torchvision import transforms
+from PIL import Image
+ROOT = Path(__file__).resolve().parent
+sys.path.insert(0, str(ROOT))
+from src.models.image_model import ElectricalOutletsImageModel
+def predict(image_path, weights="weights/electrical_outlets_image_best.pt",
+            mapping="config/label_mapping.json", device=None):
+    if device is None:
+        device = "cuda" if torch.cuda.is_available() else "cpu"
+    ckpt = torch.load(weights, map_location=device, weights_only=False)
+    head_hidden = ckpt["model_state_dict"]["head.1.weight"].shape[0]
+    model = ElectricalOutletsImageModel(
+        num_classes=ckpt["num_classes"],
+        label_mapping_path=Path(mapping),
+        pretrained=False,
+        head_hidden=head_hidden,
+    )
+    model.load_state_dict(ckpt["model_state_dict"])
+    model.idx_to_issue_type = ckpt.get("idx_to_issue_type")
+    model.idx_to_severity = ckpt.get("idx_to_severity")
+    model.eval().to(device)
+    T = ckpt.get("temperature", 1.0)
+    if T <= 0 or T > 10:
+        T = 1.0
+    tf = transforms.Compose([
+        transforms.Resize(256), transforms.CenterCrop(224),
+        transforms.ToTensor(),
+        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
+    ])
+    img = Image.open(image_path).convert("RGB")
+    x = tf(img).unsqueeze(0).to(device)
+    with torch.no_grad():
+        logits = model(x) / T
+        probs = torch.softmax(logits, dim=-1)
+    pred = model.predict_to_schema(logits)
+    print(f"\n{'='*50}")
+    print(f"  {Path(image_path).name}")
+    print(f"{'='*50}")
+    print(f"  -> {pred['issue_type']}  ({pred['severity']} severity)")
+    print(f"  -> {pred['confidence']:.1%} confidence")
+    print(f"  -> {pred['result']}")
+    print()
+    for i, p in enumerate(probs[0].tolist()):
+        name = model.idx_to_issue_type[i]
+        bar = "█" * int(p * 30)
+        tag = " ◄" if i == pred["class_idx"] else ""
+        print(f"  {name:20s} {p:6.1%} {bar}{tag}")
+    print()
+if __name__ == "__main__":
+    p = argparse.ArgumentParser()
+    p.add_argument("--image", type=str)
+    p.add_argument("--list", action="store_true")
+    p.add_argument("--weights", default="weights/electrical_outlets_image_best.pt")
+    args = p.parse_args()
+    if args.list:
+        with open("config/label_mapping.json") as f:
+            lm = json.load(f)
+        root = Path("ELECTRICAL OUTLETS-20260106T153508Z-3-001")
+        for folder in sorted(root.iterdir()):
+            if folder.is_dir():
+                imgs = list(folder.glob("*.jpg")) + list(folder.glob("*.jpeg")) + list(folder.glob("*.png"))
+                cls = lm["image"]["folder_to_class"].get(folder.name, "UNMAPPED")
+                print(f"\n{folder.name} -> {cls} ({len(imgs)} imgs)")
+                for img in imgs[:2]:
+                    print(f"  {img}")
+    elif args.image:
+        predict(args.image, args.weights)
+    else:
+        print("python test_single_image.py --image path/to/img.jpg")
+        print("python test_single_image.py --list")

tests/test_fusion.py ADDED Viewed

	@@ -0,0 +1,60 @@

+"""Unit tests for decision-level fusion."""
+import sys
+from pathlib import Path
+sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
+from src.fusion.fusion_logic import fuse_modalities, ModalityOutput
+def test_image_only_issue():
+    out = fuse_modalities(
+        image_out=ModalityOutput("issue_detected", "burn_overheating", "high", 0.9),
+        audio_out=None,
+    )
+    assert out["result"] == "issue_detected"
+    assert out["severity"] == "high"
+    assert out["issue_type"] == "burn_overheating"
+def test_both_normal_high_conf():
+    out = fuse_modalities(
+        image_out=ModalityOutput("normal", "normal", "low", 0.85),
+        audio_out=ModalityOutput("normal", "normal", "low", 0.8),
+        confidence_normal_min=0.75,
+    )
+    assert out["result"] == "normal"
+    assert out["severity"] == "low"
+def test_severity_max():
+    out = fuse_modalities(
+        image_out=ModalityOutput("issue_detected", "cracked_faceplate", "medium", 0.88),
+        audio_out=ModalityOutput("issue_detected", "arcing_pop", "critical", 0.85),
+    )
+    assert out["severity"] == "critical"
+    assert out["result"] == "issue_detected"
+def test_uncertain_low_confidence():
+    out = fuse_modalities(
+        image_out=ModalityOutput("issue_detected", "buzzing", "high", 0.5),
+        audio_out=None,
+        confidence_issue_min=0.6,
+    )
+    assert out["result"] == "uncertain"
+def test_uncertain_disagree():
+    out = fuse_modalities(
+        image_out=ModalityOutput("issue_detected", "burn_overheating", "high", 0.7),
+        audio_out=ModalityOutput("normal", "normal", "low", 0.7),
+        uncertain_if_disagree=True,
+        high_confidence_override=0.92,
+    )
+    assert out["result"] == "uncertain"
+def test_no_input():
+    out = fuse_modalities(None, None)
+    assert out["result"] == "uncertain"
+    assert out["confidence"] == 0.0

training/train_audio.py ADDED Viewed

	@@ -0,0 +1,202 @@

+"""
+Train Electrical Outlets audio model. Spectrogram CNN, class weights, per-class recall, early stopping.
+"""
+from pathlib import Path
+import sys
+import argparse
+from typing import Dict
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from torch.utils.data import DataLoader
+ROOT = Path(__file__).resolve().parent.parent
+sys.path.insert(0, str(ROOT))
+from src.data.audio_dataset import ElectricalOutletsAudioDataset
+from src.models.audio_model import ElectricalOutletsAudioModel
+def load_config(config_path: Path) -> dict:
+    import yaml
+    with open(config_path) as f:
+        return yaml.safe_load(f)
+def _wave_to_mel(waveform: torch.Tensor, n_mels: int, n_fft: int, hop: int, win: int) -> torch.Tensor:
+    import torchaudio
+    mel = torchaudio.transforms.MelSpectrogram(
+        sample_rate=16000, n_fft=n_fft, hop_length=hop, win_length=win, n_mels=n_mels,
+    )(waveform)
+    log_mel = torch.log(mel.clamp(min=1e-5))
+    return log_mel
+def per_class_recall(logits: torch.Tensor, targets: torch.Tensor, num_classes: int) -> Dict[int, float]:
+    preds = logits.argmax(dim=1)
+    recall = {}
+    for c in range(num_classes):
+        mask = targets == c
+        if mask.sum() == 0:
+            recall[c] = 0.0
+        else:
+            recall[c] = (preds[mask] == c).float().mean().item()
+    return recall
+def run_training(
+    data_root: Path,
+    label_mapping_path: Path,
+    config: dict,
+    weights_dir: Path,
+    device: str = "cuda",
+):
+    train_ratio = config["data"]["train_ratio"]
+    val_ratio = config["data"]["val_ratio"]
+    seed = config["data"].get("seed", 42)
+    batch_size = config["data"]["batch_size"]
+    num_workers = config["data"].get("num_workers", 0)
+    spec_cfg = config.get("spectrogram", {})
+    n_mels = spec_cfg.get("n_mels", 64)
+    n_fft = spec_cfg.get("n_fft", 512)
+    hop = spec_cfg.get("hop_length", 256)
+    win = spec_cfg.get("win_length", 512)
+    def to_mel(x):
+        return _wave_to_mel(x, n_mels, n_fft, hop, win)
+    train_ds = ElectricalOutletsAudioDataset(
+        data_root, label_mapping_path, split="train",
+        train_ratio=train_ratio, val_ratio=val_ratio, seed=seed, transform=to_mel,
+        target_length_sec=config["data"].get("target_length_sec", 5.0),
+        sample_rate=config["data"].get("sample_rate", 16000),
+    )
+    val_ds = ElectricalOutletsAudioDataset(
+        data_root, label_mapping_path, split="val",
+        train_ratio=train_ratio, val_ratio=val_ratio, seed=seed, transform=to_mel,
+        target_length_sec=config["data"].get("target_length_sec", 5.0),
+        sample_rate=config["data"].get("sample_rate", 16000),
+    )
+    train_loader = DataLoader(train_ds, batch_size=batch_size, shuffle=True, num_workers=num_workers)
+    val_loader = DataLoader(val_ds, batch_size=batch_size, shuffle=False, num_workers=num_workers)
+    num_classes = train_ds.num_classes
+    model = ElectricalOutletsAudioModel(
+        num_classes=num_classes,
+        label_mapping_path=label_mapping_path,
+        n_mels=config["model"].get("n_mels", 64),
+        time_steps=config["model"].get("time_steps", 128),
+    ).to(device)
+    opt = torch.optim.AdamW(
+        model.parameters(),
+        lr=config["training"]["lr"],
+        weight_decay=config["training"].get("weight_decay", 1e-4),
+    )
+    criterion = nn.CrossEntropyLoss()
+    epochs = config["training"]["epochs"]
+    patience = config["training"].get("early_stopping_patience", 12)
+    best_metric = -1.0
+    best_epoch = 0
+    wait = 0
+    recall = {}
+    for epoch in range(epochs):
+        model.train()
+        for x, y in train_loader:
+            x, y = x.to(device), y.to(device)
+            opt.zero_grad()
+            logits = model(x)
+            loss = criterion(logits, y)
+            loss.backward()
+            opt.step()
+        model.eval()
+        val_logits, val_targets = [], []
+        with torch.no_grad():
+            for x, y in val_loader:
+                x = x.to(device)
+                val_logits.append(model(x).cpu())
+                val_targets.append(y)
+        val_logits = torch.cat(val_logits, dim=0)
+        val_targets = torch.cat(val_targets, dim=0)
+        recall = per_class_recall(val_logits, val_targets, num_classes)
+        min_recall = min(recall.values())
+        macro_recall = sum(recall.values()) / num_classes
+        metric = macro_recall
+        if metric > best_metric:
+            best_metric = metric
+            best_epoch = epoch
+            wait = 0
+            weights_dir.mkdir(parents=True, exist_ok=True)
+            torch.save({
+                "model_state_dict": model.state_dict(),
+                "num_classes": num_classes,
+                "idx_to_label": model.idx_to_label,
+                "idx_to_issue_type": model.idx_to_issue_type,
+                "idx_to_severity": model.idx_to_severity,
+            }, weights_dir / config["output"]["best_name"])
+        else:
+            wait += 1
+        print(f"Epoch {epoch} min_recall={min_recall:.4f} macro_recall={macro_recall:.4f} best={best_metric:.4f}")
+        if wait >= patience:
+            print("Early stopping at epoch", epoch)
+            break
+    if config.get("calibration", {}).get("use_temperature_scaling", False):
+        model.load_state_dict(torch.load(weights_dir / config["output"]["best_name"], map_location=device)["model_state_dict"])
+        model.eval()
+        n_val = len(val_ds)
+        cal_size = max(1, int(n_val * config["calibration"].get("val_fraction_for_calibration", 0.5)))
+        cal_logits, cal_targets = [], []
+        for i in range(cal_size):
+            x, y = val_ds[i]
+            x = x.unsqueeze(0).to(device)
+            with torch.no_grad():
+                cal_logits.append(model(x).cpu())
+            cal_targets.append(y)
+        cal_logits = torch.cat(cal_logits, dim=0)
+        cal_targets = torch.tensor(cal_targets)
+        temp = nn.Parameter(torch.ones(1) * 1.5)
+        opt_cal = torch.optim.LBFGS([temp], lr=0.01, max_iter=50)
+        def eval_cal():
+            opt_cal.zero_grad()
+            loss = F.cross_entropy(cal_logits / temp, cal_targets)
+            loss.backward()
+            return loss
+        opt_cal.step(eval_cal)
+        ckpt = torch.load(weights_dir / config["output"]["best_name"], map_location="cpu")
+        ckpt["temperature"] = temp.item()
+        torch.save(ckpt, weights_dir / config["output"]["best_name"])
+    return {"best_epoch": best_epoch, "best_metric": best_metric, "recall_per_class": recall}
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--config", default="config/audio_train_config.yaml")
+    parser.add_argument("--data_root", default=None)
+    parser.add_argument("--weights_dir", default="weights")
+    parser.add_argument("--device", default="cuda" if torch.cuda.is_available() else "cpu")
+    args = parser.parse_args()
+    root = Path(__file__).resolve().parent.parent
+    config = load_config(root / args.config)
+    data_root = Path(args.data_root) if args.data_root else root / config["data"]["root"]
+    label_mapping_path = root / config["data"]["label_mapping"]
+    weights_dir = root / args.weights_dir
+    results = run_training(data_root, label_mapping_path, config, weights_dir, args.device)
+    report_path = root / "docs" / config["output"]["report_name"]
+    report_path.parent.mkdir(parents=True, exist_ok=True)
+    with open(report_path, "w") as f:
+        f.write("# Audio Model Report (Electrical Outlets)\n\n")
+        f.write("- **Preliminary model.** 100 samples is very small; recommend collecting more data.\n")
+        f.write(f"- Best epoch: {results['best_epoch']}, best metric: {results['best_metric']:.4f}\n\n")
+        f.write("## Per-class recall (validation)\n\n")
+        for c, r in results.get("recall_per_class", {}).items():
+            f.write(f"- Class {c}: {r:.4f}\n")
+        f.write("\n## Limitations\n- Small dataset; use audio as support in fusion. Do not rely on audio-only for critical decisions.\n")
+    print("Report written to", report_path)
+if __name__ == "__main__":
+    main()

training/train_image.py ADDED Viewed

	@@ -0,0 +1,329 @@

+"""
+Train Electrical Outlets image model.
+FINAL v5: Frozen backbone → partial unfreeze. 5 classes, 1300 images.
+"""
+from pathlib import Path
+import sys
+import argparse
+from typing import Dict
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from torch.utils.data import DataLoader
+from torchvision import transforms
+ROOT = Path(__file__).resolve().parent.parent
+sys.path.insert(0, str(ROOT))
+from src.data.image_dataset import ElectricalOutletsImageDataset, get_image_class_weights
+from src.models.image_model import ElectricalOutletsImageModel
+def load_config(path):
+    import yaml
+    with open(path) as f:
+        return yaml.safe_load(f)
+def focal_loss(logits, targets, alpha=0.25, gamma=2.0, weight=None):
+    ce = F.cross_entropy(logits, targets, reduction="none", weight=weight)
+    pt = torch.exp(-ce)
+    return (alpha * (1 - pt) ** gamma * ce).mean()
+def per_class_recall(logits, targets, num_classes):
+    preds = logits.argmax(dim=1)
+    recall = {}
+    for c in range(num_classes):
+        mask = targets == c
+        recall[c] = (preds[mask] == c).float().mean().item() if mask.sum() > 0 else 0.0
+    return recall
+def run_training(data_root, label_mapping_path, config, weights_dir, device="cuda"):
+    cfg_data = config["data"]
+    cfg_train = config["training"]
+    cfg_aug = config["augmentation"]
+    cfg_model = config["model"]
+    # Transforms
+    train_tf = transforms.Compose([
+        transforms.Resize(cfg_aug["resize"]),
+        transforms.RandomResizedCrop(cfg_aug["crop"], scale=(0.65, 1.0)),
+        transforms.RandomHorizontalFlip(0.5),
+        transforms.RandomRotation(15),
+        transforms.ColorJitter(brightness=0.3, contrast=0.3, saturation=0.3, hue=0.05),
+        transforms.RandomAffine(degrees=0, translate=(0.1, 0.1)),
+        transforms.GaussianBlur(3, sigma=(0.1, 2.0)),
+        transforms.ToTensor(),
+        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
+        transforms.RandomErasing(p=0.15),
+    ])
+    val_tf = transforms.Compose([
+        transforms.Resize(cfg_aug["resize"]),
+        transforms.CenterCrop(cfg_aug["crop"]),
+        transforms.ToTensor(),
+        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
+    ])
+    # Datasets
+    train_ds = ElectricalOutletsImageDataset(
+        data_root, label_mapping_path, split="train",
+        train_ratio=cfg_data["train_ratio"], val_ratio=cfg_data["val_ratio"],
+        seed=cfg_data.get("seed", 42), transform=train_tf,
+    )
+    val_ds = ElectricalOutletsImageDataset(
+        data_root, label_mapping_path, split="val",
+        train_ratio=cfg_data["train_ratio"], val_ratio=cfg_data["val_ratio"],
+        seed=cfg_data.get("seed", 42), transform=val_tf,
+    )
+    train_loader = DataLoader(train_ds, batch_size=cfg_data["batch_size"], shuffle=True,
+                              num_workers=cfg_data.get("num_workers", 4), pin_memory=True)
+    val_loader = DataLoader(val_ds, batch_size=cfg_data["batch_size"], shuffle=False,
+                            num_workers=cfg_data.get("num_workers", 4))
+    num_classes = train_ds.num_classes
+    print(f"\nTrain: {len(train_ds)}, Val: {len(val_ds)}, Classes: {num_classes}")
+    # Class weights
+    class_weights = None
+    if cfg_train.get("use_class_weights", True):
+        class_weights = get_image_class_weights(label_mapping_path, data_root).to(device)
+        print(f"Class weights: {[f'{w:.3f}' for w in class_weights.tolist()]}")
+    use_focal = cfg_train.get("use_focal", True)
+    criterion_ce = nn.CrossEntropyLoss(weight=class_weights, label_smoothing=0.1)
+    # Model
+    model = ElectricalOutletsImageModel(
+        num_classes=num_classes,
+        label_mapping_path=label_mapping_path,
+        pretrained=True,
+        head_hidden=cfg_model.get("head_hidden", 256),
+        head_dropout=cfg_model.get("head_dropout", 0.4),
+    ).to(device)
+    # ══════════════════════════════════════════════
+    #  STAGE 1: Frozen backbone — train head only
+    # ══════════════════════════════════════════════
+    for p in model.backbone.parameters():
+        p.requires_grad = False
+    trainable = sum(p.numel() for p in model.parameters() if p.requires_grad)
+    total_params = sum(p.numel() for p in model.parameters())
+    print(f"Params: {trainable:,} trainable / {total_params:,} total ({100*trainable/total_params:.1f}%)")
+    epochs = cfg_train["epochs"]
+    patience = cfg_train.get("early_stopping_patience", 20)
+    lr = cfg_train.get("lr", 3e-3)
+    opt = torch.optim.AdamW(
+        filter(lambda p: p.requires_grad, model.parameters()),
+        lr=lr, weight_decay=cfg_train.get("weight_decay", 1e-3),
+    )
+    sched = torch.optim.lr_scheduler.OneCycleLR(
+        opt, max_lr=lr, epochs=epochs,
+        steps_per_epoch=len(train_loader), pct_start=0.15,
+    )
+    print(f"\n{'='*60}")
+    print(f" Stage 1: Frozen backbone, lr={lr}, {epochs} epochs max")
+    print(f"{'='*60}")
+    best_metric = -1.0
+    best_epoch = 0
+    wait = 0
+    recall = {}
+    for epoch in range(epochs):
+        model.train()
+        epoch_loss = 0
+        for x, y in train_loader:
+            x, y = x.to(device), y.to(device)
+            opt.zero_grad()
+            logits = model(x)
+            loss = focal_loss(logits, y, weight=class_weights) if use_focal else criterion_ce(logits, y)
+            loss.backward()
+            opt.step()
+            sched.step()
+            epoch_loss += loss.item()
+        # Validate
+        model.eval()
+        vl, vt = [], []
+        with torch.no_grad():
+            for x, y in val_loader:
+                vl.append(model(x.to(device)).cpu())
+                vt.append(y)
+        vl, vt = torch.cat(vl), torch.cat(vt)
+        recall = per_class_recall(vl, vt, num_classes)
+        min_r = min(recall.values())
+        macro_r = sum(recall.values()) / num_classes
+        val_acc = (vl.argmax(1) == vt).float().mean().item()
+        metric = min_r if cfg_train.get("early_stopping_metric") == "val_min_recall" else macro_r
+        star = ""
+        if metric > best_metric:
+            best_metric = metric
+            best_epoch = epoch
+            wait = 0
+            weights_dir.mkdir(parents=True, exist_ok=True)
+            torch.save({
+                "model_state_dict": model.state_dict(),
+                "num_classes": num_classes,
+                "idx_to_issue_type": model.idx_to_issue_type,
+                "idx_to_severity": model.idx_to_severity,
+            }, weights_dir / config["output"]["best_name"])
+            star = " ★"
+        else:
+            wait += 1
+        print(f"E{epoch:3d} loss={epoch_loss/len(train_loader):.4f} acc={val_acc:.3f} "
+              f"min_r={min_r:.3f} macro={macro_r:.3f} best={best_metric:.3f}@{best_epoch}{star}")
+        if wait >= patience:
+            print(f"Early stop @ {epoch}")
+            break
+    # ══════════════════════════════════════════════
+    #  STAGE 2: Unfreeze last 2 backbone blocks
+    # ══════════════════════════════════════════════
+    if cfg_train.get("finetune_last_blocks", True) and best_metric > 0.15:
+        print(f"\n{'='*60}")
+        print(f" Stage 2: Partial unfreeze (last 2 blocks)")
+        print(f"{'='*60}")
+        ckpt = torch.load(weights_dir / config["output"]["best_name"], map_location=device)
+        model.load_state_dict(ckpt["model_state_dict"])
+        for p in model.backbone.parameters():
+            p.requires_grad = False
+        for name, p in model.backbone.named_parameters():
+            if "features.7" in name or "features.8" in name:
+                p.requires_grad = True
+        # Head stays trainable
+        for p in model.head.parameters():
+            p.requires_grad = True
+        ft_lr = cfg_train.get("finetune_lr", 5e-5)
+        ft_epochs = cfg_train.get("finetune_epochs", 25)
+        opt2 = torch.optim.AdamW(
+            filter(lambda p: p.requires_grad, model.parameters()),
+            lr=ft_lr, weight_decay=1e-3,
+        )
+        sched2 = torch.optim.lr_scheduler.CosineAnnealingLR(opt2, T_max=ft_epochs, eta_min=1e-6)
+        wait2 = 0
+        for epoch in range(ft_epochs):
+            model.train()
+            el = 0
+            for x, y in train_loader:
+                x, y = x.to(device), y.to(device)
+                opt2.zero_grad()
+                logits = model(x)
+                loss = focal_loss(logits, y, weight=class_weights) if use_focal else criterion_ce(logits, y)
+                loss.backward()
+                torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)
+                opt2.step()
+                el += loss.item()
+            sched2.step()
+            model.eval()
+            vl, vt = [], []
+            with torch.no_grad():
+                for x, y in val_loader:
+                    vl.append(model(x.to(device)).cpu())
+                    vt.append(y)
+            vl, vt = torch.cat(vl), torch.cat(vt)
+            recall = per_class_recall(vl, vt, num_classes)
+            min_r = min(recall.values())
+            macro_r = sum(recall.values()) / num_classes
+            val_acc = (vl.argmax(1) == vt).float().mean().item()
+            metric = min_r if cfg_train.get("early_stopping_metric") == "val_min_recall" else macro_r
+            star = ""
+            if metric > best_metric:
+                best_metric = metric
+                best_epoch = epoch + 1000
+                wait2 = 0
+                torch.save({
+                    "model_state_dict": model.state_dict(),
+                    "num_classes": num_classes,
+                    "idx_to_issue_type": model.idx_to_issue_type,
+                    "idx_to_severity": model.idx_to_severity,
+                }, weights_dir / config["output"]["best_name"])
+                star = " ★"
+            else:
+                wait2 += 1
+            print(f"  FT{epoch:3d} loss={el/len(train_loader):.4f} acc={val_acc:.3f} "
+                  f"min_r={min_r:.3f} macro={macro_r:.3f} best={best_metric:.3f}{star}")
+            if wait2 >= 10:
+                print(f"  FT early stop @ {epoch}")
+                break
+    # Temperature scaling
+    if config.get("calibration", {}).get("use_temperature_scaling", False):
+        ckpt = torch.load(weights_dir / config["output"]["best_name"], map_location=device)
+        model.load_state_dict(ckpt["model_state_dict"])
+        model.eval()
+        cal_size = max(1, int(len(val_ds) * 0.5))
+        cl, ct = [], []
+        for i in range(cal_size):
+            x, y = val_ds[i]
+            with torch.no_grad():
+                cl.append(model(x.unsqueeze(0).to(device)).cpu())
+            ct.append(y)
+        cl, ct = torch.cat(cl), torch.tensor(ct)
+        temp = nn.Parameter(torch.ones(1) * 1.5)
+        opt_c = torch.optim.LBFGS([temp], lr=0.01, max_iter=50)
+        def eval_c():
+            opt_c.zero_grad()
+            l = F.cross_entropy(cl / temp, ct)
+            l.backward()
+            return l
+        opt_c.step(eval_c)
+        ckpt["temperature"] = temp.item()
+        torch.save(ckpt, weights_dir / config["output"]["best_name"])
+        print(f"Temperature T={temp.item():.4f}")
+    print(f"\n{'='*60}")
+    print(f" DONE — Best: {best_metric:.4f}")
+    per_cls = " | ".join([f"C{c}={r:.2f}" for c, r in recall.items()])
+    print(f" Recall: {per_cls}")
+    print(f"{'='*60}\n")
+    return {"best_epoch": best_epoch, "best_metric": best_metric, "recall_per_class": recall}
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--config", default="config/image_train_config.yaml")
+    parser.add_argument("--data_root", default=None)
+    parser.add_argument("--weights_dir", default="weights")
+    parser.add_argument("--device", default="cuda" if torch.cuda.is_available() else "cpu")
+    args = parser.parse_args()
+    root = ROOT
+    config = load_config(root / args.config)
+    data_root = Path(args.data_root) if args.data_root else root / config["data"]["root"]
+    label_mapping_path = root / config["data"]["label_mapping"]
+    weights_dir = root / args.weights_dir
+    results = run_training(data_root, label_mapping_path, config, weights_dir, args.device)
+    report_path = root / "docs" / config["output"]["report_name"]
+    report_path.parent.mkdir(parents=True, exist_ok=True)
+    with open(report_path, "w") as f:
+        f.write("# Image Model Report (Electrical Outlets)\n\n")
+        f.write(f"- Best metric: {results['best_metric']:.4f}\n")
+        f.write(f"- Classes: 5 (burn, cracked, loose, normal, water)\n\n")
+        f.write("## Per-class recall\n\n")
+        issue_names = ["burn_overheating", "cracked_faceplate", "loose_outlet", "normal", "water_exposed"]
+        for c, r in results.get("recall_per_class", {}).items():
+            name = issue_names[c] if c < len(issue_names) else f"class_{c}"
+            f.write(f"- {name}: {r:.4f}\n")
+    print("Report:", report_path)
+if __name__ == "__main__":
+    main()

weights/.gitkeep ADDED Viewed

	@@ -0,0 +1 @@


1	+