Electrical Outlets & Switches Diagnostic Pipeline

Non-intrusive AI diagnostic system for electrical outlets and switches using image classification and audio analysis with decision-level fusion.

Overview

This pipeline analyzes photos and/or audio recordings of electrical outlets to detect potential safety issues without requiring physical inspection. It uses two independent models fused at the decision level for robust predictions.

Image Model

Architecture: EfficientNet-B0 (frozen backbone) + MLP head (512 → 5 classes)
Classes: burn/overheating, cracked faceplate, loose outlet, normal, water exposed
Performance: 77.3% accuracy, 66.7% minimum per-class recall
Training data: 1,299 images across 10 source categories merged into 5 classes

Audio Model

Architecture: 3-layer Spectrogram CNN (32→64→128 channels + adaptive pooling)
Classes: normal, buzzing, crackling/arcing, arcing pop
Performance: 100% macro recall on validation
Training data: 100 WAV files (22050 Hz, mel spectrograms with SpecAugment)

Fusion

Decision-level fusion combining both modalities
Safety-first: prefers "uncertain" over "normal" when in doubt
Severity = max(image_severity, audio_severity)
Configurable confidence thresholds in config/thresholds.yaml

Project Structure

CV/
├── config/
│   ├── label_mapping.json          # Class definitions & folder→class mapping
│   ├── image_train_config.yaml     # Image training hyperparameters
│   ├── audio_train_config.yaml     # Audio training hyperparameters
│   ├── thresholds.yaml             # Fusion confidence thresholds
│   └── schema.yaml                 # API output schema
├── src/
│   ├── data/
│   │   ├── image_dataset.py        # Image dataset with stratified splits
│   │   └── audio_dataset.py        # Audio dataset with stratified splits
│   ├── models/
│   │   ├── image_model.py          # EfficientNet-B0 + MLP classifier
│   │   └── audio_model.py          # Spectrogram CNN classifier
│   ├── fusion/
│   │   └── fusion_logic.py         # Decision-level fusion
│   └── inference/
│       └── wrapper.py              # End-to-end inference pipeline
├── training/
│   ├── train_image.py              # Image model training (2-stage)
│   └── train_audio.py              # Audio model training
├── api/
│   └── main.py                     # FastAPI endpoint
├── weights/
│   ├── electrical_outlets_image_best.pt   # Trained image model
│   └── electrical_outlets_audio_best.pt   # Trained audio model
├── tests/
│   └── test_fusion.py              # Fusion logic tests
├── test_single_image.py            # Quick single-image testing
├── requirements.txt
└── README.md

Setup

Requirements

Python 3.10+
NVIDIA GPU with CUDA (recommended: RTX 3090 or better)

Installation

git clone https://huggingface.co/<your-repo>/electrical-outlets-diagnostic
cd electrical-outlets-diagnostic

pip install -r requirements.txt

# If GPU: install CUDA-enabled PyTorch
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
# Also needed on Windows:
pip install soundfile

Download Weights

Download the model weights from the HuggingFace repository and place them in weights/:

weights/
├── electrical_outlets_image_best.pt   (~ 17 MB)
└── electrical_outlets_audio_best.pt   (~ 2 MB)

Usage

Test a Single Image

python test_single_image.py --image path/to/outlet_photo.jpg

Output: ```

burned_outlet.jpg

→ burn_overheating (high severity) → 87.3% confidence → issue_detected

burn_overheating 87.3% ██████████████████████████ ◄ cracked_faceplate 5.2% █ loose_outlet 3.1% ▊ normal 2.8% ▊ water_exposed 1.6% ▍


### API Server

```bash
uvicorn api.main:app --host 0.0.0.0 --port 8000

Endpoints

POST /v1/diagnose/electrical_outlets

Upload image and/or audio for diagnosis:

# Image only
curl -X POST http://localhost:8000/v1/diagnose/electrical_outlets \
  -F "image=@outlet_photo.jpg"

# Image + Audio
curl -X POST http://localhost:8000/v1/diagnose/electrical_outlets \
  -F "image=@outlet_photo.jpg" \
  -F "audio=@outlet_recording.wav"

Response:

{
  "diagnostic_element": "electrical_outlets",
  "result": "issue_detected",
  "issue_type": "burn_overheating",
  "severity": "high",
  "confidence": 0.873,
  "modality_contributions": null,
  "primary_issue": "burn_overheating",
  "secondary_issue": null
}

GET /health — Check model availability

Python API

from src.inference.wrapper import run_electrical_outlets_inference

result = run_electrical_outlets_inference(
    image_path="path/to/photo.jpg",
    audio_path="path/to/recording.wav",  # optional
)
print(result)

Training

Image Model

python training/train_image.py --device cuda

Two-stage training:

Stage 1: Frozen EfficientNet-B0 backbone, train MLP head only (80-100 epochs)
Stage 2: Unfreeze last 2 backbone blocks, fine-tune with low LR (25 epochs)

Audio Model

python training/train_audio.py --device cuda

Single-stage with SpecAugment, class-weighted loss, cosine LR schedule.

Class Mapping

Image Classes (5)

Class	Issue Type	Severity	Source Folders
0	burn_overheating	high	Burn marks (250), Discoloration (100), Sparking damage (150)
1	cracked_faceplate	medium	Cracked faceplate (150), Damaged switches (50)
2	loose_outlet	medium	Loose outlet (200), Exposed wiring (150)
3	normal	low	Normal outlets (50), Normal switches (50)
4	water_exposed	high	Water intrusion (150)

Audio Classes (4)

Class	Issue Type	Severity
0	normal	low
1	buzzing	high
2	crackling_arcing	high
3	arcing_pop	critical

Severity Levels

Level	Action Required
low	Monitor — no immediate action
medium	Schedule repair
high	Shut off circuit immediately
critical	Shut off main breaker immediately

Fusion Logic

The fusion layer combines image and audio predictions:

If both agree on issue → issue_detected with max severity
If both agree on normal with high confidence → normal
If they disagree → uncertain (unless one has >92% confidence)
Safety-first: defaults to uncertain over normal when confidence is low

Limitations

Image model trained on web-sourced images (some watermarked/AI-generated)
Audio model trained on 100 synthetic clips — use as supporting evidence only
Water damage and cracked faceplate classes have lower recall (64-67%)
No GFCI failure detection (no training data available)
Real-world accuracy will be lower than validation metrics

Evaluation Results

Image Model (V5.1 – Final)

Dataset: 1,299 images
Validation Split: 194 images
Best Epoch: 76

Overall Metrics

Metric	Value
Accuracy	77.3%
Minimum Per-Class Recall	66.7%
Macro Recall	77.0%
Trainable Parameters	658,437 (14.1%)

Per-Class Recall

Class	Recall	Notes
burn_overheating	68%	Confused with dark loose_outlet cases
cracked_faceplate	63%	Lowest data (200 images)
loose_outlet	98%	Strong visual pattern
normal	93%	Despite only 100 images
water_exposed	64%	Subtle cues, limited data

Audio Model

Dataset: 100 WAV files
Validation Recall: 100% macro recall
Converged: Epoch 15

⚠ Audio validation dataset is small and partially synthetic; real-world generalization may differ.

Training Configuration

Image Model

Backbone: EfficientNet-B0 (ImageNet pretrained)
Stage 1: Frozen backbone, head training (80–100 epochs)
Stage 2: Partial unfreeze (last 2 blocks), low LR fine-tuning
Optimizer: AdamW
Fine-tune LR: 2e-4
Head: 512 hidden units, 0.5 dropout
Early stopping: patience=25

Audio Model

3-layer CNN
Mel spectrogram input (22050 Hz)
SpecAugment enabled
Cosine LR scheduler
Class-weighted cross-entropy

Model Evolution Summary

Version	Min Recall	Accuracy	Key Change
V1	31.8%	47%	Baseline
V2	26.7%	44%	High LR → overfitting
V3	27.2%	52%	Frozen backbone
V4	0%	—	Folder mapping bug
V5	63.6%	77.3%	Fixed dataset loading
V5.1	66.7%	77.3%	Larger head + improved LR

Total improvement:
+35 pts minimum recall
+30 pts accuracy

Bias, Risks & Safety Considerations

Trained on web-sourced images → may not generalize to low-light industrial environments
Audio dataset is small and partially synthetic
Some image classes are underrepresented (cracked_faceplate, water_exposed)
Not certified for electrical compliance decisions
Should not replace licensed electrical inspection

Recommended use: screening / preliminary diagnostics only

Future Improvements

Add 100–200 more real cracked/water samples
Clean watermarked images
Upgrade backbone to ConvNeXt-Tiny or EfficientNet-B2
Collect real-world buzzing/arcing audio

License

Proprietary — for use in the Electrical Outlets diagnostic pipeline only.

Downloads last month: -; Downloads are not tracked for this model. How to track