Spaces:

pranit144
/

FLASK_APP

Sleeping

App Files Files Community

FLASK_APP / docs /VIT_MODEL_EXPLANATION.md

pranit144

Upload 97 files

e38de99 verified 2 months ago

preview code

raw

history blame contribute delete

8.23 kB

ViT Model in DCRM Pipeline - Complete Explanation

What is `vitResult`?

The vitResult is the output from a Vision Transformer (ViT) + Gemini AI Ensemble Model that analyzes the DCRM resistance plot image to classify circuit breaker defects.

📊 Complete Flow (Step-by-Step)

Step 1: Generate Resistance Plot

File: core/models/vit_classifier.py → plot_resistance_for_vit()

# Creates a plot with 3 lines:
# - Green line: Resistance profile
# - Blue line: Current profile  
# - Red line: Travel profile

# Saves as temporary PNG file: temp_vit_plot_{phase}_{uuid}.png

Example: temp_vit_plot_r_a3f8d2b1.png

Step 2: ViT Model Analysis (Remote API)

File: core/models/vit_classifier.py → get_remote_vit_probabilities()

# Sends image to deployed ViT model API
DEPLOYED_VIT_URL = "http://143.110.244.235/predict"

# ViT is trained on DCRM images to detect 5 defect classes:
CLASSES = [
    "Healthy",
    "Arcing_Contact_Misalignment",
    "Arcing_Contact_Wear",
    "Main Contact Misalignment",
    "main_contact_wear"
]

# Returns probability distribution for each class
vit_probs = {
    "Healthy": 0.507,
    "Arcing_Contact_Misalignment": 0.120,
    "Arcing_Contact_Wear": 0.044,
    "Main Contact Misalignment": 0.142,
    "main_contact_wear": 0.186
}

How ViT Works:

ViT (Vision Transformer) is a deep learning model trained on DCRM plot images
It learned visual patterns from thousands of circuit breaker test plots
Analyzes waveform shapes, spikes, plateaus, and transitions
Outputs probability for each defect type

Step 3: Gemini AI Analysis

File: core/models/vit_classifier.py → get_gemini_prediction()

# Sends same image to Google Gemini 2.0 Flash
# Uses expert prompt with diagnostic heuristics:

Diagnostic Rules:
1. "The Significant Grass" → Main Contact Corrosion
   - Jagged, irregular resistance plateau (> 15-20μΩ variance)
   
2. "Big Spikes & Short Wipe" → Arcing Contact Wear
   - Large amplitude spikes, shortened arcing zone
   
3. "The Struggle to Settle" → Main Misalignment
   - High-amplitude peaks before plateau (> 3-5ms)
   
4. "Rough Entry" → Arcing Misalignment
   - Erratic spikes during initial entry
   
5. "Stretched Time" → Slow Mechanism
   - Elongated resistance profile on X-axis

# Returns probability distribution
gemini_probs = {
    "Healthy": 0.05,
    "Arcing_Contact_Misalignment": 0.02,
    "Arcing_Contact_Wear": 0.01,
    "Main Contact Misalignment": 0.02,
    "main_contact_wear": 0.90  # High confidence!
}

Step 4: Ensemble Prediction

File: core/models/vit_classifier.py → predict_dcrm_image()

# Combines ViT + Gemini predictions
# ensemble_score = vit_prob + gemini_prob

ensemble_scores = {
    "Healthy": 0.507 + 0.05 = 0.557,
    "Arcing_Contact_Misalignment": 0.120 + 0.02 = 0.140,
    "Arcing_Contact_Wear": 0.044 + 0.01 = 0.054,
    "Main Contact Misalignment": 0.142 + 0.02 = 0.162,
    "main_contact_wear": 0.186 + 0.90 = 1.086  # ✅ HIGHEST!
}

# Selects class with highest ensemble score
predicted_class = "main_contact_wear"
confidence = 0.543  # Normalized confidence

Step 5: Integration into Pipeline

File: apps/flask_server.py → process_single_phase_csv()

# Lines 155-183
vit_result = None
vit_plot_path = f"temp_vit_plot_{phase_name}_{uuid.uuid4().hex[:8]}.png"

# Generate plot
if plot_resistance_for_vit(df, vit_plot_path):
    # Get prediction
    vit_class, vit_conf, vit_details = predict_dcrm_image(vit_plot_path, api_key=api_key)
    
    vit_result = {
        "class": vit_class,           # "main_contact_wear"
        "confidence": vit_conf,       # 0.5429375439882278
        "details": vit_details        # Full breakdown below
    }

# Cleanup temp file
os.remove(vit_plot_path)

📦 vitResult Structure Breakdown

{
  "class": "main_contact_wear",           // ✅ FINAL PREDICTION
  "confidence": 0.5429375439882278,       // ✅ NORMALIZED CONFIDENCE
  "details": {
    "vit_probs": {                        // 🤖 Vision Transformer probabilities
      "Healthy": 0.5076556205749512,
      "Arcing_Contact_Misalignment": 0.12034504860639572,
      "Arcing_Contact_Wear": 0.04370640590786934,
      "Main Contact Misalignment": 0.1424178034067154,
      "main_contact_wear": 0.1858750879764557
    },
    "gemini_probs": {                     // 🧠 Gemini AI probabilities
      "Healthy": 0.05,
      "Arcing_Contact_Misalignment": 0.02,
      "Arcing_Contact_Wear": 0.01,
      "Main Contact Misalignment": 0.02,
      "main_contact_wear": 0.9            // Gemini is very confident!
    },
    "ensemble_scores": {                  // 🎯 COMBINED SCORES
      "Healthy": 0.5576556205749512,
      "Arcing_Contact_Misalignment": 0.1403450486063957,
      "Arcing_Contact_Wear": 0.05370640590786934,
      "Main Contact Misalignment": 0.16241780340671538,
      "main_contact_wear": 1.0858750879764556  // ✅ HIGHEST → WINNER
    }
  }
}

🔍 Why Two Models?

Model	Strengths	Weaknesses
ViT	- Trained on real DCRM data - Fast inference - Consistent	- May overfit to training data - Limited to visual patterns
Gemini	- Expert reasoning - Contextual understanding - Adapts to new cases	- May hallucinate - Slower - Requires API calls
Ensemble	✅ Best of both worlds - ViT provides baseline - Gemini adds expertise	- Slightly higher computational cost

🎯 How It's Used in the Pipeline

The vitResult is:

Generated in flask_server.py (lines 155-183)
Passed to report_generator.py
Included in final JSON output under each phase (r, y, b)
Referenced in fault summaries for LLM context

Example Usage:

# In report_generator.py
if vit_result:
    faults_summary += f"\nViT Model Prediction:\n- Class: {vit_result.get('class', 'Unknown')}\n- Confidence: {vit_result.get('confidence', 0)*100:.2f}%\n"

📊 Visual Flow Diagram

Input CSV Data
      ↓
Extract Resistance, Current, Travel
      ↓
Generate Plot (matplotlib)
      ↓  
  temp_vit_plot.png
      ↓
      ├──→ [ViT API]      → vit_probs
      └──→ [Gemini AI]    → gemini_probs
            ↓
      Ensemble Combination
            ↓
      ensemble_scores
            ↓
   Select MAX score → predicted_class
            ↓
      vitResult JSON
            ↓
  Included in final report

🛠️ Configuration

ViT API Endpoint:

DEPLOYED_VIT_URL = "http://143.110.244.235/predict"

Gemini Model:

model = genai.GenerativeModel('gemini-2.0-flash')

API Key (from environment):

GOOGLE_API_KEY  # Main key
GOOGLE_API_KEY_1  # For R phase
GOOGLE_API_KEY_2  # For Y phase  
GOOGLE_API_KEY_3  # For B phase

🚨 Error Handling

If ViT or Gemini fails:

if not vit_result:
    # Pipeline continues without ViT analysis
    # Other components (Rule Engine, AI Agent) still work
    print("ViT prediction unavailable, continuing with other analyses...")

The pipeline is resilient - if ViT fails, analysis still completes using Rule Engine + AI Agent.

📝 Summary

vitResult provides:

✅ Image-based defect classification
✅ Visual pattern recognition (ViT)
✅ Expert reasoning (Gemini)
✅ Ensemble confidence scoring
✅ Detailed probability breakdown
✅ Complements KPI-based and time-series analysis

It's a 3rd independent diagnostic method alongside:

Rule Engine (deterministic thresholds)
AI Agent (LLM-based fault detection)
ViT Model (image classification) ← This one!

All three methods are combined to provide comprehensive, multi-faceted circuit breaker diagnostics.