ViT Model in DCRM Pipeline - Complete Explanation
What is vitResult?
The vitResult is the output from a Vision Transformer (ViT) + Gemini AI Ensemble Model that analyzes the DCRM resistance plot image to classify circuit breaker defects.
π Complete Flow (Step-by-Step)
Step 1: Generate Resistance Plot
File: core/models/vit_classifier.py β plot_resistance_for_vit()
# Creates a plot with 3 lines:
# - Green line: Resistance profile
# - Blue line: Current profile
# - Red line: Travel profile
# Saves as temporary PNG file: temp_vit_plot_{phase}_{uuid}.png
Example: temp_vit_plot_r_a3f8d2b1.png
Step 2: ViT Model Analysis (Remote API)
File: core/models/vit_classifier.py β get_remote_vit_probabilities()
# Sends image to deployed ViT model API
DEPLOYED_VIT_URL = "http://143.110.244.235/predict"
# ViT is trained on DCRM images to detect 5 defect classes:
CLASSES = [
"Healthy",
"Arcing_Contact_Misalignment",
"Arcing_Contact_Wear",
"Main Contact Misalignment",
"main_contact_wear"
]
# Returns probability distribution for each class
vit_probs = {
"Healthy": 0.507,
"Arcing_Contact_Misalignment": 0.120,
"Arcing_Contact_Wear": 0.044,
"Main Contact Misalignment": 0.142,
"main_contact_wear": 0.186
}
How ViT Works:
- ViT (Vision Transformer) is a deep learning model trained on DCRM plot images
- It learned visual patterns from thousands of circuit breaker test plots
- Analyzes waveform shapes, spikes, plateaus, and transitions
- Outputs probability for each defect type
Step 3: Gemini AI Analysis
File: core/models/vit_classifier.py β get_gemini_prediction()
# Sends same image to Google Gemini 2.0 Flash
# Uses expert prompt with diagnostic heuristics:
Diagnostic Rules:
1. "The Significant Grass" β Main Contact Corrosion
- Jagged, irregular resistance plateau (> 15-20ΞΌΞ© variance)
2. "Big Spikes & Short Wipe" β Arcing Contact Wear
- Large amplitude spikes, shortened arcing zone
3. "The Struggle to Settle" β Main Misalignment
- High-amplitude peaks before plateau (> 3-5ms)
4. "Rough Entry" β Arcing Misalignment
- Erratic spikes during initial entry
5. "Stretched Time" β Slow Mechanism
- Elongated resistance profile on X-axis
# Returns probability distribution
gemini_probs = {
"Healthy": 0.05,
"Arcing_Contact_Misalignment": 0.02,
"Arcing_Contact_Wear": 0.01,
"Main Contact Misalignment": 0.02,
"main_contact_wear": 0.90 # High confidence!
}
Step 4: Ensemble Prediction
File: core/models/vit_classifier.py β predict_dcrm_image()
# Combines ViT + Gemini predictions
# ensemble_score = vit_prob + gemini_prob
ensemble_scores = {
"Healthy": 0.507 + 0.05 = 0.557,
"Arcing_Contact_Misalignment": 0.120 + 0.02 = 0.140,
"Arcing_Contact_Wear": 0.044 + 0.01 = 0.054,
"Main Contact Misalignment": 0.142 + 0.02 = 0.162,
"main_contact_wear": 0.186 + 0.90 = 1.086 # β
HIGHEST!
}
# Selects class with highest ensemble score
predicted_class = "main_contact_wear"
confidence = 0.543 # Normalized confidence
Step 5: Integration into Pipeline
File: apps/flask_server.py β process_single_phase_csv()
# Lines 155-183
vit_result = None
vit_plot_path = f"temp_vit_plot_{phase_name}_{uuid.uuid4().hex[:8]}.png"
# Generate plot
if plot_resistance_for_vit(df, vit_plot_path):
# Get prediction
vit_class, vit_conf, vit_details = predict_dcrm_image(vit_plot_path, api_key=api_key)
vit_result = {
"class": vit_class, # "main_contact_wear"
"confidence": vit_conf, # 0.5429375439882278
"details": vit_details # Full breakdown below
}
# Cleanup temp file
os.remove(vit_plot_path)
π¦ vitResult Structure Breakdown
{
"class": "main_contact_wear", // β
FINAL PREDICTION
"confidence": 0.5429375439882278, // β
NORMALIZED CONFIDENCE
"details": {
"vit_probs": { // π€ Vision Transformer probabilities
"Healthy": 0.5076556205749512,
"Arcing_Contact_Misalignment": 0.12034504860639572,
"Arcing_Contact_Wear": 0.04370640590786934,
"Main Contact Misalignment": 0.1424178034067154,
"main_contact_wear": 0.1858750879764557
},
"gemini_probs": { // π§ Gemini AI probabilities
"Healthy": 0.05,
"Arcing_Contact_Misalignment": 0.02,
"Arcing_Contact_Wear": 0.01,
"Main Contact Misalignment": 0.02,
"main_contact_wear": 0.9 // Gemini is very confident!
},
"ensemble_scores": { // π― COMBINED SCORES
"Healthy": 0.5576556205749512,
"Arcing_Contact_Misalignment": 0.1403450486063957,
"Arcing_Contact_Wear": 0.05370640590786934,
"Main Contact Misalignment": 0.16241780340671538,
"main_contact_wear": 1.0858750879764556 // β
HIGHEST β WINNER
}
}
}
π Why Two Models?
| Model | Strengths | Weaknesses |
|---|---|---|
| ViT | - Trained on real DCRM data - Fast inference - Consistent |
- May overfit to training data - Limited to visual patterns |
| Gemini | - Expert reasoning - Contextual understanding - Adapts to new cases |
- May hallucinate - Slower - Requires API calls |
| Ensemble | β
Best of both worlds - ViT provides baseline - Gemini adds expertise |
- Slightly higher computational cost |
π― How It's Used in the Pipeline
The vitResult is:
- Generated in
flask_server.py(lines 155-183) - Passed to
report_generator.py - Included in final JSON output under each phase (r, y, b)
- Referenced in fault summaries for LLM context
Example Usage:
# In report_generator.py
if vit_result:
faults_summary += f"\nViT Model Prediction:\n- Class: {vit_result.get('class', 'Unknown')}\n- Confidence: {vit_result.get('confidence', 0)*100:.2f}%\n"
π Visual Flow Diagram
Input CSV Data
β
Extract Resistance, Current, Travel
β
Generate Plot (matplotlib)
β
temp_vit_plot.png
β
ββββ [ViT API] β vit_probs
ββββ [Gemini AI] β gemini_probs
β
Ensemble Combination
β
ensemble_scores
β
Select MAX score β predicted_class
β
vitResult JSON
β
Included in final report
π οΈ Configuration
ViT API Endpoint:
DEPLOYED_VIT_URL = "http://143.110.244.235/predict"
Gemini Model:
model = genai.GenerativeModel('gemini-2.0-flash')
API Key (from environment):
GOOGLE_API_KEY # Main key
GOOGLE_API_KEY_1 # For R phase
GOOGLE_API_KEY_2 # For Y phase
GOOGLE_API_KEY_3 # For B phase
π¨ Error Handling
If ViT or Gemini fails:
if not vit_result:
# Pipeline continues without ViT analysis
# Other components (Rule Engine, AI Agent) still work
print("ViT prediction unavailable, continuing with other analyses...")
The pipeline is resilient - if ViT fails, analysis still completes using Rule Engine + AI Agent.
π Summary
vitResult provides:
- β Image-based defect classification
- β Visual pattern recognition (ViT)
- β Expert reasoning (Gemini)
- β Ensemble confidence scoring
- β Detailed probability breakdown
- β Complements KPI-based and time-series analysis
It's a 3rd independent diagnostic method alongside:
- Rule Engine (deterministic thresholds)
- AI Agent (LLM-based fault detection)
- ViT Model (image classification) β This one!
All three methods are combined to provide comprehensive, multi-faceted circuit breaker diagnostics.