| # ViT Model in DCRM Pipeline - Complete Explanation | |
| ## What is `vitResult`? | |
| The `vitResult` is the output from a **Vision Transformer (ViT) + Gemini AI Ensemble Model** that analyzes the DCRM resistance plot image to classify circuit breaker defects. | |
| --- | |
| ## π Complete Flow (Step-by-Step) | |
| ### **Step 1: Generate Resistance Plot** | |
| **File**: `core/models/vit_classifier.py` β `plot_resistance_for_vit()` | |
| ```python | |
| # Creates a plot with 3 lines: | |
| # - Green line: Resistance profile | |
| # - Blue line: Current profile | |
| # - Red line: Travel profile | |
| # Saves as temporary PNG file: temp_vit_plot_{phase}_{uuid}.png | |
| ``` | |
| **Example**: `temp_vit_plot_r_a3f8d2b1.png` | |
| --- | |
| ### **Step 2: ViT Model Analysis** (Remote API) | |
| **File**: `core/models/vit_classifier.py` β `get_remote_vit_probabilities()` | |
| ```python | |
| # Sends image to deployed ViT model API | |
| DEPLOYED_VIT_URL = "http://143.110.244.235/predict" | |
| # ViT is trained on DCRM images to detect 5 defect classes: | |
| CLASSES = [ | |
| "Healthy", | |
| "Arcing_Contact_Misalignment", | |
| "Arcing_Contact_Wear", | |
| "Main Contact Misalignment", | |
| "main_contact_wear" | |
| ] | |
| # Returns probability distribution for each class | |
| vit_probs = { | |
| "Healthy": 0.507, | |
| "Arcing_Contact_Misalignment": 0.120, | |
| "Arcing_Contact_Wear": 0.044, | |
| "Main Contact Misalignment": 0.142, | |
| "main_contact_wear": 0.186 | |
| } | |
| ``` | |
| **How ViT Works**: | |
| - ViT (Vision Transformer) is a deep learning model trained on DCRM plot images | |
| - It learned visual patterns from thousands of circuit breaker test plots | |
| - Analyzes waveform shapes, spikes, plateaus, and transitions | |
| - Outputs probability for each defect type | |
| --- | |
| ### **Step 3: Gemini AI Analysis** | |
| **File**: `core/models/vit_classifier.py` β `get_gemini_prediction()` | |
| ```python | |
| # Sends same image to Google Gemini 2.0 Flash | |
| # Uses expert prompt with diagnostic heuristics: | |
| Diagnostic Rules: | |
| 1. "The Significant Grass" β Main Contact Corrosion | |
| - Jagged, irregular resistance plateau (> 15-20ΞΌΞ© variance) | |
| 2. "Big Spikes & Short Wipe" β Arcing Contact Wear | |
| - Large amplitude spikes, shortened arcing zone | |
| 3. "The Struggle to Settle" β Main Misalignment | |
| - High-amplitude peaks before plateau (> 3-5ms) | |
| 4. "Rough Entry" β Arcing Misalignment | |
| - Erratic spikes during initial entry | |
| 5. "Stretched Time" β Slow Mechanism | |
| - Elongated resistance profile on X-axis | |
| # Returns probability distribution | |
| gemini_probs = { | |
| "Healthy": 0.05, | |
| "Arcing_Contact_Misalignment": 0.02, | |
| "Arcing_Contact_Wear": 0.01, | |
| "Main Contact Misalignment": 0.02, | |
| "main_contact_wear": 0.90 # High confidence! | |
| } | |
| ``` | |
| --- | |
| ### **Step 4: Ensemble Prediction** | |
| **File**: `core/models/vit_classifier.py` β `predict_dcrm_image()` | |
| ```python | |
| # Combines ViT + Gemini predictions | |
| # ensemble_score = vit_prob + gemini_prob | |
| ensemble_scores = { | |
| "Healthy": 0.507 + 0.05 = 0.557, | |
| "Arcing_Contact_Misalignment": 0.120 + 0.02 = 0.140, | |
| "Arcing_Contact_Wear": 0.044 + 0.01 = 0.054, | |
| "Main Contact Misalignment": 0.142 + 0.02 = 0.162, | |
| "main_contact_wear": 0.186 + 0.90 = 1.086 # β HIGHEST! | |
| } | |
| # Selects class with highest ensemble score | |
| predicted_class = "main_contact_wear" | |
| confidence = 0.543 # Normalized confidence | |
| ``` | |
| --- | |
| ### **Step 5: Integration into Pipeline** | |
| **File**: `apps/flask_server.py` β `process_single_phase_csv()` | |
| ```python | |
| # Lines 155-183 | |
| vit_result = None | |
| vit_plot_path = f"temp_vit_plot_{phase_name}_{uuid.uuid4().hex[:8]}.png" | |
| # Generate plot | |
| if plot_resistance_for_vit(df, vit_plot_path): | |
| # Get prediction | |
| vit_class, vit_conf, vit_details = predict_dcrm_image(vit_plot_path, api_key=api_key) | |
| vit_result = { | |
| "class": vit_class, # "main_contact_wear" | |
| "confidence": vit_conf, # 0.5429375439882278 | |
| "details": vit_details # Full breakdown below | |
| } | |
| # Cleanup temp file | |
| os.remove(vit_plot_path) | |
| ``` | |
| --- | |
| ## π¦ vitResult Structure Breakdown | |
| ```json | |
| { | |
| "class": "main_contact_wear", // β FINAL PREDICTION | |
| "confidence": 0.5429375439882278, // β NORMALIZED CONFIDENCE | |
| "details": { | |
| "vit_probs": { // π€ Vision Transformer probabilities | |
| "Healthy": 0.5076556205749512, | |
| "Arcing_Contact_Misalignment": 0.12034504860639572, | |
| "Arcing_Contact_Wear": 0.04370640590786934, | |
| "Main Contact Misalignment": 0.1424178034067154, | |
| "main_contact_wear": 0.1858750879764557 | |
| }, | |
| "gemini_probs": { // π§ Gemini AI probabilities | |
| "Healthy": 0.05, | |
| "Arcing_Contact_Misalignment": 0.02, | |
| "Arcing_Contact_Wear": 0.01, | |
| "Main Contact Misalignment": 0.02, | |
| "main_contact_wear": 0.9 // Gemini is very confident! | |
| }, | |
| "ensemble_scores": { // π― COMBINED SCORES | |
| "Healthy": 0.5576556205749512, | |
| "Arcing_Contact_Misalignment": 0.1403450486063957, | |
| "Arcing_Contact_Wear": 0.05370640590786934, | |
| "Main Contact Misalignment": 0.16241780340671538, | |
| "main_contact_wear": 1.0858750879764556 // β HIGHEST β WINNER | |
| } | |
| } | |
| } | |
| ``` | |
| --- | |
| ## π Why Two Models? | |
| | Model | Strengths | Weaknesses | | |
| |-------|-----------|------------| | |
| | **ViT** | - Trained on real DCRM data<br>- Fast inference<br>- Consistent | - May overfit to training data<br>- Limited to visual patterns | | |
| | **Gemini** | - Expert reasoning<br>- Contextual understanding<br>- Adapts to new cases | - May hallucinate<br>- Slower<br>- Requires API calls | | |
| | **Ensemble** | β **Best of both worlds**<br>- ViT provides baseline<br>- Gemini adds expertise | - Slightly higher computational cost | | |
| --- | |
| ## π― How It's Used in the Pipeline | |
| The `vitResult` is: | |
| 1. **Generated** in `flask_server.py` (lines 155-183) | |
| 2. **Passed to** `report_generator.py` | |
| 3. **Included in** final JSON output under each phase (r, y, b) | |
| 4. **Referenced** in fault summaries for LLM context | |
| **Example Usage**: | |
| ```python | |
| # In report_generator.py | |
| if vit_result: | |
| faults_summary += f"\nViT Model Prediction:\n- Class: {vit_result.get('class', 'Unknown')}\n- Confidence: {vit_result.get('confidence', 0)*100:.2f}%\n" | |
| ``` | |
| --- | |
| ## π Visual Flow Diagram | |
| ``` | |
| Input CSV Data | |
| β | |
| Extract Resistance, Current, Travel | |
| β | |
| Generate Plot (matplotlib) | |
| β | |
| temp_vit_plot.png | |
| β | |
| ββββ [ViT API] β vit_probs | |
| ββββ [Gemini AI] β gemini_probs | |
| β | |
| Ensemble Combination | |
| β | |
| ensemble_scores | |
| β | |
| Select MAX score β predicted_class | |
| β | |
| vitResult JSON | |
| β | |
| Included in final report | |
| ``` | |
| --- | |
| ## π οΈ Configuration | |
| **ViT API Endpoint**: | |
| ```python | |
| DEPLOYED_VIT_URL = "http://143.110.244.235/predict" | |
| ``` | |
| **Gemini Model**: | |
| ```python | |
| model = genai.GenerativeModel('gemini-2.0-flash') | |
| ``` | |
| **API Key** (from environment): | |
| ```python | |
| GOOGLE_API_KEY # Main key | |
| GOOGLE_API_KEY_1 # For R phase | |
| GOOGLE_API_KEY_2 # For Y phase | |
| GOOGLE_API_KEY_3 # For B phase | |
| ``` | |
| --- | |
| ## π¨ Error Handling | |
| If ViT or Gemini fails: | |
| ```python | |
| if not vit_result: | |
| # Pipeline continues without ViT analysis | |
| # Other components (Rule Engine, AI Agent) still work | |
| print("ViT prediction unavailable, continuing with other analyses...") | |
| ``` | |
| The pipeline is **resilient** - if ViT fails, analysis still completes using Rule Engine + AI Agent. | |
| --- | |
| ## π Summary | |
| **vitResult provides**: | |
| - β Image-based defect classification | |
| - β Visual pattern recognition (ViT) | |
| - β Expert reasoning (Gemini) | |
| - β Ensemble confidence scoring | |
| - β Detailed probability breakdown | |
| - β Complements KPI-based and time-series analysis | |
| It's a **3rd independent diagnostic method** alongside: | |
| 1. Rule Engine (deterministic thresholds) | |
| 2. AI Agent (LLM-based fault detection) | |
| 3. **ViT Model (image classification)** β This one! | |
| All three methods are combined to provide comprehensive, multi-faceted circuit breaker diagnostics. | |