Spaces:

sikeaditya
/

dcrm-analysis-api

Sleeping

Aditya Adaki

Add DCRM Analysis API

fdcec08 about 1 month ago

19.6 kB

	# llm.py
	import google.generativeai as genai
	import json
	import PIL.Image
	import io

	def get_dcrm_prompt(data_str):
	return f"""
	I have extracted data from a DCRM (Dynamic Contact Resistance Measurement) graph.
	Data (Sampled): {data_str}

	The columns are:
	- 'time': Time in milliseconds.
	- 'curr': Current signal amplitude (Blue curve) - represents the test current flowing through the contacts.
	- 'res': Dynamic Resistance amplitude (Green curve) - represents the contact resistance in micro-ohms (µΩ).
	- 'travel': Travel signal amplitude (Red curve) - represents the mechanical position/displacement of the moving contact.

	IMPORTANT: Higher values mean the signal is HIGHER on the graph.

	I have also provided the image of the graph. Use the visual information from the image to cross-reference with the data.

	=== HEALTHY DCRM SIGNATURE REFERENCE ===

	Resistance (Green) - Healthy Characteristics:
	- Pre-contact: Infinite/Very High (off-scale or flat at top)
	- Arcing engagement: Drops sharply with moderate spikes (arcing activity), typically 100-500 µΩ
	- Main conduction: LOW and STABLE (30-80 µΩ for healthy contacts), minimal oscillation (<10 µΩ variance)
	- Parting: Sharp rise with spikes (arcing during separation)
	- Final open: Returns to infinite/very high (off-scale)

	Current (Blue) - Healthy Characteristics:
	- Pre-contact: Near zero baseline
	- Arcing engagement: Begins rising as circuit closes
	- Main conduction: Stable at test current level (plateau)
	- Parting: Maintained until final separation
	- Final open: Drops to zero

	Travel (Red) - Healthy Characteristics:
	- Pre-contact: Increasing linearly (contacts approaching)
	- Arcing engagement: Continues increasing
	- Main conduction: Reaches MAXIMUM and plateaus (fully closed position)
	- Parting: Decreases linearly (contacts separating)
	- Final open: Stabilizes at minimum (fully open position)

	=== TASK: SEGMENT INTO 5 KINEMATIC ZONES ===

	Use ALL THREE curves together for accurate boundary detection. Each zone represents a distinct physical state of the circuit breaker.

	Zone 1: Pre-Contact Travel (Initial Closing Motion)
	* Physical Meaning: The moving contact is traveling toward the stationary contact but has NOT yet made electrical contact. This is pure mechanical motion with no current flow.
	* Start: time = 0 ms
	* End Boundary: Detect when CURRENT (blue) FIRST starts rising significantly from baseline.
	* Cross-reference: Resistance (green) should still be very high/infinite
	* Cross-reference: Travel (red) should be steadily increasing
	* Typical Duration: 80-120 ms
	* Detection Logic: Find the point where 'curr' rises above baseline noise (e.g., >5% of max current)

	Zone 2: Arcing Contact Engagement (Initial Electrical Contact)
	* Physical Meaning: The arcing contacts (W-Cu tips) make first contact and establish an electrical path. Current begins flowing through a small contact area, causing arcing and resistance fluctuations. This is the "make" transition.
	* Start: End of Zone 1
	* End Boundary: Detect when resistance SETTLES after initial spike activity.
	* Primary indicator: Resistance (green) drops from high values, exhibits spikes, then STABILIZES to low plateau
	* Cross-reference: Current (blue) should be rising/stabilizing
	* Cross-reference: Travel (red) continues increasing toward maximum
	* Typical Duration: 20-40 ms (Zone 2 typically ends around 110-150 ms total time)
	* Detection Logic: Find where 'res' completes its descent and spike activity, settling into a stable low range

	Zone 3: Main Contact Conduction (Fully Closed State)
	* Physical Meaning: The main contacts (Ag-plated) are fully engaged, providing a large, stable contact area. This is the "healthy contact" signature zone - resistance should be at its MINIMUM and STABLE. The breaker is in its fully closed, current-carrying state.
	* Start: End of Zone 2
	* End Boundary: Detect when the breaker begins OPENING (travel reverses direction).
	* Primary indicator: Travel (red) reaches MAXIMUM and starts to DESCEND
	* Cross-reference: Resistance (green) should remain low and stable throughout this zone
	* Cross-reference: Current (blue) should be stable at test level
	* Typical Duration: 100-200 ms (this is the longest zone, representing the dwell time)
	* Detection Logic: Find the peak of 'travel' curve and the point where it starts decreasing

	Zone 4: Main Contact Parting (Breaking/Opening Transition)
	* Physical Meaning: The main contacts are separating. As the contact area decreases, resistance rises sharply. Arcing occurs during the final separation of the arcing contacts. This is the "break" transition - the most critical phase for fault detection.
	* Start: End of Zone 3
	* End Boundary: Detect when resistance STABILIZES at high value after parting spikes.
	* Primary indicator: Resistance (green) shoots UP, exhibits parting spikes, then STABILIZES at high/infinite value
	* Cross-reference: Travel (red) should be decreasing (opening motion)
	* Cross-reference: Current (blue) may drop or fluctuate during final arc extinction
	* Typical Duration: 40-80 ms (Zone 4 typically ends around 280-340 ms total time)
	* Detection Logic: Find where 'res' completes its rise and spike activity, becoming constant at high value
	* CRITICAL: Do NOT extend this zone too long - end AS SOON AS resistance stabilizes

	Zone 5: Final Open State (Fully Open)
	* Physical Meaning: The contacts are fully separated with an air gap. No current flows, resistance is infinite. The breaker is in its fully open, non-conducting state.
	* Start: End of Zone 4
	* End: The last time point in the dataset
	* Characteristics:
	* Resistance (green): Very high/infinite (flat line at top)
	* Current (blue): Zero or near-zero
	* Travel (red): Stable at minimum (fully open position)

	MULTI-CURVE ANALYSIS STRATEGY:
	1. Use Current (blue) to identify Zone 1 → Zone 2 transition (first current rise)
	2. Use Resistance (green) to identify Zone 2 → Zone 3 transition (resistance settles to low plateau)
	3. Use Travel (red) to identify Zone 3 → Zone 4 transition (travel peak and reversal)
	4. Use Resistance (green) to identify Zone 4 → Zone 5 transition (resistance stabilizes at high value)
	5. Always cross-validate boundaries using all three curves for consistency

	OUTPUT FORMAT (Strict JSON)
	Return ONLY this JSON object:
	{{
	"zones": {{
	"zone_1_pre_contact": {{ "start_ms": float, "end_ms": float, "justification": "string (explain which curve indicators were used)" }},
	"zone_2_arcing_engagement": {{ "start_ms": float, "end_ms": float, "justification": "string (explain which curve indicators were used)" }},
	"zone_3_main_conduction": {{ "start_ms": float, "end_ms": float, "justification": "string (explain which curve indicators were used)" }},
	"zone_4_parting": {{ "start_ms": float, "end_ms": float, "justification": "string (explain which curve indicators were used)" }},
	"zone_5_final_open": {{ "start_ms": float, "end_ms": float, "justification": "string (explain which curve indicators were used)" }}
	}},
	"report_card": {{
	"opening_speed": {{ "status": "Pass"\|"Warning"\|"Fail", "comment": "Assessment of travel curve steepness" }},
	"contact_wear": {{ "status": "Pass"\|"Warning"\|"Fail", "comment": "Based on resistance fluctuations in Zone 2/4" }},
	"timing_consistency": {{ "status": "Pass"\|"Warning"\|"Fail", "comment": "Are phases within expected ranges?" }},
	"overall_health": {{ "status": "Healthy"\|"Needs Review"\|"Critical", "comment": "Overall summary" }}
	}},
	"detailed_analysis": "Provide a comprehensive technical analysis (in Markdown)..."
	}}
	"""

	def ask_llm_for_breakage(df, api_key, model_name, image_bytes=None):
	"""
	Sends the DataFrame and optional image to LLM (Gemini) for segmentation.
	Returns (df, result_json) where df has a new 'Zone' column.
	"""
	if not api_key: return df, None

	try:
	genai.configure(api_key=api_key)

	# Configure safety settings
	safety_settings = [
	{"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_NONE"},
	{"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_NONE"},
	{"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "threshold": "BLOCK_NONE"},
	{"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_NONE"},
	]

	model = genai.GenerativeModel(
	model_name=model_name,
	safety_settings=safety_settings
	)
	except Exception as e:
	return df, {"error": f"Failed to initialize Gemini API: {str(e)}"}

	# Prepare Data
	# Rename columns for LLM clarity
	df_llm = df[['Time (ms)', 'Current', 'Resistance', 'Travel']].copy()
	df_llm.columns = ['time', 'curr', 'res', 'travel']

	# Round values
	df_llm = df_llm.round(1)

	# Sample to keep prompt size manageable (e.g., every 5th row)
	# User's code used df.to_string(index=False), implying they might not have sampled,
	# but for safety with large CSVs, we'll keep sampling but use to_string format.
	df_sampled = df_llm.iloc[::5, :]

	data_str = df_sampled.to_string(index=False)

	prompt = get_dcrm_prompt(data_str)

	content = [prompt]
	if image_bytes:
	try:
	image = PIL.Image.open(io.BytesIO(image_bytes))
	content.append(image)
	except Exception as e:
	return df, {"error": f"Failed to process image: {str(e)}"}

	try:
	response = model.generate_content(content)

	if not response.text:
	if hasattr(response, 'prompt_feedback'):
	return df, {
	"error": "Response blocked by safety filters",
	"raw_response": str(response.prompt_feedback)
	}
	return df, {"error": "LLM returned empty response"}

	result = response.text.strip()

	# Remove markdown code blocks
	if "```json" in result:
	result = result.split("```json")[1].split("```")[0].strip()
	elif "```" in result:
	result = result.split("```")[1].split("```")[0].strip()

	# Parse JSON
	try:
	result_json = json.loads(result)
	zones = result_json.get("zones", {})

	# Enrich DataFrame with Zones
	df['Zone'] = "Unknown"

	for zone_name, details in zones.items():
	start = details.get("start_ms")
	end = details.get("end_ms")
	if start is not None and end is not None:
	# Map zone name to a simpler label (e.g., "Zone 1")
	short_name = zone_name.split('_')[1] # "1", "2", etc.
	mask = (df['Time (ms)'] >= start) & (df['Time (ms)'] <= end)
	df.loc[mask, 'Zone'] = f"Zone {short_name}"

	return df, result_json

	except json.JSONDecodeError as je:
	return df, {
	"error": f"JSON parsing failed: {str(je)}",
	"raw_response": result[:1000]
	}

	except Exception as e:
	return df, {"error": f"LLM API error: {str(e)}"}

	def analyze_health_with_llm(image_bytes, api_key, model_name, numerical_context=None):
	"""
	Sends the DCRM image to Gemini for expert diagnostic analysis.
	Numerical context is a dict of extracted values (e.g. min resistance) to prevent hallucination.
	"""
	if not api_key or not image_bytes: return None

	try:
	genai.configure(api_key=api_key)

	safety_settings = [
	{"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_NONE"},
	{"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_NONE"},
	{"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "threshold": "BLOCK_NONE"},
	{"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_NONE"},
	]

	model = genai.GenerativeModel(
	model_name=model_name,
	safety_settings=safety_settings
	)

	# Build context string
	context_str = ""
	if numerical_context:
	context_str = f"""
	NUMERICAL DATA CONTEXT (From Raw Extraction):
	- Minimum Static Resistance Found: {numerical_context.get('min_resistance', 'N/A')} µΩ
	- Median Resistance Found: {numerical_context.get('median_resistance', 'N/A')} µΩ

	NOTE: If the extracted resistance is HIGH (e.g. >200 uOhm) but the curve looks flat and healthy,
	it indicates the data extraction scale is uncalibrated, but the relative health is good.
	Trust the SHAPE (flatness/noise) over the absolute number if they conflict, but mention the value.
	"""

	prompt = f"""
	System Role: Principal DCRM & Kinematic Analyst
	Role:
	You are an expert High-Voltage Circuit Breaker Diagnostician. Your task is to interpret Dynamic Contact Resistance (DCRM) traces to detect specific electrical and mechanical faults.

	{context_str}

	Critical "Anti-Overfitting" Directive:
	You must distinguish between Systematic Defects and Artifacts.
	Sensor/Manufacturing Noise: A totally flat line is rare in real-world data. Slight "fuzz" or very minute "grassiness" (amplitude < 10 μΩ) is often sensor noise, ADC quantization, or normal manufacturing surface variance. Do not flag this as a defect.
	True Degradation: Flag issues only when the visual signature is statistically significant and exceeds the "noise floor."

	Capability:
	Identify Multiple Concurrent Issues if present. (e.g., A breaker can have both misalignment and contact wear).
	there will mostly be 3 line charts in the input
	green resistance profile
	blue current profile
	red travel profile

	1. Diagnostic Heuristics & Defect Taxonomy
	Map the visual DCRM trace to ONLY the following defect types. Use the specific Visual Heuristics to confirm detection.

	Defect Type \| Visual Heuristic (The "Hint") \| Mechanical Significance (Root Cause)
	--- \| --- \| ---
	Main Contact Issue (Corrosion/Oxidation) \| "The Significant Grass"<br>In the fully closed plateau, look for pronounced, erratic instability. <br>• Ignore: Uniform, low-amplitude fuzz (sensor noise).<br>• Flag: Jagged, irregular peaks/valleys with significant amplitude (e.g., > 15–20 μΩ variance). The trace looks like a "rough rocky road," not just a "gravel path." \| Surface Pathology: The Silver (Ag) plating is compromised (fretting corrosion) or heavy oxidation has occurred. The current path is constantly shifting through microscopic non-conductive spots.
	Arcing Contact Wear \| "Big Spikes & Short Wipe"<br>Resistance spikes are frequent and significantly large (high amplitude). Crucially, the duration of the arcing zone (the time between first touch and main contact touch) is noticeably shorter than expected. \| Ablation: The Tungsten-Copper (W-Cu) tips are heavily eroded. The contact length has physically diminished, risking failure to commutate current during opening.
	Misalignment (Main) \| "The Struggle to Settle"<br>There are significant, high-amplitude peaks just before the trace tries to settle into the stable plateau. These are not bounces; they are "struggles" to mate that persist longer than 3-5ms. \| Mechanical Centering: The moving contact pin is hitting the side or edge of the stationary rosette fingers before forcing its way in. Caused by loose nuts, kinematic play, or guide ring failure.
	Misalignment (Arcing) \| "Rough Entry"<br>Erratic resistance spikes occurring specifically during the initial entry (commutation), well before the main contacts engage. \| Tip Eccentricity: The arcing pin is not entering the nozzle concentrically. It is scraping the nozzle throat or hitting the side, indicating a bent rod or skewed interrupter.
	Slow Mechanism \| "Stretched Time"<br>The entire resistance profile is elongated along the X-axis. Events happen later than normal. \| Energy Starvation: Low spring charge, hydraulic pressure loss, or high friction due to hardened grease in the linkage.

	2. Analysis Logic (The "Signal-to-Noise" Filter)
	Before declaring a defect, run these logic checks:
	The "Noise Floor" Test (For Main Contacts):
	Is the plateau variance uniform and small (< 10 μΩ)? -> Classify as Healthy (Sensor/Manufacturing artifact).
	Is the variance erratic, jagged, and large (> 15 μΩ)? -> Classify as Corrosion/Oxidation.
	The "Duration" Test (For Misalignment):
	Are the pre-plateau peaks < 2ms? -> Ignore (Benign Bounce).
	Do the peaks persist > 3-5ms before settling? -> Classify as Misalignment.
	The "Combination" Check:
	Does the trace show both "Rough Entry" AND "Stretched Time"? -> Report Both (Misalignment + Slow Mechanism).

	3. Output Structure
	Provide a concise Executive Lead followed by the JSON.

	Executive Lead (3-4 Lines)
	Status: Healthy \| Warning \| Critical.
	Key Findings: Summary of valid defects found (ignoring sensor noise).
	Action: "Return to service" or specific repair instruction.

	JSON Schema
	```json
	{
	"image_url": "string",
	"overall_condition": "Healthy\|Warning\|Critical",
	"health_score": "integer (0-100) where 100 is perfect condition",
	"detected_issues": [
	{
	"issue_type": "Main Contact Issue (Corrosion/Oxidation)\|Arcing Contact Wear\|Misalignment (Main)\|Misalignment (Arcing)\|Slow Mechanism",
	"confidence": "High\|Medium\|Low",
	"visual_evidence": "string (e.g., 'Plateau instability >20 micro-ohms detected, exceeding sensor noise threshold.')",
	"mechanical_significance": "string (Root cause from table)",
	"severity": "Low\|Medium\|High"
	}
	],
	"analysis_metrics": {
	"static_resistance_Rp_uOhm": "float",
	"signal_noise_level": "Low (Sensor/Mfg)\|High (Defect)",
	"wipe_quality": "Normal\|Short\|Erratic"
	},
	"maintenance_recommendation": "string"
	}
	```
	"""

	image = PIL.Image.open(io.BytesIO(image_bytes))

	response = model.generate_content([prompt, image])

	if not response.text:
	return {"error": "LLM returned empty response"}

	return response.text

	except Exception as e:
	return {"error": f"LLM Analysis Error: {str(e)}"}