saadmannan commited on
Commit
d2173d1
·
1 Parent(s): c38dbec

Prepare project for Hugging Face Space deployment - Add app.py with Gradio interface - Update requirements.txt with torch dependencies - Configure LFS for large files (models, data) - Update README with comprehensive documentation

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ *.csv filter=lfs diff=lfs merge=lfs -text
.gitignore ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Python
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+ *.so
6
+ .Python
7
+ build/
8
+ develop-eggs/
9
+ dist/
10
+ downloads/
11
+ eggs/
12
+ .eggs/
13
+ lib/
14
+ lib64/
15
+ parts/
16
+ sdist/
17
+ var/
18
+ wheels/
19
+ *.egg-info/
20
+ .installed.cfg
21
+ *.egg
22
+
23
+ # Virtual Environment
24
+ venv/
25
+ ENV/
26
+ env/
27
+ .venv
28
+
29
+ # Jupyter Notebook
30
+ .ipynb_checkpoints
31
+
32
+ # PyCharm
33
+ .idea/
34
+
35
+ # VS Code
36
+ .vscode/
37
+
38
+ # Environment variables
39
+ .env
40
+ .env.local
41
+
42
+ # Model files (optional - uncomment if models are large)
43
+ # *.pth
44
+ # *.pt
45
+ # *.h5
46
+
47
+ # Data files (optional - uncomment if data is large)
48
+ # data/raw/*.csv
49
+ # data/processed/*.csv
50
+
51
+ # Logs
52
+ *.log
53
+ logs/
54
+
55
+ # OS
56
+ .DS_Store
57
+ Thumbs.db
58
+
59
+ # Testing
60
+ .pytest_cache/
61
+ .coverage
62
+ htmlcov/
63
+
64
+ # Docker
65
+ .dockerignore
Dockerfile ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Vehicle Diagnostics Agent Dockerfile
2
+ FROM python:3.10-slim
3
+
4
+ # Set working directory
5
+ WORKDIR /app
6
+
7
+ # Install system dependencies
8
+ RUN apt-get update && apt-get install -y \
9
+ build-essential \
10
+ curl \
11
+ && rm -rf /var/lib/apt/lists/*
12
+
13
+ # Copy requirements
14
+ COPY requirements.txt .
15
+
16
+ # Install Python dependencies
17
+ RUN pip install --no-cache-dir -r requirements.txt
18
+
19
+ # Install PyTorch (CPU version for smaller image)
20
+ RUN pip install --no-cache-dir torch torchvision --index-url https://download.pytorch.org/whl/cpu
21
+
22
+ # Copy application code
23
+ COPY src/ ./src/
24
+ COPY data/ ./data/
25
+
26
+ # Expose ports
27
+ EXPOSE 8000 7860
28
+
29
+ # Set environment variables
30
+ ENV PYTHONUNBUFFERED=1
31
+ ENV PYTHONPATH=/app
32
+
33
+ # Default command (can be overridden)
34
+ CMD ["python", "src/api/main.py"]
PROJECT_SUMMARY.md ADDED
@@ -0,0 +1,332 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Vehicle Diagnostics Agent - Project Completion Summary
2
+
3
+ ## 🎉 Project Status: COMPLETED
4
+
5
+ All phases of the Vehicle Diagnostics Agent project have been successfully implemented and tested.
6
+
7
+ ---
8
+
9
+ ## ✅ Completed Phases
10
+
11
+ ### Phase 1: Project Setup and Planning ✓
12
+ - ✅ Created project structure with organized directories
13
+ - ✅ Set up conda environment (vda)
14
+ - ✅ Installed all dependencies (PyTorch, LangChain, FastAPI, Gradio, etc.)
15
+ - ✅ Generated synthetic vehicle sensor dataset (50,000 records, 100 vehicles)
16
+ - ✅ Dataset includes 14 sensor measurements with realistic anomaly patterns
17
+
18
+ ### Phase 2: Data Collection and Preprocessing ✓
19
+ - ✅ Implemented comprehensive data preprocessing pipeline
20
+ - ✅ Applied noise filtering with moving average (window=5)
21
+ - ✅ Engineered 60+ features including:
22
+ - Rate of change features
23
+ - Rolling statistics
24
+ - Domain-specific features (temp differential, tire imbalance, engine stress, etc.)
25
+ - ✅ Normalized features using StandardScaler
26
+ - ✅ Split data: 70% train, 10% validation, 20% test
27
+ - ✅ Saved preprocessing artifacts (scaler, feature columns)
28
+
29
+ ### Phase 3: Build Individual Agents ✓
30
+
31
+ #### 1. Data Ingestion Agent ✓
32
+ - ✅ Loads and prepares vehicle sensor data
33
+ - ✅ Supports filtering by vehicle ID and time range
34
+ - ✅ Generates sensor summary statistics
35
+ - ✅ Prepares data for downstream agents
36
+
37
+ #### 2. Anomaly Detection Agent ✓
38
+ - ✅ LSTM-based neural network model
39
+ - ✅ Architecture: 2-layer LSTM with 64 hidden units
40
+ - ✅ Trained on 31,570 sequences
41
+ - ✅ Validation accuracy: 99.53%
42
+ - ✅ Best validation loss: 0.0409
43
+ - ✅ Fallback rule-based detection system
44
+ - ✅ Identifies anomalous sensors with severity levels
45
+
46
+ #### 3. Root Cause Analysis Agent ✓
47
+ - ✅ 8 fault pattern definitions with thresholds
48
+ - ✅ Fault code mapping (P-codes, C-codes)
49
+ - ✅ Sensor correlation analysis
50
+ - ✅ Failure sequence determination
51
+ - ✅ Confidence scoring for each root cause
52
+
53
+ #### 4. Maintenance Recommendation Agent ✓
54
+ - ✅ Comprehensive maintenance action database
55
+ - ✅ Immediate, short-term, and long-term actions
56
+ - ✅ Cost estimation for each fault type
57
+ - ✅ Urgency-based prioritization
58
+ - ✅ Downtime estimation
59
+
60
+ #### 5. Report Generation Agent ✓
61
+ - ✅ Executive summary generation
62
+ - ✅ Natural language summaries for non-technical users
63
+ - ✅ Detailed technical reports
64
+ - ✅ JSON-formatted structured reports
65
+ - ✅ Timestamp and metadata tracking
66
+
67
+ ### Phase 4: Agent Orchestration and Workflow ✓
68
+ - ✅ Implemented LangGraph-based orchestration
69
+ - ✅ Sequential agent execution pipeline
70
+ - ✅ State management across agents
71
+ - ✅ Error handling and recovery
72
+ - ✅ Support for single and batch vehicle diagnostics
73
+ - ✅ Complete workflow: Data Ingestion → Anomaly Detection → Root Cause → Recommendation → Report
74
+
75
+ ### Phase 5: Backend and Frontend Development ✓
76
+
77
+ #### FastAPI Backend ✓
78
+ - ✅ RESTful API with 7 endpoints:
79
+ - `/` - Root endpoint
80
+ - `/health` - Health check
81
+ - `/vehicles` - List available vehicles
82
+ - `/diagnose` - Single vehicle diagnostic
83
+ - `/batch-diagnose` - Batch diagnostics
84
+ - `/report/{vehicle_id}` - Full report
85
+ - `/vehicle/{vehicle_id}/status` - Vehicle status
86
+ - ✅ CORS middleware enabled
87
+ - ✅ Pydantic models for request/response validation
88
+ - ✅ Comprehensive error handling
89
+ - ✅ Auto-generated API documentation (Swagger/OpenAPI)
90
+
91
+ #### Gradio Frontend ✓
92
+ - ✅ Interactive web-based UI
93
+ - ✅ Three main tabs:
94
+ - Single Vehicle Diagnostic
95
+ - Vehicle Overview
96
+ - About/Documentation
97
+ - ✅ Real-time diagnostic execution
98
+ - ✅ Plotly visualizations for anomaly detection
99
+ - ✅ Vehicle information display
100
+ - ✅ Full report viewing
101
+ - ✅ Natural language summaries
102
+
103
+ ### Phase 6: Testing and Validation ✓
104
+ - ✅ Comprehensive unit test suite (12 tests)
105
+ - ✅ All tests passing (100% success rate)
106
+ - ✅ Tests cover:
107
+ - Data Ingestion Agent
108
+ - Anomaly Detection Agent
109
+ - Root Cause Analysis Agent
110
+ - Maintenance Recommendation Agent
111
+ - Report Generation Agent
112
+ - Full pipeline integration
113
+ - ✅ Pytest configuration
114
+ - ✅ Test execution time: ~3.24 seconds
115
+
116
+ ### Phase 7: Deployment and Documentation ✓
117
+ - ✅ Dockerfile for containerization
118
+ - ✅ Docker Compose configuration (API + UI services)
119
+ - ✅ Comprehensive README.md with:
120
+ - Project overview
121
+ - Architecture diagrams
122
+ - Installation instructions
123
+ - Usage examples
124
+ - API documentation
125
+ - Performance metrics
126
+ - ✅ .gitignore file
127
+ - ✅ Quick start scripts (run_ui.sh, run_api.sh)
128
+ - ✅ Requirements.txt with all dependencies
129
+
130
+ ---
131
+
132
+ ## 📊 Key Metrics
133
+
134
+ ### Model Performance
135
+ - **Validation Accuracy:** 99.53%
136
+ - **Training Loss:** 0.0003 (final epoch)
137
+ - **Validation Loss:** 0.0409 (best)
138
+ - **Training Time:** ~2 minutes (20 epochs on GPU)
139
+
140
+ ### Dataset Statistics
141
+ - **Total Records:** 50,000
142
+ - **Vehicles:** 100
143
+ - **Timesteps per Vehicle:** 500
144
+ - **Features:** 60 (engineered)
145
+ - **Anomaly Rate:** ~9% (train), ~2% (val), ~7% (test)
146
+
147
+ ### System Performance
148
+ - **Pipeline Execution Time:** ~1 second per vehicle
149
+ - **API Response Time:** < 2 seconds
150
+ - **Memory Usage:** Moderate (suitable for production)
151
+
152
+ ---
153
+
154
+ ## 🗂️ Project Structure
155
+
156
+ ```
157
+ VehicleDiagnosticsAgent/
158
+ ├── data/
159
+ │ ├── raw/
160
+ │ │ └── vehicle_sensor_data.csv (50,000 records)
161
+ │ └── processed/
162
+ │ ├── train.csv (35,000 records)
163
+ │ ├── val.csv (5,000 records)
164
+ │ ├── test.csv (10,000 records)
165
+ │ ├── scaler.pkl
166
+ │ └── feature_columns.pkl
167
+ ├── src/
168
+ │ ├── agents/
169
+ │ │ ├── data_ingestion_agent.py
170
+ │ │ ├── anomaly_detection_agent.py
171
+ │ │ ├── root_cause_agent.py
172
+ │ │ ├── maintenance_recommendation_agent.py
173
+ │ │ └── report_generation_agent.py
174
+ │ ├── models/
175
+ │ │ ├── anomaly_detector.py
176
+ │ │ ├── train_anomaly_detector.py
177
+ │ │ └── best_anomaly_detector.pth (trained model)
178
+ │ ├── utils/
179
+ │ │ ├── download_data.py
180
+ │ │ └── data_preprocessing.py
181
+ │ ├── api/
182
+ │ │ └── main.py (FastAPI backend)
183
+ │ ├── ui/
184
+ │ │ └── gradio_app.py (Gradio frontend)
185
+ │ └── orchestrator.py (LangGraph orchestration)
186
+ ├── tests/
187
+ │ └── test_agents.py (12 unit tests)
188
+ ├── Dockerfile
189
+ ├── docker-compose.yml
190
+ ├── requirements.txt
191
+ ├── README.md
192
+ ├── .gitignore
193
+ ├── run_ui.sh
194
+ ├── run_api.sh
195
+ └── project.md
196
+ ```
197
+
198
+ ---
199
+
200
+ ## 🚀 How to Run
201
+
202
+ ### Option 1: Gradio UI (Recommended)
203
+ ```bash
204
+ conda activate vda
205
+ ./run_ui.sh
206
+ # Access at http://localhost:7860
207
+ ```
208
+
209
+ ### Option 2: FastAPI Backend
210
+ ```bash
211
+ conda activate vda
212
+ ./run_api.sh
213
+ # API at http://localhost:8000
214
+ # Docs at http://localhost:8000/docs
215
+ ```
216
+
217
+ ### Option 3: Docker (Production)
218
+ ```bash
219
+ docker-compose up --build
220
+ # API: http://localhost:8000
221
+ # UI: http://localhost:7860
222
+ ```
223
+
224
+ ### Option 4: Python Direct
225
+ ```bash
226
+ conda activate vda
227
+ python src/orchestrator.py # Test orchestrator
228
+ python src/ui/gradio_app.py # Launch UI
229
+ uvicorn src.api.main:app --reload # Launch API
230
+ ```
231
+
232
+ ---
233
+
234
+ ## 🎯 Key Features Demonstrated
235
+
236
+ ### Technical Skills
237
+ - ✅ Multi-agent AI system design
238
+ - ✅ Deep learning (LSTM for time-series)
239
+ - ✅ LangChain/LangGraph orchestration
240
+ - ✅ FastAPI REST API development
241
+ - ✅ Gradio UI development
242
+ - ✅ Data engineering & preprocessing
243
+ - ✅ Feature engineering
244
+ - ✅ Docker containerization
245
+ - ✅ Unit testing with pytest
246
+ - ✅ Production-ready code structure
247
+
248
+ ### Domain Knowledge
249
+ - ✅ Automotive diagnostics
250
+ - ✅ Fault code mapping (OBD-II)
251
+ - ✅ Sensor data analysis
252
+ - ✅ Maintenance planning
253
+ - ✅ Cost estimation
254
+
255
+ ### Software Engineering
256
+ - ✅ Clean code architecture
257
+ - ✅ Modular design
258
+ - ✅ Error handling
259
+ - ✅ Documentation
260
+ - ✅ Version control ready
261
+ - ✅ Deployment ready
262
+
263
+ ---
264
+
265
+ ## 📈 Sample Results
266
+
267
+ ### Example Diagnostic Output
268
+
269
+ **Vehicle 32 Analysis:**
270
+ - **Anomaly Detected:** Yes
271
+ - **Anomaly Score:** 0.755
272
+ - **Anomalous Readings:** 151/200 (75.5%)
273
+ - **Primary Cause:** Cooling system failure (Critical severity, 100% confidence)
274
+ - **Fault Codes:** P0217, P0128
275
+ - **Estimated Cost:** $1,120 - $4,300
276
+ - **Estimated Downtime:** 2-5 days
277
+
278
+ **Immediate Actions:**
279
+ 1. Do not operate vehicle
280
+ 2. Tow to service center
281
+ 3. Stop engine immediately
282
+
283
+ ---
284
+
285
+ ## 🎓 Learning Outcomes
286
+
287
+ This project successfully demonstrates:
288
+
289
+ 1. **Multi-Agent Architecture** - Coordinated execution of specialized AI agents
290
+ 2. **Production ML Pipeline** - From data collection to deployment
291
+ 3. **Real-World Application** - Automotive diagnostics with practical value
292
+ 4. **Full-Stack Development** - Backend API + Frontend UI
293
+ 5. **Modern AI Tools** - LangChain, LangGraph, PyTorch
294
+ 6. **DevOps Practices** - Docker, testing, documentation
295
+
296
+ ---
297
+
298
+ ## 🔮 Future Enhancements (Optional)
299
+
300
+ - [ ] Real-time streaming data support
301
+ - [ ] Integration with actual OBD-II devices
302
+ - [ ] LLM integration for conversational diagnostics
303
+ - [ ] Mobile application
304
+ - [ ] Cloud deployment (AWS/Azure/GCP)
305
+ - [ ] Advanced visualization dashboard
306
+ - [ ] Multi-model ensemble
307
+ - [ ] Predictive maintenance scheduling
308
+
309
+ ---
310
+
311
+ ## ✨ Conclusion
312
+
313
+ The Vehicle Diagnostics Agent project has been **successfully completed** with all requirements met:
314
+
315
+ ✅ Multi-agent AI system with 5 specialized agents
316
+ ✅ LSTM-based anomaly detection (99.53% accuracy)
317
+ ✅ LangGraph orchestration
318
+ ✅ FastAPI backend with 7 endpoints
319
+ ✅ Gradio interactive UI
320
+ ✅ Comprehensive testing (12 tests, 100% pass)
321
+ ✅ Docker containerization
322
+ ✅ Complete documentation
323
+
324
+ **The system is production-ready and demonstrates advanced AI/ML engineering capabilities.**
325
+
326
+ ---
327
+
328
+ **Project Completed:** November 23, 2025
329
+ **Total Development Time:** ~1 session
330
+ **Lines of Code:** ~3,500+
331
+ **Test Coverage:** Comprehensive
332
+ **Status:** ✅ READY FOR DEPLOYMENT
QUICK_START.md ADDED
@@ -0,0 +1,277 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🚀 Quick Start Guide - Vehicle Diagnostics Agent
2
+
3
+ ## ✅ Current Status
4
+
5
+ **The system is fully operational!**
6
+
7
+ - ✅ Conda environment: `vda` (active)
8
+ - ✅ Dataset: Generated (50,000 records)
9
+ - ✅ Model: Trained (99.53% accuracy)
10
+ - ✅ All agents: Implemented and tested
11
+ - ✅ Gradio UI: Running at http://localhost:7860
12
+ - ✅ Tests: All 12 tests passing
13
+
14
+ ---
15
+
16
+ ## 🎯 Access the System
17
+
18
+ ### Gradio UI (Currently Running)
19
+ ```
20
+ URL: http://localhost:7860
21
+ ```
22
+
23
+ The Gradio interface is already running in your cascade terminal!
24
+
25
+ **Features:**
26
+ - 🔍 Single vehicle diagnostics
27
+ - 📊 Vehicle overview with anomaly list
28
+ - 📋 Full diagnostic reports
29
+ - 📈 Interactive visualizations
30
+
31
+ ---
32
+
33
+ ## 🔧 Running Different Components
34
+
35
+ ### 1. Gradio UI (Interactive Dashboard)
36
+ ```bash
37
+ # If not already running:
38
+ python src/ui/gradio_app.py
39
+
40
+ # Or use the quick start script:
41
+ ./run_ui.sh
42
+ ```
43
+
44
+ ### 2. FastAPI Backend (REST API)
45
+ ```bash
46
+ # Start the API server:
47
+ uvicorn src.api.main:app --host 0.0.0.0 --port 8000 --reload
48
+
49
+ # Or use the quick start script:
50
+ ./run_api.sh
51
+ ```
52
+
53
+ **API Endpoints:**
54
+ - `http://localhost:8000` - Root
55
+ - `http://localhost:8000/docs` - Interactive API documentation
56
+ - `http://localhost:8000/health` - Health check
57
+ - `http://localhost:8000/vehicles` - List vehicles
58
+ - `http://localhost:8000/diagnose` - Run diagnostic
59
+
60
+ ### 3. Python Script (Direct)
61
+ ```bash
62
+ # Run the demo script:
63
+ python demo.py
64
+
65
+ # Or test the orchestrator:
66
+ python src/orchestrator.py
67
+ ```
68
+
69
+ ### 4. Docker (Production Deployment)
70
+ ```bash
71
+ # Build and run with Docker Compose:
72
+ docker-compose up --build
73
+
74
+ # Access:
75
+ # - API: http://localhost:8000
76
+ # - UI: http://localhost:7860
77
+ ```
78
+
79
+ ---
80
+
81
+ ## 📝 Quick Examples
82
+
83
+ ### Example 1: Using Gradio UI
84
+
85
+ 1. Open http://localhost:7860 in your browser
86
+ 2. Go to "Single Vehicle Diagnostic" tab
87
+ 3. Select a vehicle ID from the dropdown
88
+ 4. Set number of readings (e.g., 200)
89
+ 5. Click "Run Diagnostic"
90
+ 6. View results, visualizations, and full report
91
+
92
+ ### Example 2: Using Python API
93
+
94
+ ```python
95
+ from src.orchestrator import VehicleDiagnosticOrchestrator
96
+
97
+ # Initialize
98
+ orchestrator = VehicleDiagnosticOrchestrator()
99
+
100
+ # Run diagnostic
101
+ result = orchestrator.diagnose_vehicle(vehicle_id=32, n_readings=200)
102
+
103
+ # Access results
104
+ if result['success']:
105
+ print(result['report']['natural_language_summary'])
106
+ print(f"Anomaly Score: {result['anomaly_result']['overall_score']}")
107
+ ```
108
+
109
+ ### Example 3: Using REST API
110
+
111
+ ```bash
112
+ # Health check
113
+ curl http://localhost:8000/health
114
+
115
+ # List vehicles
116
+ curl http://localhost:8000/vehicles
117
+
118
+ # Run diagnostic
119
+ curl -X POST http://localhost:8000/diagnose \
120
+ -H "Content-Type: application/json" \
121
+ -d '{"vehicle_id": 32, "n_readings": 200}'
122
+
123
+ # Get full report
124
+ curl http://localhost:8000/report/32
125
+ ```
126
+
127
+ ---
128
+
129
+ ## 🧪 Testing
130
+
131
+ ```bash
132
+ # Run all tests:
133
+ pytest tests/ -v
134
+
135
+ # Run specific test:
136
+ pytest tests/test_agents.py::TestDataIngestionAgent -v
137
+
138
+ # Run with coverage:
139
+ pytest tests/ --cov=src --cov-report=html
140
+ ```
141
+
142
+ **Current Test Results:**
143
+ - ✅ 12/12 tests passing
144
+ - ✅ Execution time: ~3.24 seconds
145
+ - ✅ 100% success rate
146
+
147
+ ---
148
+
149
+ ## 📊 Sample Vehicles to Try
150
+
151
+ Based on the test data, here are some interesting vehicles:
152
+
153
+ **Vehicles with Anomalies:**
154
+ - Vehicle 32: High anomaly rate (~75%), cooling system issues
155
+ - Vehicle 8: Medium anomaly rate, multiple sensor issues
156
+ - Vehicle 15: Low anomaly rate, tire pressure issues
157
+
158
+ **Healthy Vehicles:**
159
+ - Vehicle 1: No anomalies detected
160
+ - Vehicle 2: Clean sensor readings
161
+ - Vehicle 5: Normal operation
162
+
163
+ ---
164
+
165
+ ## 🎨 Gradio UI Features
166
+
167
+ ### Tab 1: Single Vehicle Diagnostic
168
+ - Select vehicle from dropdown
169
+ - Set number of readings to analyze
170
+ - View real-time diagnostic results
171
+ - See anomaly detection visualization
172
+ - Read natural language summary
173
+ - Access full technical report
174
+
175
+ ### Tab 2: Vehicle Overview
176
+ - List all vehicles with anomalies
177
+ - See anomaly counts and rates
178
+ - Refresh list dynamically
179
+
180
+ ### Tab 3: About
181
+ - System architecture
182
+ - Technology stack
183
+ - Feature list
184
+ - Dataset information
185
+
186
+ ---
187
+
188
+ ## 📁 Important Files
189
+
190
+ ### Data Files
191
+ - `data/raw/vehicle_sensor_data.csv` - Raw sensor data
192
+ - `data/processed/train.csv` - Training data
193
+ - `data/processed/test.csv` - Test data
194
+ - `data/processed/scaler.pkl` - Feature scaler
195
+
196
+ ### Model Files
197
+ - `src/models/best_anomaly_detector.pth` - Trained LSTM model
198
+
199
+ ### Configuration
200
+ - `requirements.txt` - Python dependencies
201
+ - `docker-compose.yml` - Docker configuration
202
+ - `.gitignore` - Git ignore rules
203
+
204
+ ### Documentation
205
+ - `README.md` - Comprehensive documentation
206
+ - `PROJECT_SUMMARY.md` - Project completion summary
207
+ - `QUICK_START.md` - This file
208
+
209
+ ---
210
+
211
+ ## 🔍 Troubleshooting
212
+
213
+ ### Issue: Gradio UI not loading
214
+ **Solution:** Check if the UI is already running in another terminal. Only one instance can run on port 7860.
215
+
216
+ ### Issue: Model not found error
217
+ **Solution:** Train the model first:
218
+ ```bash
219
+ python src/models/train_anomaly_detector.py
220
+ ```
221
+
222
+ ### Issue: Data not found error
223
+ **Solution:** Generate and preprocess data:
224
+ ```bash
225
+ python src/utils/download_data.py
226
+ python src/utils/data_preprocessing.py
227
+ ```
228
+
229
+ ### Issue: Import errors
230
+ **Solution:** Make sure vda conda environment is activated:
231
+ ```bash
232
+ conda activate vda
233
+ ```
234
+
235
+ ### Issue: Port already in use
236
+ **Solution:** Change the port or stop the existing process:
237
+ ```bash
238
+ # For Gradio (default 7860):
239
+ python src/ui/gradio_app.py # Will auto-select next available port
240
+
241
+ # For FastAPI (default 8000):
242
+ uvicorn src.api.main:app --port 8001
243
+ ```
244
+
245
+ ---
246
+
247
+ ## 🎯 Next Steps
248
+
249
+ 1. **Explore the Gradio UI** - Try diagnosing different vehicles
250
+ 2. **Test the API** - Use the FastAPI docs at `/docs`
251
+ 3. **Run the demo** - Execute `python demo.py`
252
+ 4. **Customize** - Modify agents for your use case
253
+ 5. **Deploy** - Use Docker for production deployment
254
+
255
+ ---
256
+
257
+ ## 📞 Support
258
+
259
+ For issues or questions:
260
+ - Check `README.md` for detailed documentation
261
+ - Review `PROJECT_SUMMARY.md` for project overview
262
+ - Examine test files in `tests/` for usage examples
263
+
264
+ ---
265
+
266
+ ## 🎉 Success!
267
+
268
+ Your Vehicle Diagnostics Agent is fully operational and ready to use!
269
+
270
+ **Current Status:**
271
+ - ✅ System: Running
272
+ - ✅ UI: http://localhost:7860
273
+ - ✅ Model: Trained (99.53% accuracy)
274
+ - ✅ Data: Processed (50,000 records)
275
+ - ✅ Tests: Passing (12/12)
276
+
277
+ **Enjoy your multi-agent AI diagnostic system!** 🚗✨
README.md CHANGED
@@ -1,13 +1,110 @@
1
  ---
2
- title: VehicleDiagnosticsAgent
3
- emoji:
4
- colorFrom: gray
5
- colorTo: yellow
6
  sdk: gradio
7
- sdk_version: 6.0.0
8
  app_file: app.py
9
  pinned: false
10
- short_description: Anomaly detection in Vehicles
 
 
 
 
 
 
 
 
 
 
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Vehicle Diagnostics Agent
3
+ emoji: 🚗
4
+ colorFrom: blue
5
+ colorTo: green
6
  sdk: gradio
7
+ sdk_version: 4.44.0
8
  app_file: app.py
9
  pinned: false
10
+ license: mit
11
+ short_description: Multi-Agent AI System for Predictive Vehicle Diagnostics
12
+ tags:
13
+ - anomaly-detection
14
+ - lstm
15
+ - pytorch
16
+ - langchain
17
+ - langgraph
18
+ - multi-agent
19
+ - vehicle-diagnostics
20
+ - time-series
21
  ---
22
 
23
+ # 🚗 Vehicle Diagnostics Agent
24
+
25
+ ## Multi-Agent AI System for Predictive Vehicle Diagnostics
26
+
27
+ This is a production-ready multi-agent AI system that analyzes vehicle sensor data to detect anomalies, identify root causes, and provide actionable maintenance recommendations.
28
+
29
+ ### 🎯 Key Features
30
+
31
+ - **🔍 Anomaly Detection**: LSTM-based neural network with 99.53% validation accuracy
32
+ - **🔬 Root Cause Analysis**: Identifies underlying issues with OBD-II fault code mapping
33
+ - **🔧 Maintenance Recommendations**: Provides cost estimates and prioritized action plans
34
+ - **📊 Interactive Visualizations**: Real-time anomaly detection charts
35
+ - **📋 Natural Language Reports**: Easy-to-understand summaries for vehicle owners
36
+
37
+ ### 🏗️ System Architecture
38
+
39
+ The system employs a **multi-agent architecture** orchestrated by LangGraph:
40
+
41
+ 1. **Data Ingestion Agent** - Loads and prepares vehicle sensor data
42
+ 2. **Anomaly Detection Agent** - LSTM neural network for pattern detection
43
+ 3. **Root Cause Analysis Agent** - Fault pattern matching and correlation analysis
44
+ 4. **Maintenance Recommendation Agent** - Cost estimation and action planning
45
+ 5. **Report Generation Agent** - Comprehensive diagnostic reports
46
+
47
+ ### 🚀 Technology Stack
48
+
49
+ - **ML Framework**: PyTorch (LSTM-based time-series anomaly detection)
50
+ - **Orchestration**: LangGraph for multi-agent coordination
51
+ - **Frontend**: Gradio for interactive UI
52
+ - **Data Processing**: Pandas, NumPy, Scikit-learn
53
+ - **Visualization**: Plotly
54
+
55
+ ### 📊 Model Performance
56
+
57
+ - **Validation Accuracy**: 99.53%
58
+ - **Training Loss**: 0.0003 (final epoch)
59
+ - **Validation Loss**: 0.0409 (best)
60
+ - **Dataset**: 50,000 records from 100 vehicles
61
+ - **Features**: 60+ engineered features from 14 sensor measurements
62
+
63
+ ### 🎮 How to Use
64
+
65
+ 1. **Select a Vehicle**: Choose from available vehicle IDs
66
+ 2. **Set Reading Count**: Specify how many recent readings to analyze (default: 200)
67
+ 3. **Run Diagnostic**: Click the diagnostic button to analyze
68
+ 4. **Review Results**: View anomaly detection, root cause analysis, and maintenance recommendations
69
+
70
+ ### 📈 Dataset
71
+
72
+ The system analyzes synthetic vehicle sensor data including:
73
+ - Engine temperature, RPM, speed
74
+ - Battery voltage and health
75
+ - Oil and fuel pressure
76
+ - Tire pressure (all four wheels)
77
+ - Vibration levels
78
+ - Coolant temperature
79
+ - And more...
80
+
81
+ ### 🔬 Technical Details
82
+
83
+ **Anomaly Detection Model:**
84
+ - Architecture: 2-layer LSTM with 64 hidden units
85
+ - Input: Sequences of 10 timesteps with 60 features
86
+ - Output: Binary classification (normal/anomaly)
87
+ - Training: 31,570 sequences on GPU
88
+
89
+ **Root Cause Analysis:**
90
+ - 8 fault pattern definitions
91
+ - Sensor correlation analysis
92
+ - Confidence scoring
93
+ - OBD-II fault code mapping (P-codes, C-codes)
94
+
95
+ ### 📝 License
96
+
97
+ MIT License - See LICENSE file for details
98
+
99
+ ### 🔗 Links
100
+
101
+ - **GitHub**: [VehicleDiagnosticsAgent](https://github.com/saadmann18/VehicleDiagnosticsAgent)
102
+ - **Documentation**: Full project documentation available in the repository
103
+
104
+ ### 👨‍💻 Author
105
+
106
+ Built with ❤️ by Saad Mann
107
+
108
+ ---
109
+
110
+ **Note**: This is a demonstration system using synthetic data. For production use with real vehicles, integration with actual OBD-II devices would be required.
app.py ADDED
@@ -0,0 +1,317 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Gradio UI for Vehicle Diagnostics Agent - Hugging Face Space
3
+ """
4
+ import gradio as gr
5
+ import sys
6
+ from pathlib import Path
7
+ import pandas as pd
8
+ import plotly.graph_objects as go
9
+ import os
10
+
11
+ # Add src directory to path
12
+ sys.path.append(str(Path(__file__).parent / 'src'))
13
+
14
+ from src.orchestrator import VehicleDiagnosticOrchestrator
15
+ from src.agents.data_ingestion_agent import DataIngestionAgent
16
+
17
+ # Initialize components
18
+ orchestrator = VehicleDiagnosticOrchestrator()
19
+ ingestion_agent = DataIngestionAgent()
20
+
21
+ # Load available vehicles
22
+ test_df = ingestion_agent.load_test_data()
23
+ available_vehicles = sorted(test_df['vehicle_id'].unique().tolist())
24
+
25
+
26
+ def run_diagnostic(vehicle_id, n_readings):
27
+ """Run diagnostic for a vehicle"""
28
+ try:
29
+ vehicle_id = int(vehicle_id)
30
+ n_readings = int(n_readings) if n_readings else None
31
+
32
+ # Run diagnostic
33
+ result = orchestrator.diagnose_vehicle(vehicle_id, n_readings)
34
+
35
+ if not result['success']:
36
+ return f"❌ Error: {result.get('error')}", "", "", None
37
+
38
+ # Extract results
39
+ anomaly_result = result.get('anomaly_result', {})
40
+ report = result.get('report', {})
41
+
42
+ # Status summary
43
+ if anomaly_result.get('anomaly_detected'):
44
+ status = f"""
45
+ ## 🚨 ALERT: Anomalies Detected
46
+
47
+ **Vehicle ID:** {vehicle_id}
48
+ **Anomaly Score:** {anomaly_result.get('overall_score', 0):.3f}
49
+ **Anomalous Readings:** {anomaly_result.get('num_anomalies', 0)} / {len(anomaly_result.get('anomaly_predictions', []))} ({anomaly_result.get('anomaly_rate', 0):.1%})
50
+ **Status:** ⚠️ Requires Attention
51
+ """
52
+ else:
53
+ status = f"""
54
+ ## ✅ Vehicle Healthy
55
+
56
+ **Vehicle ID:** {vehicle_id}
57
+ **Status:** 🟢 All Systems Normal
58
+ **Anomaly Score:** {anomaly_result.get('overall_score', 0):.3f}
59
+ """
60
+
61
+ # Natural language summary
62
+ nl_summary = report.get('natural_language_summary', 'No summary available')
63
+
64
+ # Full report
65
+ full_report = report.get('full_report', 'No report available')
66
+
67
+ # Create visualization
68
+ fig = create_anomaly_visualization(anomaly_result)
69
+
70
+ return status, nl_summary, full_report, fig
71
+
72
+ except Exception as e:
73
+ return f"❌ Error: {str(e)}", "", "", None
74
+
75
+
76
+ def create_anomaly_visualization(anomaly_result):
77
+ """Create visualization of anomaly detection results"""
78
+ try:
79
+ timestamps = anomaly_result.get('timestamps', [])
80
+ predictions = anomaly_result.get('anomaly_predictions', [])
81
+ scores = anomaly_result.get('anomaly_scores', [])
82
+
83
+ if len(timestamps) == 0:
84
+ return None
85
+
86
+ # Create figure with secondary y-axis
87
+ fig = go.Figure()
88
+
89
+ # Add anomaly predictions
90
+ fig.add_trace(go.Scatter(
91
+ x=timestamps,
92
+ y=predictions,
93
+ mode='lines',
94
+ name='Anomaly Detected',
95
+ line=dict(color='red', width=2),
96
+ fill='tozeroy',
97
+ fillcolor='rgba(255, 0, 0, 0.2)'
98
+ ))
99
+
100
+ # Add anomaly scores
101
+ fig.add_trace(go.Scatter(
102
+ x=timestamps,
103
+ y=scores,
104
+ mode='lines',
105
+ name='Anomaly Score',
106
+ line=dict(color='orange', width=1, dash='dot'),
107
+ yaxis='y2'
108
+ ))
109
+
110
+ # Update layout
111
+ fig.update_layout(
112
+ title='Anomaly Detection Over Time',
113
+ xaxis_title='Timestamp',
114
+ yaxis_title='Anomaly Detected (0/1)',
115
+ yaxis2=dict(
116
+ title='Anomaly Score',
117
+ overlaying='y',
118
+ side='right'
119
+ ),
120
+ hovermode='x unified',
121
+ template='plotly_white',
122
+ height=400
123
+ )
124
+
125
+ return fig
126
+
127
+ except Exception as e:
128
+ print(f"Visualization error: {e}")
129
+ return None
130
+
131
+
132
+ def get_vehicle_info(vehicle_id):
133
+ """Get basic info about a vehicle"""
134
+ try:
135
+ vehicle_id = int(vehicle_id)
136
+ vehicle_data = test_df[test_df['vehicle_id'] == vehicle_id]
137
+
138
+ if len(vehicle_data) == 0:
139
+ return "Vehicle not found"
140
+
141
+ num_readings = len(vehicle_data)
142
+ has_anomalies = vehicle_data['anomaly'].sum() > 0
143
+ num_anomalies = vehicle_data['anomaly'].sum()
144
+
145
+ info = f"""
146
+ ### Vehicle Information
147
+
148
+ **Vehicle ID:** {vehicle_id}
149
+ **Total Readings:** {num_readings}
150
+ **Known Anomalies:** {num_anomalies} ({num_anomalies/num_readings:.1%})
151
+ **Status:** {'⚠️ Has anomalies' if has_anomalies else '✅ Healthy'}
152
+ """
153
+ return info
154
+
155
+ except Exception as e:
156
+ return f"Error: {str(e)}"
157
+
158
+
159
+ def list_vehicles_with_anomalies():
160
+ """List vehicles that have anomalies"""
161
+ vehicles_with_anomalies = []
162
+
163
+ for vid in available_vehicles[:50]: # Limit to first 50
164
+ vehicle_data = test_df[test_df['vehicle_id'] == vid]
165
+ if vehicle_data['anomaly'].sum() > 0:
166
+ vehicles_with_anomalies.append({
167
+ 'Vehicle ID': vid,
168
+ 'Total Readings': len(vehicle_data),
169
+ 'Anomalies': int(vehicle_data['anomaly'].sum()),
170
+ 'Anomaly Rate': f"{vehicle_data['anomaly'].sum()/len(vehicle_data):.1%}"
171
+ })
172
+
173
+ if vehicles_with_anomalies:
174
+ df = pd.DataFrame(vehicles_with_anomalies)
175
+ return df
176
+ else:
177
+ return pd.DataFrame({'Message': ['No vehicles with anomalies found']})
178
+
179
+
180
+ # Create Gradio interface
181
+ with gr.Blocks(title="Vehicle Diagnostics Agent", theme=gr.themes.Soft()) as demo:
182
+ gr.Markdown("""
183
+ # 🚗 Vehicle Diagnostics Agent
184
+ ### Multi-Agent AI System for Predictive Vehicle Diagnostics
185
+
186
+ This system uses advanced AI agents to analyze vehicle sensor data, detect anomalies,
187
+ identify root causes, and provide actionable maintenance recommendations.
188
+
189
+ **Powered by:** LSTM Neural Networks, LangGraph Multi-Agent Orchestration, PyTorch
190
+ """)
191
+
192
+ with gr.Tab("🔍 Single Vehicle Diagnostic"):
193
+ gr.Markdown("### Analyze a single vehicle")
194
+
195
+ with gr.Row():
196
+ with gr.Column(scale=1):
197
+ vehicle_id_input = gr.Dropdown(
198
+ choices=available_vehicles,
199
+ label="Select Vehicle ID",
200
+ value=available_vehicles[0] if available_vehicles else None
201
+ )
202
+ n_readings_input = gr.Number(
203
+ label="Number of Recent Readings (optional)",
204
+ value=200,
205
+ precision=0
206
+ )
207
+
208
+ diagnose_btn = gr.Button("🔬 Run Diagnostic", variant="primary", size="lg")
209
+
210
+ gr.Markdown("---")
211
+ vehicle_info_output = gr.Markdown(label="Vehicle Info")
212
+
213
+ # Auto-update vehicle info when selection changes
214
+ vehicle_id_input.change(
215
+ fn=get_vehicle_info,
216
+ inputs=[vehicle_id_input],
217
+ outputs=[vehicle_info_output]
218
+ )
219
+
220
+ with gr.Column(scale=2):
221
+ status_output = gr.Markdown(label="Diagnostic Status")
222
+ summary_output = gr.Textbox(
223
+ label="📋 Summary",
224
+ lines=5,
225
+ max_lines=10
226
+ )
227
+
228
+ with gr.Row():
229
+ anomaly_plot = gr.Plot(label="Anomaly Detection Visualization")
230
+
231
+ with gr.Row():
232
+ full_report_output = gr.Textbox(
233
+ label="📄 Full Diagnostic Report",
234
+ lines=20,
235
+ max_lines=30
236
+ )
237
+
238
+ diagnose_btn.click(
239
+ fn=run_diagnostic,
240
+ inputs=[vehicle_id_input, n_readings_input],
241
+ outputs=[status_output, summary_output, full_report_output, anomaly_plot]
242
+ )
243
+
244
+ with gr.Tab("📊 Vehicle Overview"):
245
+ gr.Markdown("### Vehicles with Known Anomalies")
246
+
247
+ refresh_btn = gr.Button("🔄 Refresh List", variant="secondary")
248
+ vehicles_table = gr.Dataframe(
249
+ value=list_vehicles_with_anomalies(),
250
+ label="Vehicles Requiring Attention"
251
+ )
252
+
253
+ refresh_btn.click(
254
+ fn=list_vehicles_with_anomalies,
255
+ inputs=[],
256
+ outputs=[vehicles_table]
257
+ )
258
+
259
+ with gr.Tab("ℹ️ About"):
260
+ gr.Markdown("""
261
+ ## About Vehicle Diagnostics Agent
262
+
263
+ ### System Architecture
264
+
265
+ This system employs a multi-agent architecture with the following components:
266
+
267
+ 1. **Data Ingestion Agent** - Loads and prepares vehicle sensor data
268
+ 2. **Anomaly Detection Agent** - Uses LSTM neural networks to detect unusual patterns (99.53% accuracy)
269
+ 3. **Root Cause Analysis Agent** - Identifies the underlying causes of anomalies
270
+ 4. **Maintenance Recommendation Agent** - Provides actionable maintenance steps with cost estimates
271
+ 5. **Report Generation Agent** - Creates comprehensive diagnostic reports
272
+
273
+ ### Technology Stack
274
+
275
+ - **ML Framework:** PyTorch (LSTM-based anomaly detection)
276
+ - **Orchestration:** LangGraph for multi-agent coordination
277
+ - **Backend:** FastAPI for REST API
278
+ - **Frontend:** Gradio for interactive UI
279
+ - **Data Processing:** Pandas, NumPy, Scikit-learn
280
+
281
+ ### Features
282
+
283
+ - ✅ Real-time anomaly detection with 99.53% validation accuracy
284
+ - ✅ Root cause analysis with OBD-II fault code mapping
285
+ - ✅ Maintenance cost estimation
286
+ - ✅ Natural language summaries for non-technical users
287
+ - ✅ Interactive visualizations
288
+ - ✅ Batch processing support
289
+
290
+ ### Dataset
291
+
292
+ The system analyzes synthetic vehicle sensor data including:
293
+ - Engine temperature, RPM, speed
294
+ - Battery voltage and health
295
+ - Oil and fuel pressure
296
+ - Tire pressure (all four wheels)
297
+ - Vibration levels
298
+ - Coolant temperature
299
+ - And more...
300
+
301
+ ### Model Performance
302
+
303
+ - **Validation Accuracy:** 99.53%
304
+ - **Training Loss:** 0.0003 (final epoch)
305
+ - **Validation Loss:** 0.0409 (best)
306
+ - **Dataset:** 50,000 records from 100 vehicles
307
+
308
+ ---
309
+
310
+ **Version:** 1.0.0
311
+ **GitHub:** [VehicleDiagnosticsAgent](https://github.com/saadmann18/VehicleDiagnosticsAgent)
312
+ **License:** MIT
313
+ """)
314
+
315
+ # Launch the app
316
+ if __name__ == "__main__":
317
+ demo.launch()
data/processed/feature_columns.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3dd5c80aee68f6de2426fae0d6f25fe92f00fbf664e6ea4cf139ee4457d875d6
3
+ size 1139
data/processed/scaler.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dac94ac6c97bc4347eca4006e1a64dd4b45d37e5064ff02764021e56fcc56b45
3
+ size 3109
data/processed/test.csv ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d78c024b918c94ca426bdc144dd626e8c18eb1d8ac1d593138e38cd165ead6d0
3
+ size 11889740
data/processed/train.csv ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4f8a7f1b6069b3ebf32c3c39a4a53a31a18d1e0c194b229b3a5ec4eca4c4d3d3
3
+ size 41604295
data/processed/val.csv ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:260b732891b2ebafe5afd9f55ebbb8726793cd920ff34e88f5f117f2df897673
3
+ size 5946877
data/raw/vehicle_sensor_data.csv ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f56029e4c173823eb4548c98e12d42cde6b457e09621257962cfd5b7d36b538e
3
+ size 13304420
demo.py ADDED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Quick Demo Script for Vehicle Diagnostics Agent
4
+ Demonstrates the complete diagnostic workflow
5
+ """
6
+ import sys
7
+ from pathlib import Path
8
+
9
+ sys.path.append(str(Path(__file__).parent / 'src'))
10
+
11
+ from orchestrator import VehicleDiagnosticOrchestrator
12
+ from agents.data_ingestion_agent import DataIngestionAgent
13
+
14
+
15
+ def main():
16
+ print("\n" + "="*70)
17
+ print("🚗 VEHICLE DIAGNOSTICS AGENT - DEMO")
18
+ print("="*70 + "\n")
19
+
20
+ # Initialize
21
+ print("Initializing system...")
22
+ orchestrator = VehicleDiagnosticOrchestrator()
23
+ ingestion_agent = DataIngestionAgent()
24
+
25
+ # Load test data
26
+ print("Loading test data...")
27
+ test_df = ingestion_agent.load_test_data()
28
+
29
+ # Find vehicles with anomalies
30
+ print("\nFinding vehicles with anomalies...")
31
+ vehicles_with_anomalies = []
32
+ for vid in test_df['vehicle_id'].unique()[:20]:
33
+ vehicle_data = test_df[test_df['vehicle_id'] == vid]
34
+ if vehicle_data['anomaly'].sum() > 0:
35
+ vehicles_with_anomalies.append({
36
+ 'id': vid,
37
+ 'anomaly_count': int(vehicle_data['anomaly'].sum()),
38
+ 'total_readings': len(vehicle_data)
39
+ })
40
+
41
+ print(f"✓ Found {len(vehicles_with_anomalies)} vehicles with anomalies\n")
42
+
43
+ # Select a vehicle for demo
44
+ if vehicles_with_anomalies:
45
+ demo_vehicle = vehicles_with_anomalies[0]
46
+ vehicle_id = demo_vehicle['id']
47
+
48
+ print(f"Demo Vehicle: {vehicle_id}")
49
+ print(f" - Total readings: {demo_vehicle['total_readings']}")
50
+ print(f" - Known anomalies: {demo_vehicle['anomaly_count']}")
51
+ print(f" - Anomaly rate: {demo_vehicle['anomaly_count']/demo_vehicle['total_readings']:.1%}")
52
+ print("\n" + "-"*70 + "\n")
53
+
54
+ # Run diagnostic
55
+ print(f"Running complete diagnostic workflow for Vehicle {vehicle_id}...\n")
56
+ result = orchestrator.diagnose_vehicle(vehicle_id, n_readings=200)
57
+
58
+ if result['success']:
59
+ print("\n" + "="*70)
60
+ print("📊 DIAGNOSTIC RESULTS")
61
+ print("="*70 + "\n")
62
+
63
+ # Anomaly Detection Results
64
+ anomaly_result = result['anomaly_result']
65
+ print("🔍 ANOMALY DETECTION:")
66
+ print(f" ✓ Anomaly Detected: {'YES ⚠️' if anomaly_result['anomaly_detected'] else 'NO ✅'}")
67
+ print(f" ✓ Overall Score: {anomaly_result['overall_score']:.3f}")
68
+ print(f" ✓ Anomalous Readings: {anomaly_result['num_anomalies']}/{len(anomaly_result['anomaly_predictions'])} ({anomaly_result['anomaly_rate']:.1%})")
69
+ print(f" ✓ Affected Sensors: {len(anomaly_result['anomalous_sensors'])}")
70
+
71
+ # Root Cause Analysis
72
+ root_cause_result = result['root_cause_result']
73
+ print(f"\n🔬 ROOT CAUSE ANALYSIS:")
74
+ print(f" ✓ Root Causes Identified: {len(root_cause_result['root_causes'])}")
75
+
76
+ if root_cause_result['primary_cause']:
77
+ primary = root_cause_result['primary_cause']
78
+ print(f"\n PRIMARY ISSUE:")
79
+ print(f" • Fault: {primary['fault_name'].replace('_', ' ').title()}")
80
+ print(f" • Description: {primary['description']}")
81
+ print(f" • Severity: {primary['severity'].upper()}")
82
+ print(f" • Confidence: {primary['confidence']:.0%}")
83
+ print(f" • Fault Codes: {', '.join(primary['fault_codes'])}")
84
+
85
+ # Maintenance Recommendations
86
+ maintenance_result = result['maintenance_result']
87
+ print(f"\n🔧 MAINTENANCE RECOMMENDATIONS:")
88
+ print(f" ✓ Total Items: {len(maintenance_result['recommendations'])}")
89
+ print(f" ✓ Estimated Cost: {maintenance_result['total_cost']['cost_range']}")
90
+ print(f" ✓ Immediate Actions: {len(maintenance_result['action_plan']['immediate'])}")
91
+
92
+ if maintenance_result['top_priority']:
93
+ top = maintenance_result['top_priority']
94
+ print(f"\n TOP PRIORITY:")
95
+ print(f" • Urgency: {top['urgency'].upper()}")
96
+ print(f" • Cost: {top['estimated_cost']}")
97
+ print(f" • Downtime: {top['estimated_downtime']}")
98
+
99
+ # Natural Language Summary
100
+ report = result['report']
101
+ print(f"\n📋 SUMMARY FOR VEHICLE OWNER:")
102
+ print("-"*70)
103
+ print(report['natural_language_summary'])
104
+ print("-"*70)
105
+
106
+ # Save report
107
+ report_file = f"vehicle_{vehicle_id}_report.txt"
108
+ with open(report_file, 'w') as f:
109
+ f.write(report['full_report'])
110
+ print(f"\n✓ Full report saved to: {report_file}")
111
+
112
+ else:
113
+ print(f"\n❌ Diagnostic failed: {result.get('error')}")
114
+
115
+ else:
116
+ print("No vehicles with anomalies found in test set.")
117
+ print("Running diagnostic on first available vehicle...")
118
+
119
+ vehicle_id = test_df['vehicle_id'].iloc[0]
120
+ result = orchestrator.diagnose_vehicle(vehicle_id, n_readings=100)
121
+
122
+ if result['success']:
123
+ print(f"\n✅ Vehicle {vehicle_id} is healthy!")
124
+ print(result['report']['natural_language_summary'])
125
+
126
+ print("\n" + "="*70)
127
+ print("DEMO COMPLETED")
128
+ print("="*70)
129
+ print("\nNext steps:")
130
+ print(" • Run Gradio UI: ./run_ui.sh")
131
+ print(" • Run FastAPI: ./run_api.sh")
132
+ print(" • Run tests: pytest tests/ -v")
133
+ print(" • Deploy with Docker: docker-compose up --build")
134
+ print("\n")
135
+
136
+
137
+ if __name__ == '__main__':
138
+ main()
docker-compose.yml ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ version: '3.8'
2
+
3
+ services:
4
+ # FastAPI Backend
5
+ api:
6
+ build: .
7
+ container_name: vda-api
8
+ ports:
9
+ - "8000:8000"
10
+ volumes:
11
+ - ./src:/app/src
12
+ - ./data:/app/data
13
+ environment:
14
+ - PYTHONUNBUFFERED=1
15
+ command: uvicorn src.api.main:app --host 0.0.0.0 --port 8000 --reload
16
+ restart: unless-stopped
17
+ networks:
18
+ - vda-network
19
+
20
+ # Gradio Frontend
21
+ ui:
22
+ build: .
23
+ container_name: vda-ui
24
+ ports:
25
+ - "7860:7860"
26
+ volumes:
27
+ - ./src:/app/src
28
+ - ./data:/app/data
29
+ environment:
30
+ - PYTHONUNBUFFERED=1
31
+ command: python src/ui/gradio_app.py
32
+ restart: unless-stopped
33
+ networks:
34
+ - vda-network
35
+ depends_on:
36
+ - api
37
+
38
+ networks:
39
+ vda-network:
40
+ driver: bridge
project.md ADDED
@@ -0,0 +1,231 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Step-by-step explanation of how to accomplish the **Vehicle Diagnostics Agent** project end-to-end:
2
+
3
+ ## Vehicle Diagnostics Agent Project: Detailed Implementation Plan
4
+
5
+ ### Phase 1: Project Setup and Planning
6
+
7
+
8
+
9
+ 1. **Define Project Goals and Scope**
10
+
11
+ - Build a multi-agent AI system for predictive vehicle diagnostics.
12
+
13
+ - Agents will collaboratively analyze sensor data to detect anomalies, identify causes, recommend maintenance, and generate reports.
14
+
15
+ - Use realistic automotive sensor data (real/simulated).
16
+
17
+ - Demonstrate production-readiness with FastAPI backend and Gradio interface.
18
+
19
+
20
+
21
+ 2. **Select Tools and Frameworks**
22
+
23
+ - LangChain and LangGraph for multi-agent orchestration.
24
+
25
+ - Python for logic implementation.
26
+
27
+ - PyTorch/TensorFlow for any ML model development.
28
+
29
+ - FastAPI for service endpoints.
30
+
31
+ - Gradio for user-friendly interface.
32
+
33
+ - Docker for containerization.
34
+
35
+
36
+
37
+ 3. **Gather Data**
38
+
39
+ - Use open datasets like NASA Prognostics repository, Udacity self-driving car datasets, OR simulate vehicle telemetry in CARLA and inject anomalies.
40
+
41
+
42
+ ### Phase 2: Data Collection and Preprocessing
43
+
44
+
45
+
46
+ 1. **Acquire Vehicle Sensor Data**
47
+
48
+ - Collect time-series data such as engine temperature, speed, RPM, battery voltage, brake status, etc.
49
+
50
+ - For supervised learning, acquire or generate corresponding anomaly/fault labels.
51
+
52
+
53
+
54
+ 2. **Clean and Process Data**
55
+
56
+ - Implement filtering to reduce noise (e.g., moving average, Kalman filtering).
57
+
58
+ - Normalize and synchronize sensor streams.
59
+
60
+ - Extract meaningful statistical and domain-specific features.
61
+
62
+
63
+
64
+ 3. **Split Data**
65
+
66
+ - Partition into training, validation, and testing datasets.
67
+
68
+
69
+
70
+
71
+ ### Phase 3: Build Individual Agents
72
+
73
+
74
+
75
+ 1. **Data Ingestion Agent**
76
+
77
+ - Load or stream sensor data into the system.
78
+
79
+ - Prepare data for downstream agents.
80
+
81
+
82
+
83
+ 2. **Anomaly Detection Agent**
84
+
85
+ - Train and deploy ML models (e.g., LSTM, CNN) to detect unusual sensor patterns.
86
+
87
+ - Use thresholding or probabilistic models for anomaly scoring.
88
+
89
+ 3. **Root Cause Analysis Agent**
90
+
91
+ - Implement rule-based or ML models to infer possible causes of anomalies by correlating sensor data patterns.
92
+
93
+ - Integrate domain knowledge (e.g., engine fault codes mapping).
94
+
95
+ 4. **Maintenance Recommendation Agent**
96
+
97
+ - Map root causes to actionable maintenance steps or alerts.
98
+
99
+ - Prioritize actions based on severity and impact.
100
+
101
+ 5. **Report Generation Agent**
102
+
103
+ - Compile diagnostic summaries into clear reports for users/operators.
104
+
105
+ - Generate natural-language summaries.
106
+
107
+
108
+ ### Phase 4: Agent Orchestration and Workflow
109
+
110
+ 1. **Design Communication Protocol**
111
+
112
+ - Define how agents exchange information (inputs/outputs).
113
+
114
+ - Implement context/memory sharing to maintain state across steps.
115
+
116
+ 2. **Implement Multi-Agent Orchestration**
117
+
118
+ - Use LangChain to manage sequential and parallel task execution among agents.
119
+
120
+ - Define orchestration logic to call agents in order (Data Ingestion → Anomaly Detection → Root Cause → Recommendation → Report).
121
+
122
+ 3. **Add Error Handling and Recovery**
123
+
124
+ - Establish retry/fallback rules in case of agent failures or inconsistent data.
125
+
126
+ ### Phase 5: Backend and Frontend Development
127
+
128
+
129
+ 1. **FastAPI Service**
130
+
131
+ - Develop API endpoints for triggering diagnostics, retrieving reports, and health checks.
132
+
133
+ - Handle concurrent user requests.
134
+
135
+ 2. **Gradio-based UI**
136
+
137
+ - Build an interactive dashboard for users to input vehicle IDs and view diagnostic reports.
138
+
139
+ - Visualize detected anomalies and recommended actions.
140
+
141
+
142
+ ### Phase 6: Deployment and Monitoring
143
+
144
+ 1. **Containerization**
145
+
146
+ - Create Docker images for backend and frontend.
147
+
148
+ - Use Docker Compose for service orchestration.
149
+
150
+
151
+ 2. **Deployment**
152
+
153
+ - Deploy locally or on cloud (AWS, Azure).
154
+
155
+ - Configure environment variables and API keys securely.
156
+
157
+
158
+
159
+ 3. **Observability**
160
+
161
+ - Add logging and monitoring for system performance and errors.
162
+
163
+ - Use LangSmith or other tracing tools to instrument agent workflows.
164
+
165
+
166
+ ### Phase 7: Testing and Validation
167
+
168
+
169
+ 1. **Unit Testing**
170
+
171
+ - Write tests for each agent’s logic.
172
+
173
+ - Validate correct anomaly detection and recommendations.
174
+
175
+
176
+ 2. **Integration Testing**
177
+
178
+ - Verify multi-agent orchestration flows end-to-end.
179
+
180
+ - Simulate vehicle scenarios including anomalies.
181
+
182
+
183
+ 3. **User Acceptance Testing**
184
+
185
+ - Gather feedback on Gradio interface usability and report clarity.
186
+
187
+
188
+
189
+ ### Phase 8: Documentation and Presentation
190
+
191
+
192
+ 1. **Write Comprehensive README**
193
+
194
+ - Explain project goals, architecture, how to run and extend.
195
+
196
+ - Include example data and system diagram.
197
+
198
+
199
+ 2. **Prepare Demo and Presentation**
200
+
201
+ - Showcase live diagnostics on sample data.
202
+
203
+ - Highlight modular design and agent collaboration.
204
+
205
+
206
+ ## Tasks to accomplish
207
+
208
+ | 1 | Data collection, preprocessing, build Data Ingestion & Anomaly Agents |
209
+
210
+ | 2 | Build Root Cause, Recommendation, Report Agents; implement LangChain orchestration |
211
+
212
+ | 3 | Backend (FastAPI), Frontend (Gradio), Deployment, Testing, Documentation |
213
+
214
+
215
+
216
+
217
+
218
+ - Multi-agent AI system design and orchestration
219
+
220
+ - Production-grade ML pipeline development
221
+
222
+ - Cross-functional, safety-critical domain knowledge
223
+
224
+ - Full-stack deployment and user interface
225
+
226
+ - Strong data engineering and AI validation skills
227
+
228
+
229
+
230
+ This project will serve as a flagship portfolio piece so one can apply AI to automotive challenges with agentic AI thinking.
231
+
requirements.txt ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Core ML and Data Processing
2
+ numpy>=1.24.0
3
+ pandas>=2.0.0
4
+ scikit-learn>=1.3.0
5
+ scipy>=1.11.0
6
+
7
+ # Deep Learning
8
+ torch>=2.0.0
9
+ torchvision>=0.15.0
10
+
11
+ # LangChain and Agent Orchestration
12
+ langchain>=0.1.0
13
+ langchain-community>=0.0.10
14
+ langgraph>=0.0.20
15
+ langchain-openai>=0.0.5
16
+
17
+ # API and Web Framework
18
+ fastapi>=0.104.0
19
+ uvicorn[standard]>=0.24.0
20
+ pydantic>=2.0.0
21
+ python-multipart>=0.0.6
22
+
23
+ # UI
24
+ gradio>=4.0.0
25
+
26
+ # Data Visualization
27
+ matplotlib>=3.7.0
28
+ seaborn>=0.12.0
29
+ plotly>=5.17.0
30
+
31
+ # Utilities
32
+ python-dotenv>=1.0.0
33
+ pyyaml>=6.0
34
+ requests>=2.31.0
35
+ tqdm>=4.66.0
36
+
37
+ # Testing
38
+ pytest>=7.4.0
39
+ pytest-asyncio>=0.21.0
40
+ httpx>=0.25.0
run_api.sh ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ # Quick start script for Vehicle Diagnostics Agent API
3
+
4
+ echo "=========================================="
5
+ echo "Vehicle Diagnostics Agent - FastAPI"
6
+ echo "=========================================="
7
+ echo ""
8
+
9
+ # Check if conda environment is activated
10
+ if [[ "$CONDA_DEFAULT_ENV" != "vda" ]]; then
11
+ echo "⚠️ Warning: vda conda environment not activated"
12
+ echo "Please run: conda activate vda"
13
+ echo ""
14
+ fi
15
+
16
+ # Check if model exists
17
+ if [ ! -f "src/models/best_anomaly_detector.pth" ]; then
18
+ echo "❌ Model not found. Please train the model first:"
19
+ echo " python src/models/train_anomaly_detector.py"
20
+ exit 1
21
+ fi
22
+
23
+ # Check if data exists
24
+ if [ ! -f "data/processed/test.csv" ]; then
25
+ echo "❌ Processed data not found. Please run preprocessing:"
26
+ echo " python src/utils/data_preprocessing.py"
27
+ exit 1
28
+ fi
29
+
30
+ echo "✅ Starting FastAPI server..."
31
+ echo " API: http://localhost:8000"
32
+ echo " Docs: http://localhost:8000/docs"
33
+ echo ""
34
+ echo "Press Ctrl+C to stop"
35
+ echo ""
36
+
37
+ uvicorn src.api.main:app --host 0.0.0.0 --port 8000 --reload
run_ui.sh ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ # Quick start script for Vehicle Diagnostics Agent UI
3
+
4
+ echo "=========================================="
5
+ echo "Vehicle Diagnostics Agent - Gradio UI"
6
+ echo "=========================================="
7
+ echo ""
8
+
9
+ # Check if conda environment is activated
10
+ if [[ "$CONDA_DEFAULT_ENV" != "vda" ]]; then
11
+ echo "⚠️ Warning: vda conda environment not activated"
12
+ echo "Please run: conda activate vda"
13
+ echo ""
14
+ fi
15
+
16
+ # Check if model exists
17
+ if [ ! -f "src/models/best_anomaly_detector.pth" ]; then
18
+ echo "❌ Model not found. Please train the model first:"
19
+ echo " python src/models/train_anomaly_detector.py"
20
+ exit 1
21
+ fi
22
+
23
+ # Check if data exists
24
+ if [ ! -f "data/processed/test.csv" ]; then
25
+ echo "❌ Processed data not found. Please run preprocessing:"
26
+ echo " python src/utils/data_preprocessing.py"
27
+ exit 1
28
+ fi
29
+
30
+ echo "✅ Starting Gradio UI..."
31
+ echo " Access at: http://localhost:7860"
32
+ echo ""
33
+ echo "Press Ctrl+C to stop"
34
+ echo ""
35
+
36
+ python src/ui/gradio_app.py
src/agents/anomaly_detection_agent.py ADDED
@@ -0,0 +1,251 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Anomaly Detection Agent - Detects unusual patterns in sensor data
3
+ """
4
+ import numpy as np
5
+ import sys
6
+ from pathlib import Path
7
+
8
+ # Add parent directory to path
9
+ sys.path.append(str(Path(__file__).parent.parent))
10
+
11
+ from models.anomaly_detector import AnomalyDetectionModel
12
+ from typing import Dict, List, Tuple
13
+
14
+
15
+ class AnomalyDetectionAgent:
16
+ """
17
+ Agent responsible for detecting anomalies in vehicle sensor data
18
+ """
19
+
20
+ def __init__(self, model_path='src/models/best_anomaly_detector.pth', threshold=0.5):
21
+ self.model_path = Path(model_path)
22
+ self.threshold = threshold
23
+ self.model = None
24
+ self._load_model()
25
+
26
+ def _load_model(self):
27
+ """Load the trained anomaly detection model"""
28
+ if self.model_path.exists():
29
+ # Get input size from model file
30
+ import torch
31
+ checkpoint = torch.load(self.model_path, map_location='cpu')
32
+ input_size = checkpoint['input_size']
33
+ sequence_length = checkpoint['sequence_length']
34
+
35
+ self.model = AnomalyDetectionModel(input_size, sequence_length)
36
+ self.model.load(self.model_path)
37
+ print(f"✓ Loaded anomaly detection model from {self.model_path}")
38
+ else:
39
+ print(f"⚠ Model not found at {self.model_path}. Using rule-based detection.")
40
+ self.model = None
41
+
42
+ def detect_anomalies_ml(self, features: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
43
+ """
44
+ Detect anomalies using ML model
45
+
46
+ Args:
47
+ features: Feature array of shape (n_samples, n_features)
48
+
49
+ Returns:
50
+ Tuple of (anomaly_scores, anomaly_predictions)
51
+ """
52
+ if self.model is None:
53
+ raise ValueError("ML model not loaded")
54
+
55
+ scores, predictions = self.model.predict(features)
56
+ return scores, predictions
57
+
58
+ def detect_anomalies_rules(self, raw_data) -> np.ndarray:
59
+ """
60
+ Detect anomalies using rule-based approach (fallback)
61
+
62
+ Args:
63
+ raw_data: DataFrame with raw sensor data
64
+
65
+ Returns:
66
+ Array of anomaly predictions
67
+ """
68
+ anomalies = np.zeros(len(raw_data), dtype=int)
69
+
70
+ # Rule 1: Engine overheating
71
+ if 'engine_temp' in raw_data.columns:
72
+ anomalies |= (raw_data['engine_temp'] > 2.0).astype(int) # Normalized threshold
73
+
74
+ # Rule 2: Low oil pressure
75
+ if 'oil_pressure' in raw_data.columns:
76
+ anomalies |= (raw_data['oil_pressure'] < -1.5).astype(int)
77
+
78
+ # Rule 3: Battery issues
79
+ if 'battery_voltage' in raw_data.columns:
80
+ anomalies |= (raw_data['battery_voltage'] < -1.0).astype(int)
81
+
82
+ # Rule 4: High vibration
83
+ if 'vibration_level' in raw_data.columns:
84
+ anomalies |= (raw_data['vibration_level'] > 2.0).astype(int)
85
+
86
+ # Rule 5: Tire pressure issues
87
+ tire_cols = [col for col in raw_data.columns if 'tire_pressure' in col]
88
+ if tire_cols:
89
+ for col in tire_cols:
90
+ anomalies |= (raw_data[col] < -1.5).astype(int)
91
+
92
+ return anomalies
93
+
94
+ def identify_anomalous_sensors(self, raw_data, anomaly_indices: List[int]) -> Dict:
95
+ """
96
+ Identify which sensors are showing anomalous behavior
97
+
98
+ Args:
99
+ raw_data: DataFrame with raw sensor data
100
+ anomaly_indices: Indices where anomalies were detected
101
+
102
+ Returns:
103
+ Dictionary mapping sensor names to anomaly information
104
+ """
105
+ if len(anomaly_indices) == 0:
106
+ return {}
107
+
108
+ anomalous_data = raw_data.iloc[anomaly_indices]
109
+
110
+ sensor_cols = [col for col in raw_data.columns
111
+ if col not in ['vehicle_id', 'timestamp', 'anomaly']]
112
+
113
+ anomalous_sensors = {}
114
+
115
+ for col in sensor_cols:
116
+ # Check if this sensor shows unusual values
117
+ overall_mean = raw_data[col].mean()
118
+ overall_std = raw_data[col].std()
119
+
120
+ anomaly_mean = anomalous_data[col].mean()
121
+
122
+ # If anomaly mean is more than 2 std away from overall mean
123
+ if abs(anomaly_mean - overall_mean) > 2 * overall_std:
124
+ anomalous_sensors[col] = {
125
+ 'overall_mean': float(overall_mean),
126
+ 'anomaly_mean': float(anomaly_mean),
127
+ 'deviation': float(abs(anomaly_mean - overall_mean) / overall_std),
128
+ 'severity': 'high' if abs(anomaly_mean - overall_mean) > 3 * overall_std else 'medium'
129
+ }
130
+
131
+ return anomalous_sensors
132
+
133
+ def calculate_anomaly_score(self, predictions: np.ndarray, scores: np.ndarray = None) -> float:
134
+ """
135
+ Calculate overall anomaly score for the vehicle
136
+
137
+ Args:
138
+ predictions: Binary anomaly predictions
139
+ scores: Optional continuous anomaly scores
140
+
141
+ Returns:
142
+ Overall anomaly score (0-1)
143
+ """
144
+ if scores is not None:
145
+ return float(np.mean(scores))
146
+ else:
147
+ return float(np.mean(predictions))
148
+
149
+ def run(self, prepared_data: Dict) -> Dict:
150
+ """
151
+ Main execution method for the Anomaly Detection Agent
152
+
153
+ Args:
154
+ prepared_data: Data prepared by Data Ingestion Agent
155
+
156
+ Returns:
157
+ Dictionary containing anomaly detection results
158
+ """
159
+ print(f"\n{'='*60}")
160
+ print(f"ANOMALY DETECTION AGENT - Vehicle {prepared_data['vehicle_id']}")
161
+ print(f"{'='*60}")
162
+
163
+ features = prepared_data['features']
164
+ raw_data = prepared_data['raw_data']
165
+
166
+ # Detect anomalies
167
+ if self.model is not None:
168
+ print("Using ML-based anomaly detection...")
169
+ scores, predictions = self.detect_anomalies_ml(features)
170
+
171
+ # Pad predictions to match original length
172
+ padded_predictions = np.zeros(len(raw_data), dtype=int)
173
+ padded_predictions[-len(predictions):] = predictions
174
+
175
+ padded_scores = np.zeros(len(raw_data))
176
+ padded_scores[-len(scores):] = scores
177
+ else:
178
+ print("Using rule-based anomaly detection...")
179
+ padded_predictions = self.detect_anomalies_rules(raw_data)
180
+ padded_scores = padded_predictions.astype(float)
181
+
182
+ # Find anomaly indices
183
+ anomaly_indices = np.where(padded_predictions == 1)[0].tolist()
184
+ num_anomalies = len(anomaly_indices)
185
+
186
+ print(f"✓ Detected {num_anomalies} anomalous readings out of {len(raw_data)}")
187
+ print(f" Anomaly rate: {num_anomalies/len(raw_data):.2%}")
188
+
189
+ # Calculate overall anomaly score
190
+ overall_score = self.calculate_anomaly_score(padded_predictions, padded_scores)
191
+ print(f" Overall anomaly score: {overall_score:.3f}")
192
+
193
+ # Identify anomalous sensors
194
+ anomalous_sensors = {}
195
+ if num_anomalies > 0:
196
+ anomalous_sensors = self.identify_anomalous_sensors(raw_data, anomaly_indices)
197
+ print(f"✓ Identified {len(anomalous_sensors)} sensors with anomalous behavior")
198
+
199
+ if anomalous_sensors:
200
+ print(" Top anomalous sensors:")
201
+ sorted_sensors = sorted(anomalous_sensors.items(),
202
+ key=lambda x: x[1]['deviation'],
203
+ reverse=True)
204
+ for sensor, info in sorted_sensors[:3]:
205
+ print(f" - {sensor}: {info['severity']} severity (deviation: {info['deviation']:.2f}σ)")
206
+
207
+ # Compare with ground truth if available
208
+ if prepared_data['ground_truth'] is not None:
209
+ ground_truth = prepared_data['ground_truth']
210
+ accuracy = (padded_predictions == ground_truth).mean()
211
+ print(f" Accuracy vs ground truth: {accuracy:.2%}")
212
+
213
+ print(f"{'='*60}\n")
214
+
215
+ result = {
216
+ 'vehicle_id': prepared_data['vehicle_id'],
217
+ 'anomaly_detected': num_anomalies > 0,
218
+ 'num_anomalies': num_anomalies,
219
+ 'anomaly_rate': num_anomalies / len(raw_data),
220
+ 'overall_score': overall_score,
221
+ 'anomaly_indices': anomaly_indices,
222
+ 'anomaly_predictions': padded_predictions,
223
+ 'anomaly_scores': padded_scores,
224
+ 'anomalous_sensors': anomalous_sensors,
225
+ 'timestamps': prepared_data['timestamps'],
226
+ 'raw_data': raw_data
227
+ }
228
+
229
+ return result
230
+
231
+
232
+ if __name__ == '__main__':
233
+ # Test the Anomaly Detection Agent
234
+ from data_ingestion_agent import DataIngestionAgent
235
+
236
+ # Load data
237
+ ingestion_agent = DataIngestionAgent()
238
+ test_df = ingestion_agent.load_test_data()
239
+ test_vehicle_id = test_df['vehicle_id'].iloc[0]
240
+
241
+ # Prepare data
242
+ prepared_data = ingestion_agent.run(test_vehicle_id, n_readings=200)
243
+
244
+ # Detect anomalies
245
+ detection_agent = AnomalyDetectionAgent()
246
+ result = detection_agent.run(prepared_data)
247
+
248
+ print(f"\nAnomaly Detection Summary:")
249
+ print(f" Anomalies detected: {result['anomaly_detected']}")
250
+ print(f" Overall score: {result['overall_score']:.3f}")
251
+ print(f" Anomalous sensors: {len(result['anomalous_sensors'])}")
src/agents/data_ingestion_agent.py ADDED
@@ -0,0 +1,193 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Data Ingestion Agent - Loads and prepares sensor data for analysis
3
+ """
4
+ import pandas as pd
5
+ import numpy as np
6
+ from pathlib import Path
7
+ import pickle
8
+ from typing import Dict, List, Optional
9
+
10
+
11
+ class DataIngestionAgent:
12
+ """
13
+ Agent responsible for loading and preparing vehicle sensor data
14
+ """
15
+
16
+ def __init__(self, data_dir='data/processed'):
17
+ self.data_dir = Path(data_dir)
18
+ self.scaler = None
19
+ self.feature_columns = None
20
+ self._load_preprocessing_artifacts()
21
+
22
+ def _load_preprocessing_artifacts(self):
23
+ """Load scaler and feature columns"""
24
+ scaler_path = self.data_dir / 'scaler.pkl'
25
+ features_path = self.data_dir / 'feature_columns.pkl'
26
+
27
+ if scaler_path.exists():
28
+ with open(scaler_path, 'rb') as f:
29
+ self.scaler = pickle.load(f)
30
+
31
+ if features_path.exists():
32
+ with open(features_path, 'rb') as f:
33
+ self.feature_columns = pickle.load(f)
34
+
35
+ def load_test_data(self) -> pd.DataFrame:
36
+ """Load test dataset"""
37
+ test_path = self.data_dir / 'test.csv'
38
+ if not test_path.exists():
39
+ raise FileNotFoundError(f"Test data not found at {test_path}")
40
+
41
+ df = pd.read_csv(test_path)
42
+ return df
43
+
44
+ def get_vehicle_data(self, vehicle_id: int, df: Optional[pd.DataFrame] = None) -> pd.DataFrame:
45
+ """
46
+ Get sensor data for a specific vehicle
47
+
48
+ Args:
49
+ vehicle_id: ID of the vehicle
50
+ df: Optional dataframe to filter from, otherwise loads test data
51
+
52
+ Returns:
53
+ DataFrame with vehicle sensor data
54
+ """
55
+ if df is None:
56
+ df = self.load_test_data()
57
+
58
+ vehicle_data = df[df['vehicle_id'] == vehicle_id].copy()
59
+
60
+ if len(vehicle_data) == 0:
61
+ raise ValueError(f"No data found for vehicle_id {vehicle_id}")
62
+
63
+ return vehicle_data
64
+
65
+ def get_latest_readings(self, vehicle_id: int, n_readings: int = 50) -> pd.DataFrame:
66
+ """
67
+ Get the latest N sensor readings for a vehicle
68
+
69
+ Args:
70
+ vehicle_id: ID of the vehicle
71
+ n_readings: Number of recent readings to retrieve
72
+
73
+ Returns:
74
+ DataFrame with latest sensor readings
75
+ """
76
+ vehicle_data = self.get_vehicle_data(vehicle_id)
77
+ latest_data = vehicle_data.tail(n_readings)
78
+ return latest_data
79
+
80
+ def prepare_for_analysis(self, vehicle_data: pd.DataFrame) -> Dict:
81
+ """
82
+ Prepare vehicle data for downstream agents
83
+
84
+ Args:
85
+ vehicle_data: Raw vehicle sensor data
86
+
87
+ Returns:
88
+ Dictionary containing prepared data and metadata
89
+ """
90
+ vehicle_id = vehicle_data['vehicle_id'].iloc[0]
91
+
92
+ # Extract features
93
+ if self.feature_columns:
94
+ features = vehicle_data[self.feature_columns].values
95
+ else:
96
+ # Fallback: use all numeric columns except metadata
97
+ exclude_cols = ['vehicle_id', 'timestamp', 'anomaly']
98
+ feature_cols = [col for col in vehicle_data.columns if col not in exclude_cols]
99
+ features = vehicle_data[feature_cols].values
100
+
101
+ # Get ground truth if available
102
+ ground_truth = vehicle_data['anomaly'].values if 'anomaly' in vehicle_data.columns else None
103
+
104
+ prepared_data = {
105
+ 'vehicle_id': vehicle_id,
106
+ 'features': features,
107
+ 'feature_names': self.feature_columns if self.feature_columns else feature_cols,
108
+ 'timestamps': vehicle_data['timestamp'].values,
109
+ 'raw_data': vehicle_data,
110
+ 'ground_truth': ground_truth,
111
+ 'num_readings': len(vehicle_data),
112
+ 'time_range': (vehicle_data['timestamp'].min(), vehicle_data['timestamp'].max())
113
+ }
114
+
115
+ return prepared_data
116
+
117
+ def get_sensor_summary(self, vehicle_data: pd.DataFrame) -> Dict:
118
+ """
119
+ Get summary statistics for sensor readings
120
+
121
+ Args:
122
+ vehicle_data: Vehicle sensor data
123
+
124
+ Returns:
125
+ Dictionary with sensor statistics
126
+ """
127
+ sensor_cols = [col for col in vehicle_data.columns
128
+ if col not in ['vehicle_id', 'timestamp', 'anomaly']]
129
+
130
+ summary = {}
131
+ for col in sensor_cols:
132
+ summary[col] = {
133
+ 'mean': float(vehicle_data[col].mean()),
134
+ 'std': float(vehicle_data[col].std()),
135
+ 'min': float(vehicle_data[col].min()),
136
+ 'max': float(vehicle_data[col].max()),
137
+ 'latest': float(vehicle_data[col].iloc[-1])
138
+ }
139
+
140
+ return summary
141
+
142
+ def run(self, vehicle_id: int, n_readings: Optional[int] = None) -> Dict:
143
+ """
144
+ Main execution method for the Data Ingestion Agent
145
+
146
+ Args:
147
+ vehicle_id: ID of the vehicle to analyze
148
+ n_readings: Optional number of recent readings to analyze
149
+
150
+ Returns:
151
+ Dictionary containing prepared data for downstream agents
152
+ """
153
+ print(f"\n{'='*60}")
154
+ print(f"DATA INGESTION AGENT - Vehicle {vehicle_id}")
155
+ print(f"{'='*60}")
156
+
157
+ # Load vehicle data
158
+ if n_readings:
159
+ vehicle_data = self.get_latest_readings(vehicle_id, n_readings)
160
+ print(f"✓ Loaded latest {n_readings} readings for vehicle {vehicle_id}")
161
+ else:
162
+ vehicle_data = self.get_vehicle_data(vehicle_id)
163
+ print(f"✓ Loaded all {len(vehicle_data)} readings for vehicle {vehicle_id}")
164
+
165
+ # Prepare data for analysis
166
+ prepared_data = self.prepare_for_analysis(vehicle_data)
167
+ print(f"✓ Prepared {prepared_data['num_readings']} readings for analysis")
168
+ print(f" Time range: {prepared_data['time_range'][0]} to {prepared_data['time_range'][1]}")
169
+ print(f" Features: {len(prepared_data['feature_names'])}")
170
+
171
+ # Get sensor summary
172
+ sensor_summary = self.get_sensor_summary(vehicle_data)
173
+ prepared_data['sensor_summary'] = sensor_summary
174
+
175
+ print(f"✓ Generated sensor summary statistics")
176
+ print(f"{'='*60}\n")
177
+
178
+ return prepared_data
179
+
180
+
181
+ if __name__ == '__main__':
182
+ # Test the Data Ingestion Agent
183
+ agent = DataIngestionAgent()
184
+
185
+ # Test with a vehicle from test set
186
+ test_df = agent.load_test_data()
187
+ test_vehicle_id = test_df['vehicle_id'].iloc[0]
188
+
189
+ result = agent.run(test_vehicle_id, n_readings=100)
190
+
191
+ print("\nSample sensor summary:")
192
+ for sensor, stats in list(result['sensor_summary'].items())[:3]:
193
+ print(f" {sensor}: mean={stats['mean']:.2f}, std={stats['std']:.2f}")
src/agents/maintenance_recommendation_agent.py ADDED
@@ -0,0 +1,425 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Maintenance Recommendation Agent - Provides actionable maintenance recommendations
3
+ """
4
+ from typing import Dict, List
5
+
6
+
7
+ class MaintenanceRecommendationAgent:
8
+ """
9
+ Agent responsible for generating maintenance recommendations based on root cause analysis
10
+ """
11
+
12
+ def __init__(self):
13
+ # Define maintenance actions for each fault type
14
+ self.maintenance_actions = {
15
+ 'engine_overheating': {
16
+ 'immediate_actions': [
17
+ 'Stop vehicle immediately and allow engine to cool',
18
+ 'Check coolant level and top up if low',
19
+ 'Inspect for coolant leaks'
20
+ ],
21
+ 'short_term_actions': [
22
+ 'Replace thermostat if faulty',
23
+ 'Flush and replace coolant',
24
+ 'Check radiator fan operation',
25
+ 'Inspect water pump for proper operation'
26
+ ],
27
+ 'long_term_actions': [
28
+ 'Schedule comprehensive cooling system inspection',
29
+ 'Consider radiator replacement if old',
30
+ 'Regular coolant system maintenance every 30,000 miles'
31
+ ],
32
+ 'estimated_cost': '$200-$800',
33
+ 'urgency': 'critical',
34
+ 'downtime': '1-3 days'
35
+ },
36
+ 'cooling_system_failure': {
37
+ 'immediate_actions': [
38
+ 'Do not operate vehicle',
39
+ 'Tow to service center'
40
+ ],
41
+ 'short_term_actions': [
42
+ 'Diagnose cooling system failure',
43
+ 'Replace failed components (radiator, water pump, thermostat)',
44
+ 'Pressure test cooling system'
45
+ ],
46
+ 'long_term_actions': [
47
+ 'Monitor coolant levels regularly',
48
+ 'Annual cooling system inspection'
49
+ ],
50
+ 'estimated_cost': '$500-$1500',
51
+ 'urgency': 'critical',
52
+ 'downtime': '2-5 days'
53
+ },
54
+ 'oil_pressure_low': {
55
+ 'immediate_actions': [
56
+ 'Stop engine immediately',
57
+ 'Check oil level',
58
+ 'Do not restart until issue is resolved'
59
+ ],
60
+ 'short_term_actions': [
61
+ 'Add oil if level is low',
62
+ 'Check for oil leaks',
63
+ 'Replace oil pressure sensor if faulty',
64
+ 'Inspect oil pump',
65
+ 'Change oil and filter'
66
+ ],
67
+ 'long_term_actions': [
68
+ 'Regular oil changes every 5,000 miles',
69
+ 'Use recommended oil grade',
70
+ 'Monitor oil consumption'
71
+ ],
72
+ 'estimated_cost': '$100-$600',
73
+ 'urgency': 'critical',
74
+ 'downtime': '1-2 days'
75
+ },
76
+ 'battery_degradation': {
77
+ 'immediate_actions': [
78
+ 'Test battery voltage',
79
+ 'Check battery terminals for corrosion'
80
+ ],
81
+ 'short_term_actions': [
82
+ 'Clean battery terminals',
83
+ 'Test alternator output',
84
+ 'Replace battery if failing load test',
85
+ 'Check for parasitic drain'
86
+ ],
87
+ 'long_term_actions': [
88
+ 'Replace battery every 3-5 years',
89
+ 'Regular battery maintenance',
90
+ 'Keep terminals clean'
91
+ ],
92
+ 'estimated_cost': '$100-$300',
93
+ 'urgency': 'high',
94
+ 'downtime': '0.5-1 day'
95
+ },
96
+ 'tire_pressure_issue': {
97
+ 'immediate_actions': [
98
+ 'Check tire pressure on all tires',
99
+ 'Inflate to recommended PSI',
100
+ 'Inspect for punctures or damage'
101
+ ],
102
+ 'short_term_actions': [
103
+ 'Repair or replace damaged tire',
104
+ 'Check valve stems',
105
+ 'Inspect for slow leaks',
106
+ 'Rotate tires if needed'
107
+ ],
108
+ 'long_term_actions': [
109
+ 'Check tire pressure monthly',
110
+ 'Regular tire rotation every 5,000-7,500 miles',
111
+ 'Replace tires when tread depth is low'
112
+ ],
113
+ 'estimated_cost': '$20-$200',
114
+ 'urgency': 'medium',
115
+ 'downtime': '0.5-1 day'
116
+ },
117
+ 'excessive_vibration': {
118
+ 'immediate_actions': [
119
+ 'Reduce speed',
120
+ 'Note when vibration occurs (speed, braking, etc.)'
121
+ ],
122
+ 'short_term_actions': [
123
+ 'Balance and rotate tires',
124
+ 'Check wheel alignment',
125
+ 'Inspect suspension components',
126
+ 'Check brake rotors for warping',
127
+ 'Inspect engine mounts'
128
+ ],
129
+ 'long_term_actions': [
130
+ 'Regular tire balancing',
131
+ 'Annual suspension inspection',
132
+ 'Replace worn suspension components'
133
+ ],
134
+ 'estimated_cost': '$100-$500',
135
+ 'urgency': 'high',
136
+ 'downtime': '1-2 days'
137
+ },
138
+ 'fuel_system_issue': {
139
+ 'immediate_actions': [
140
+ 'Note any performance issues',
141
+ 'Check for fuel leaks'
142
+ ],
143
+ 'short_term_actions': [
144
+ 'Replace fuel filter',
145
+ 'Test fuel pump pressure',
146
+ 'Clean fuel injectors',
147
+ 'Inspect fuel lines'
148
+ ],
149
+ 'long_term_actions': [
150
+ 'Use quality fuel',
151
+ 'Replace fuel filter every 30,000 miles',
152
+ 'Add fuel system cleaner periodically'
153
+ ],
154
+ 'estimated_cost': '$150-$600',
155
+ 'urgency': 'high',
156
+ 'downtime': '1-2 days'
157
+ },
158
+ 'engine_stress': {
159
+ 'immediate_actions': [
160
+ 'Reduce engine load',
161
+ 'Avoid high RPM operation'
162
+ ],
163
+ 'short_term_actions': [
164
+ 'Check air filter',
165
+ 'Inspect spark plugs',
166
+ 'Verify proper fuel octane rating',
167
+ 'Check for engine codes'
168
+ ],
169
+ 'long_term_actions': [
170
+ 'Regular tune-ups',
171
+ 'Avoid aggressive driving',
172
+ 'Use recommended fuel grade'
173
+ ],
174
+ 'estimated_cost': '$100-$400',
175
+ 'urgency': 'medium',
176
+ 'downtime': '0.5-1 day'
177
+ }
178
+ }
179
+
180
+ def generate_recommendations(self, root_causes: List[Dict]) -> List[Dict]:
181
+ """
182
+ Generate maintenance recommendations based on root causes
183
+
184
+ Args:
185
+ root_causes: List of identified root causes
186
+
187
+ Returns:
188
+ List of maintenance recommendations
189
+ """
190
+ recommendations = []
191
+
192
+ for cause in root_causes:
193
+ fault_name = cause['fault_name']
194
+
195
+ if fault_name in self.maintenance_actions:
196
+ actions = self.maintenance_actions[fault_name]
197
+
198
+ recommendation = {
199
+ 'fault_name': fault_name,
200
+ 'description': cause['description'],
201
+ 'severity': cause['severity'],
202
+ 'confidence': cause['confidence'],
203
+ 'fault_codes': cause['fault_codes'],
204
+ 'immediate_actions': actions['immediate_actions'],
205
+ 'short_term_actions': actions['short_term_actions'],
206
+ 'long_term_actions': actions['long_term_actions'],
207
+ 'estimated_cost': actions['estimated_cost'],
208
+ 'urgency': actions['urgency'],
209
+ 'estimated_downtime': actions['downtime']
210
+ }
211
+
212
+ recommendations.append(recommendation)
213
+
214
+ return recommendations
215
+
216
+ def prioritize_actions(self, recommendations: List[Dict]) -> List[Dict]:
217
+ """
218
+ Prioritize maintenance actions based on urgency and severity
219
+
220
+ Args:
221
+ recommendations: List of recommendations
222
+
223
+ Returns:
224
+ Prioritized list of actions
225
+ """
226
+ urgency_order = {'critical': 0, 'high': 1, 'medium': 2, 'low': 3}
227
+
228
+ # Sort by urgency and confidence
229
+ prioritized = sorted(
230
+ recommendations,
231
+ key=lambda x: (urgency_order.get(x['urgency'], 4), -x['confidence'])
232
+ )
233
+
234
+ return prioritized
235
+
236
+ def calculate_total_cost(self, recommendations: List[Dict]) -> Dict:
237
+ """
238
+ Calculate estimated total maintenance cost
239
+
240
+ Args:
241
+ recommendations: List of recommendations
242
+
243
+ Returns:
244
+ Dictionary with cost estimates
245
+ """
246
+ total_min = 0
247
+ total_max = 0
248
+
249
+ for rec in recommendations:
250
+ cost_str = rec['estimated_cost']
251
+ # Parse cost range like "$200-$800"
252
+ costs = cost_str.replace('$', '').split('-')
253
+ if len(costs) == 2:
254
+ total_min += int(costs[0])
255
+ total_max += int(costs[1])
256
+
257
+ return {
258
+ 'min_cost': total_min,
259
+ 'max_cost': total_max,
260
+ 'cost_range': f'${total_min}-${total_max}'
261
+ }
262
+
263
+ def generate_action_plan(self, recommendations: List[Dict]) -> Dict:
264
+ """
265
+ Generate a comprehensive action plan
266
+
267
+ Args:
268
+ recommendations: List of recommendations
269
+
270
+ Returns:
271
+ Dictionary containing action plan
272
+ """
273
+ if not recommendations:
274
+ return {
275
+ 'immediate': [],
276
+ 'short_term': [],
277
+ 'long_term': [],
278
+ 'total_actions': 0
279
+ }
280
+
281
+ immediate = []
282
+ short_term = []
283
+ long_term = []
284
+
285
+ for rec in recommendations:
286
+ # Add immediate actions
287
+ for action in rec['immediate_actions']:
288
+ immediate.append({
289
+ 'action': action,
290
+ 'related_to': rec['fault_name'],
291
+ 'urgency': rec['urgency']
292
+ })
293
+
294
+ # Add short-term actions
295
+ for action in rec['short_term_actions']:
296
+ short_term.append({
297
+ 'action': action,
298
+ 'related_to': rec['fault_name'],
299
+ 'estimated_cost': rec['estimated_cost']
300
+ })
301
+
302
+ # Add long-term actions
303
+ for action in rec['long_term_actions']:
304
+ long_term.append({
305
+ 'action': action,
306
+ 'related_to': rec['fault_name']
307
+ })
308
+
309
+ return {
310
+ 'immediate': immediate,
311
+ 'short_term': short_term,
312
+ 'long_term': long_term,
313
+ 'total_actions': len(immediate) + len(short_term) + len(long_term)
314
+ }
315
+
316
+ def run(self, root_cause_result: Dict) -> Dict:
317
+ """
318
+ Main execution method for the Maintenance Recommendation Agent
319
+
320
+ Args:
321
+ root_cause_result: Results from Root Cause Analysis Agent
322
+
323
+ Returns:
324
+ Dictionary containing maintenance recommendations
325
+ """
326
+ print(f"\n{'='*60}")
327
+ print(f"MAINTENANCE RECOMMENDATION AGENT - Vehicle {root_cause_result['vehicle_id']}")
328
+ print(f"{'='*60}")
329
+
330
+ root_causes = root_cause_result['root_causes']
331
+
332
+ if not root_causes:
333
+ print("✓ No maintenance recommendations needed - vehicle is healthy")
334
+ print(f"{'='*60}\n")
335
+ return {
336
+ 'vehicle_id': root_cause_result['vehicle_id'],
337
+ 'recommendations': [],
338
+ 'action_plan': {},
339
+ 'total_cost': {'min_cost': 0, 'max_cost': 0, 'cost_range': '$0'},
340
+ 'summary': 'No maintenance required'
341
+ }
342
+
343
+ print(f"Generating recommendations for {len(root_causes)} identified issues...")
344
+
345
+ # Generate recommendations
346
+ recommendations = self.generate_recommendations(root_causes)
347
+ print(f"✓ Generated {len(recommendations)} maintenance recommendations")
348
+
349
+ # Prioritize actions
350
+ prioritized_recommendations = self.prioritize_actions(recommendations)
351
+
352
+ # Calculate total cost
353
+ total_cost = self.calculate_total_cost(recommendations)
354
+ print(f"✓ Estimated total cost: {total_cost['cost_range']}")
355
+
356
+ # Generate action plan
357
+ action_plan = self.generate_action_plan(prioritized_recommendations)
358
+ print(f"✓ Action plan created:")
359
+ print(f" - Immediate actions: {len(action_plan['immediate'])}")
360
+ print(f" - Short-term actions: {len(action_plan['short_term'])}")
361
+ print(f" - Long-term actions: {len(action_plan['long_term'])}")
362
+
363
+ # Display top priority recommendation
364
+ if prioritized_recommendations:
365
+ top_rec = prioritized_recommendations[0]
366
+ print(f"\n✓ Top priority: {top_rec['fault_name']}")
367
+ print(f" Urgency: {top_rec['urgency']}")
368
+ print(f" Estimated cost: {top_rec['estimated_cost']}")
369
+ print(f" Downtime: {top_rec['estimated_downtime']}")
370
+
371
+ if action_plan['immediate']:
372
+ print(f"\n Immediate actions required:")
373
+ for action in action_plan['immediate'][:3]:
374
+ print(f" • {action['action']}")
375
+
376
+ summary = (f"{len(recommendations)} maintenance items identified. "
377
+ f"Estimated cost: {total_cost['cost_range']}. "
378
+ f"Highest priority: {prioritized_recommendations[0]['urgency']} urgency.")
379
+
380
+ print(f"\n✓ Summary: {summary}")
381
+ print(f"{'='*60}\n")
382
+
383
+ result = {
384
+ 'vehicle_id': root_cause_result['vehicle_id'],
385
+ 'recommendations': prioritized_recommendations,
386
+ 'action_plan': action_plan,
387
+ 'total_cost': total_cost,
388
+ 'summary': summary,
389
+ 'top_priority': prioritized_recommendations[0] if prioritized_recommendations else None
390
+ }
391
+
392
+ return result
393
+
394
+
395
+ if __name__ == '__main__':
396
+ # Test the Maintenance Recommendation Agent
397
+ from data_ingestion_agent import DataIngestionAgent
398
+ from anomaly_detection_agent import AnomalyDetectionAgent
399
+ from root_cause_agent import RootCauseAnalysisAgent
400
+
401
+ # Load and prepare data
402
+ ingestion_agent = DataIngestionAgent()
403
+ test_df = ingestion_agent.load_test_data()
404
+
405
+ # Find a vehicle with anomalies
406
+ test_vehicle_id = None
407
+ for vid in test_df['vehicle_id'].unique()[:10]:
408
+ if test_df[test_df['vehicle_id'] == vid]['anomaly'].sum() > 0:
409
+ test_vehicle_id = vid
410
+ break
411
+
412
+ if test_vehicle_id:
413
+ prepared_data = ingestion_agent.run(test_vehicle_id)
414
+ detection_agent = AnomalyDetectionAgent()
415
+ anomaly_result = detection_agent.run(prepared_data)
416
+ rca_agent = RootCauseAnalysisAgent()
417
+ rca_result = rca_agent.run(anomaly_result)
418
+
419
+ # Generate recommendations
420
+ maintenance_agent = MaintenanceRecommendationAgent()
421
+ result = maintenance_agent.run(rca_result)
422
+
423
+ print(f"\nMaintenance Summary:")
424
+ print(f" Recommendations: {len(result['recommendations'])}")
425
+ print(f" Total cost: {result['total_cost']['cost_range']}")
src/agents/report_generation_agent.py ADDED
@@ -0,0 +1,392 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Report Generation Agent - Generates comprehensive diagnostic reports
3
+ """
4
+ from typing import Dict
5
+ from datetime import datetime
6
+ import json
7
+
8
+
9
+ class ReportGenerationAgent:
10
+ """
11
+ Agent responsible for generating human-readable diagnostic reports
12
+ """
13
+
14
+ def __init__(self):
15
+ self.report_template = None
16
+
17
+ def generate_executive_summary(self, vehicle_id: int,
18
+ anomaly_result: Dict,
19
+ root_cause_result: Dict,
20
+ maintenance_result: Dict) -> str:
21
+ """
22
+ Generate executive summary of the diagnostic report
23
+
24
+ Args:
25
+ vehicle_id: Vehicle ID
26
+ anomaly_result: Results from anomaly detection
27
+ root_cause_result: Results from root cause analysis
28
+ maintenance_result: Results from maintenance recommendations
29
+
30
+ Returns:
31
+ Executive summary string
32
+ """
33
+ if not anomaly_result['anomaly_detected']:
34
+ return (f"Vehicle {vehicle_id} is operating normally. "
35
+ f"No anomalies detected in the analyzed sensor data. "
36
+ f"No maintenance actions required at this time.")
37
+
38
+ num_anomalies = anomaly_result['num_anomalies']
39
+ anomaly_rate = anomaly_result['anomaly_rate']
40
+ overall_score = anomaly_result['overall_score']
41
+
42
+ primary_cause = root_cause_result.get('primary_cause')
43
+ num_recommendations = len(maintenance_result['recommendations'])
44
+
45
+ summary = f"""
46
+ Vehicle {vehicle_id} Diagnostic Summary:
47
+
48
+ ALERT: Anomalies detected in vehicle sensor data.
49
+
50
+ Key Findings:
51
+ • Anomaly Detection: {num_anomalies} anomalous readings detected ({anomaly_rate:.1%} of analyzed data)
52
+ • Overall Anomaly Score: {overall_score:.3f}
53
+ • Affected Sensors: {len(anomaly_result['anomalous_sensors'])} sensors showing abnormal behavior
54
+ """
55
+
56
+ if primary_cause:
57
+ summary += f"""
58
+ Primary Issue Identified:
59
+ • {primary_cause['description']}
60
+ • Severity: {primary_cause['severity'].upper()}
61
+ • Confidence: {primary_cause['confidence']:.0%}
62
+ • Fault Codes: {', '.join(primary_cause['fault_codes'])}
63
+ """
64
+
65
+ if num_recommendations > 0:
66
+ top_priority = maintenance_result.get('top_priority')
67
+ total_cost = maintenance_result['total_cost']
68
+
69
+ summary += f"""
70
+ Maintenance Required:
71
+ • {num_recommendations} maintenance items identified
72
+ • Highest Priority: {top_priority['urgency'].upper()} urgency
73
+ • Estimated Cost: {total_cost['cost_range']}
74
+ • Immediate Actions: {len(maintenance_result['action_plan']['immediate'])} required
75
+ """
76
+
77
+ return summary.strip()
78
+
79
+ def format_anomaly_details(self, anomaly_result: Dict) -> str:
80
+ """Format anomaly detection details"""
81
+ if not anomaly_result['anomaly_detected']:
82
+ return "No anomalies detected."
83
+
84
+ details = f"""
85
+ ANOMALY DETECTION DETAILS
86
+ {'='*60}
87
+
88
+ Overall Statistics:
89
+ • Total Readings Analyzed: {len(anomaly_result['anomaly_predictions'])}
90
+ • Anomalous Readings: {anomaly_result['num_anomalies']}
91
+ • Anomaly Rate: {anomaly_result['anomaly_rate']:.2%}
92
+ • Overall Anomaly Score: {anomaly_result['overall_score']:.3f}
93
+
94
+ Affected Sensors:
95
+ """
96
+
97
+ anomalous_sensors = anomaly_result['anomalous_sensors']
98
+ sorted_sensors = sorted(anomalous_sensors.items(),
99
+ key=lambda x: x[1]['deviation'],
100
+ reverse=True)
101
+
102
+ for sensor, info in sorted_sensors:
103
+ details += f"""
104
+ • {sensor.upper()}
105
+ - Severity: {info['severity']}
106
+ - Deviation: {info['deviation']:.2f}σ from normal
107
+ - Normal Mean: {info['overall_mean']:.3f}
108
+ - Anomaly Mean: {info['anomaly_mean']:.3f}
109
+ """
110
+
111
+ return details.strip()
112
+
113
+ def format_root_cause_analysis(self, root_cause_result: Dict) -> str:
114
+ """Format root cause analysis details"""
115
+ if not root_cause_result['root_causes']:
116
+ return "No root causes identified."
117
+
118
+ details = f"""
119
+ ROOT CAUSE ANALYSIS
120
+ {'='*60}
121
+
122
+ Analysis Summary:
123
+ {root_cause_result['analysis_summary']}
124
+
125
+ Failure Progression:
126
+ • Type: {root_cause_result['failure_sequence'].get('progression', 'unknown').upper()}
127
+ • Duration: {root_cause_result['failure_sequence'].get('duration', 0)} timesteps
128
+ • First Anomaly: Timestep {root_cause_result['failure_sequence'].get('first_anomaly_time', 'N/A')}
129
+ • Last Anomaly: Timestep {root_cause_result['failure_sequence'].get('last_anomaly_time', 'N/A')}
130
+
131
+ Identified Root Causes:
132
+ """
133
+
134
+ for i, cause in enumerate(root_cause_result['root_causes'], 1):
135
+ details += f"""
136
+ {i}. {cause['fault_name'].upper().replace('_', ' ')}
137
+ Description: {cause['description']}
138
+ Severity: {cause['severity'].upper()}
139
+ Confidence: {cause['confidence']:.0%}
140
+ Fault Codes: {', '.join(cause['fault_codes'])}
141
+ Affected Sensors: {', '.join(cause['affected_sensors'])}
142
+ """
143
+
144
+ if root_cause_result['correlations']:
145
+ details += "\nCorrelated Sensor Failures:\n"
146
+ for sensor1, sensor2, strength in root_cause_result['correlations']:
147
+ details += f"• {sensor1} ↔ {sensor2} (correlation: {strength:.2f})\n"
148
+
149
+ return details.strip()
150
+
151
+ def format_maintenance_recommendations(self, maintenance_result: Dict) -> str:
152
+ """Format maintenance recommendations"""
153
+ if not maintenance_result['recommendations']:
154
+ return "No maintenance required at this time."
155
+
156
+ details = f"""
157
+ MAINTENANCE RECOMMENDATIONS
158
+ {'='*60}
159
+
160
+ Cost Estimate: {maintenance_result['total_cost']['cost_range']}
161
+ Total Actions: {maintenance_result['action_plan']['total_actions']}
162
+
163
+ IMMEDIATE ACTIONS (Perform Now):
164
+ """
165
+
166
+ for i, action in enumerate(maintenance_result['action_plan']['immediate'], 1):
167
+ details += f"{i}. {action['action']}\n Related to: {action['related_to'].replace('_', ' ').title()}\n Urgency: {action['urgency'].upper()}\n\n"
168
+
169
+ details += "\nSHORT-TERM ACTIONS (Within 1-2 Weeks):\n"
170
+ for i, action in enumerate(maintenance_result['action_plan']['short_term'], 1):
171
+ details += f"{i}. {action['action']}\n Related to: {action['related_to'].replace('_', ' ').title()}\n\n"
172
+
173
+ details += "\nLONG-TERM ACTIONS (Preventive Maintenance):\n"
174
+ for i, action in enumerate(maintenance_result['action_plan']['long_term'], 1):
175
+ details += f"{i}. {action['action']}\n Related to: {action['related_to'].replace('_', ' ').title()}\n\n"
176
+
177
+ # Add detailed recommendations
178
+ details += "\nDETAILED MAINTENANCE ITEMS:\n"
179
+ for i, rec in enumerate(maintenance_result['recommendations'], 1):
180
+ details += f"""
181
+ {i}. {rec['fault_name'].upper().replace('_', ' ')}
182
+ Severity: {rec['severity'].upper()}
183
+ Urgency: {rec['urgency'].upper()}
184
+ Estimated Cost: {rec['estimated_cost']}
185
+ Estimated Downtime: {rec['estimated_downtime']}
186
+ Fault Codes: {', '.join(rec['fault_codes'])}
187
+ """
188
+
189
+ return details.strip()
190
+
191
+ def generate_natural_language_summary(self, vehicle_id: int,
192
+ anomaly_result: Dict,
193
+ root_cause_result: Dict,
194
+ maintenance_result: Dict) -> str:
195
+ """Generate natural language summary for non-technical users"""
196
+ if not anomaly_result['anomaly_detected']:
197
+ return (f"Good news! Vehicle {vehicle_id} is running smoothly. "
198
+ f"Our diagnostic system analyzed all sensor data and found no issues. "
199
+ f"Continue with regular maintenance schedule.")
200
+
201
+ primary_cause = root_cause_result.get('primary_cause')
202
+ top_priority = maintenance_result.get('top_priority')
203
+
204
+ summary = f"Vehicle {vehicle_id} requires attention. "
205
+
206
+ if primary_cause:
207
+ summary += f"Our analysis detected {primary_cause['description'].lower()}. "
208
+
209
+ if primary_cause['severity'] == 'critical':
210
+ summary += "This is a critical issue that requires immediate attention. "
211
+ elif primary_cause['severity'] == 'high':
212
+ summary += "This is a high-priority issue that should be addressed soon. "
213
+ else:
214
+ summary += "This issue should be addressed during your next service visit. "
215
+
216
+ if top_priority:
217
+ summary += f"\n\nWhat you need to do: "
218
+ immediate_actions = maintenance_result['action_plan']['immediate']
219
+ if immediate_actions:
220
+ summary += f"{immediate_actions[0]['action']} "
221
+
222
+ summary += f"\n\nEstimated repair cost: {maintenance_result['total_cost']['cost_range']}. "
223
+ summary += f"Expected downtime: {top_priority['estimated_downtime']}."
224
+
225
+ return summary
226
+
227
+ def generate_json_report(self, vehicle_id: int,
228
+ prepared_data: Dict,
229
+ anomaly_result: Dict,
230
+ root_cause_result: Dict,
231
+ maintenance_result: Dict) -> Dict:
232
+ """Generate structured JSON report"""
233
+ report = {
234
+ 'report_metadata': {
235
+ 'vehicle_id': vehicle_id,
236
+ 'report_timestamp': datetime.now().isoformat(),
237
+ 'report_version': '1.0',
238
+ 'analysis_timerange': prepared_data['time_range']
239
+ },
240
+ 'anomaly_detection': {
241
+ 'anomaly_detected': anomaly_result['anomaly_detected'],
242
+ 'num_anomalies': anomaly_result['num_anomalies'],
243
+ 'anomaly_rate': anomaly_result['anomaly_rate'],
244
+ 'overall_score': anomaly_result['overall_score'],
245
+ 'anomalous_sensors': anomaly_result['anomalous_sensors']
246
+ },
247
+ 'root_cause_analysis': {
248
+ 'root_causes': root_cause_result['root_causes'],
249
+ 'primary_cause': root_cause_result.get('primary_cause'),
250
+ 'failure_sequence': root_cause_result['failure_sequence'],
251
+ 'correlations': root_cause_result['correlations']
252
+ },
253
+ 'maintenance_recommendations': {
254
+ 'recommendations': maintenance_result['recommendations'],
255
+ 'action_plan': maintenance_result['action_plan'],
256
+ 'total_cost': maintenance_result['total_cost'],
257
+ 'top_priority': maintenance_result.get('top_priority')
258
+ }
259
+ }
260
+
261
+ return report
262
+
263
+ def run(self, vehicle_id: int,
264
+ prepared_data: Dict,
265
+ anomaly_result: Dict,
266
+ root_cause_result: Dict,
267
+ maintenance_result: Dict) -> Dict:
268
+ """
269
+ Main execution method for the Report Generation Agent
270
+
271
+ Args:
272
+ vehicle_id: Vehicle ID
273
+ prepared_data: Data from ingestion agent
274
+ anomaly_result: Results from anomaly detection
275
+ root_cause_result: Results from root cause analysis
276
+ maintenance_result: Results from maintenance recommendations
277
+
278
+ Returns:
279
+ Dictionary containing complete diagnostic report
280
+ """
281
+ print(f"\n{'='*60}")
282
+ print(f"REPORT GENERATION AGENT - Vehicle {vehicle_id}")
283
+ print(f"{'='*60}")
284
+
285
+ print("Generating comprehensive diagnostic report...")
286
+
287
+ # Generate all report sections
288
+ executive_summary = self.generate_executive_summary(
289
+ vehicle_id, anomaly_result, root_cause_result, maintenance_result
290
+ )
291
+
292
+ anomaly_details = self.format_anomaly_details(anomaly_result)
293
+ root_cause_details = self.format_root_cause_analysis(root_cause_result)
294
+ maintenance_details = self.format_maintenance_recommendations(maintenance_result)
295
+
296
+ natural_language_summary = self.generate_natural_language_summary(
297
+ vehicle_id, anomaly_result, root_cause_result, maintenance_result
298
+ )
299
+
300
+ json_report = self.generate_json_report(
301
+ vehicle_id, prepared_data, anomaly_result, root_cause_result, maintenance_result
302
+ )
303
+
304
+ # Compile full report
305
+ full_report = f"""
306
+ {'='*60}
307
+ VEHICLE DIAGNOSTIC REPORT
308
+ Vehicle ID: {vehicle_id}
309
+ Report Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}
310
+ {'='*60}
311
+
312
+ EXECUTIVE SUMMARY
313
+ {'='*60}
314
+ {executive_summary}
315
+
316
+ {anomaly_details}
317
+
318
+ {root_cause_details}
319
+
320
+ {maintenance_details}
321
+
322
+ {'='*60}
323
+ PLAIN LANGUAGE SUMMARY
324
+ {'='*60}
325
+ {natural_language_summary}
326
+
327
+ {'='*60}
328
+ END OF REPORT
329
+ {'='*60}
330
+ """
331
+
332
+ print("✓ Generated executive summary")
333
+ print("✓ Generated anomaly detection details")
334
+ print("✓ Generated root cause analysis")
335
+ print("✓ Generated maintenance recommendations")
336
+ print("✓ Generated natural language summary")
337
+ print("✓ Generated JSON report")
338
+
339
+ print(f"\n✓ Complete diagnostic report generated")
340
+ print(f"{'='*60}\n")
341
+
342
+ result = {
343
+ 'vehicle_id': vehicle_id,
344
+ 'full_report': full_report,
345
+ 'executive_summary': executive_summary,
346
+ 'natural_language_summary': natural_language_summary,
347
+ 'json_report': json_report,
348
+ 'report_timestamp': datetime.now().isoformat()
349
+ }
350
+
351
+ return result
352
+
353
+
354
+ if __name__ == '__main__':
355
+ # Test the Report Generation Agent
356
+ from data_ingestion_agent import DataIngestionAgent
357
+ from anomaly_detection_agent import AnomalyDetectionAgent
358
+ from root_cause_agent import RootCauseAnalysisAgent
359
+ from maintenance_recommendation_agent import MaintenanceRecommendationAgent
360
+
361
+ # Run full pipeline
362
+ ingestion_agent = DataIngestionAgent()
363
+ test_df = ingestion_agent.load_test_data()
364
+
365
+ # Find a vehicle with anomalies
366
+ test_vehicle_id = None
367
+ for vid in test_df['vehicle_id'].unique()[:10]:
368
+ if test_df[test_df['vehicle_id'] == vid]['anomaly'].sum() > 0:
369
+ test_vehicle_id = vid
370
+ break
371
+
372
+ if test_vehicle_id:
373
+ prepared_data = ingestion_agent.run(test_vehicle_id)
374
+
375
+ detection_agent = AnomalyDetectionAgent()
376
+ anomaly_result = detection_agent.run(prepared_data)
377
+
378
+ rca_agent = RootCauseAnalysisAgent()
379
+ rca_result = rca_agent.run(anomaly_result)
380
+
381
+ maintenance_agent = MaintenanceRecommendationAgent()
382
+ maintenance_result = maintenance_agent.run(rca_result)
383
+
384
+ # Generate report
385
+ report_agent = ReportGenerationAgent()
386
+ report = report_agent.run(test_vehicle_id, prepared_data, anomaly_result,
387
+ rca_result, maintenance_result)
388
+
389
+ print("\n" + "="*60)
390
+ print("SAMPLE REPORT OUTPUT")
391
+ print("="*60)
392
+ print(report['full_report'][:1000] + "...")
src/agents/root_cause_agent.py ADDED
@@ -0,0 +1,307 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Root Cause Analysis Agent - Identifies the root cause of detected anomalies
3
+ """
4
+ import numpy as np
5
+ from typing import Dict, List, Tuple
6
+
7
+
8
+ class RootCauseAnalysisAgent:
9
+ """
10
+ Agent responsible for determining the root cause of detected anomalies
11
+ """
12
+
13
+ def __init__(self):
14
+ # Define fault patterns and their associated root causes
15
+ self.fault_patterns = {
16
+ 'engine_overheating': {
17
+ 'sensors': ['engine_temp', 'coolant_temp', 'temp_differential'],
18
+ 'thresholds': {'engine_temp': 1.5, 'coolant_temp': 1.5, 'temp_differential': 1.0},
19
+ 'description': 'Engine temperature exceeds safe operating limits',
20
+ 'severity': 'critical',
21
+ 'fault_codes': ['P0217', 'P0218', 'P0219']
22
+ },
23
+ 'cooling_system_failure': {
24
+ 'sensors': ['coolant_temp', 'engine_temp'],
25
+ 'thresholds': {'coolant_temp': 2.0, 'engine_temp': 1.8},
26
+ 'description': 'Cooling system not maintaining proper temperature',
27
+ 'severity': 'critical',
28
+ 'fault_codes': ['P0217', 'P0128']
29
+ },
30
+ 'oil_pressure_low': {
31
+ 'sensors': ['oil_pressure'],
32
+ 'thresholds': {'oil_pressure': -1.5},
33
+ 'description': 'Oil pressure below safe operating range',
34
+ 'severity': 'critical',
35
+ 'fault_codes': ['P0520', 'P0521', 'P0522']
36
+ },
37
+ 'battery_degradation': {
38
+ 'sensors': ['battery_voltage', 'battery_health'],
39
+ 'thresholds': {'battery_voltage': -1.0, 'battery_health': -1.0},
40
+ 'description': 'Battery voltage or health declining',
41
+ 'severity': 'high',
42
+ 'fault_codes': ['P0560', 'P0562', 'P0563']
43
+ },
44
+ 'tire_pressure_issue': {
45
+ 'sensors': ['tire_pressure_fl', 'tire_pressure_fr', 'tire_pressure_rl', 'tire_pressure_rr', 'tire_pressure_imbalance'],
46
+ 'thresholds': {'tire_pressure_fl': -1.5, 'tire_pressure_fr': -1.5,
47
+ 'tire_pressure_rl': -1.5, 'tire_pressure_rr': -1.5,
48
+ 'tire_pressure_imbalance': 1.5},
49
+ 'description': 'One or more tires have incorrect pressure',
50
+ 'severity': 'medium',
51
+ 'fault_codes': ['C1234', 'C1235']
52
+ },
53
+ 'excessive_vibration': {
54
+ 'sensors': ['vibration_level'],
55
+ 'thresholds': {'vibration_level': 2.0},
56
+ 'description': 'Abnormal vibration detected',
57
+ 'severity': 'high',
58
+ 'fault_codes': ['P0300', 'P0301']
59
+ },
60
+ 'fuel_system_issue': {
61
+ 'sensors': ['fuel_pressure'],
62
+ 'thresholds': {'fuel_pressure': -1.5},
63
+ 'description': 'Fuel pressure outside normal range',
64
+ 'severity': 'high',
65
+ 'fault_codes': ['P0087', 'P0088']
66
+ },
67
+ 'engine_stress': {
68
+ 'sensors': ['engine_stress', 'rpm', 'engine_temp'],
69
+ 'thresholds': {'engine_stress': 2.0, 'rpm': 2.0},
70
+ 'description': 'Engine operating under excessive stress',
71
+ 'severity': 'medium',
72
+ 'fault_codes': ['P0101', 'P0102']
73
+ }
74
+ }
75
+
76
+ def analyze_sensor_patterns(self, anomalous_sensors: Dict, raw_data) -> List[Dict]:
77
+ """
78
+ Analyze anomalous sensor patterns to identify root causes
79
+
80
+ Args:
81
+ anomalous_sensors: Dictionary of sensors showing anomalous behavior
82
+ raw_data: Raw sensor data DataFrame
83
+
84
+ Returns:
85
+ List of identified root causes with confidence scores
86
+ """
87
+ identified_causes = []
88
+
89
+ for fault_name, fault_info in self.fault_patterns.items():
90
+ # Check if any of the fault's sensors are anomalous
91
+ matching_sensors = []
92
+ confidence_scores = []
93
+
94
+ for sensor in fault_info['sensors']:
95
+ if sensor in anomalous_sensors:
96
+ matching_sensors.append(sensor)
97
+
98
+ # Calculate confidence based on deviation
99
+ deviation = anomalous_sensors[sensor]['deviation']
100
+ confidence = min(deviation / 5.0, 1.0) # Normalize to 0-1
101
+ confidence_scores.append(confidence)
102
+
103
+ # Also check if sensor values exceed thresholds
104
+ elif sensor in raw_data.columns:
105
+ threshold = fault_info['thresholds'].get(sensor)
106
+ if threshold is not None:
107
+ # Check recent values
108
+ recent_values = raw_data[sensor].tail(20)
109
+ if threshold > 0:
110
+ exceeds = (recent_values > threshold).sum() / len(recent_values)
111
+ else:
112
+ exceeds = (recent_values < threshold).sum() / len(recent_values)
113
+
114
+ if exceeds > 0.3: # If 30% of recent values exceed threshold
115
+ matching_sensors.append(sensor)
116
+ confidence_scores.append(exceeds)
117
+
118
+ # If we have matching sensors, this is a potential root cause
119
+ if matching_sensors:
120
+ avg_confidence = np.mean(confidence_scores)
121
+
122
+ identified_causes.append({
123
+ 'fault_name': fault_name,
124
+ 'description': fault_info['description'],
125
+ 'severity': fault_info['severity'],
126
+ 'confidence': float(avg_confidence),
127
+ 'affected_sensors': matching_sensors,
128
+ 'fault_codes': fault_info['fault_codes'],
129
+ 'num_sensors_affected': len(matching_sensors)
130
+ })
131
+
132
+ # Sort by confidence
133
+ identified_causes.sort(key=lambda x: x['confidence'], reverse=True)
134
+
135
+ return identified_causes
136
+
137
+ def correlate_sensor_failures(self, anomalous_sensors: Dict) -> List[Tuple[str, str, float]]:
138
+ """
139
+ Find correlations between anomalous sensors
140
+
141
+ Args:
142
+ anomalous_sensors: Dictionary of anomalous sensors
143
+
144
+ Returns:
145
+ List of correlated sensor pairs with correlation strength
146
+ """
147
+ correlations = []
148
+
149
+ # Known sensor correlations
150
+ known_correlations = [
151
+ ('engine_temp', 'coolant_temp', 0.9),
152
+ ('engine_temp', 'oil_pressure', -0.7),
153
+ ('rpm', 'engine_temp', 0.6),
154
+ ('battery_voltage', 'battery_health', 0.95),
155
+ ('tire_pressure_fl', 'tire_pressure_fr', 0.8),
156
+ ('tire_pressure_rl', 'tire_pressure_rr', 0.8),
157
+ ]
158
+
159
+ for sensor1, sensor2, corr_strength in known_correlations:
160
+ if sensor1 in anomalous_sensors and sensor2 in anomalous_sensors:
161
+ correlations.append((sensor1, sensor2, corr_strength))
162
+
163
+ return correlations
164
+
165
+ def determine_failure_sequence(self, anomaly_indices: List[int],
166
+ anomalous_sensors: Dict,
167
+ timestamps: np.ndarray) -> Dict:
168
+ """
169
+ Determine the sequence of failures
170
+
171
+ Args:
172
+ anomaly_indices: Indices where anomalies occurred
173
+ anomalous_sensors: Dictionary of anomalous sensors
174
+ timestamps: Array of timestamps
175
+
176
+ Returns:
177
+ Dictionary describing failure sequence
178
+ """
179
+ if not anomaly_indices:
180
+ return {'sequence': [], 'duration': 0}
181
+
182
+ first_anomaly = min(anomaly_indices)
183
+ last_anomaly = max(anomaly_indices)
184
+ duration = last_anomaly - first_anomaly
185
+
186
+ sequence = {
187
+ 'first_anomaly_time': int(timestamps[first_anomaly]),
188
+ 'last_anomaly_time': int(timestamps[last_anomaly]),
189
+ 'duration': int(duration),
190
+ 'progression': 'gradual' if duration > 50 else 'sudden',
191
+ 'affected_sensors': list(anomalous_sensors.keys())
192
+ }
193
+
194
+ return sequence
195
+
196
+ def run(self, anomaly_result: Dict) -> Dict:
197
+ """
198
+ Main execution method for the Root Cause Analysis Agent
199
+
200
+ Args:
201
+ anomaly_result: Results from Anomaly Detection Agent
202
+
203
+ Returns:
204
+ Dictionary containing root cause analysis
205
+ """
206
+ print(f"\n{'='*60}")
207
+ print(f"ROOT CAUSE ANALYSIS AGENT - Vehicle {anomaly_result['vehicle_id']}")
208
+ print(f"{'='*60}")
209
+
210
+ if not anomaly_result['anomaly_detected']:
211
+ print("✓ No anomalies detected - no root cause analysis needed")
212
+ print(f"{'='*60}\n")
213
+ return {
214
+ 'vehicle_id': anomaly_result['vehicle_id'],
215
+ 'root_causes': [],
216
+ 'correlations': [],
217
+ 'failure_sequence': {},
218
+ 'analysis_summary': 'No anomalies detected'
219
+ }
220
+
221
+ anomalous_sensors = anomaly_result['anomalous_sensors']
222
+ raw_data = anomaly_result['raw_data']
223
+ anomaly_indices = anomaly_result['anomaly_indices']
224
+ timestamps = anomaly_result['timestamps']
225
+
226
+ print(f"Analyzing {len(anomalous_sensors)} anomalous sensors...")
227
+
228
+ # Identify root causes
229
+ root_causes = self.analyze_sensor_patterns(anomalous_sensors, raw_data)
230
+ print(f"✓ Identified {len(root_causes)} potential root causes")
231
+
232
+ if root_causes:
233
+ print("\nTop root causes:")
234
+ for i, cause in enumerate(root_causes[:3], 1):
235
+ print(f" {i}. {cause['fault_name']} ({cause['severity']} severity)")
236
+ print(f" Confidence: {cause['confidence']:.2%}")
237
+ print(f" Description: {cause['description']}")
238
+ print(f" Fault codes: {', '.join(cause['fault_codes'])}")
239
+
240
+ # Find sensor correlations
241
+ correlations = self.correlate_sensor_failures(anomalous_sensors)
242
+ if correlations:
243
+ print(f"\n✓ Found {len(correlations)} correlated sensor failures")
244
+ for sensor1, sensor2, strength in correlations:
245
+ print(f" - {sensor1} ↔ {sensor2} (correlation: {strength:.2f})")
246
+
247
+ # Determine failure sequence
248
+ failure_sequence = self.determine_failure_sequence(
249
+ anomaly_indices, anomalous_sensors, timestamps
250
+ )
251
+ print(f"\n✓ Failure progression: {failure_sequence.get('progression', 'unknown')}")
252
+ print(f" Duration: {failure_sequence.get('duration', 0)} timesteps")
253
+
254
+ # Generate analysis summary
255
+ if root_causes:
256
+ primary_cause = root_causes[0]
257
+ summary = (f"Primary issue: {primary_cause['description']} "
258
+ f"({primary_cause['severity']} severity, "
259
+ f"{primary_cause['confidence']:.0%} confidence)")
260
+ else:
261
+ summary = "Anomalies detected but root cause unclear"
262
+
263
+ print(f"\n✓ Analysis summary: {summary}")
264
+ print(f"{'='*60}\n")
265
+
266
+ result = {
267
+ 'vehicle_id': anomaly_result['vehicle_id'],
268
+ 'root_causes': root_causes,
269
+ 'correlations': correlations,
270
+ 'failure_sequence': failure_sequence,
271
+ 'analysis_summary': summary,
272
+ 'primary_cause': root_causes[0] if root_causes else None
273
+ }
274
+
275
+ return result
276
+
277
+
278
+ if __name__ == '__main__':
279
+ # Test the Root Cause Analysis Agent
280
+ from data_ingestion_agent import DataIngestionAgent
281
+ from anomaly_detection_agent import AnomalyDetectionAgent
282
+
283
+ # Load and prepare data
284
+ ingestion_agent = DataIngestionAgent()
285
+ test_df = ingestion_agent.load_test_data()
286
+
287
+ # Find a vehicle with anomalies
288
+ test_vehicle_id = None
289
+ for vid in test_df['vehicle_id'].unique()[:10]:
290
+ if test_df[test_df['vehicle_id'] == vid]['anomaly'].sum() > 0:
291
+ test_vehicle_id = vid
292
+ break
293
+
294
+ if test_vehicle_id:
295
+ prepared_data = ingestion_agent.run(test_vehicle_id)
296
+
297
+ # Detect anomalies
298
+ detection_agent = AnomalyDetectionAgent()
299
+ anomaly_result = detection_agent.run(prepared_data)
300
+
301
+ # Analyze root cause
302
+ rca_agent = RootCauseAnalysisAgent()
303
+ result = rca_agent.run(anomaly_result)
304
+
305
+ print(f"\nRoot Cause Analysis Summary:")
306
+ print(f" Primary cause: {result['primary_cause']['fault_name'] if result['primary_cause'] else 'None'}")
307
+ print(f" Root causes found: {len(result['root_causes'])}")
src/api/main.py ADDED
@@ -0,0 +1,277 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ FastAPI Backend for Vehicle Diagnostics Agent
3
+ """
4
+ from fastapi import FastAPI, HTTPException, BackgroundTasks
5
+ from fastapi.middleware.cors import CORSMiddleware
6
+ from pydantic import BaseModel, Field
7
+ from typing import Optional, List, Dict
8
+ import sys
9
+ from pathlib import Path
10
+
11
+ # Add parent directory to path
12
+ sys.path.append(str(Path(__file__).parent.parent))
13
+
14
+ from orchestrator import VehicleDiagnosticOrchestrator
15
+ from agents.data_ingestion_agent import DataIngestionAgent
16
+
17
+ # Initialize FastAPI app
18
+ app = FastAPI(
19
+ title="Vehicle Diagnostics Agent API",
20
+ description="Multi-agent AI system for predictive vehicle diagnostics",
21
+ version="1.0.0"
22
+ )
23
+
24
+ # Add CORS middleware
25
+ app.add_middleware(
26
+ CORSMiddleware,
27
+ allow_origins=["*"],
28
+ allow_credentials=True,
29
+ allow_methods=["*"],
30
+ allow_headers=["*"],
31
+ )
32
+
33
+ # Initialize orchestrator
34
+ orchestrator = VehicleDiagnosticOrchestrator()
35
+ ingestion_agent = DataIngestionAgent()
36
+
37
+ # Store for async job results
38
+ job_results = {}
39
+
40
+
41
+ # Pydantic models for request/response
42
+ class DiagnosticRequest(BaseModel):
43
+ vehicle_id: int = Field(..., description="ID of the vehicle to diagnose")
44
+ n_readings: Optional[int] = Field(None, description="Number of recent readings to analyze")
45
+
46
+
47
+ class DiagnosticResponse(BaseModel):
48
+ success: bool
49
+ vehicle_id: int
50
+ message: str
51
+ anomaly_detected: Optional[bool] = None
52
+ overall_score: Optional[float] = None
53
+ num_anomalies: Optional[int] = None
54
+ primary_cause: Optional[str] = None
55
+ estimated_cost: Optional[str] = None
56
+ report_summary: Optional[str] = None
57
+
58
+
59
+ class BatchDiagnosticRequest(BaseModel):
60
+ vehicle_ids: List[int] = Field(..., description="List of vehicle IDs to diagnose")
61
+ n_readings: Optional[int] = Field(None, description="Number of recent readings to analyze")
62
+
63
+
64
+ class HealthCheckResponse(BaseModel):
65
+ status: str
66
+ version: str
67
+ available_vehicles: int
68
+
69
+
70
+ @app.get("/", response_model=Dict)
71
+ async def root():
72
+ """Root endpoint"""
73
+ return {
74
+ "message": "Vehicle Diagnostics Agent API",
75
+ "version": "1.0.0",
76
+ "endpoints": {
77
+ "health": "/health",
78
+ "diagnose": "/diagnose",
79
+ "batch_diagnose": "/batch-diagnose",
80
+ "vehicles": "/vehicles",
81
+ "report": "/report/{vehicle_id}"
82
+ }
83
+ }
84
+
85
+
86
+ @app.get("/health", response_model=HealthCheckResponse)
87
+ async def health_check():
88
+ """Health check endpoint"""
89
+ try:
90
+ test_df = ingestion_agent.load_test_data()
91
+ num_vehicles = test_df['vehicle_id'].nunique()
92
+
93
+ return HealthCheckResponse(
94
+ status="healthy",
95
+ version="1.0.0",
96
+ available_vehicles=num_vehicles
97
+ )
98
+ except Exception as e:
99
+ raise HTTPException(status_code=500, detail=f"Health check failed: {str(e)}")
100
+
101
+
102
+ @app.get("/vehicles", response_model=Dict)
103
+ async def list_vehicles():
104
+ """List available vehicles for diagnosis"""
105
+ try:
106
+ test_df = ingestion_agent.load_test_data()
107
+ vehicle_ids = test_df['vehicle_id'].unique().tolist()
108
+
109
+ # Get basic stats for each vehicle
110
+ vehicle_info = []
111
+ for vid in vehicle_ids[:20]: # Limit to first 20 for performance
112
+ vehicle_data = test_df[test_df['vehicle_id'] == vid]
113
+ vehicle_info.append({
114
+ 'vehicle_id': int(vid),
115
+ 'num_readings': len(vehicle_data),
116
+ 'has_anomalies': bool(vehicle_data['anomaly'].sum() > 0),
117
+ 'anomaly_count': int(vehicle_data['anomaly'].sum())
118
+ })
119
+
120
+ return {
121
+ "total_vehicles": len(vehicle_ids),
122
+ "vehicles": vehicle_info
123
+ }
124
+ except Exception as e:
125
+ raise HTTPException(status_code=500, detail=f"Failed to list vehicles: {str(e)}")
126
+
127
+
128
+ @app.post("/diagnose", response_model=DiagnosticResponse)
129
+ async def diagnose_vehicle(request: DiagnosticRequest):
130
+ """
131
+ Run diagnostic analysis for a single vehicle
132
+ """
133
+ try:
134
+ # Run diagnostic workflow
135
+ result = orchestrator.diagnose_vehicle(
136
+ vehicle_id=request.vehicle_id,
137
+ n_readings=request.n_readings
138
+ )
139
+
140
+ if not result['success']:
141
+ return DiagnosticResponse(
142
+ success=False,
143
+ vehicle_id=request.vehicle_id,
144
+ message=f"Diagnostic failed: {result.get('error', 'Unknown error')}"
145
+ )
146
+
147
+ # Extract key information
148
+ anomaly_result = result.get('anomaly_result', {})
149
+ root_cause_result = result.get('root_cause_result', {})
150
+ maintenance_result = result.get('maintenance_result', {})
151
+ report = result.get('report', {})
152
+
153
+ primary_cause = root_cause_result.get('primary_cause')
154
+
155
+ return DiagnosticResponse(
156
+ success=True,
157
+ vehicle_id=request.vehicle_id,
158
+ message="Diagnostic completed successfully",
159
+ anomaly_detected=anomaly_result.get('anomaly_detected', False),
160
+ overall_score=anomaly_result.get('overall_score'),
161
+ num_anomalies=anomaly_result.get('num_anomalies'),
162
+ primary_cause=primary_cause['fault_name'] if primary_cause else None,
163
+ estimated_cost=maintenance_result.get('total_cost', {}).get('cost_range'),
164
+ report_summary=report.get('natural_language_summary')
165
+ )
166
+
167
+ except Exception as e:
168
+ raise HTTPException(status_code=500, detail=f"Diagnostic failed: {str(e)}")
169
+
170
+
171
+ @app.post("/batch-diagnose")
172
+ async def batch_diagnose(request: BatchDiagnosticRequest, background_tasks: BackgroundTasks):
173
+ """
174
+ Run diagnostic analysis for multiple vehicles (async)
175
+ """
176
+ try:
177
+ # For simplicity, run synchronously for now
178
+ # In production, this would be handled by a task queue
179
+ results = orchestrator.diagnose_multiple_vehicles(
180
+ vehicle_ids=request.vehicle_ids,
181
+ n_readings=request.n_readings
182
+ )
183
+
184
+ # Summarize results
185
+ summary = {
186
+ 'total_vehicles': len(request.vehicle_ids),
187
+ 'successful': sum(1 for r in results.values() if r['success']),
188
+ 'with_anomalies': sum(1 for r in results.values()
189
+ if r['success'] and r.get('anomaly_result', {}).get('anomaly_detected')),
190
+ 'results': {}
191
+ }
192
+
193
+ for vid, result in results.items():
194
+ if result['success']:
195
+ anomaly_result = result.get('anomaly_result', {})
196
+ summary['results'][vid] = {
197
+ 'anomaly_detected': anomaly_result.get('anomaly_detected', False),
198
+ 'overall_score': anomaly_result.get('overall_score'),
199
+ 'num_anomalies': anomaly_result.get('num_anomalies')
200
+ }
201
+ else:
202
+ summary['results'][vid] = {
203
+ 'error': result.get('error')
204
+ }
205
+
206
+ return summary
207
+
208
+ except Exception as e:
209
+ raise HTTPException(status_code=500, detail=f"Batch diagnostic failed: {str(e)}")
210
+
211
+
212
+ @app.get("/report/{vehicle_id}")
213
+ async def get_full_report(vehicle_id: int, n_readings: Optional[int] = None):
214
+ """
215
+ Get full diagnostic report for a vehicle
216
+ """
217
+ try:
218
+ # Run diagnostic workflow
219
+ result = orchestrator.diagnose_vehicle(
220
+ vehicle_id=vehicle_id,
221
+ n_readings=n_readings
222
+ )
223
+
224
+ if not result['success']:
225
+ raise HTTPException(status_code=500, detail=result.get('error', 'Unknown error'))
226
+
227
+ report = result.get('report', {})
228
+
229
+ return {
230
+ 'vehicle_id': vehicle_id,
231
+ 'report_timestamp': report.get('report_timestamp'),
232
+ 'full_report': report.get('full_report'),
233
+ 'executive_summary': report.get('executive_summary'),
234
+ 'natural_language_summary': report.get('natural_language_summary'),
235
+ 'json_report': report.get('json_report')
236
+ }
237
+
238
+ except HTTPException:
239
+ raise
240
+ except Exception as e:
241
+ raise HTTPException(status_code=500, detail=f"Failed to generate report: {str(e)}")
242
+
243
+
244
+ @app.get("/vehicle/{vehicle_id}/status")
245
+ async def get_vehicle_status(vehicle_id: int):
246
+ """
247
+ Get current status of a vehicle without full diagnostic
248
+ """
249
+ try:
250
+ test_df = ingestion_agent.load_test_data()
251
+ vehicle_data = test_df[test_df['vehicle_id'] == vehicle_id]
252
+
253
+ if len(vehicle_data) == 0:
254
+ raise HTTPException(status_code=404, detail=f"Vehicle {vehicle_id} not found")
255
+
256
+ # Get basic statistics
257
+ latest_data = vehicle_data.tail(50)
258
+ sensor_summary = ingestion_agent.get_sensor_summary(latest_data)
259
+
260
+ return {
261
+ 'vehicle_id': vehicle_id,
262
+ 'num_readings': len(vehicle_data),
263
+ 'latest_timestamp': int(vehicle_data['timestamp'].iloc[-1]),
264
+ 'has_anomalies': bool(vehicle_data['anomaly'].sum() > 0),
265
+ 'total_anomalies': int(vehicle_data['anomaly'].sum()),
266
+ 'sensor_summary': sensor_summary
267
+ }
268
+
269
+ except HTTPException:
270
+ raise
271
+ except Exception as e:
272
+ raise HTTPException(status_code=500, detail=f"Failed to get vehicle status: {str(e)}")
273
+
274
+
275
+ if __name__ == "__main__":
276
+ import uvicorn
277
+ uvicorn.run(app, host="0.0.0.0", port=8000)
src/models/anomaly_detector.py ADDED
@@ -0,0 +1,205 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Anomaly Detection Model using LSTM Neural Network
3
+ """
4
+ import torch
5
+ import torch.nn as nn
6
+ import numpy as np
7
+ from pathlib import Path
8
+ import pickle
9
+
10
+
11
+ class LSTMAnomalyDetector(nn.Module):
12
+ """
13
+ LSTM-based anomaly detection model for time-series sensor data
14
+ """
15
+
16
+ def __init__(self, input_size, hidden_size=64, num_layers=2, dropout=0.2):
17
+ super(LSTMAnomalyDetector, self).__init__()
18
+
19
+ self.hidden_size = hidden_size
20
+ self.num_layers = num_layers
21
+
22
+ # LSTM layers
23
+ self.lstm = nn.LSTM(
24
+ input_size=input_size,
25
+ hidden_size=hidden_size,
26
+ num_layers=num_layers,
27
+ batch_first=True,
28
+ dropout=dropout if num_layers > 1 else 0
29
+ )
30
+
31
+ # Fully connected layers
32
+ self.fc1 = nn.Linear(hidden_size, 32)
33
+ self.relu = nn.ReLU()
34
+ self.dropout = nn.Dropout(dropout)
35
+ self.fc2 = nn.Linear(32, 1)
36
+ self.sigmoid = nn.Sigmoid()
37
+
38
+ def forward(self, x):
39
+ # LSTM forward pass
40
+ lstm_out, _ = self.lstm(x)
41
+
42
+ # Take the last output
43
+ last_output = lstm_out[:, -1, :]
44
+
45
+ # Fully connected layers
46
+ out = self.fc1(last_output)
47
+ out = self.relu(out)
48
+ out = self.dropout(out)
49
+ out = self.fc2(out)
50
+ out = self.sigmoid(out)
51
+
52
+ return out
53
+
54
+
55
+ class AnomalyDetectionModel:
56
+ """
57
+ Wrapper class for anomaly detection model with training and inference
58
+ """
59
+
60
+ def __init__(self, input_size, sequence_length=50, device=None):
61
+ self.input_size = input_size
62
+ self.sequence_length = sequence_length
63
+ self.device = device or torch.device('cuda' if torch.cuda.is_available() else 'cpu')
64
+
65
+ self.model = LSTMAnomalyDetector(input_size).to(self.device)
66
+ self.criterion = nn.BCELoss()
67
+ self.optimizer = torch.optim.Adam(self.model.parameters(), lr=0.001)
68
+
69
+ print(f"Initialized Anomaly Detection Model on {self.device}")
70
+
71
+ def create_sequences(self, data, labels=None):
72
+ """
73
+ Create sequences for LSTM input
74
+
75
+ Args:
76
+ data: numpy array of shape (n_samples, n_features)
77
+ labels: optional numpy array of labels
78
+
79
+ Returns:
80
+ Sequences and labels (if provided)
81
+ """
82
+ sequences = []
83
+ seq_labels = []
84
+
85
+ for i in range(len(data) - self.sequence_length + 1):
86
+ seq = data[i:i + self.sequence_length]
87
+ sequences.append(seq)
88
+
89
+ if labels is not None:
90
+ # Label is 1 if any point in sequence is anomalous
91
+ label = labels[i + self.sequence_length - 1]
92
+ seq_labels.append(label)
93
+
94
+ sequences = np.array(sequences)
95
+
96
+ if labels is not None:
97
+ seq_labels = np.array(seq_labels)
98
+ return sequences, seq_labels
99
+
100
+ return sequences
101
+
102
+ def train_epoch(self, train_loader):
103
+ """Train for one epoch"""
104
+ self.model.train()
105
+ total_loss = 0
106
+
107
+ for batch_x, batch_y in train_loader:
108
+ batch_x = batch_x.to(self.device)
109
+ batch_y = batch_y.to(self.device)
110
+
111
+ # Forward pass
112
+ outputs = self.model(batch_x)
113
+ loss = self.criterion(outputs.squeeze(), batch_y.float())
114
+
115
+ # Backward pass
116
+ self.optimizer.zero_grad()
117
+ loss.backward()
118
+ self.optimizer.step()
119
+
120
+ total_loss += loss.item()
121
+
122
+ return total_loss / len(train_loader)
123
+
124
+ def evaluate(self, val_loader):
125
+ """Evaluate on validation set"""
126
+ self.model.eval()
127
+ total_loss = 0
128
+ all_preds = []
129
+ all_labels = []
130
+
131
+ with torch.no_grad():
132
+ for batch_x, batch_y in val_loader:
133
+ batch_x = batch_x.to(self.device)
134
+ batch_y = batch_y.to(self.device)
135
+
136
+ outputs = self.model(batch_x)
137
+ loss = self.criterion(outputs.squeeze(), batch_y.float())
138
+
139
+ total_loss += loss.item()
140
+
141
+ preds = (outputs.squeeze() > 0.5).cpu().numpy()
142
+ all_preds.extend(preds)
143
+ all_labels.extend(batch_y.cpu().numpy())
144
+
145
+ avg_loss = total_loss / len(val_loader)
146
+
147
+ # Calculate metrics
148
+ all_preds = np.array(all_preds)
149
+ all_labels = np.array(all_labels)
150
+
151
+ accuracy = (all_preds == all_labels).mean()
152
+
153
+ return avg_loss, accuracy
154
+
155
+ def predict(self, data):
156
+ """
157
+ Predict anomalies for given data
158
+
159
+ Args:
160
+ data: numpy array of shape (n_samples, n_features)
161
+
162
+ Returns:
163
+ Anomaly scores and binary predictions
164
+ """
165
+ self.model.eval()
166
+
167
+ # Create sequences
168
+ sequences = self.create_sequences(data)
169
+
170
+ # Convert to tensor
171
+ sequences_tensor = torch.FloatTensor(sequences).to(self.device)
172
+
173
+ # Predict
174
+ with torch.no_grad():
175
+ scores = self.model(sequences_tensor).squeeze().cpu().numpy()
176
+
177
+ # Binary predictions
178
+ predictions = (scores > 0.5).astype(int)
179
+
180
+ return scores, predictions
181
+
182
+ def save(self, path):
183
+ """Save model"""
184
+ path = Path(path)
185
+ path.parent.mkdir(parents=True, exist_ok=True)
186
+
187
+ torch.save({
188
+ 'model_state_dict': self.model.state_dict(),
189
+ 'optimizer_state_dict': self.optimizer.state_dict(),
190
+ 'input_size': self.input_size,
191
+ 'sequence_length': self.sequence_length,
192
+ }, path)
193
+
194
+ print(f"✓ Model saved to {path}")
195
+
196
+ def load(self, path):
197
+ """Load model"""
198
+ checkpoint = torch.load(path, map_location=self.device)
199
+
200
+ self.model.load_state_dict(checkpoint['model_state_dict'])
201
+ self.optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
202
+ self.input_size = checkpoint['input_size']
203
+ self.sequence_length = checkpoint['sequence_length']
204
+
205
+ print(f"✓ Model loaded from {path}")
src/models/best_anomaly_detector.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ca38939ee83cea0a7846731c4888718af19218b6fc233918143bbedc7b1372fd
3
+ size 825850
src/models/train_anomaly_detector.py ADDED
@@ -0,0 +1,116 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Train the LSTM Anomaly Detection Model
3
+ """
4
+ import pandas as pd
5
+ import numpy as np
6
+ import torch
7
+ from torch.utils.data import TensorDataset, DataLoader
8
+ from pathlib import Path
9
+ import pickle
10
+ from anomaly_detector import AnomalyDetectionModel
11
+
12
+
13
+ def load_data(data_dir='data/processed'):
14
+ """Load preprocessed data"""
15
+ data_path = Path(data_dir)
16
+
17
+ train_df = pd.read_csv(data_path / 'train.csv')
18
+ val_df = pd.read_csv(data_path / 'val.csv')
19
+
20
+ # Load feature columns
21
+ with open(data_path / 'feature_columns.pkl', 'rb') as f:
22
+ feature_columns = pickle.load(f)
23
+
24
+ return train_df, val_df, feature_columns
25
+
26
+
27
+ def prepare_data_by_vehicle(df, feature_columns, sequence_length=50):
28
+ """Prepare sequences grouped by vehicle"""
29
+ all_sequences = []
30
+ all_labels = []
31
+
32
+ for vehicle_id in df['vehicle_id'].unique():
33
+ vehicle_data = df[df['vehicle_id'] == vehicle_id]
34
+
35
+ features = vehicle_data[feature_columns].values
36
+ labels = vehicle_data['anomaly'].values
37
+
38
+ # Create sequences for this vehicle
39
+ for i in range(len(features) - sequence_length + 1):
40
+ seq = features[i:i + sequence_length]
41
+ label = labels[i + sequence_length - 1]
42
+
43
+ all_sequences.append(seq)
44
+ all_labels.append(label)
45
+
46
+ return np.array(all_sequences), np.array(all_labels)
47
+
48
+
49
+ def train_model(epochs=20, batch_size=32, sequence_length=50):
50
+ """Train the anomaly detection model"""
51
+ print("="*60)
52
+ print("TRAINING ANOMALY DETECTION MODEL")
53
+ print("="*60)
54
+
55
+ # Load data
56
+ print("\nLoading data...")
57
+ train_df, val_df, feature_columns = load_data()
58
+ print(f"✓ Loaded train: {len(train_df)} records, val: {len(val_df)} records")
59
+ print(f"✓ Features: {len(feature_columns)}")
60
+
61
+ # Prepare sequences
62
+ print("\nPreparing sequences...")
63
+ X_train, y_train = prepare_data_by_vehicle(train_df, feature_columns, sequence_length)
64
+ X_val, y_val = prepare_data_by_vehicle(val_df, feature_columns, sequence_length)
65
+
66
+ print(f"✓ Train sequences: {X_train.shape}")
67
+ print(f"✓ Val sequences: {X_val.shape}")
68
+ print(f"✓ Train anomaly rate: {y_train.mean():.2%}")
69
+ print(f"✓ Val anomaly rate: {y_val.mean():.2%}")
70
+
71
+ # Create data loaders
72
+ train_dataset = TensorDataset(
73
+ torch.FloatTensor(X_train),
74
+ torch.FloatTensor(y_train)
75
+ )
76
+ val_dataset = TensorDataset(
77
+ torch.FloatTensor(X_val),
78
+ torch.FloatTensor(y_val)
79
+ )
80
+
81
+ train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
82
+ val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)
83
+
84
+ # Initialize model
85
+ input_size = len(feature_columns)
86
+ model = AnomalyDetectionModel(input_size, sequence_length)
87
+
88
+ # Training loop
89
+ print(f"\nTraining for {epochs} epochs...")
90
+ print("-"*60)
91
+
92
+ best_val_loss = float('inf')
93
+
94
+ for epoch in range(epochs):
95
+ train_loss = model.train_epoch(train_loader)
96
+ val_loss, val_acc = model.evaluate(val_loader)
97
+
98
+ print(f"Epoch {epoch+1}/{epochs} - "
99
+ f"Train Loss: {train_loss:.4f}, "
100
+ f"Val Loss: {val_loss:.4f}, "
101
+ f"Val Acc: {val_acc:.4f}")
102
+
103
+ # Save best model
104
+ if val_loss < best_val_loss:
105
+ best_val_loss = val_loss
106
+ model.save('src/models/best_anomaly_detector.pth')
107
+
108
+ print("-"*60)
109
+ print(f"\n✓ Training complete! Best val loss: {best_val_loss:.4f}")
110
+ print("="*60)
111
+
112
+ return model
113
+
114
+
115
+ if __name__ == '__main__':
116
+ model = train_model(epochs=20, batch_size=32, sequence_length=50)
src/orchestrator.py ADDED
@@ -0,0 +1,249 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Multi-Agent Orchestrator using LangGraph
3
+ Coordinates the execution of all diagnostic agents
4
+ """
5
+ from typing import Dict, TypedDict, Annotated
6
+ from langgraph.graph import StateGraph, END
7
+ import operator
8
+
9
+ from agents.data_ingestion_agent import DataIngestionAgent
10
+ from agents.anomaly_detection_agent import AnomalyDetectionAgent
11
+ from agents.root_cause_agent import RootCauseAnalysisAgent
12
+ from agents.maintenance_recommendation_agent import MaintenanceRecommendationAgent
13
+ from agents.report_generation_agent import ReportGenerationAgent
14
+
15
+
16
+ class DiagnosticState(TypedDict):
17
+ """State object passed between agents"""
18
+ vehicle_id: int
19
+ n_readings: int
20
+ prepared_data: Dict
21
+ anomaly_result: Dict
22
+ root_cause_result: Dict
23
+ maintenance_result: Dict
24
+ report_result: Dict
25
+ error: str
26
+
27
+
28
+ class VehicleDiagnosticOrchestrator:
29
+ """
30
+ Orchestrates the multi-agent vehicle diagnostic workflow using LangGraph
31
+ """
32
+
33
+ def __init__(self):
34
+ self.ingestion_agent = DataIngestionAgent()
35
+ self.anomaly_agent = AnomalyDetectionAgent()
36
+ self.root_cause_agent = RootCauseAnalysisAgent()
37
+ self.maintenance_agent = MaintenanceRecommendationAgent()
38
+ self.report_agent = ReportGenerationAgent()
39
+
40
+ self.workflow = self._build_workflow()
41
+
42
+ def _build_workflow(self) -> StateGraph:
43
+ """Build the LangGraph workflow"""
44
+
45
+ # Define the workflow graph
46
+ workflow = StateGraph(DiagnosticState)
47
+
48
+ # Add nodes for each agent
49
+ workflow.add_node("data_ingestion", self._run_data_ingestion)
50
+ workflow.add_node("anomaly_detection", self._run_anomaly_detection)
51
+ workflow.add_node("root_cause_analysis", self._run_root_cause_analysis)
52
+ workflow.add_node("maintenance_recommendation", self._run_maintenance_recommendation)
53
+ workflow.add_node("report_generation", self._run_report_generation)
54
+
55
+ # Define the workflow edges (sequential execution)
56
+ workflow.set_entry_point("data_ingestion")
57
+ workflow.add_edge("data_ingestion", "anomaly_detection")
58
+ workflow.add_edge("anomaly_detection", "root_cause_analysis")
59
+ workflow.add_edge("root_cause_analysis", "maintenance_recommendation")
60
+ workflow.add_edge("maintenance_recommendation", "report_generation")
61
+ workflow.add_edge("report_generation", END)
62
+
63
+ return workflow.compile()
64
+
65
+ def _run_data_ingestion(self, state: DiagnosticState) -> DiagnosticState:
66
+ """Execute Data Ingestion Agent"""
67
+ try:
68
+ prepared_data = self.ingestion_agent.run(
69
+ state['vehicle_id'],
70
+ state.get('n_readings')
71
+ )
72
+ state['prepared_data'] = prepared_data
73
+ except Exception as e:
74
+ state['error'] = f"Data Ingestion Error: {str(e)}"
75
+
76
+ return state
77
+
78
+ def _run_anomaly_detection(self, state: DiagnosticState) -> DiagnosticState:
79
+ """Execute Anomaly Detection Agent"""
80
+ try:
81
+ if 'error' not in state:
82
+ anomaly_result = self.anomaly_agent.run(state['prepared_data'])
83
+ state['anomaly_result'] = anomaly_result
84
+ except Exception as e:
85
+ state['error'] = f"Anomaly Detection Error: {str(e)}"
86
+
87
+ return state
88
+
89
+ def _run_root_cause_analysis(self, state: DiagnosticState) -> DiagnosticState:
90
+ """Execute Root Cause Analysis Agent"""
91
+ try:
92
+ if 'error' not in state:
93
+ root_cause_result = self.root_cause_agent.run(state['anomaly_result'])
94
+ state['root_cause_result'] = root_cause_result
95
+ except Exception as e:
96
+ state['error'] = f"Root Cause Analysis Error: {str(e)}"
97
+
98
+ return state
99
+
100
+ def _run_maintenance_recommendation(self, state: DiagnosticState) -> DiagnosticState:
101
+ """Execute Maintenance Recommendation Agent"""
102
+ try:
103
+ if 'error' not in state:
104
+ maintenance_result = self.maintenance_agent.run(state['root_cause_result'])
105
+ state['maintenance_result'] = maintenance_result
106
+ except Exception as e:
107
+ state['error'] = f"Maintenance Recommendation Error: {str(e)}"
108
+
109
+ return state
110
+
111
+ def _run_report_generation(self, state: DiagnosticState) -> DiagnosticState:
112
+ """Execute Report Generation Agent"""
113
+ try:
114
+ if 'error' not in state:
115
+ report_result = self.report_agent.run(
116
+ state['vehicle_id'],
117
+ state['prepared_data'],
118
+ state['anomaly_result'],
119
+ state['root_cause_result'],
120
+ state['maintenance_result']
121
+ )
122
+ state['report_result'] = report_result
123
+ except Exception as e:
124
+ state['error'] = f"Report Generation Error: {str(e)}"
125
+
126
+ return state
127
+
128
+ def diagnose_vehicle(self, vehicle_id: int, n_readings: int = None) -> Dict:
129
+ """
130
+ Run complete diagnostic workflow for a vehicle
131
+
132
+ Args:
133
+ vehicle_id: ID of the vehicle to diagnose
134
+ n_readings: Optional number of recent readings to analyze
135
+
136
+ Returns:
137
+ Dictionary containing complete diagnostic results
138
+ """
139
+ print("\n" + "="*60)
140
+ print("VEHICLE DIAGNOSTIC ORCHESTRATOR")
141
+ print("="*60)
142
+ print(f"Starting diagnostic workflow for Vehicle {vehicle_id}")
143
+ print("="*60 + "\n")
144
+
145
+ # Initialize state
146
+ initial_state = {
147
+ 'vehicle_id': vehicle_id,
148
+ 'n_readings': n_readings
149
+ }
150
+
151
+ # Execute workflow
152
+ final_state = self.workflow.invoke(initial_state)
153
+
154
+ # Check for errors
155
+ if 'error' in final_state:
156
+ print(f"\n❌ Error occurred: {final_state['error']}")
157
+ return {
158
+ 'success': False,
159
+ 'error': final_state['error'],
160
+ 'vehicle_id': vehicle_id
161
+ }
162
+
163
+ print("\n" + "="*60)
164
+ print("DIAGNOSTIC WORKFLOW COMPLETED SUCCESSFULLY")
165
+ print("="*60)
166
+
167
+ # Return comprehensive results
168
+ return {
169
+ 'success': True,
170
+ 'vehicle_id': vehicle_id,
171
+ 'prepared_data': final_state.get('prepared_data'),
172
+ 'anomaly_result': final_state.get('anomaly_result'),
173
+ 'root_cause_result': final_state.get('root_cause_result'),
174
+ 'maintenance_result': final_state.get('maintenance_result'),
175
+ 'report': final_state.get('report_result')
176
+ }
177
+
178
+ def diagnose_multiple_vehicles(self, vehicle_ids: list, n_readings: int = None) -> Dict:
179
+ """
180
+ Run diagnostics for multiple vehicles
181
+
182
+ Args:
183
+ vehicle_ids: List of vehicle IDs
184
+ n_readings: Optional number of recent readings to analyze
185
+
186
+ Returns:
187
+ Dictionary mapping vehicle IDs to diagnostic results
188
+ """
189
+ results = {}
190
+
191
+ print(f"\n{'='*60}")
192
+ print(f"BATCH DIAGNOSTICS - {len(vehicle_ids)} vehicles")
193
+ print(f"{'='*60}\n")
194
+
195
+ for i, vehicle_id in enumerate(vehicle_ids, 1):
196
+ print(f"\nProcessing vehicle {i}/{len(vehicle_ids)}: {vehicle_id}")
197
+ results[vehicle_id] = self.diagnose_vehicle(vehicle_id, n_readings)
198
+
199
+ print(f"\n{'='*60}")
200
+ print(f"BATCH DIAGNOSTICS COMPLETED")
201
+ print(f"{'='*60}")
202
+
203
+ # Summary statistics
204
+ successful = sum(1 for r in results.values() if r['success'])
205
+ with_anomalies = sum(1 for r in results.values()
206
+ if r['success'] and r.get('anomaly_result', {}).get('anomaly_detected'))
207
+
208
+ print(f"\nSummary:")
209
+ print(f" Total vehicles: {len(vehicle_ids)}")
210
+ print(f" Successfully analyzed: {successful}")
211
+ print(f" Vehicles with anomalies: {with_anomalies}")
212
+
213
+ return results
214
+
215
+
216
+ def main():
217
+ """Test the orchestrator"""
218
+ orchestrator = VehicleDiagnosticOrchestrator()
219
+
220
+ # Load test data to get vehicle IDs
221
+ from agents.data_ingestion_agent import DataIngestionAgent
222
+ ingestion_agent = DataIngestionAgent()
223
+ test_df = ingestion_agent.load_test_data()
224
+
225
+ # Get a vehicle with anomalies
226
+ test_vehicle_id = None
227
+ for vid in test_df['vehicle_id'].unique()[:10]:
228
+ if test_df[test_df['vehicle_id'] == vid]['anomaly'].sum() > 0:
229
+ test_vehicle_id = vid
230
+ break
231
+
232
+ if test_vehicle_id:
233
+ # Run single vehicle diagnostic
234
+ result = orchestrator.diagnose_vehicle(test_vehicle_id, n_readings=200)
235
+
236
+ if result['success']:
237
+ print("\n" + "="*60)
238
+ print("DIAGNOSTIC REPORT PREVIEW")
239
+ print("="*60)
240
+ report = result['report']['full_report']
241
+ print(report[:2000] + "\n...\n")
242
+
243
+ print("\nNatural Language Summary:")
244
+ print("-"*60)
245
+ print(result['report']['natural_language_summary'])
246
+
247
+
248
+ if __name__ == '__main__':
249
+ main()
src/ui/gradio_app.py ADDED
@@ -0,0 +1,307 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Gradio UI for Vehicle Diagnostics Agent
3
+ """
4
+ import gradio as gr
5
+ import sys
6
+ from pathlib import Path
7
+ import pandas as pd
8
+ import plotly.graph_objects as go
9
+ import plotly.express as px
10
+
11
+ # Add parent directory to path
12
+ sys.path.append(str(Path(__file__).parent.parent))
13
+
14
+ from orchestrator import VehicleDiagnosticOrchestrator
15
+ from agents.data_ingestion_agent import DataIngestionAgent
16
+
17
+ # Initialize components
18
+ orchestrator = VehicleDiagnosticOrchestrator()
19
+ ingestion_agent = DataIngestionAgent()
20
+
21
+ # Load available vehicles
22
+ test_df = ingestion_agent.load_test_data()
23
+ available_vehicles = sorted(test_df['vehicle_id'].unique().tolist())
24
+
25
+
26
+ def run_diagnostic(vehicle_id, n_readings):
27
+ """Run diagnostic for a vehicle"""
28
+ try:
29
+ vehicle_id = int(vehicle_id)
30
+ n_readings = int(n_readings) if n_readings else None
31
+
32
+ # Run diagnostic
33
+ result = orchestrator.diagnose_vehicle(vehicle_id, n_readings)
34
+
35
+ if not result['success']:
36
+ return f"❌ Error: {result.get('error')}", "", "", None
37
+
38
+ # Extract results
39
+ anomaly_result = result.get('anomaly_result', {})
40
+ report = result.get('report', {})
41
+
42
+ # Status summary
43
+ if anomaly_result.get('anomaly_detected'):
44
+ status = f"""
45
+ ## 🚨 ALERT: Anomalies Detected
46
+
47
+ **Vehicle ID:** {vehicle_id}
48
+ **Anomaly Score:** {anomaly_result.get('overall_score', 0):.3f}
49
+ **Anomalous Readings:** {anomaly_result.get('num_anomalies', 0)} / {len(anomaly_result.get('anomaly_predictions', []))} ({anomaly_result.get('anomaly_rate', 0):.1%})
50
+ **Status:** ⚠️ Requires Attention
51
+ """
52
+ else:
53
+ status = f"""
54
+ ## ✅ Vehicle Healthy
55
+
56
+ **Vehicle ID:** {vehicle_id}
57
+ **Status:** 🟢 All Systems Normal
58
+ **Anomaly Score:** {anomaly_result.get('overall_score', 0):.3f}
59
+ """
60
+
61
+ # Natural language summary
62
+ nl_summary = report.get('natural_language_summary', 'No summary available')
63
+
64
+ # Full report
65
+ full_report = report.get('full_report', 'No report available')
66
+
67
+ # Create visualization
68
+ fig = create_anomaly_visualization(anomaly_result)
69
+
70
+ return status, nl_summary, full_report, fig
71
+
72
+ except Exception as e:
73
+ return f"❌ Error: {str(e)}", "", "", None
74
+
75
+
76
+ def create_anomaly_visualization(anomaly_result):
77
+ """Create visualization of anomaly detection results"""
78
+ try:
79
+ timestamps = anomaly_result.get('timestamps', [])
80
+ predictions = anomaly_result.get('anomaly_predictions', [])
81
+ scores = anomaly_result.get('anomaly_scores', [])
82
+
83
+ if len(timestamps) == 0:
84
+ return None
85
+
86
+ # Create figure with secondary y-axis
87
+ fig = go.Figure()
88
+
89
+ # Add anomaly predictions
90
+ fig.add_trace(go.Scatter(
91
+ x=timestamps,
92
+ y=predictions,
93
+ mode='lines',
94
+ name='Anomaly Detected',
95
+ line=dict(color='red', width=2),
96
+ fill='tozeroy',
97
+ fillcolor='rgba(255, 0, 0, 0.2)'
98
+ ))
99
+
100
+ # Add anomaly scores
101
+ fig.add_trace(go.Scatter(
102
+ x=timestamps,
103
+ y=scores,
104
+ mode='lines',
105
+ name='Anomaly Score',
106
+ line=dict(color='orange', width=1, dash='dot'),
107
+ yaxis='y2'
108
+ ))
109
+
110
+ # Update layout
111
+ fig.update_layout(
112
+ title='Anomaly Detection Over Time',
113
+ xaxis_title='Timestamp',
114
+ yaxis_title='Anomaly Detected (0/1)',
115
+ yaxis2=dict(
116
+ title='Anomaly Score',
117
+ overlaying='y',
118
+ side='right'
119
+ ),
120
+ hovermode='x unified',
121
+ template='plotly_white',
122
+ height=400
123
+ )
124
+
125
+ return fig
126
+
127
+ except Exception as e:
128
+ print(f"Visualization error: {e}")
129
+ return None
130
+
131
+
132
+ def get_vehicle_info(vehicle_id):
133
+ """Get basic info about a vehicle"""
134
+ try:
135
+ vehicle_id = int(vehicle_id)
136
+ vehicle_data = test_df[test_df['vehicle_id'] == vehicle_id]
137
+
138
+ if len(vehicle_data) == 0:
139
+ return "Vehicle not found"
140
+
141
+ num_readings = len(vehicle_data)
142
+ has_anomalies = vehicle_data['anomaly'].sum() > 0
143
+ num_anomalies = vehicle_data['anomaly'].sum()
144
+
145
+ info = f"""
146
+ ### Vehicle Information
147
+
148
+ **Vehicle ID:** {vehicle_id}
149
+ **Total Readings:** {num_readings}
150
+ **Known Anomalies:** {num_anomalies} ({num_anomalies/num_readings:.1%})
151
+ **Status:** {'⚠️ Has anomalies' if has_anomalies else '✅ Healthy'}
152
+ """
153
+ return info
154
+
155
+ except Exception as e:
156
+ return f"Error: {str(e)}"
157
+
158
+
159
+ def list_vehicles_with_anomalies():
160
+ """List vehicles that have anomalies"""
161
+ vehicles_with_anomalies = []
162
+
163
+ for vid in available_vehicles[:50]: # Limit to first 50
164
+ vehicle_data = test_df[test_df['vehicle_id'] == vid]
165
+ if vehicle_data['anomaly'].sum() > 0:
166
+ vehicles_with_anomalies.append({
167
+ 'Vehicle ID': vid,
168
+ 'Total Readings': len(vehicle_data),
169
+ 'Anomalies': int(vehicle_data['anomaly'].sum()),
170
+ 'Anomaly Rate': f"{vehicle_data['anomaly'].sum()/len(vehicle_data):.1%}"
171
+ })
172
+
173
+ if vehicles_with_anomalies:
174
+ df = pd.DataFrame(vehicles_with_anomalies)
175
+ return df
176
+ else:
177
+ return pd.DataFrame({'Message': ['No vehicles with anomalies found']})
178
+
179
+
180
+ # Create Gradio interface
181
+ with gr.Blocks(title="Vehicle Diagnostics Agent") as demo:
182
+ gr.Markdown("""
183
+ # 🚗 Vehicle Diagnostics Agent
184
+ ### Multi-Agent AI System for Predictive Vehicle Diagnostics
185
+
186
+ This system uses advanced AI agents to analyze vehicle sensor data, detect anomalies,
187
+ identify root causes, and provide actionable maintenance recommendations.
188
+ """)
189
+
190
+ with gr.Tab("🔍 Single Vehicle Diagnostic"):
191
+ gr.Markdown("### Analyze a single vehicle")
192
+
193
+ with gr.Row():
194
+ with gr.Column(scale=1):
195
+ vehicle_id_input = gr.Dropdown(
196
+ choices=available_vehicles,
197
+ label="Select Vehicle ID",
198
+ value=available_vehicles[0] if available_vehicles else None
199
+ )
200
+ n_readings_input = gr.Number(
201
+ label="Number of Recent Readings (optional)",
202
+ value=200,
203
+ precision=0
204
+ )
205
+
206
+ diagnose_btn = gr.Button("🔬 Run Diagnostic", variant="primary", size="lg")
207
+
208
+ gr.Markdown("---")
209
+ vehicle_info_output = gr.Markdown(label="Vehicle Info")
210
+
211
+ # Auto-update vehicle info when selection changes
212
+ vehicle_id_input.change(
213
+ fn=get_vehicle_info,
214
+ inputs=[vehicle_id_input],
215
+ outputs=[vehicle_info_output]
216
+ )
217
+
218
+ with gr.Column(scale=2):
219
+ status_output = gr.Markdown(label="Diagnostic Status")
220
+ summary_output = gr.Textbox(
221
+ label="📋 Summary",
222
+ lines=5,
223
+ max_lines=10
224
+ )
225
+
226
+ with gr.Row():
227
+ anomaly_plot = gr.Plot(label="Anomaly Detection Visualization")
228
+
229
+ with gr.Row():
230
+ full_report_output = gr.Textbox(
231
+ label="📄 Full Diagnostic Report",
232
+ lines=20,
233
+ max_lines=30
234
+ )
235
+
236
+ diagnose_btn.click(
237
+ fn=run_diagnostic,
238
+ inputs=[vehicle_id_input, n_readings_input],
239
+ outputs=[status_output, summary_output, full_report_output, anomaly_plot]
240
+ )
241
+
242
+ with gr.Tab("📊 Vehicle Overview"):
243
+ gr.Markdown("### Vehicles with Known Anomalies")
244
+
245
+ refresh_btn = gr.Button("🔄 Refresh List", variant="secondary")
246
+ vehicles_table = gr.Dataframe(
247
+ value=list_vehicles_with_anomalies(),
248
+ label="Vehicles Requiring Attention"
249
+ )
250
+
251
+ refresh_btn.click(
252
+ fn=list_vehicles_with_anomalies,
253
+ inputs=[],
254
+ outputs=[vehicles_table]
255
+ )
256
+
257
+ with gr.Tab("ℹ️ About"):
258
+ gr.Markdown("""
259
+ ## About Vehicle Diagnostics Agent
260
+
261
+ ### System Architecture
262
+
263
+ This system employs a multi-agent architecture with the following components:
264
+
265
+ 1. **Data Ingestion Agent** - Loads and prepares vehicle sensor data
266
+ 2. **Anomaly Detection Agent** - Uses LSTM neural networks to detect unusual patterns
267
+ 3. **Root Cause Analysis Agent** - Identifies the underlying causes of anomalies
268
+ 4. **Maintenance Recommendation Agent** - Provides actionable maintenance steps
269
+ 5. **Report Generation Agent** - Creates comprehensive diagnostic reports
270
+
271
+ ### Technology Stack
272
+
273
+ - **ML Framework:** PyTorch (LSTM-based anomaly detection)
274
+ - **Orchestration:** LangGraph for multi-agent coordination
275
+ - **Backend:** FastAPI for REST API
276
+ - **Frontend:** Gradio for interactive UI
277
+ - **Data Processing:** Pandas, NumPy, Scikit-learn
278
+
279
+ ### Features
280
+
281
+ - ✅ Real-time anomaly detection
282
+ - ✅ Root cause analysis with fault code mapping
283
+ - ✅ Maintenance cost estimation
284
+ - ✅ Natural language summaries
285
+ - ✅ Interactive visualizations
286
+ - ✅ Batch processing support
287
+
288
+ ### Dataset
289
+
290
+ The system analyzes synthetic vehicle sensor data including:
291
+ - Engine temperature, RPM, speed
292
+ - Battery voltage and health
293
+ - Oil and fuel pressure
294
+ - Tire pressure (all four wheels)
295
+ - Vibration levels
296
+ - And more...
297
+
298
+ ---
299
+
300
+ **Version:** 1.0.0
301
+ **Author:** Vehicle Diagnostics Team
302
+ **License:** MIT
303
+ """)
304
+
305
+ # Launch the app
306
+ if __name__ == "__main__":
307
+ demo.launch(server_name="0.0.0.0", server_port=7860, share=False)
src/utils/data_preprocessing.py ADDED
@@ -0,0 +1,209 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Data preprocessing and feature engineering for vehicle sensor data
3
+ """
4
+ import numpy as np
5
+ import pandas as pd
6
+ from sklearn.preprocessing import StandardScaler, MinMaxScaler
7
+ from sklearn.model_selection import train_test_split
8
+ from pathlib import Path
9
+ import pickle
10
+
11
+
12
+ class VehicleDataPreprocessor:
13
+ """Preprocess and engineer features from vehicle sensor data"""
14
+
15
+ def __init__(self, data_path='data/raw/vehicle_sensor_data.csv'):
16
+ self.data_path = Path(data_path)
17
+ self.scaler = StandardScaler()
18
+ self.feature_columns = None
19
+ self.target_column = 'anomaly'
20
+
21
+ def load_data(self):
22
+ """Load raw sensor data"""
23
+ print(f"Loading data from {self.data_path}...")
24
+ df = pd.read_csv(self.data_path)
25
+ print(f"✓ Loaded {len(df)} records for {df['vehicle_id'].nunique()} vehicles")
26
+ return df
27
+
28
+ def clean_data(self, df):
29
+ """Clean and filter noisy data"""
30
+ print("Cleaning data...")
31
+
32
+ # Remove duplicates
33
+ df = df.drop_duplicates()
34
+
35
+ # Handle missing values
36
+ df = df.fillna(df.median(numeric_only=True))
37
+
38
+ # Remove outliers using IQR method for key sensors
39
+ sensor_cols = [col for col in df.columns if col not in ['vehicle_id', 'timestamp', 'anomaly']]
40
+
41
+ for col in sensor_cols:
42
+ Q1 = df[col].quantile(0.01)
43
+ Q3 = df[col].quantile(0.99)
44
+ IQR = Q3 - Q1
45
+ lower_bound = Q1 - 3 * IQR
46
+ upper_bound = Q3 + 3 * IQR
47
+ df[col] = df[col].clip(lower_bound, upper_bound)
48
+
49
+ print(f"✓ Cleaned data: {len(df)} records remaining")
50
+ return df
51
+
52
+ def apply_moving_average(self, df, window=5):
53
+ """Apply moving average filter to reduce noise"""
54
+ print(f"Applying moving average filter (window={window})...")
55
+
56
+ sensor_cols = [col for col in df.columns if col not in ['vehicle_id', 'timestamp', 'anomaly']]
57
+
58
+ # Group by vehicle and apply rolling average
59
+ for col in sensor_cols:
60
+ df[f'{col}_ma'] = df.groupby('vehicle_id')[col].transform(
61
+ lambda x: x.rolling(window=window, min_periods=1).mean()
62
+ )
63
+
64
+ print(f"✓ Applied moving average to {len(sensor_cols)} sensors")
65
+ return df
66
+
67
+ def engineer_features(self, df):
68
+ """Create domain-specific features"""
69
+ print("Engineering features...")
70
+
71
+ # Rate of change features
72
+ sensor_cols = [col for col in df.columns if col not in ['vehicle_id', 'timestamp', 'anomaly'] and not col.endswith('_ma')]
73
+
74
+ for col in sensor_cols:
75
+ # Rate of change
76
+ df[f'{col}_rate'] = df.groupby('vehicle_id')[col].diff()
77
+
78
+ # Rolling statistics
79
+ df[f'{col}_std'] = df.groupby('vehicle_id')[col].transform(
80
+ lambda x: x.rolling(window=10, min_periods=1).std()
81
+ )
82
+
83
+ # Domain-specific features
84
+ # Temperature differential
85
+ df['temp_differential'] = df['engine_temp'] - df['coolant_temp']
86
+
87
+ # Tire pressure imbalance
88
+ df['tire_pressure_imbalance'] = df[['tire_pressure_fl', 'tire_pressure_fr',
89
+ 'tire_pressure_rl', 'tire_pressure_rr']].std(axis=1)
90
+
91
+ # Engine stress indicator
92
+ df['engine_stress'] = (df['rpm'] / 1000) * (df['engine_temp'] / 100)
93
+
94
+ # Battery health indicator
95
+ df['battery_health'] = df['battery_voltage'] / 12.6 # Normalized to ideal voltage
96
+
97
+ # Fill NaN values created by diff and rolling operations
98
+ df = df.fillna(0)
99
+
100
+ print(f"✓ Engineered features: {df.shape[1]} total columns")
101
+ return df
102
+
103
+ def normalize_features(self, df, fit=True):
104
+ """Normalize sensor values"""
105
+ print("Normalizing features...")
106
+
107
+ # Select feature columns (exclude metadata and target)
108
+ exclude_cols = ['vehicle_id', 'timestamp', 'anomaly']
109
+ self.feature_columns = [col for col in df.columns if col not in exclude_cols]
110
+
111
+ if fit:
112
+ df[self.feature_columns] = self.scaler.fit_transform(df[self.feature_columns])
113
+ else:
114
+ df[self.feature_columns] = self.scaler.transform(df[self.feature_columns])
115
+
116
+ print(f"✓ Normalized {len(self.feature_columns)} features")
117
+ return df
118
+
119
+ def split_data(self, df, test_size=0.2, val_size=0.1):
120
+ """Split data into train, validation, and test sets"""
121
+ print("Splitting data...")
122
+
123
+ # Split by vehicle to avoid data leakage
124
+ vehicle_ids = df['vehicle_id'].unique()
125
+
126
+ # First split: train+val vs test
127
+ train_val_ids, test_ids = train_test_split(
128
+ vehicle_ids, test_size=test_size, random_state=42
129
+ )
130
+
131
+ # Second split: train vs val
132
+ train_ids, val_ids = train_test_split(
133
+ train_val_ids, test_size=val_size/(1-test_size), random_state=42
134
+ )
135
+
136
+ train_df = df[df['vehicle_id'].isin(train_ids)]
137
+ val_df = df[df['vehicle_id'].isin(val_ids)]
138
+ test_df = df[df['vehicle_id'].isin(test_ids)]
139
+
140
+ print(f"✓ Train: {len(train_df)} records ({len(train_ids)} vehicles)")
141
+ print(f"✓ Val: {len(val_df)} records ({len(val_ids)} vehicles)")
142
+ print(f"✓ Test: {len(test_df)} records ({len(test_ids)} vehicles)")
143
+
144
+ return train_df, val_df, test_df
145
+
146
+ def save_processed_data(self, train_df, val_df, test_df, output_dir='data/processed'):
147
+ """Save processed datasets"""
148
+ output_path = Path(output_dir)
149
+ output_path.mkdir(parents=True, exist_ok=True)
150
+
151
+ print(f"Saving processed data to {output_path}...")
152
+
153
+ train_df.to_csv(output_path / 'train.csv', index=False)
154
+ val_df.to_csv(output_path / 'val.csv', index=False)
155
+ test_df.to_csv(output_path / 'test.csv', index=False)
156
+
157
+ # Save scaler
158
+ with open(output_path / 'scaler.pkl', 'wb') as f:
159
+ pickle.dump(self.scaler, f)
160
+
161
+ # Save feature columns
162
+ with open(output_path / 'feature_columns.pkl', 'wb') as f:
163
+ pickle.dump(self.feature_columns, f)
164
+
165
+ print("✓ Saved all processed datasets and preprocessing artifacts")
166
+
167
+ # Print statistics
168
+ print("\nDataset Statistics:")
169
+ print(f"Train anomaly rate: {train_df['anomaly'].mean():.2%}")
170
+ print(f"Val anomaly rate: {val_df['anomaly'].mean():.2%}")
171
+ print(f"Test anomaly rate: {test_df['anomaly'].mean():.2%}")
172
+
173
+ def preprocess_pipeline(self):
174
+ """Run complete preprocessing pipeline"""
175
+ print("="*60)
176
+ print("VEHICLE DATA PREPROCESSING PIPELINE")
177
+ print("="*60)
178
+
179
+ # Load data
180
+ df = self.load_data()
181
+
182
+ # Clean data
183
+ df = self.clean_data(df)
184
+
185
+ # Apply filters
186
+ df = self.apply_moving_average(df, window=5)
187
+
188
+ # Engineer features
189
+ df = self.engineer_features(df)
190
+
191
+ # Normalize features
192
+ df = self.normalize_features(df, fit=True)
193
+
194
+ # Split data
195
+ train_df, val_df, test_df = self.split_data(df)
196
+
197
+ # Save processed data
198
+ self.save_processed_data(train_df, val_df, test_df)
199
+
200
+ print("\n" + "="*60)
201
+ print("PREPROCESSING COMPLETE!")
202
+ print("="*60)
203
+
204
+ return train_df, val_df, test_df
205
+
206
+
207
+ if __name__ == '__main__':
208
+ preprocessor = VehicleDataPreprocessor()
209
+ train_df, val_df, test_df = preprocessor.preprocess_pipeline()
src/utils/download_data.py ADDED
@@ -0,0 +1,158 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Download NASA Turbofan Engine Degradation Dataset
3
+ This dataset simulates engine sensor data with degradation patterns
4
+ """
5
+ import os
6
+ import zipfile
7
+ import requests
8
+ from pathlib import Path
9
+ from tqdm import tqdm
10
+
11
+
12
+ def download_file(url, destination):
13
+ """Download file with progress bar"""
14
+ response = requests.get(url, stream=True)
15
+ total_size = int(response.headers.get('content-length', 0))
16
+
17
+ with open(destination, 'wb') as file, tqdm(
18
+ desc=destination.name,
19
+ total=total_size,
20
+ unit='iB',
21
+ unit_scale=True,
22
+ unit_divisor=1024,
23
+ ) as progress_bar:
24
+ for data in response.iter_content(chunk_size=1024):
25
+ size = file.write(data)
26
+ progress_bar.update(size)
27
+
28
+
29
+ def download_nasa_turbofan_data(data_dir='data/raw'):
30
+ """
31
+ Download NASA Turbofan Engine Degradation Simulation Data Set
32
+ Source: https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/
33
+ """
34
+ data_path = Path(data_dir)
35
+ data_path.mkdir(parents=True, exist_ok=True)
36
+
37
+ # NASA C-MAPSS Dataset URL
38
+ url = "https://ti.arc.nasa.gov/c/6/"
39
+
40
+ print("Downloading NASA Turbofan Engine Degradation Dataset...")
41
+ print("This dataset contains simulated engine sensor data with degradation patterns")
42
+
43
+ # Alternative: Use a direct download link or create synthetic data
44
+ # Since the NASA link requires manual download, we'll create a synthetic dataset
45
+ print("\nNote: Creating synthetic vehicle sensor dataset based on NASA patterns...")
46
+
47
+ return create_synthetic_vehicle_data(data_path)
48
+
49
+
50
+ def create_synthetic_vehicle_data(data_path):
51
+ """
52
+ Create synthetic vehicle sensor data with realistic patterns
53
+ Simulates: engine temp, RPM, speed, battery voltage, oil pressure, etc.
54
+ """
55
+ import numpy as np
56
+ import pandas as pd
57
+
58
+ print("Generating synthetic vehicle sensor data...")
59
+
60
+ np.random.seed(42)
61
+
62
+ # Number of vehicles and time steps
63
+ n_vehicles = 100
64
+ n_timesteps = 500
65
+
66
+ datasets = {}
67
+
68
+ for vehicle_id in range(1, n_vehicles + 1):
69
+ data = []
70
+
71
+ # Determine if vehicle will have anomaly
72
+ has_anomaly = np.random.rand() > 0.7 # 30% have anomalies
73
+ anomaly_start = np.random.randint(300, 450) if has_anomaly else n_timesteps + 1
74
+
75
+ for t in range(n_timesteps):
76
+ # Base sensor readings with some noise
77
+ base_engine_temp = 90 + np.random.normal(0, 5)
78
+ base_rpm = 2000 + np.random.normal(0, 200)
79
+ base_speed = 60 + np.random.normal(0, 10)
80
+ base_battery = 12.6 + np.random.normal(0, 0.2)
81
+ base_oil_pressure = 40 + np.random.normal(0, 3)
82
+ base_coolant_temp = 85 + np.random.normal(0, 4)
83
+ base_fuel_pressure = 50 + np.random.normal(0, 2)
84
+ base_throttle = 50 + np.random.normal(0, 10)
85
+ base_brake_temp = 150 + np.random.normal(0, 15)
86
+ base_tire_pressure_fl = 32 + np.random.normal(0, 0.5)
87
+ base_tire_pressure_fr = 32 + np.random.normal(0, 0.5)
88
+ base_tire_pressure_rl = 32 + np.random.normal(0, 0.5)
89
+ base_tire_pressure_rr = 32 + np.random.normal(0, 0.5)
90
+ base_vibration = 0.5 + np.random.normal(0, 0.1)
91
+
92
+ # Introduce anomalies after anomaly_start
93
+ if t >= anomaly_start:
94
+ degradation_factor = (t - anomaly_start) / 100
95
+
96
+ # Engine overheating
97
+ base_engine_temp += degradation_factor * 20
98
+ base_coolant_temp += degradation_factor * 15
99
+
100
+ # Oil pressure drop
101
+ base_oil_pressure -= degradation_factor * 10
102
+
103
+ # Battery degradation
104
+ base_battery -= degradation_factor * 0.5
105
+
106
+ # Increased vibration
107
+ base_vibration += degradation_factor * 0.3
108
+
109
+ # Tire pressure issues
110
+ if np.random.rand() > 0.8:
111
+ base_tire_pressure_fl -= degradation_factor * 2
112
+
113
+ # Create data point
114
+ data_point = {
115
+ 'vehicle_id': vehicle_id,
116
+ 'timestamp': t,
117
+ 'engine_temp': max(0, base_engine_temp),
118
+ 'rpm': max(0, base_rpm),
119
+ 'speed': max(0, base_speed),
120
+ 'battery_voltage': max(0, base_battery),
121
+ 'oil_pressure': max(0, base_oil_pressure),
122
+ 'coolant_temp': max(0, base_coolant_temp),
123
+ 'fuel_pressure': max(0, base_fuel_pressure),
124
+ 'throttle_position': np.clip(base_throttle, 0, 100),
125
+ 'brake_temp': max(0, base_brake_temp),
126
+ 'tire_pressure_fl': max(0, base_tire_pressure_fl),
127
+ 'tire_pressure_fr': max(0, base_tire_pressure_fr),
128
+ 'tire_pressure_rl': max(0, base_tire_pressure_rl),
129
+ 'tire_pressure_rr': max(0, base_tire_pressure_rr),
130
+ 'vibration_level': max(0, base_vibration),
131
+ 'anomaly': 1 if t >= anomaly_start else 0
132
+ }
133
+ data.append(data_point)
134
+
135
+ datasets[f'vehicle_{vehicle_id}'] = pd.DataFrame(data)
136
+
137
+ # Combine all vehicles into one dataset
138
+ full_dataset = pd.concat(datasets.values(), ignore_index=True)
139
+
140
+ # Save to CSV
141
+ output_file = data_path / 'vehicle_sensor_data.csv'
142
+ full_dataset.to_csv(output_file, index=False)
143
+ print(f"✓ Saved synthetic vehicle sensor data to {output_file}")
144
+ print(f" - Total records: {len(full_dataset)}")
145
+ print(f" - Vehicles: {n_vehicles}")
146
+ print(f" - Timesteps per vehicle: {n_timesteps}")
147
+ print(f" - Anomaly rate: ~30%")
148
+
149
+ # Create summary statistics
150
+ summary = full_dataset.groupby('vehicle_id')['anomaly'].sum()
151
+ vehicles_with_anomalies = (summary > 0).sum()
152
+ print(f" - Vehicles with anomalies: {vehicles_with_anomalies}/{n_vehicles}")
153
+
154
+ return output_file
155
+
156
+
157
+ if __name__ == '__main__':
158
+ download_nasa_turbofan_data()
tests/test_agents.py ADDED
@@ -0,0 +1,197 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Unit tests for individual agents
3
+ """
4
+ import pytest
5
+ import sys
6
+ from pathlib import Path
7
+
8
+ sys.path.append(str(Path(__file__).parent.parent / 'src'))
9
+
10
+ from agents.data_ingestion_agent import DataIngestionAgent
11
+ from agents.anomaly_detection_agent import AnomalyDetectionAgent
12
+ from agents.root_cause_agent import RootCauseAnalysisAgent
13
+ from agents.maintenance_recommendation_agent import MaintenanceRecommendationAgent
14
+ from agents.report_generation_agent import ReportGenerationAgent
15
+
16
+
17
+ class TestDataIngestionAgent:
18
+ """Test Data Ingestion Agent"""
19
+
20
+ def test_load_data(self):
21
+ """Test loading test data"""
22
+ agent = DataIngestionAgent()
23
+ df = agent.load_test_data()
24
+
25
+ assert df is not None
26
+ assert len(df) > 0
27
+ assert 'vehicle_id' in df.columns
28
+ assert 'timestamp' in df.columns
29
+
30
+ def test_get_vehicle_data(self):
31
+ """Test getting data for specific vehicle"""
32
+ agent = DataIngestionAgent()
33
+ df = agent.load_test_data()
34
+ vehicle_id = df['vehicle_id'].iloc[0]
35
+
36
+ vehicle_data = agent.get_vehicle_data(vehicle_id)
37
+
38
+ assert len(vehicle_data) > 0
39
+ assert (vehicle_data['vehicle_id'] == vehicle_id).all()
40
+
41
+ def test_prepare_for_analysis(self):
42
+ """Test data preparation"""
43
+ agent = DataIngestionAgent()
44
+ df = agent.load_test_data()
45
+ vehicle_id = df['vehicle_id'].iloc[0]
46
+ vehicle_data = agent.get_vehicle_data(vehicle_id)
47
+
48
+ prepared = agent.prepare_for_analysis(vehicle_data)
49
+
50
+ assert 'vehicle_id' in prepared
51
+ assert 'features' in prepared
52
+ assert 'timestamps' in prepared
53
+ assert prepared['vehicle_id'] == vehicle_id
54
+
55
+
56
+ class TestAnomalyDetectionAgent:
57
+ """Test Anomaly Detection Agent"""
58
+
59
+ def test_initialization(self):
60
+ """Test agent initialization"""
61
+ agent = AnomalyDetectionAgent()
62
+ assert agent is not None
63
+
64
+ def test_detect_anomalies(self):
65
+ """Test anomaly detection"""
66
+ ingestion_agent = DataIngestionAgent()
67
+ detection_agent = AnomalyDetectionAgent()
68
+
69
+ df = ingestion_agent.load_test_data()
70
+ vehicle_id = df['vehicle_id'].iloc[0]
71
+
72
+ prepared_data = ingestion_agent.run(vehicle_id, n_readings=100)
73
+ result = detection_agent.run(prepared_data)
74
+
75
+ assert 'vehicle_id' in result
76
+ assert 'anomaly_detected' in result
77
+ assert 'overall_score' in result
78
+ assert 'anomaly_predictions' in result
79
+
80
+
81
+ class TestRootCauseAnalysisAgent:
82
+ """Test Root Cause Analysis Agent"""
83
+
84
+ def test_initialization(self):
85
+ """Test agent initialization"""
86
+ agent = RootCauseAnalysisAgent()
87
+ assert agent is not None
88
+ assert len(agent.fault_patterns) > 0
89
+
90
+ def test_analyze_no_anomalies(self):
91
+ """Test analysis when no anomalies"""
92
+ agent = RootCauseAnalysisAgent()
93
+
94
+ anomaly_result = {
95
+ 'vehicle_id': 1,
96
+ 'anomaly_detected': False,
97
+ 'anomalous_sensors': {},
98
+ 'raw_data': None,
99
+ 'anomaly_indices': [],
100
+ 'timestamps': []
101
+ }
102
+
103
+ result = agent.run(anomaly_result)
104
+
105
+ assert result['vehicle_id'] == 1
106
+ assert len(result['root_causes']) == 0
107
+
108
+
109
+ class TestMaintenanceRecommendationAgent:
110
+ """Test Maintenance Recommendation Agent"""
111
+
112
+ def test_initialization(self):
113
+ """Test agent initialization"""
114
+ agent = MaintenanceRecommendationAgent()
115
+ assert agent is not None
116
+ assert len(agent.maintenance_actions) > 0
117
+
118
+ def test_generate_recommendations(self):
119
+ """Test recommendation generation"""
120
+ agent = MaintenanceRecommendationAgent()
121
+
122
+ root_causes = [{
123
+ 'fault_name': 'engine_overheating',
124
+ 'description': 'Test',
125
+ 'severity': 'critical',
126
+ 'confidence': 0.9,
127
+ 'fault_codes': ['P0217']
128
+ }]
129
+
130
+ recommendations = agent.generate_recommendations(root_causes)
131
+
132
+ assert len(recommendations) > 0
133
+ assert 'immediate_actions' in recommendations[0]
134
+ assert 'estimated_cost' in recommendations[0]
135
+
136
+
137
+ class TestReportGenerationAgent:
138
+ """Test Report Generation Agent"""
139
+
140
+ def test_initialization(self):
141
+ """Test agent initialization"""
142
+ agent = ReportGenerationAgent()
143
+ assert agent is not None
144
+
145
+ def test_generate_summary(self):
146
+ """Test summary generation"""
147
+ agent = ReportGenerationAgent()
148
+
149
+ anomaly_result = {
150
+ 'vehicle_id': 1,
151
+ 'anomaly_detected': False,
152
+ 'num_anomalies': 0,
153
+ 'anomaly_rate': 0.0,
154
+ 'overall_score': 0.0,
155
+ 'anomalous_sensors': {}
156
+ }
157
+
158
+ root_cause_result = {
159
+ 'root_causes': [],
160
+ 'primary_cause': None
161
+ }
162
+
163
+ maintenance_result = {
164
+ 'recommendations': [],
165
+ 'total_cost': {'cost_range': '$0'}
166
+ }
167
+
168
+ summary = agent.generate_executive_summary(
169
+ 1, anomaly_result, root_cause_result, maintenance_result
170
+ )
171
+
172
+ assert 'Vehicle 1' in summary
173
+ assert 'normally' in summary.lower()
174
+
175
+
176
+ def test_full_pipeline():
177
+ """Test complete diagnostic pipeline"""
178
+ from orchestrator import VehicleDiagnosticOrchestrator
179
+
180
+ orchestrator = VehicleDiagnosticOrchestrator()
181
+
182
+ # Get a test vehicle
183
+ ingestion_agent = DataIngestionAgent()
184
+ df = ingestion_agent.load_test_data()
185
+ vehicle_id = df['vehicle_id'].iloc[0]
186
+
187
+ # Run diagnostic
188
+ result = orchestrator.diagnose_vehicle(vehicle_id, n_readings=100)
189
+
190
+ assert result['success'] == True
191
+ assert result['vehicle_id'] == vehicle_id
192
+ assert 'report' in result
193
+ assert 'anomaly_result' in result
194
+
195
+
196
+ if __name__ == '__main__':
197
+ pytest.main([__file__, '-v'])