Spaces:

baiganinn
/

AnomalDrive

Sleeping

App Files Files Community

baiganinn commited on Sep 14, 2025

Commit

7058515

0 Parent(s):

init

Browse files

Files changed (22) hide show

.gitattributes +3 -0
README.md +183 -0
__pycache__/batch_production_pred.cpython-311.pyc +0 -0
__pycache__/gradio_app.cpython-311.pyc +0 -0
__pycache__/production_predictor.cpython-311.pyc +0 -0
app.py +10 -0
batch_production_pred.py +484 -0
gradio_app.py +536 -0
launch.bat +12 -0
models/feature_names.json +1 -0
models/isolation_forest.pkl +3 -0
models/lstm_autoencoder.pth +3 -0
models/lstm_threshold.json +1 -0
models/manifest.json +25 -0
models/model_metadata.json +13 -0
models/one_class_svm.pkl +3 -0
models/optimization_model.joblib +3 -0
models/robots.txt +3 -0
models/scaler.pkl +3 -0
production_predictor.py +673 -0
requirements.txt +9 -0
sample_data.csv +31 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1,3 @@

+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,183 @@

+# 🛣️ Vehicle Anomaly Detection System
+An advanced machine learning-powered anomaly detection system for GPS tracking data with a beautiful Gradio interface.
+## 🚀 Features
+- **Multiple ML Models**: Ensemble of Isolation Forest, One-Class SVM, and LSTM Autoencoder
+- **Beautiful UI**: Modern Gradio interface with interactive visualizations
+- **Real-time Processing**: Handles up to 2000 GPS points with detailed analysis
+- **Comprehensive Output**: Point-by-point analysis, risk factors, and JSON export
+- **Interactive Maps**: GPS route visualization with anomaly highlighting
+- **Performance Analytics**: Speed, altitude, and confidence distribution charts
+## 📊 Processing Performance
+- **CPU-only processing**: 45-90 seconds for 2000 samples
+- **HuggingFace Spaces ready**: Optimized for cloud deployment
+- **Memory efficient**: Handles large datasets with rolling window processing
+## 🔧 Installation
+### Local Installation
+```bash
+# Clone or download the project
+cd anomaly
+# Install dependencies
+pip install -r requirements.txt
+# Run the Gradio app
+python gradio_app.py
+```
+### HuggingFace Spaces Deployment
+1. Create a new Space on HuggingFace
+2. Upload all files including the `models/` directory
+3. Set `app_file` to `gradio_app.py`
+4. The app will automatically launch
+## 📁 Input Format
+Your CSV file must contain these columns:
+| Column | Description | Range |
+|--------|-------------|-------|
+| `randomized_id` | Vehicle identifier | Any string |
+| `lat` | Latitude | -90 to 90 |
+| `lng` | Longitude | -180 to 180 |
+| `spd` | Speed (km/h) | 0 to 300 |
+| `azm` | Azimuth/heading (degrees) | 0 to 360 |
+| `alt` | Altitude (meters) | Any number |
+### Sample Data
+```csv
+randomized_id,lat,lng,spd,azm,alt
+VEHICLE001,40.7128,-74.0060,45.5,90.0,100.0
+VEHICLE001,40.7138,-74.0070,48.2,92.0,102.0
+VEHICLE002,40.7500,-73.9800,35.2,180.0,90.0
+```
+**Maximum**: 2000 samples per upload
+**Minimum**: 5 samples required
+## 🎯 Anomaly Detection
+The system detects various types of anomalies:
+### Speed Anomalies
+- Excessive speeding (>120 km/h)
+- Sudden acceleration/deceleration
+- Speed inconsistencies
+### Movement Anomalies
+- Erratic GPS patterns
+- Sharp turns at high speed
+- Altitude inconsistencies
+### Behavioral Patterns
+- Route deviations
+- Stop-and-go patterns
+- Unusual driving sequences
+## 📈 Output Features
+### 1. Detailed Results
+- Point-by-point analysis
+- Normal vs. anomaly classification
+- Confidence scores and alert levels
+- Risk factor identification
+### 2. Interactive Visualizations
+- GPS route mapping with anomaly markers
+- Speed and altitude profiles
+- Confidence score distributions
+- Multi-panel analysis dashboard
+### 3. Summary Statistics
+- Processing performance metrics
+- Overall anomaly rates
+- Alert level distributions
+- Risk factor rankings
+### 4. JSON Export
+Complete machine-readable results including:
+- All detection scores
+- Driving metrics
+- Risk assessments
+- Timestamps and metadata
+## 🔬 Technical Details
+### ML Models Used
+1. **Isolation Forest**: Tree-based anomaly detection
+2. **One-Class SVM**: Support vector-based outlier detection
+3. **LSTM Autoencoder**: Deep learning sequence anomaly detection
+### Feature Engineering
+- 18 engineered features including:
+  - Speed patterns and statistics
+  - Acceleration and jerk calculations
+  - Angular velocity and curvature
+  - Rolling window aggregations
+  - Risk scoring algorithms
+### Performance Optimization
+- Efficient batch processing
+- Memory-optimized feature calculation
+- CPU-friendly model inference
+- Progressive result streaming
+## 🛡️ Privacy & Security
+- **Local Processing**: All analysis happens in your environment
+- **No Data Upload**: Your GPS data never leaves the system
+- **Real-time Analysis**: No data storage or logging
+- **Secure Processing**: Industry-standard ML pipeline
+## 🚀 Deployment Options
+### Local Development
+```bash
+python gradio_app.py
+# Access at http://localhost:7860
+```
+### HuggingFace Spaces
+- Perfect for sharing and collaboration
+- No setup required
+- Automatic scaling
+- Public or private deployment
+### Docker (Optional)
+```dockerfile
+FROM python:3.9-slim
+COPY . /app
+WORKDIR /app
+RUN pip install -r requirements.txt
+CMD ["python", "gradio_app.py"]
+```
+## 📞 Support
+For issues or questions:
+1. Check the sample data format
+2. Ensure your CSV has all required columns
+3. Verify data is within expected ranges
+4. Check for missing values or invalid entries
+## 🔮 Future Enhancements
+- Real-time streaming support
+- Custom alert thresholds
+- Historical trend analysis
+- Fleet management dashboard
+- Advanced route optimization
+- Multi-vehicle correlation analysis
+---
+**Made with ❤️ using Gradio, PyTorch, and Advanced ML**

__pycache__/batch_production_pred.cpython-311.pyc ADDED Viewed

Binary file (26.4 kB). View file

__pycache__/gradio_app.cpython-311.pyc ADDED Viewed

Binary file (30.1 kB). View file

__pycache__/production_predictor.cpython-311.pyc ADDED Viewed

Binary file (34.8 kB). View file

app.py ADDED Viewed

	@@ -0,0 +1,10 @@

+#!/usr/bin/env python3
+"""
+HuggingFace Spaces entry point for Vehicle Anomaly Detection System
+"""
+from gradio_app import create_interface
+if __name__ == "__main__":
+    demo = create_interface()
+    demo.launch(share=True, server_name="0.0.0.0", server_port=7860, debug=True)

batch_production_pred.py ADDED Viewed

	@@ -0,0 +1,484 @@

+import numpy as np
+import pandas as pd
+from typing import List, Dict, Optional, Tuple, Any
+from datetime import datetime, timedelta
+import logging
+from production_predictor import ProductionAnomalyDetector, AnomalyResult, GPSPoint
+import torch
+logger = logging.getLogger(__name__)
+device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+class BatchAnomalyDetector(ProductionAnomalyDetector):
+    """
+    Extended ProductionAnomalyDetector with batch processing capabilities
+    Processes data as list of lists: [[id, lat, lng, azm, spd, alt], ...]
+    """
+    def __init__(self, model_dir: str, config: Dict = None):
+        super().__init__(model_dir, config)
+        self.batch_results = []
+    def process_batch_list_of_lists(self,
+                                   data: List[List],
+                                   column_order: List[str] = None,
+                                   sort_by_vehicle: bool = True,
+                                   generate_timestamps: bool = True) -> Dict[str, Any]:
+        """
+        Process batch data as list of lists
+        Args:
+            data: List of lists in format [[id, lat, lng, azm, spd, alt], ...]
+            column_order: Order of columns if different from default
+            sort_by_vehicle: Whether to sort by vehicle_id for proper sequence
+            generate_timestamps: Whether to generate timestamps automatically
+        Returns:
+            Dictionary with batch processing results
+        """
+        if column_order is None:
+            column_order = ['vehicle_id', 'lat', 'lng', 'azm', 'spd', 'alt']
+        print(f"🔄 Processing batch of {len(data)} GPS points...")
+        # Convert list of lists to DataFrame
+        df = pd.DataFrame(data, columns=column_order)
+        # Rename to match your training format
+        column_mapping = {
+            'vehicle_id': 'randomized_id',
+            'azm': 'azm',
+            'spd': 'spd',
+            'alt': 'alt',
+            'lat': 'lat',
+            'lng': 'lng'
+        }
+        # Apply column mapping if needed
+        for old_col, new_col in column_mapping.items():
+            if old_col in df.columns and old_col != new_col:
+                df = df.rename(columns={old_col: new_col})
+        # Ensure we have the right columns
+        required_columns = ['randomized_id', 'lat', 'lng', 'alt', 'spd', 'azm']
+        missing_columns = [col for col in required_columns if col not in df.columns]
+        if missing_columns:
+            raise ValueError(f"Missing required columns: {missing_columns}")
+        # Sort by vehicle and add sequence if requested
+        if sort_by_vehicle:
+            df = df.sort_values(['randomized_id', 'lat', 'lng']).reset_index(drop=True)
+        # Generate timestamps if requested
+        if generate_timestamps:
+            df['timestamp'] = self._generate_timestamps(df)
+        # Process batch
+        return self._process_dataframe_batch(df)
+    def process_batch_by_vehicle(self,
+                                data: List[List],
+                                column_order: List[str] = None,
+                                time_interval_seconds: int = 2) -> Dict[str, List[AnomalyResult]]:
+        """
+        Process batch data vehicle by vehicle to maintain proper sequence
+        Args:
+            data: List of lists format
+            column_order: Column order specification
+            time_interval_seconds: Time interval between GPS points
+        Returns:
+            Dictionary with vehicle_id as key and list of results as value
+        """
+        if column_order is None:
+            column_order = ['vehicle_id', 'lat', 'lng', 'azm', 'spd', 'alt']
+        # Convert to DataFrame
+        df = pd.DataFrame(data, columns=column_order)
+        # Group by vehicle
+        vehicle_results = {}
+        total_anomalies = 0
+        print(f"🚛 Processing {df['vehicle_id'].nunique()} vehicles with {len(df)} total points...")
+        for vehicle_id in df['vehicle_id'].unique():
+            vehicle_data = df[df['vehicle_id'] == vehicle_id].copy()
+            vehicle_data = vehicle_data.sort_values(['lat', 'lng']).reset_index(drop=True)
+            print(f"\n📍 Processing vehicle: {vehicle_id} ({len(vehicle_data)} points)")
+            # Clear vehicle buffer to start fresh
+            if vehicle_id in self.vehicle_buffers:
+                del self.vehicle_buffers[vehicle_id]
+            vehicle_results[vehicle_id] = []
+            vehicle_anomalies = 0
+            # Process points sequentially for this vehicle
+            for idx, row in vehicle_data.iterrows():
+                timestamp = datetime.now() + timedelta(seconds=idx * time_interval_seconds)
+                gps_point = GPSPoint(
+                    vehicle_id=vehicle_id,
+                    lat=row['lat'],
+                    lng=row['lng'],
+                    alt=row.get('alt', 0.0),
+                    spd=row.get('spd', 0.0),
+                    azm=row.get('azm', 0.0),
+                    timestamp=timestamp.isoformat()
+                )
+                result = self.process_gps_point(gps_point)
+                if result:
+                    vehicle_results[vehicle_id].append(result)
+                    if result.anomaly_detected:
+                        vehicle_anomalies += 1
+                        total_anomalies += 1
+                        # Print anomaly details
+                        print(f"   🚨 Point {idx+1}: {result.alert_level} "
+                              f"(Speed: {result.driving_metrics['speed']:.1f} km/h, "
+                              f"Conf: {result.confidence:.3f})")
+                        print(f"      Risk factors: {result.risk_factors}")
+            detection_rate = vehicle_anomalies / len(vehicle_results[vehicle_id]) if vehicle_results[vehicle_id] else 0
+            print(f"   📊 Vehicle summary: {vehicle_anomalies} anomalies out of {len(vehicle_results[vehicle_id])} detections ({detection_rate:.1%})")
+        print(f"\n🎯 Batch Summary:")
+        print(f"   Total vehicles: {len(vehicle_results)}")
+        print(f"   Total points processed: {len(df)}")
+        print(f"   Total anomalies detected: {total_anomalies}")
+        print(f"   Overall anomaly rate: {total_anomalies/len(df):.1%}")
+        return vehicle_results
+    def process_realtime_stream(self, data_stream: List[List],
+                               column_order: List[str] = None,
+                               delay_seconds: float = 2.0,
+                               callback_function = None) -> List[AnomalyResult]:
+        """
+        Simulate real-time processing of list-of-lists data
+        Args:
+            data_stream: List of lists to process as real-time stream
+            column_order: Column order
+            delay_seconds: Delay between processing points (simulate real-time)
+            callback_function: Function to call when anomaly is detected
+        Returns:
+            List of all detection results
+        """
+        import time
+        if column_order is None:
+            column_order = ['vehicle_id', 'lat', 'lng', 'azm', 'spd', 'alt']
+        print(f"🔴 Starting real-time stream simulation with {len(data_stream)} points...")
+        print(f"⏱️ Processing delay: {delay_seconds} seconds between points")
+        all_results = []
+        anomaly_count = 0
+        for i, point_data in enumerate(data_stream):
+            # Convert list to GPSPoint
+            point_dict = dict(zip(column_order, point_data))
+            gps_point = GPSPoint(
+                vehicle_id=point_dict['vehicle_id'],
+                lat=point_dict['lat'],
+                lng=point_dict['lng'],
+                alt=point_dict.get('alt', 0.0),
+                spd=point_dict.get('spd', 0.0),
+                azm=point_dict.get('azm', 0.0),
+                timestamp=datetime.now().isoformat()
+            )
+            # Process point
+            result = self.process_gps_point(gps_point)
+            if result:
+                all_results.append(result)
+                # Print status
+                status_icon = "🟢" if result.alert_level == "NORMAL" else "🟡" if result.alert_level in ["LOW", "MEDIUM"] else "🔴"
+                print(f"{status_icon} Point {i+1:3d}: {result.vehicle_id:12s} | "
+                      f"{result.alert_level:8s} | Speed: {result.driving_metrics['speed']:5.1f} km/h | "
+                      f"Conf: {result.confidence:.3f}")
+                if result.anomaly_detected:
+                    anomaly_count += 1
+                    print(f"      🚨 ANOMALY DETECTED! {result.risk_factors}")
+                    # Call callback function if provided
+                    if callback_function:
+                        callback_function(result, gps_point)
+            else:
+                print(f"⏳ Point {i+1:3d}: {point_dict['vehicle_id']:12s} | Building buffer...")
+            # Simulate real-time delay
+            if i < len(data_stream) - 1:  # Don't delay after last point
+                time.sleep(delay_seconds)
+        print(f"\n📊 Stream Complete:")
+        print(f"   Points processed: {len(data_stream)}")
+        print(f"   Detections made: {len(all_results)}")
+        print(f"   Anomalies found: {anomaly_count}")
+        print(f"   Anomaly rate: {anomaly_count/len(all_results)*100:.1f}%" if all_results else "   No detections made")
+        return all_results
+    def _generate_timestamps(self, df: pd.DataFrame) -> List[str]:
+        """Generate realistic timestamps for GPS data"""
+        base_time = datetime.now()
+        timestamps = []
+        for vehicle_id in df['randomized_id'].unique():
+            vehicle_mask = df['randomized_id'] == vehicle_id
+            vehicle_count = vehicle_mask.sum()
+            # Generate timestamps for this vehicle (2-second intervals)
+            for i in range(vehicle_count):
+                timestamp = base_time + timedelta(seconds=i * 2)
+                timestamps.append(timestamp.isoformat())
+        return timestamps
+    def _process_dataframe_batch(self, df: pd.DataFrame) -> Dict[str, Any]:
+        """Process DataFrame using the existing feature pipeline"""
+        # Use your exact feature engineering pipeline
+        features_df = self._calculate_features_exact_pipeline(df)
+        if len(features_df) == 0:
+            return {
+                "status": "error",
+                "message": "No features could be calculated",
+                "processed": 0,
+                "anomalies": 0
+            }
+        # Scale features
+        features_scaled = self.scaler.transform(features_df)
+        # Get anomaly scores for all points
+        anomaly_results = []
+        print("🔍 Running anomaly detection on all points...")
+        for i in range(len(features_scaled)):
+            point_scaled = features_scaled[i:i+1]
+            # Get scores from all models
+            scores = {}
+            # Isolation Forest
+            if self.isolation_forest:
+                scores['isolation_forest'] = float(self.isolation_forest.decision_function(point_scaled)[0])
+            # One-Class SVM
+            if self.one_class_svm:
+                scores['one_class_svm'] = float(self.one_class_svm.decision_function(point_scaled)[0])
+            # LSTM (only if we have enough sequence data)
+            if self.lstm_autoencoder and i >= self.config['lstm_sequence_length'] - 1:
+                try:
+                    sequence_start = max(0, i - self.config['lstm_sequence_length'] + 1)
+                    sequence_features = features_scaled[sequence_start:i+1]
+                    if len(sequence_features) == self.config['lstm_sequence_length']:
+                        sequence_tensor = torch.FloatTensor(sequence_features).unsqueeze(0).to(device)
+                        with torch.no_grad():
+                            reconstructed = self.lstm_autoencoder(sequence_tensor)
+                            reconstruction_error = torch.mean((sequence_tensor - reconstructed) ** 2).item()
+                            scores['lstm'] = float(reconstruction_error)
+                except:
+                    scores['lstm'] = 0.0
+            # Calculate ensemble score
+            ensemble_score = self._calculate_ensemble_score(scores)
+            alert_level = self._get_alert_level(ensemble_score)
+            # Extract metrics
+            feature_row = features_df.iloc[i]
+            driving_metrics = self._extract_driving_metrics_from_features(feature_row)
+            risk_factors = self._extract_risk_factors_from_features(feature_row)
+            anomaly_results.append({
+                'index': i,
+                'vehicle_id': df.iloc[i]['randomized_id'],
+                'anomaly_detected': ensemble_score > self.config['alert_threshold'],
+                'confidence': ensemble_score,
+                'alert_level': alert_level,
+                'raw_scores': scores,
+                'driving_metrics': driving_metrics,
+                'risk_factors': risk_factors
+            })
+        # Generate summary
+        total_anomalies = sum(1 for r in anomaly_results if r['anomaly_detected'])
+        return {
+            "status": "completed",
+            "processed": len(anomaly_results),
+            "anomalies": total_anomalies,
+            "anomaly_rate": total_anomalies / len(anomaly_results) if anomaly_results else 0,
+            "results": anomaly_results,
+            "summary": {
+                "total_vehicles": df['randomized_id'].nunique(),
+                "total_points": len(df),
+                "detection_ready_points": len(anomaly_results),
+                "anomalies_by_level": {
+                    level: sum(1 for r in anomaly_results if r['alert_level'] == level)
+                    for level in ['NORMAL', 'LOW', 'MEDIUM', 'HIGH', 'CRITICAL']
+                }
+            }
+        }
+# Example usage functions
+def example_list_of_lists_usage():
+    """Example of how to use the batch processor with list of lists"""
+    print("🔄 Example: Processing List of Lists Data")
+    print("=" * 50)
+    # Initialize batch detector
+    detector = BatchAnomalyDetector("/kaggle/working/anomaly_analysis_pytorch_fixed/models")
+    # Sample data as list of lists: [vehicle_id, lat, lng, azm, spd, alt]
+    sample_data = [
+        # Normal driving for vehicle_001
+        ["vehicle_001", 55.7558, 37.6176, 90.0, 45.0, 156.0],
+        ["vehicle_001", 55.7559, 37.6177, 92.0, 47.0, 157.0],
+        ["vehicle_001", 55.7560, 37.6178, 94.0, 46.0, 158.0],
+        ["vehicle_001", 55.7561, 37.6179, 96.0, 48.0, 159.0],
+        ["vehicle_001", 55.7562, 37.6180, 98.0, 49.0, 160.0],
+        # Aggressive driving for vehicle_002
+        ["vehicle_002", 55.7600, 37.6200, 180.0, 70.0, 150.0],
+        ["vehicle_002", 55.7601, 37.6201, 182.0, 125.0, 151.0],  # Speeding
+        ["vehicle_002", 55.7602, 37.6202, 184.0, 15.0, 152.0],   # Hard braking
+        ["vehicle_002", 55.7603, 37.6203, 250.0, 55.0, 153.0],   # Sharp turn
+        # Mixed behavior for vehicle_003
+        ["vehicle_003", 55.7700, 37.6300, 45.0, 40.0, 145.0],
+        ["vehicle_003", 55.7701, 37.6301, 47.0, 42.0, 146.0],
+        ["vehicle_003", 55.7702, 37.6302, 49.0, 110.0, 147.0],   # Speed violation
+        ["vehicle_003", 55.7703, 37.6303, 51.0, 43.0, 148.0],
+    ]
+    print(f"Processing {len(sample_data)} GPS points from {len(set(row[0] for row in sample_data))} vehicles...")
+    # Method 1: Process as batch
+    print("\n📊 Method 1: Batch Processing")
+    batch_results = detector.process_batch_list_of_lists(sample_data)
+    print(f"Batch Results:")
+    print(f"  Status: {batch_results['status']}")
+    print(f"  Points processed: {batch_results['processed']}")
+    print(f"  Anomalies detected: {batch_results['anomalies']}")
+    print(f"  Anomaly rate: {batch_results['anomaly_rate']:.1%}")
+    # Method 2: Process by vehicle
+    print("\n🚛 Method 2: Vehicle-by-Vehicle Processing")
+    vehicle_results = detector.process_batch_by_vehicle(sample_data)
+    for vehicle_id, results in vehicle_results.items():
+        anomaly_count = sum(1 for r in results if r.anomaly_detected)
+        print(f"  {vehicle_id}: {anomaly_count} anomalies out of {len(results)} detections")
+    # Method 3: Real-time simulation
+    print("\n🔴 Method 3: Real-time Stream Simulation (first 8 points)")
+    def anomaly_callback(result, gps_point):
+        """Callback function for when anomaly is detected"""
+        print(f"      📧 ALERT SENT: {result.vehicle_id} - {result.alert_level}")
+    stream_results = detector.process_realtime_stream(
+        sample_data[:8],  # First 8 points
+        delay_seconds=0.5,  # Faster for demo
+        callback_function=anomaly_callback
+    )
+def load_from_csv_example():
+    """Example of loading data from CSV and converting to list of lists"""
+    print("\n📁 Example: Loading from CSV")
+    print("=" * 50)
+    # Simulate CSV loading (you would use pd.read_csv('your_file.csv'))
+    csv_data = """vehicle_id,lat,lng,azm,spd,alt
+vehicle_001,55.7558,37.6176,90.0,45.0,156.0
+vehicle_001,55.7559,37.6177,92.0,47.0,157.0
+vehicle_002,55.7600,37.6200,180.0,125.0,150.0
+vehicle_002,55.7601,37.6201,182.0,15.0,151.0"""
+    # Convert CSV to list of lists
+    from io import StringIO
+    df = pd.read_csv(StringIO(csv_data))
+    # Convert DataFrame to list of lists
+    data_as_lists = df.values.tolist()
+    print(f"Loaded {len(data_as_lists)} rows from CSV")
+    print(f"Column order: {df.columns.tolist()}")
+    print(f"Sample data: {data_as_lists[0]}")
+    # Process with detector
+    detector = BatchAnomalyDetector("/kaggle/working/anomaly_analysis_pytorch_fixed/models")
+    results = detector.process_batch_list_of_lists(
+        data_as_lists,
+        column_order=df.columns.tolist()
+    )
+    print(f"Processing complete: {results['anomalies']} anomalies detected")
+def large_dataset_example():
+    """Example for processing large datasets efficiently"""
+    print("\n🔢 Example: Large Dataset Processing")
+    print("=" * 50)
+    # Simulate large dataset
+    np.random.seed(42)
+    large_data = []
+    vehicles = [f"vehicle_{i:03d}" for i in range(1, 11)]  # 10 vehicles
+    for vehicle in vehicles:
+        for point in range(100):  # 100 points per vehicle
+            lat = 55.7500 + np.random.uniform(-0.01, 0.01)
+            lng = 37.6000 + np.random.uniform(-0.01, 0.01)
+            azm = np.random.uniform(0, 360)
+            spd = np.random.uniform(20, 80) if np.random.random() > 0.1 else np.random.uniform(90, 140)  # 10% aggressive
+            alt = 150 + np.random.uniform(-20, 20)
+            large_data.append([vehicle, lat, lng, azm, spd, alt])
+    print(f"Generated large dataset: {len(large_data)} points from {len(vehicles)} vehicles")
+    # Process efficiently
+    detector = BatchAnomalyDetector("/kaggle/working/anomaly_analysis_pytorch_fixed/models")
+    # Process in chunks for memory efficiency
+    chunk_size = 500
+    total_anomalies = 0
+    for i in range(0, len(large_data), chunk_size):
+        chunk = large_data[i:i + chunk_size]
+        print(f"Processing chunk {i//chunk_size + 1}: points {i+1}-{i+len(chunk)}")
+        results = detector.process_batch_list_of_lists(chunk)
+        total_anomalies += results['anomalies']
+        print(f"  Chunk anomalies: {results['anomalies']}")
+    print(f"\nLarge dataset complete:")
+    print(f"  Total points: {len(large_data)}")
+    print(f"  Total anomalies: {total_anomalies}")
+    print(f"  Overall anomaly rate: {total_anomalies/len(large_data):.1%}")

gradio_app.py ADDED Viewed

	@@ -0,0 +1,536 @@

+import gradio as gr
+import pandas as pd
+import numpy as np
+import json
+import time
+from datetime import datetime
+from typing import Dict, List, Tuple, Any
+import plotly.express as px
+import plotly.graph_objects as go
+from plotly.subplots import make_subplots
+import warnings
+warnings.filterwarnings("ignore")
+# Import your ML solution
+from batch_production_pred import BatchAnomalyDetector
+from production_predictor import AnomalyResult
+class AnomalyDetectionGradioApp:
+    def __init__(self, model_dir: str = "./models"):
+        """Initialize the Gradio app with ML models"""
+        self.model_dir = model_dir
+        self.detector = None
+        self.load_models()
+    def load_models(self):
+        """Load the ML models"""
+        try:
+            self.detector = BatchAnomalyDetector(self.model_dir)
+            print("✅ Models loaded successfully!")
+        except Exception as e:
+            print(f"❌ Error loading models: {e}")
+            self.detector = None
+    def validate_csv(self, file_path: str) -> Tuple[bool, str, pd.DataFrame]:
+        """Validate uploaded CSV file"""
+        try:
+            # Read CSV
+            df = pd.read_csv(file_path)
+            # Check required columns
+            required_cols = ['randomized_id', 'lat', 'lng', 'spd', 'azm', 'alt']
+            missing_cols = [col for col in required_cols if col not in df.columns]
+            if missing_cols:
+                return False, f"❌ Missing required columns: {', '.join(missing_cols)}", None
+            # Check sample count
+            if len(df) > 2000:
+                return False, f"❌ Too many samples ({len(df)}). Maximum allowed: 2000", None
+            if len(df) < 5:
+                return False, f"❌ Too few samples ({len(df)}). Minimum required: 5", None
+            # Check data types and ranges
+            try:
+                df['lat'] = pd.to_numeric(df['lat'])
+                df['lng'] = pd.to_numeric(df['lng'])
+                df['spd'] = pd.to_numeric(df['spd'])
+                df['azm'] = pd.to_numeric(df['azm'])
+                df['alt'] = pd.to_numeric(df['alt'])
+                # Basic range validation
+                if not df['lat'].between(-90, 90).all():
+                    return False, "❌ Latitude values must be between -90 and 90", None
+                if not df['lng'].between(-180, 180).all():
+                    return False, "❌ Longitude values must be between -180 and 180", None
+                if not df['spd'].between(0, 300).all():
+                    return False, "❌ Speed values must be between 0 and 300 km/h", None
+                if not df['azm'].between(0, 360).all():
+                    return False, "❌ Azimuth values must be between 0 and 360 degrees", None
+            except Exception as e:
+                return False, f"❌ Data type error: {str(e)}", None
+            return True, f"✅ Valid CSV: {len(df)} samples, {df['randomized_id'].nunique()} vehicles", df
+        except Exception as e:
+            return False, f"❌ Error reading CSV: {str(e)}", None
+    def process_data(self, file_path: str, progress=gr.Progress()) -> Tuple[str, str, str, str]:
+        """Process the uploaded CSV and return results"""
+        if not self.detector:
+            return "❌ Models not loaded", "", "", ""
+        # Validate CSV
+        is_valid, message, df = self.validate_csv(file_path)
+        if not is_valid:
+            return message, "", "", ""
+        progress(0.1, desc="Validating data...")
+        try:
+            # Convert DataFrame to list of lists format
+            data_list = df[['randomized_id', 'lat', 'lng', 'azm', 'spd', 'alt']].values.tolist()
+            column_order = ['vehicle_id', 'lat', 'lng', 'azm', 'spd', 'alt']
+            progress(0.2, desc="Starting anomaly detection...")
+            # Process batch by vehicle
+            start_time = time.time()
+            vehicle_results = self.detector.process_batch_by_vehicle(
+                data_list,
+                column_order=column_order
+            )
+            processing_time = time.time() - start_time
+            progress(0.8, desc="Generating detailed results...")
+            # Generate detailed output
+            detailed_results = self.generate_detailed_results(vehicle_results, df)
+            summary_stats = self.generate_summary_stats(vehicle_results, processing_time)
+            visualization = self.create_visualization(vehicle_results, df)
+            json_output = self.generate_json_output(vehicle_results)
+            progress(1.0, desc="Complete!")
+            return detailed_results, summary_stats, visualization, json_output
+        except Exception as e:
+            return f"❌ Processing error: {str(e)}", "", "", ""
+    def generate_detailed_results(self, vehicle_results: Dict, original_df: pd.DataFrame) -> str:
+        """Generate detailed point-by-point analysis"""
+        output_lines = ["# 🔍 Detailed Anomaly Detection Results\n"]
+        total_points = 0
+        total_anomalies = 0
+        for vehicle_id, results in vehicle_results.items():
+            if not results:
+                continue
+            output_lines.append(f"## 🚗 Vehicle: {vehicle_id}")
+            output_lines.append(f"**Points analyzed:** {len(results)}\n")
+            vehicle_anomalies = 0
+            for i, result in enumerate(results, 1):
+                total_points += 1
+                if result.anomaly_detected:
+                    total_anomalies += 1
+                    vehicle_anomalies += 1
+                    # Get original data point
+                    vehicle_data = original_df[original_df['randomized_id'] == vehicle_id].iloc[i-1]
+                    # Anomaly details
+                    output_lines.append(f"### 🚨 Point {i}: **ANOMALY DETECTED!**")
+                    output_lines.append(f"- **Alert Level:** {result.alert_level}")
+                    output_lines.append(f"- **Confidence:** {result.confidence:.3f}")
+                    output_lines.append(f"- **Location:** ({vehicle_data['lat']:.6f}, {vehicle_data['lng']:.6f})")
+                    output_lines.append(f"- **Speed:** {result.driving_metrics.get('speed', 0):.1f} km/h")
+                    output_lines.append(f"- **Altitude:** {vehicle_data['alt']:.1f} m")
+                    output_lines.append(f"- **Heading:** {vehicle_data['azm']:.1f}°")
+                    # Risk factors
+                    risk_factors = [k for k, v in result.risk_factors.items() if v]
+                    if risk_factors:
+                        output_lines.append(f"- **Risk Factors:** {', '.join(risk_factors)}")
+                    # Model scores
+                    output_lines.append(f"- **Model Scores:**")
+                    for model, score in result.raw_scores.items():
+                        output_lines.append(f"  - {model}: {score:.3f}")
+                    output_lines.append("")
+                else:
+                    # Normal point (abbreviated)
+                    if i <= 5 or i % 10 == 0:  # Show first 5 and every 10th normal point
+                        output_lines.append(f"**Point {i}:** ✅ Normal (confidence: {result.confidence:.3f})")
+            # Vehicle summary
+            detection_rate = vehicle_anomalies / len(results) if results else 0
+            output_lines.append(f"\n**Vehicle Summary:** {vehicle_anomalies} anomalies out of {len(results)} points ({detection_rate:.1%})\n")
+            output_lines.append("---\n")
+        # Overall summary
+        overall_rate = total_anomalies / total_points if total_points > 0 else 0
+        output_lines.append(f"## 📊 Overall Summary")
+        output_lines.append(f"- **Total Points:** {total_points}")
+        output_lines.append(f"- **Total Anomalies:** {total_anomalies}")
+        output_lines.append(f"- **Detection Rate:** {overall_rate:.1%}")
+        return "\n".join(output_lines)
+    def generate_summary_stats(self, vehicle_results: Dict, processing_time: float) -> str:
+        """Generate summary statistics"""
+        total_vehicles = len(vehicle_results)
+        total_points = sum(len(results) for results in vehicle_results.values())
+        total_anomalies = sum(sum(1 for r in results if r.anomaly_detected)
+                             for results in vehicle_results.values())
+        # Alert level distribution
+        alert_levels = {}
+        for results in vehicle_results.values():
+            for result in results:
+                if result.anomaly_detected:
+                    level = result.alert_level
+                    alert_levels[level] = alert_levels.get(level, 0) + 1
+        # Risk factor analysis
+        risk_factors = {}
+        for results in vehicle_results.values():
+            for result in results:
+                if result.anomaly_detected:
+                    for factor, present in result.risk_factors.items():
+                        if present:
+                            risk_factors[factor] = risk_factors.get(factor, 0) + 1
+        output = f"""
+# 📈 Processing Summary
+## ⚡ Performance Metrics
+- **Processing Time:** {processing_time:.2f} seconds
+- **Points per Second:** {total_points/processing_time:.1f}
+- **Average Time per Point:** {1000*processing_time/total_points:.1f} ms
+## 📊 Detection Statistics
+- **Total Vehicles:** {total_vehicles}
+- **Total GPS Points:** {total_points}
+- **Anomalies Detected:** {total_anomalies}
+- **Overall Anomaly Rate:** {100*total_anomalies/total_points:.2f}%
+## 🚨 Alert Level Distribution
+"""
+        for level, count in sorted(alert_levels.items()):
+            percentage = 100 * count / total_anomalies if total_anomalies > 0 else 0
+            output += f"- **{level}:** {count} ({percentage:.1f}%)\n"
+        if risk_factors:
+            output += "\n## ⚠️ Top Risk Factors\n"
+            sorted_risks = sorted(risk_factors.items(), key=lambda x: x[1], reverse=True)[:5]
+            for factor, count in sorted_risks:
+                percentage = 100 * count / total_anomalies if total_anomalies > 0 else 0
+                output += f"- **{factor}:** {count} occurrences ({percentage:.1f}%)\n"
+        return output
+    def create_visualization(self, vehicle_results: Dict, original_df: pd.DataFrame) -> gr.Plot:
+        """Create interactive visualization"""
+        # Prepare data for plotting
+        plot_data = []
+        for vehicle_id, results in vehicle_results.items():
+            vehicle_df = original_df[original_df['randomized_id'] == vehicle_id].copy()
+            for i, result in enumerate(results):
+                if i < len(vehicle_df):
+                    row = vehicle_df.iloc[i]
+                    plot_data.append({
+                        'vehicle_id': vehicle_id,
+                        'lat': row['lat'],
+                        'lng': row['lng'],
+                        'spd': row['spd'],
+                        'alt': row['alt'],
+                        'azm': row['azm'],
+                        'anomaly': result.anomaly_detected,
+                        'confidence': result.confidence,
+                        'alert_level': result.alert_level if result.anomaly_detected else 'Normal'
+                    })
+        plot_df = pd.DataFrame(plot_data)
+        if len(plot_df) == 0:
+            return gr.Plot(value=go.Figure().add_annotation(text="No data to plot"))
+        # Create subplots
+        fig = make_subplots(
+            rows=2, cols=2,
+            subplot_titles=('GPS Route with Anomalies', 'Speed Profile',
+                          'Altitude Profile', 'Confidence Distribution'),
+            specs=[[{"type": "scattermapbox"}, {"type": "scatter"}],
+                   [{"type": "scatter"}, {"type": "histogram"}]]
+        )
+        # GPS Route Map
+        normal_points = plot_df[~plot_df['anomaly']]
+        anomaly_points = plot_df[plot_df['anomaly']]
+        if len(normal_points) > 0:
+            fig.add_trace(
+                go.Scattermapbox(
+                    lat=normal_points['lat'],
+                    lon=normal_points['lng'],
+                    mode='markers',
+                    marker=dict(size=8, color='green'),
+                    text=normal_points['vehicle_id'],
+                    name='Normal',
+                    hovertemplate='<b>%{text}</b><br>Lat: %{lat}<br>Lon: %{lon}<extra></extra>'
+                ),
+                row=1, col=1
+            )
+        if len(anomaly_points) > 0:
+            fig.add_trace(
+                go.Scattermapbox(
+                    lat=anomaly_points['lat'],
+                    lon=anomaly_points['lng'],
+                    mode='markers',
+                    marker=dict(size=12, color='red', symbol='diamond'),
+                    text=anomaly_points['alert_level'],
+                    name='Anomaly',
+                    hovertemplate='<b>%{text}</b><br>Lat: %{lat}<br>Lon: %{lon}<extra></extra>'
+                ),
+                row=1, col=1
+            )
+        # Speed Profile
+        fig.add_trace(
+            go.Scatter(
+                x=list(range(len(plot_df))),
+                y=plot_df['spd'],
+                mode='lines+markers',
+                marker=dict(color=plot_df['anomaly'].map({True: 'red', False: 'blue'})),
+                name='Speed',
+                hovertemplate='Point: %{x}<br>Speed: %{y} km/h<extra></extra>'
+            ),
+            row=1, col=2
+        )
+        # Altitude Profile
+        fig.add_trace(
+            go.Scatter(
+                x=list(range(len(plot_df))),
+                y=plot_df['alt'],
+                mode='lines+markers',
+                marker=dict(color=plot_df['anomaly'].map({True: 'red', False: 'green'})),
+                name='Altitude',
+                hovertemplate='Point: %{x}<br>Altitude: %{y} m<extra></extra>'
+            ),
+            row=2, col=1
+        )
+        # Confidence Distribution
+        fig.add_trace(
+            go.Histogram(
+                x=plot_df['confidence'],
+                nbinsx=20,
+                name='Confidence',
+                marker_color='lightblue'
+            ),
+            row=2, col=2
+        )
+        # Update layout
+        fig.update_layout(
+            mapbox=dict(
+                style="open-street-map",
+                center=dict(lat=plot_df['lat'].mean(), lon=plot_df['lng'].mean()),
+                zoom=10
+            ),
+            height=800,
+            showlegend=True,
+            title_text="��️ Vehicle Anomaly Detection Analysis"
+        )
+        fig.update_xaxes(title_text="Point Index", row=1, col=2)
+        fig.update_yaxes(title_text="Speed (km/h)", row=1, col=2)
+        fig.update_xaxes(title_text="Point Index", row=2, col=1)
+        fig.update_yaxes(title_text="Altitude (m)", row=2, col=1)
+        fig.update_xaxes(title_text="Confidence Score", row=2, col=2)
+        fig.update_yaxes(title_text="Count", row=2, col=2)
+        return gr.Plot(value=fig)
+    def generate_json_output(self, vehicle_results: Dict) -> str:
+        """Generate JSON output of all results"""
+        json_data = {
+            "detection_results": {},
+            "summary": {
+                "total_vehicles": len(vehicle_results),
+                "total_points": sum(len(results) for results in vehicle_results.values()),
+                "total_anomalies": sum(sum(1 for r in results if r.anomaly_detected)
+                                     for results in vehicle_results.values()),
+                "timestamp": datetime.now().isoformat()
+            }
+        }
+        for vehicle_id, results in vehicle_results.items():
+            json_data["detection_results"][vehicle_id] = []
+            for i, result in enumerate(results, 1):
+                result_dict = {
+                    "point_number": i,
+                    "anomaly_detected": result.anomaly_detected,
+                    "confidence": round(result.confidence, 4),
+                    "alert_level": result.alert_level,
+                    "timestamp": result.timestamp,
+                    "driving_metrics": result.driving_metrics,
+                    "risk_factors": result.risk_factors,
+                    "raw_scores": {k: round(v, 4) for k, v in result.raw_scores.items()}
+                }
+                json_data["detection_results"][vehicle_id].append(result_dict)
+        return json.dumps(json_data, indent=2)
+# Initialize the app
+app = AnomalyDetectionGradioApp()
+def process_csv_file(file):
+    """Wrapper function for Gradio interface"""
+    if file is None:
+        return "Please upload a CSV file", "", "", ""
+    return app.process_data(file.name)
+# Create the Gradio interface
+def create_interface():
+    with gr.Blocks(
+        theme=gr.themes.Soft(),
+        title="🛣️ Vehicle Anomaly Detection System",
+        css="""
+        .gradio-container {
+            font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
+        }
+        .main-header {
+            text-align: center;
+            background: linear-gradient(45deg, #1e3c72, #2a5298);
+            color: white;
+            padding: 2rem;
+            border-radius: 10px;
+            margin-bottom: 2rem;
+        }
+        .upload-area {
+            border: 2px dashed #4CAF50;
+            border-radius: 10px;
+            padding: 2rem;
+            text-align: center;
+            background-color: #f8f9fa;
+        }
+        """
+    ) as demo:
+        # Header
+        gr.HTML("""
+        <div class="main-header">
+            <h1>🛣️ Vehicle Anomaly Detection System</h1>
+            <p>Advanced ML-powered anomaly detection for GPS tracking data</p>
+            <p><strong>Upload your CSV with columns:</strong> randomized_id, lat, lng, spd, azm, alt (max 2000 samples)</p>
+        </div>
+        """)
+        with gr.Row():
+            with gr.Column(scale=1):
+                # File upload
+                gr.HTML('<div class="upload-area">')
+                file_upload = gr.File(
+                    label="📁 Upload GPS Data CSV",
+                    file_types=[".csv"],
+                    type="filepath"
+                )
+                gr.HTML('</div>')
+                # Process button
+                process_btn = gr.Button(
+                    "🚀 Analyze Anomalies",
+                    variant="primary",
+                    size="lg"
+                )
+                # Sample data info
+                gr.HTML("""
+                <div style="margin-top: 1rem; padding: 1rem; background-color: #e8f4fd; border-radius: 5px;">
+                    <h4>📋 Expected CSV Format:</h4>
+                    <code>
+                    randomized_id,lat,lng,spd,azm,alt<br>
+                    VEHICLE001,40.7128,-74.0060,45.5,90.0,100.0<br>
+                    VEHICLE001,40.7138,-74.0070,48.2,92.0,102.0<br>
+                    ...
+                    </code>
+                    <ul style="margin-top: 1rem;">
+                        <li><strong>randomized_id:</strong> Vehicle identifier</li>
+                        <li><strong>lat:</strong> Latitude (-90 to 90)</li>
+                        <li><strong>lng:</strong> Longitude (-180 to 180)</li>
+                        <li><strong>spd:</strong> Speed in km/h (0-300)</li>
+                        <li><strong>azm:</strong> Azimuth/heading (0-360°)</li>
+                        <li><strong>alt:</strong> Altitude in meters</li>
+                    </ul>
+                </div>
+                """)
+        # Results tabs
+        with gr.Tabs():
+            with gr.Tab("📋 Detailed Results"):
+                detailed_output = gr.Markdown(
+                    value="Upload a CSV file and click 'Analyze Anomalies' to see detailed results...",
+                    elem_classes=["detailed-results"]
+                )
+            with gr.Tab("📊 Summary & Stats"):
+                summary_output = gr.Markdown(
+                    value="Processing summary will appear here...",
+                    elem_classes=["summary-stats"]
+                )
+            with gr.Tab("📈 Visualizations"):
+                viz_output = gr.Plot(
+                    label="Interactive Analysis Charts"
+                )
+            with gr.Tab("💾 JSON Export"):
+                json_output = gr.Code(
+                    language="json",
+                    label="Complete Results JSON",
+                    value="JSON results will appear here..."
+                )
+        # Connect the processing
+        process_btn.click(
+            fn=process_csv_file,
+            inputs=[file_upload],
+            outputs=[detailed_output, summary_output, viz_output, json_output],
+            show_progress=True
+        )
+        # Footer
+        gr.HTML("""
+        <div style="text-align: center; margin-top: 2rem; padding: 1rem; background-color: #f1f3f4; border-radius: 5px;">
+            <p>🔬 <strong>ML Models:</strong> Isolation Forest + One-Class SVM + LSTM Autoencoder</p>
+            <p>⚡ <strong>Processing:</strong> ~45-90 seconds for 2000 samples on CPU</p>
+            <p>🛡️ <strong>Privacy:</strong> All processing happens locally - your data never leaves this environment</p>
+        </div>
+        """)
+    return demo
+if __name__ == "__main__":
+    demo = create_interface()
+    demo.launch(
+        server_name="0.0.0.0",
+        server_port=7860,
+        share=True,
+        show_error=True
+    )

launch.bat ADDED Viewed

	@@ -0,0 +1,12 @@

+@echo off
+echo 🛣️ Vehicle Anomaly Detection System
+echo =====================================
+echo.
+echo Installing dependencies...
+pip install -r requirements.txt
+echo.
+echo Starting Gradio application...
+echo Access the interface at: http://localhost:7860
+echo.
+python gradio_app.py
+pause

models/feature_names.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"feature_names": ["spd", "acceleration", "jerk", "angular_velocity", "lateral_acceleration", "heading_change_rate", "curvature", "overall_risk", "speed_std_3", "speed_std_5", "speed_std_10", "accel_std_3", "accel_std_5", "accel_std_10", "acceleration_risk", "jerk_risk", "lateral_risk", "speed_risk"]}

models/isolation_forest.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8b4ee70e6ebf4a9e9d9ecd6b2ce0897303f12513078ca4870030d554ab155fdd
+size 1710078

models/lstm_autoencoder.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:69846219391308a4980bf5475668bb3e24e387e666522134a5de13f49f493398
+size 500332

models/lstm_threshold.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"lstm_threshold": 2.9153685569763184}

models/manifest.json ADDED Viewed

	@@ -0,0 +1,25 @@

+{
+  "short_name": "React App",
+  "name": "Create React App Sample",
+  "icons": [
+    {
+      "src": "favicon.ico",
+      "sizes": "64x64 32x32 24x24 16x16",
+      "type": "image/x-icon"
+    },
+    {
+      "src": "logo192.png",
+      "type": "image/png",
+      "sizes": "192x192"
+    },
+    {
+      "src": "logo512.png",
+      "type": "image/png",
+      "sizes": "512x512"
+    }
+  ],
+  "start_url": ".",
+  "display": "standalone",
+  "theme_color": "#000000",
+  "background_color": "#ffffff"
+}

models/model_metadata.json ADDED Viewed

	@@ -0,0 +1,13 @@

+{
+  "models_saved": [
+    "Isolation Forest",
+    "One-Class SVM",
+    "LSTM Autoencoder",
+    "LSTM Threshold",
+    "Scaler",
+    "Feature Names"
+  ],
+  "save_timestamp": "2025-09-13T15:15:49.010561",
+  "device_used": "cuda",
+  "total_samples": 118166
+}

models/one_class_svm.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:89b7add1680af2a043dc510902dfb31c64cb74ece7bd4d08175edec9cb117161
+size 412575

models/optimization_model.joblib ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ab47250d98f86d183ed95f5b6aa8d4017597d0d510be8d4fb43abd623d4ae75c
+size 409969

models/robots.txt ADDED Viewed

	@@ -0,0 +1,3 @@

+# https://www.robotstxt.org/robotstxt.html
+User-agent: *
+Disallow:

models/scaler.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1a84f6f969094f40ff8b1c5b6cfb81dc63b3689d8f0902aed115b57f96e33f47
+size 1319

production_predictor.py ADDED Viewed

	@@ -0,0 +1,673 @@

+import os
+import json
+import time
+import logging
+import numpy as np
+import pandas as pd
+import torch
+import joblib
+from datetime import datetime
+from collections import deque
+from typing import Dict, List, Optional, Any
+import asyncio
+import aiofiles
+from dataclasses import dataclass, asdict
+from pathlib import Path
+from scipy.signal import savgol_filter
+# Set up logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+# Set device
+device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
+@dataclass
+class GPSPoint:
+    """GPS data point from tracker - matches your dataset structure"""
+    vehicle_id: str  # This will be our randomized_id
+    lat: float
+    lng: float
+    alt: float
+    spd: float  # speed in km/h
+    azm: float  # azimuth/heading 0-360
+    timestamp: str = None  # Added for real-time tracking
+    @classmethod
+    def from_tracker_data(cls, tracker_data: Dict) -> 'GPSPoint':
+        """Convert from real GPS tracker format to our dataset format"""
+        return cls(
+            vehicle_id=tracker_data.get('vehicle_id', tracker_data.get('device_id')),
+            lat=tracker_data['lat'],
+            lng=tracker_data['lng'],
+            alt=tracker_data.get('alt', tracker_data.get('altitude', 0.0)),
+            spd=tracker_data.get('spd', tracker_data.get('speed', 0.0)),
+            azm=tracker_data.get('azm', tracker_data.get('heading', 0.0)),
+            timestamp=tracker_data.get('timestamp', datetime.now().isoformat())
+        )
+    def to_dataset_format(self) -> Dict:
+        """Convert to the format expected by your trained model"""
+        return {
+            'randomized_id': self.vehicle_id,
+            'lat': self.lat,
+            'lng': self.lng,
+            'alt': self.alt,
+            'spd': self.spd,
+            'azm': self.azm
+        }
+@dataclass
+class AnomalyResult:
+    """Anomaly detection result"""
+    timestamp: str
+    vehicle_id: str
+    anomaly_detected: bool
+    confidence: float
+    alert_level: str
+    raw_scores: Dict[str, float]
+    driving_metrics: Dict[str, float]
+    risk_factors: Dict[str, bool]
+    def to_dict(self) -> Dict:
+        return asdict(self)
+# Import the LSTM model from your training code
+class LSTMAutoencoder(torch.nn.Module):
+    """LSTM Autoencoder - same as your training code"""
+    def __init__(self, input_dim, hidden_dim=64, latent_dim=10, num_layers=2, sequence_length=20):
+        super(LSTMAutoencoder, self).__init__()
+        self.input_dim = input_dim
+        self.hidden_dim = hidden_dim
+        self.latent_dim = latent_dim
+        self.num_layers = num_layers
+        self.sequence_length = sequence_length
+        # Encoder
+        self.encoder_lstm = torch.nn.LSTM(
+            input_dim, hidden_dim, num_layers,
+            batch_first=True, dropout=0.2 if num_layers > 1 else 0
+        )
+        self.encoder_fc = torch.nn.Linear(hidden_dim, latent_dim)
+        # Decoder
+        self.decoder_fc = torch.nn.Linear(latent_dim, hidden_dim)
+        self.decoder_lstm = torch.nn.LSTM(
+            hidden_dim, hidden_dim, num_layers,
+            batch_first=True, dropout=0.2 if num_layers > 1 else 0
+        )
+        self.output_projection = torch.nn.Linear(hidden_dim, input_dim)
+        self.dropout = torch.nn.Dropout(0.2)
+    def encode(self, x):
+        lstm_out, (hidden, cell) = self.encoder_lstm(x)
+        encoded = self.encoder_fc(hidden[-1])
+        return encoded
+    def decode(self, encoded):
+        batch_size = encoded.size(0)
+        decoded = self.decoder_fc(encoded)
+        decoded = decoded.unsqueeze(1).repeat(1, self.sequence_length, 1)
+        lstm_out, _ = self.decoder_lstm(decoded)
+        output = self.output_projection(lstm_out)
+        return output
+    def forward(self, x):
+        encoded = self.encode(x)
+        decoded = self.decode(encoded)
+        return decoded
+class ProductionAnomalyDetector:
+    """
+    Production-ready driving anomaly detection system
+    Works with your exact dataset format: randomized_id,lat,lng,alt,spd,azm
+    """
+    def __init__(self, model_dir: str, config: Dict = None):
+        """
+        Initialize with pre-trained models
+        """
+        self.model_dir = Path(model_dir)
+        self.config = config or self._default_config()
+        # Model components
+        self.scaler = None
+        self.isolation_forest = None
+        self.one_class_svm = None
+        self.lstm_autoencoder = None
+        self.lstm_threshold = None
+        # Vehicle buffers for real-time processing
+        self.vehicle_buffers = {}  # vehicle_id -> deque of GPS points
+        self.buffer_size = self.config['buffer_size']
+        # Normalization parameters
+        self.if_min = None
+        self.if_max = None
+        self.svm_min = None
+        self.svm_max = None
+        # Load models
+        self._load_models()
+        logger.info(f"ProductionAnomalyDetector initialized with models from {model_dir}")
+        logger.info(f"Using device: {device}")
+    def _default_config(self) -> Dict:
+        """Default configuration matching your training setup"""
+        return {
+            'buffer_size': 20,
+            'min_points_for_detection': 5,
+            'lstm_sequence_length': 15,
+            'alert_threshold': 0.3,
+            'weights': {
+                'isolation_forest': 0.35,
+                'one_class_svm': 0.30,
+                'lstm': 0.35
+            }
+        }
+    def _load_models(self):
+        """Load all pre-trained models"""
+        try:
+            # Load scaler (required)
+            scaler_path = self.model_dir / 'scaler.pkl'
+            if scaler_path.exists():
+                self.scaler = joblib.load(scaler_path)
+                logger.info("✓ Feature scaler loaded")
+            else:
+                raise FileNotFoundError(f"Feature scaler not found: {scaler_path}")
+            # Load Isolation Forest
+            if_path = self.model_dir / 'isolation_forest.pkl'
+            if if_path.exists():
+                self.isolation_forest = joblib.load(if_path)
+                logger.info("✓ Isolation Forest loaded")
+            # Load One-Class SVM
+            svm_path = self.model_dir / 'one_class_svm.pkl'
+            if svm_path.exists():
+                self.one_class_svm = joblib.load(svm_path)
+                logger.info("✓ One-Class SVM loaded")
+            # Load LSTM Autoencoder
+            lstm_path = self.model_dir / 'lstm_autoencoder.pth'
+            if lstm_path.exists():
+                checkpoint = torch.load(lstm_path, map_location=device)
+                lstm_config = checkpoint["model_config"]
+                self.lstm_autoencoder = LSTMAutoencoder(**lstm_config).to(device)
+                self.lstm_autoencoder.load_state_dict(checkpoint["model_state_dict"])
+                self.lstm_autoencoder.eval()
+                logger.info("✓ LSTM Autoencoder loaded")
+                self.lstm_threshold = 2.9153685569763184 # fallback threshold
+                logger.info(f"✓ LSTM threshold: {self.lstm_threshold}")
+            # Load normalization parameters
+            norm_path = self.model_dir / 'normalization_params.json'
+            if norm_path.exists():
+                with open(norm_path, 'r') as f:
+                    norm_params = json.load(f)
+                    self.if_min = norm_params.get('if_min', -0.2400)
+                    self.if_max = norm_params.get('if_max', 0.1680)
+                    self.svm_min = norm_params.get('svm_min', -381.6356)
+                    self.svm_max = norm_params.get('svm_max', 106.7346)
+                logger.info("✓ Normalization parameters loaded")
+            else:
+                # Use your actual training values
+                self.if_min, self.if_max = -0.2400, 0.1680
+                self.svm_min, self.svm_max = -381.6356, 106.7346
+                logger.info("Using training normalization parameters")
+            logger.info("All models loaded successfully!")
+        except Exception as e:
+            logger.error(f"Error loading models: {e}")
+            raise
+    def process_gps_point(self, gps_point: GPSPoint) -> Optional[AnomalyResult]:
+        """
+        Process a single GPS point - main entry point for real-time detection
+        """
+        vehicle_id = gps_point.vehicle_id
+        # Initialize vehicle buffer if needed
+        if vehicle_id not in self.vehicle_buffers:
+            self.vehicle_buffers[vehicle_id] = deque(maxlen=self.buffer_size)
+        # Add point to buffer
+        self.vehicle_buffers[vehicle_id].append(gps_point)
+        buffer = self.vehicle_buffers[vehicle_id]
+        # Need minimum points for detection
+        if len(buffer) < self.config['min_points_for_detection']:
+            return None
+        try:
+            # Convert buffer to DataFrame in your exact format
+            buffer_data = []
+            for point in buffer:
+                buffer_data.append(point.to_dataset_format())
+            df_buffer = pd.DataFrame(buffer_data)
+            # Calculate features using your exact feature engineering pipeline
+            features_df = self._calculate_features_exact_pipeline(df_buffer)
+            if len(features_df) == 0:
+                return None
+            # Get latest point features
+            latest_features = features_df.iloc[-1:].values
+            latest_scaled = self.scaler.transform(latest_features)
+            # Get anomaly scores
+            scores = self._get_anomaly_scores(features_df, latest_scaled)
+            # Calculate ensemble score
+            ensemble_score = self._calculate_ensemble_score(scores)
+            # Determine alert level
+            alert_level = self._get_alert_level(ensemble_score)
+            # Extract metrics from the processed features
+            latest_processed = features_df.iloc[-1]
+            driving_metrics = self._extract_driving_metrics_from_features(latest_processed)
+            risk_factors = self._extract_risk_factors_from_features(latest_processed)
+            return AnomalyResult(
+                timestamp=gps_point.timestamp or datetime.now().isoformat(),
+                vehicle_id=vehicle_id,
+                anomaly_detected=ensemble_score > self.config['alert_threshold'],
+                confidence=float(ensemble_score),
+                alert_level=alert_level,
+                raw_scores=scores,
+                driving_metrics=driving_metrics,
+                risk_factors=risk_factors
+            )
+        except Exception as e:
+            logger.error(f"Error processing GPS point for vehicle {vehicle_id}: {e}")
+            return None
+    def _calculate_features_exact_pipeline(self, df: pd.DataFrame) -> pd.DataFrame:
+        """
+        Calculate features using EXACT same pipeline as your training code
+        Input: DataFrame with columns [randomized_id, lat, lng, alt, spd, azm]
+        Output: DataFrame with 18 features ready for ML models
+        """
+        # Apply the EXACT same feature engineering as your training
+        df_processed = self._apply_physics_calculations(df.copy())
+        df_processed = self._apply_anomaly_feature_engineering(df_processed)
+        features_df = self._prepare_ml_features_exact(df_processed)
+        return features_df
+    def _apply_physics_calculations(self, df: pd.DataFrame) -> pd.DataFrame:
+        """Apply exact physics calculations from your training code"""
+        # Sort by trip and create sequence
+        df = df.sort_values(['randomized_id', 'lat', 'lng'])
+        df['sequence'] = df.groupby('randomized_id').cumcount()
+        df['time_delta'] = 1.0  # 1 second intervals
+        def calculate_trip_features(group):
+            if len(group) < 3:
+                # Fill with safe defaults for short trips
+                group['distance'] = 0.0
+                group['speed_smooth'] = group['spd']
+                group['acceleration'] = 0.0
+                group['jerk'] = 0.0
+                group['angular_velocity'] = 0.0
+                group['lateral_acceleration'] = 0.0
+                group['heading_change_rate'] = 0.0
+                group['curvature'] = 0.0
+                return group
+            # Haversine distance calculation
+            def haversine_distance(lat1, lon1, lat2, lon2):
+                R = 6371000  # Earth radius in meters
+                lat1, lon1, lat2, lon2 = map(np.radians, [lat1, lon1, lat2, lon2])
+                dlat = lat2 - lat1
+                dlon = lon2 - lon1
+                a = np.sin(dlat/2)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2)**2
+                c = 2 * np.arcsin(np.sqrt(np.clip(a, 0, 1)))
+                return R * c
+            # Calculate distances
+            distances = [0]
+            for i in range(1, len(group)):
+                try:
+                    dist = haversine_distance(
+                        group.iloc[i-1]['lat'], group.iloc[i-1]['lng'],
+                        group.iloc[i]['lat'], group.iloc[i]['lng']
+                    )
+                    dist = min(dist, 1000)  # Cap at 1km to avoid GPS errors
+                    distances.append(dist)
+                except:
+                    distances.append(0)
+            group['distance'] = distances
+            # Smooth speed data
+            if len(group) >= 5:
+                try:
+                    group['speed_smooth'] = savgol_filter(group['spd'], 5, 2)
+                except:
+                    group['speed_smooth'] = group['spd']
+            else:
+                group['speed_smooth'] = group['spd']
+            group['speed_smooth'] = np.maximum(group['speed_smooth'], 0)
+            # Calculate acceleration
+            speed_ms = group['speed_smooth'] / 3.6  # km/h to m/s
+            try:
+                acceleration = np.gradient(speed_ms, group['time_delta'])
+                acceleration = np.clip(acceleration, -15, 15)
+            except:
+                acceleration = np.zeros(len(group))
+            group['acceleration'] = acceleration
+            # Calculate jerk
+            try:
+                jerk = np.gradient(acceleration, group['time_delta'])
+                jerk = np.clip(jerk, -20, 20)
+            except:
+                jerk = np.zeros(len(group))
+            group['jerk'] = jerk
+            # Calculate angular velocity
+            try:
+                azimuth_rad = np.radians(group['azm'])
+                azimuth_unwrapped = np.unwrap(azimuth_rad)
+                angular_velocity = np.gradient(azimuth_unwrapped, group['time_delta'])
+                angular_velocity = np.clip(angular_velocity, -np.pi, np.pi)
+            except:
+                angular_velocity = np.zeros(len(group))
+            group['angular_velocity'] = angular_velocity
+            # Calculate lateral acceleration
+            lateral_acceleration = speed_ms * angular_velocity
+            lateral_acceleration = np.clip(lateral_acceleration, -20, 20)
+            group['lateral_acceleration'] = lateral_acceleration
+            # Calculate heading change rate
+            group['heading_change_rate'] = np.abs(angular_velocity)
+            # Calculate curvature with safe division
+            denominator = speed_ms + 0.1
+            group['curvature'] = np.divide(
+                np.abs(angular_velocity),
+                denominator,
+                out=np.zeros_like(angular_velocity),
+                where=denominator!=0
+            )
+            return group
+        df = df.groupby('randomized_id').apply(calculate_trip_features)
+        df = df.reset_index(drop=True)
+        # Clean any remaining NaN/inf values
+        numeric_columns = ['distance', 'speed_smooth', 'acceleration', 'jerk',
+                          'angular_velocity', 'lateral_acceleration', 'heading_change_rate', 'curvature']
+        for col in numeric_columns:
+            if col in df.columns:
+                df[col] = df[col].fillna(0)
+                df[col] = df[col].replace([np.inf, -np.inf], 0)
+        return df
+    def _apply_anomaly_feature_engineering(self, df: pd.DataFrame) -> pd.DataFrame:
+        """Apply exact anomaly feature engineering from your training code"""
+        # Rolling window statistics
+        window_sizes = [3, 5, 10]
+        for window in window_sizes:
+            try:
+                # Speed patterns
+                df[f'speed_std_{window}'] = df.groupby('randomized_id')['spd'].rolling(
+                    window, center=True, min_periods=1).std().reset_index(0, drop=True).fillna(0)
+                df[f'speed_max_{window}'] = df.groupby('randomized_id')['spd'].rolling(
+                    window, center=True, min_periods=1).max().reset_index(0, drop=True).fillna(0)
+                df[f'speed_min_{window}'] = df.groupby('randomized_id')['spd'].rolling(
+                    window, center=True, min_periods=1).min().reset_index(0, drop=True).fillna(0)
+                # Acceleration patterns
+                df[f'accel_std_{window}'] = df.groupby('randomized_id')['acceleration'].rolling(
+                    window, center=True, min_periods=1).std().reset_index(0, drop=True).fillna(0)
+                df[f'accel_max_{window}'] = df.groupby('randomized_id')['acceleration'].rolling(
+                    window, center=True, min_periods=1).max().reset_index(0, drop=True).fillna(0)
+                df[f'accel_min_{window}'] = df.groupby('randomized_id')['acceleration'].rolling(
+                    window, center=True, min_periods=1).min().reset_index(0, drop=True).fillna(0)
+            except:
+                # Fallback values
+                df[f'speed_std_{window}'] = 0
+                df[f'speed_max_{window}'] = df['spd']
+                df[f'speed_min_{window}'] = df['spd']
+                df[f'accel_std_{window}'] = 0
+                df[f'accel_max_{window}'] = df['acceleration']
+                df[f'accel_min_{window}'] = df['acceleration']
+        # Extreme behavior indicators (exact thresholds from training)
+        df['hard_braking'] = (df['acceleration'] < -4.0).astype(int)
+        df['hard_acceleration'] = (df['acceleration'] > 3.0).astype(int)
+        df['excessive_speed'] = (df['spd'] > 80).astype(int)
+        df['sharp_turn'] = (np.abs(df['lateral_acceleration']) > 4.0).astype(int)
+        df['erratic_steering'] = (np.abs(df['heading_change_rate']) > 0.5).astype(int)
+        # Composite risk scores (exact same calculations)
+        df['acceleration_risk'] = np.clip(np.abs(df['acceleration']) / 10.0, 0, 1)
+        df['jerk_risk'] = np.clip(np.abs(df['jerk']) / 5.0, 0, 1)
+        df['lateral_risk'] = np.clip(np.abs(df['lateral_acceleration']) / 8.0, 0, 1)
+        df['speed_risk'] = np.clip(np.maximum(0, (df['spd'] - 60) / 40.0), 0, 1)
+        # Overall risk score (exact same weights)
+        df['overall_risk'] = (
+            df['acceleration_risk'] * 0.25 +
+            df['jerk_risk'] * 0.20 +
+            df['lateral_risk'] * 0.25 +
+            df['speed_risk'] * 0.15 +
+            (df['hard_braking'] + df['hard_acceleration'] +
+             df['sharp_turn'] + df['erratic_steering']) * 0.15 / 4
+        )
+        df['overall_risk'] = np.clip(df['overall_risk'], 0, 1)
+        return df
+    def _prepare_ml_features_exact(self, df: pd.DataFrame) -> pd.DataFrame:
+        """Prepare exact same 18 features as in training"""
+        # Exact same feature columns as your training
+        feature_columns = [
+            'spd', 'acceleration', 'jerk', 'angular_velocity', 'lateral_acceleration',
+            'heading_change_rate', 'curvature', 'overall_risk',
+            'speed_std_3', 'speed_std_5', 'speed_std_10',
+            'accel_std_3', 'accel_std_5', 'accel_std_10',
+            'acceleration_risk', 'jerk_risk', 'lateral_risk', 'speed_risk'
+        ]
+        features_df = df[feature_columns].copy()
+        # Clean any remaining issues
+        for col in feature_columns:
+            features_df[col] = features_df[col].fillna(0)
+            features_df[col] = features_df[col].replace([np.inf, -np.inf], 0)
+        return features_df
+    def _get_anomaly_scores(self, features_df: pd.DataFrame, latest_scaled: np.ndarray) -> Dict[str, float]:
+        """Get anomaly scores from all models"""
+        scores = {}
+        # Isolation Forest
+        if self.isolation_forest:
+            scores['isolation_forest'] = float(self.isolation_forest.decision_function(latest_scaled)[0])
+        # One-Class SVM
+        if self.one_class_svm:
+            scores['one_class_svm'] = float(self.one_class_svm.decision_function(latest_scaled)[0])
+        # LSTM Autoencoder
+        if self.lstm_autoencoder and len(features_df) >= self.config['lstm_sequence_length']:
+            try:
+                sequence_length = self.config['lstm_sequence_length']
+                sequence_features = features_df.iloc[-sequence_length:].values
+                sequence_scaled = self.scaler.transform(sequence_features)
+                sequence_tensor = torch.FloatTensor(sequence_scaled).unsqueeze(0).to(device)
+                with torch.no_grad():
+                    reconstructed = self.lstm_autoencoder(sequence_tensor)
+                    reconstruction_error = torch.mean((sequence_tensor - reconstructed) ** 2).item()
+                    scores['lstm'] = float(reconstruction_error)
+            except Exception as e:
+                logger.warning(f"LSTM inference error: {e}")
+                scores['lstm'] = 0.0
+        return scores
+    def _calculate_ensemble_score(self, scores: Dict[str, float]) -> float:
+        """Calculate ensemble score using exact same logic as training"""
+        ensemble_score = 0.0
+        weights = self.config['weights']
+        # Isolation Forest (lower = more anomalous)
+        if 'isolation_forest' in scores:
+            if_range = self.if_max - self.if_min
+            if if_range > 0:
+                if_normalized = (scores['isolation_forest'] - self.if_min) / if_range
+                if_anomaly_score = 1.0 - np.clip(if_normalized, 0, 1)
+            else:
+                if_anomaly_score = 0.5
+            ensemble_score += weights['isolation_forest'] * if_anomaly_score
+        # SVM (negative = more anomalous)
+        if 'one_class_svm' in scores:
+            svm_range = self.svm_max - self.svm_min
+            if svm_range > 0:
+                svm_normalized = (scores['one_class_svm'] - self.svm_min) / svm_range
+                svm_anomaly_score = 1.0 - np.clip(svm_normalized, 0, 1)
+            else:
+                svm_anomaly_score = 0.5
+            ensemble_score += weights['one_class_svm'] * svm_anomaly_score
+        # LSTM (higher reconstruction error = more anomalous)
+        if 'lstm' in scores and self.lstm_threshold:
+            lstm_anomaly_score = np.clip(scores['lstm'] / self.lstm_threshold, 0, 1)
+            ensemble_score += weights['lstm'] * lstm_anomaly_score
+        return np.clip(ensemble_score, 0, 1)
+    def _get_alert_level(self, confidence: float) -> str:
+        """Determine alert level"""
+        if confidence > 0.8:
+            return 'CRITICAL'
+        elif confidence > 0.6:
+            return 'HIGH'
+        elif confidence > 0.4:
+            return 'MEDIUM'
+        elif confidence > 0.2:
+            return 'LOW'
+        else:
+            return 'NORMAL'
+    def _extract_driving_metrics_from_features(self, features_row: pd.Series) -> Dict[str, float]:
+        """Extract driving metrics from processed features"""
+        return {
+            'speed': float(features_row['spd']),
+            'acceleration': float(features_row['acceleration']),
+            'lateral_acceleration': float(features_row['lateral_acceleration']),
+            'jerk': float(features_row['jerk']),
+            'heading_change_rate': float(features_row['heading_change_rate']),
+            'overall_risk': float(features_row['overall_risk'])
+        }
+    def _extract_risk_factors_from_features(self, features_row):
+        """
+        Extract boolean risk factors from a row of driving features.
+        """
+        return {
+            'hard_braking': bool(features_row['acceleration'] < -2.5),        # sudden deceleration
+            'hard_acceleration': bool(features_row['acceleration'] > 2.5),    # sudden acceleration
+            'excessive_speed': bool(features_row['spd'] > 120),               # overspeeding (km/h)
+            'sharp_turn': bool(abs(features_row['lateral_acceleration']) > 3.0),  # strong lateral g-force
+            'erratic_steering': bool(abs(features_row['angular_velocity']) > 30)  # quick steering angle change
+        }
+    def get_vehicle_status(self, vehicle_id: str) -> Dict[str, Any]:
+        """Get current status of a vehicle"""
+        if vehicle_id not in self.vehicle_buffers:
+            return {'vehicle_id': vehicle_id, 'status': 'no_data'}
+        buffer = self.vehicle_buffers[vehicle_id]
+        return {
+            'vehicle_id': vehicle_id,
+            'buffer_size': len(buffer),
+            'last_update': buffer[-1].timestamp if buffer else None,
+            'ready_for_detection': len(buffer) >= self.config['min_points_for_detection']
+        }
+# Updated API input model to match your data structure
+from fastapi import FastAPI, HTTPException
+from pydantic import BaseModel
+from typing import Optional
+class GPSPointRequest(BaseModel):
+    """API request model matching your dataset columns"""
+    vehicle_id: str  # maps to randomized_id
+    lat: float
+    lng: float
+    alt: float = 0.0
+    spd: float  # speed in km/h
+    azm: float  # azimuth/heading 0-360
+    timestamp: Optional[str] = None
+# Updated sample input/output for your exact data structure
+sample_input_output = {
+    "input": {
+        "vehicle_id": "fleet_001",
+        "lat": 55.7558,
+        "lng": 37.6176,
+        "alt": 156.0,
+        "spd": 45.5,
+        "azm": 85.0,
+        "timestamp": "2025-09-13T10:31:18Z"
+    },
+    "output": {
+        "status": "detected",
+        "result": {
+            "timestamp": "2025-09-13T10:31:18Z",
+            "vehicle_id": "fleet_001",
+            "anomaly_detected": False,
+            "confidence": 0.156,
+            "alert_level": "NORMAL",
+            "raw_scores": {
+                "isolation_forest": 0.045,
+                "one_class_svm": 12.34,
+                "lstm": 0.234
+            },
+            "driving_metrics": {
+                "speed": 45.5,
+                "acceleration": 0.12,
+                "lateral_acceleration": 0.08,
+                "jerk": 0.05,
+                "heading_change_rate": 0.02,
+                "overall_risk": 0.089
+            },
+            "risk_factors": {
+                "hard_braking": False,
+                "hard_acceleration": False,
+                "excessive_speed": False,
+                "sharp_turn": False,
+                "erratic_steering": False
+            }
+        }
+    }
+}

requirements.txt ADDED Viewed

	@@ -0,0 +1,9 @@

+gradio>=4.0.0
+pandas>=1.5.0
+numpy>=1.21.0
+torch>=1.12.0
+scikit-learn>=1.1.0
+plotly>=5.0.0
+scipy>=1.9.0
+joblib>=1.2.0
+aiofiles>=22.0.0

sample_data.csv ADDED Viewed

	@@ -0,0 +1,31 @@

+randomized_id,lat,lng,spd,azm,alt
+VEHICLE001,40.7128,-74.0060,45.5,90.0,100.0
+VEHICLE001,40.7138,-74.0070,48.2,92.0,102.0
+VEHICLE001,40.7148,-74.0080,52.1,95.0,105.0
+VEHICLE001,40.7158,-74.0090,85.3,98.0,108.0
+VEHICLE001,40.7168,-74.0100,127.5,101.0,110.0
+VEHICLE001,40.7178,-74.0110,156.2,105.0,112.0
+VEHICLE001,40.7188,-74.0120,42.8,108.0,115.0
+VEHICLE001,40.7198,-74.0130,38.5,110.0,118.0
+VEHICLE002,40.7500,-73.9800,35.2,180.0,90.0
+VEHICLE002,40.7510,-73.9810,38.1,182.0,92.0
+VEHICLE002,40.7520,-73.9820,41.5,185.0,95.0
+VEHICLE002,40.7530,-73.9830,165.8,188.0,98.0
+VEHICLE002,40.7540,-73.9840,198.2,191.0,100.0
+VEHICLE002,40.7550,-73.9850,43.7,195.0,102.0
+VEHICLE002,40.7560,-73.9860,39.9,198.0,105.0
+VEHICLE003,40.8000,-73.9500,55.0,270.0,200.0
+VEHICLE003,40.8010,-73.9510,58.3,272.0,202.0
+VEHICLE003,40.8020,-73.9520,62.1,275.0,205.0
+VEHICLE003,40.8030,-73.9530,220.5,278.0,208.0
+VEHICLE003,40.8040,-73.9540,245.8,281.0,210.0
+VEHICLE003,40.8050,-73.9550,51.2,285.0,212.0
+VEHICLE003,40.8060,-73.9560,48.7,288.0,215.0
+VEHICLE004,40.6500,-74.1000,25.0,45.0,50.0
+VEHICLE004,40.6510,-74.1010,28.5,47.0,52.0
+VEHICLE004,40.6520,-74.1020,31.2,49.0,55.0
+VEHICLE004,40.6530,-74.1030,34.8,52.0,58.0
+VEHICLE004,40.6540,-74.1040,37.5,55.0,60.0
+VEHICLE004,40.6550,-74.1050,40.1,58.0,62.0
+VEHICLE004,40.6560,-74.1060,42.8,60.0,65.0
+VEHICLE004,40.6570,-74.1070,45.5,62.0,68.0