Spaces:
Sleeping
Sleeping
Upload 2 files
Browse files- README.md +51 -50
- requirements.txt +2 -1
README.md
CHANGED
|
@@ -4,74 +4,75 @@ emoji: 🕳️
|
|
| 4 |
colorFrom: yellow
|
| 5 |
colorTo: red
|
| 6 |
sdk: gradio
|
| 7 |
-
sdk_version: 4.
|
| 8 |
app_file: app.py
|
| 9 |
pinned: false
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
---
|
| 11 |
|
| 12 |
-
#
|
| 13 |
|
| 14 |
-
|
| 15 |
|
| 16 |
-
##
|
| 17 |
|
| 18 |
-
|
| 19 |
-
```bash
|
| 20 |
-
pip install numpy pandas scikit-learn xgboost shap matplotlib joblib
|
| 21 |
-
```
|
| 22 |
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
```
|
| 27 |
|
| 28 |
-
##
|
| 29 |
|
| 30 |
-
|
| 31 |
-
| :--- | :--- |
|
| 32 |
-
| `severity_model_pipeline.py` | Main end-to-end pipeline script. |
|
| 33 |
-
| `synthetic_pothole_data.csv` | The generated dataset (10k samples). |
|
| 34 |
-
| `severity_model.json` | Trained XGBoost model (Native JSON format). |
|
| 35 |
-
| `feature_scaler.pkl` | MinMaxScaler for normalizing real-time features. |
|
| 36 |
-
| `feature_list.json` | JSON list ensuring correct feature ordering during inference. |
|
| 37 |
-
| `shap_bar_plot.png` | Global feature importance visualization. |
|
| 38 |
-
| `shap_dot_plot.png` | Detailed SHAP summary plot showing feature impact. |
|
| 39 |
|
| 40 |
-
|
|
|
|
|
|
|
| 41 |
|
| 42 |
-
|
| 43 |
|
| 44 |
-
|
| 45 |
-
- **D**: Defect density (fragmentation level).
|
| 46 |
-
- **C**: Centrality (distance from road center).
|
| 47 |
-
- **Q**: Detection confidence (CV confidence score).
|
| 48 |
-
- **M**: Multi-user confirmation score (crowdsourced weight).
|
| 49 |
-
- **T**: Temporal persistence (time since detection).
|
| 50 |
-
- **R**: Traffic importance (Highway: 1.0, Main: 0.7, Local: 0.4).
|
| 51 |
-
- **P**: Proximity to critical infrastructure (Hospitals, schools).
|
| 52 |
-
- **F**: Recurrence frequency (historical patch failure).
|
| 53 |
-
- **X**: Resolution failure score (reopen count).
|
| 54 |
|
| 55 |
-
|
| 56 |
|
| 57 |
-
|
| 58 |
-
$S_{base} = 0.28A + 0.10D + 0.14C + 0.04Q + 0.08M + 0.07T + 0.09R + 0.10P + 0.06F + 0.04X$
|
| 59 |
-
- **Infrastructure Boost**: $K = 1 + 0.5P$
|
| 60 |
-
- **Final Target**: $S = \min(1, S_{base} * K + \text{Gaussian Noise})$
|
| 61 |
|
| 62 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 63 |
|
| 64 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 65 |
|
| 66 |
-
|
| 67 |
|
| 68 |
-
|
| 69 |
-
from severity_model_pipeline import predict_severity, load_inference_artefacts
|
| 70 |
|
| 71 |
-
|
| 72 |
-
model, scaler, features = load_inference_artefacts()
|
| 73 |
|
| 74 |
-
|
| 75 |
-
|
| 76 |
-
|
| 77 |
-
```
|
|
|
|
| 4 |
colorFrom: yellow
|
| 5 |
colorTo: red
|
| 6 |
sdk: gradio
|
| 7 |
+
sdk_version: 4.42.0
|
| 8 |
app_file: app.py
|
| 9 |
pinned: false
|
| 10 |
+
license: mit
|
| 11 |
+
tags:
|
| 12 |
+
- xgboost
|
| 13 |
+
- tabular-regression
|
| 14 |
+
- civic-tech
|
| 15 |
+
- pothole-detection
|
| 16 |
---
|
| 17 |
|
| 18 |
+
# Model Card for Pothole Severity Scoring
|
| 19 |
|
| 20 |
+
## Model Details
|
| 21 |
|
| 22 |
+
### Model Description
|
| 23 |
|
| 24 |
+
This is an XGBoost Regressor model designed to predict the priority/severity score of civic infrastructure issues (specifically potholes). It evaluates multiple structural, environmental, and temporal features to output a severity score bounded between 0 and 1, assisting civic authorities in prioritizing repairs and resource allocation.
|
|
|
|
|
|
|
|
|
|
| 25 |
|
| 26 |
+
- **Developed by:** Civic AI System (Demo)
|
| 27 |
+
- **Model type:** XGBoost Regressor
|
| 28 |
+
- **License:** MIT
|
|
|
|
| 29 |
|
| 30 |
+
## Uses
|
| 31 |
|
| 32 |
+
### Direct Use
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
|
| 34 |
+
The model natively ingests 10 engineered features characterizing a reported pothole and outputs:
|
| 35 |
+
- A numeric severity score ($S \in [0,1]$).
|
| 36 |
+
- A qualitative priority label ("Low", "Medium", "High").
|
| 37 |
|
| 38 |
+
This is intended for sorting and prioritizing civil work dispatch queues.
|
| 39 |
|
| 40 |
+
## Bias, Risks, and Limitations
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
|
| 42 |
+
The model heavily factors in proximity to critical infrastructure (`P`) and road hierarchy (`R`). While this effectively prioritizes areas like highways and hospitals, it may systematically delay repairs in neglected or local neighborhoods if those areas lack designated local "critical infrastructure". Disparate impact assessments should be run periodically to ensure equitable civic maintenance.
|
| 43 |
|
| 44 |
+
## Training Details
|
|
|
|
|
|
|
|
|
|
| 45 |
|
| 46 |
+
### Training Data
|
| 47 |
+
|
| 48 |
+
The model was trained on a synthetically generated dataset of `10,000` samples designed to mirror realistic distributions of civic reporting. Features include:
|
| 49 |
+
- `A`: Defect area ratio
|
| 50 |
+
- `D`: Defect density
|
| 51 |
+
- `C`: Centrality (distance from center)
|
| 52 |
+
- `Q`: Initial detection confidence
|
| 53 |
+
- `M`: Multi-user confirmation score
|
| 54 |
+
- `T`: Temporal persistence (days unresolved)
|
| 55 |
+
- `R`: Traffic importance tier
|
| 56 |
+
- `P`: Proximity to critical infrastructure
|
| 57 |
+
- `F`: Recurrence frequency
|
| 58 |
+
- `X`: Resolution failure count
|
| 59 |
+
|
| 60 |
+
All features are min-max scaled `[0,1]`.
|
| 61 |
+
|
| 62 |
+
### Training Procedure
|
| 63 |
|
| 64 |
+
- **Algorithm:** XGBoost
|
| 65 |
+
- **Objective:** `reg:squarederror`
|
| 66 |
+
- **Trees:** 200
|
| 67 |
+
- **Max Depth:** 5
|
| 68 |
+
- **Learning Rate:** 0.05
|
| 69 |
|
| 70 |
+
## Evaluation
|
| 71 |
|
| 72 |
+
### Testing Data, Factors & Metrics
|
|
|
|
| 73 |
|
| 74 |
+
Evaluated on a 20% holdout set (`N=2000`).
|
|
|
|
| 75 |
|
| 76 |
+
- **RMSE:** 0.0312
|
| 77 |
+
- **MAE:** 0.0247
|
| 78 |
+
- **R² Score:** 0.8067
|
|
|
requirements.txt
CHANGED
|
@@ -3,4 +3,5 @@ pandas
|
|
| 3 |
scikit-learn
|
| 4 |
xgboost
|
| 5 |
joblib
|
| 6 |
-
gradio
|
|
|
|
|
|
| 3 |
scikit-learn
|
| 4 |
xgboost
|
| 5 |
joblib
|
| 6 |
+
gradio>=4.42.0
|
| 7 |
+
pyaudioop
|