Add KPI Health Check Panel with multi-RAT analysis, persistent degradation detection, and Excel export functionality
Browse files- documentations/kpi_health_check_plan.md +230 -0
- panel_app/kpi_health_check_panel.py +475 -0
- panel_app/panel_portal.py +115 -0
- panel_app/trafic_analysis_panel.py +18 -10
- process_kpi/__init__.py +0 -0
- process_kpi/kpi_health_check/__init__.py +0 -0
- process_kpi/kpi_health_check/engine.py +210 -0
- process_kpi/kpi_health_check/export.py +38 -0
- process_kpi/kpi_health_check/io.py +45 -0
- process_kpi/kpi_health_check/multi_rat.py +126 -0
- process_kpi/kpi_health_check/normalization.py +219 -0
- process_kpi/kpi_health_check/rules.py +31 -0
documentations/kpi_health_check_plan.md
ADDED
|
@@ -0,0 +1,230 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# KPI Health Check (Panel) — Plan global
|
| 2 |
+
|
| 3 |
+
## 1) Contexte & objectif
|
| 4 |
+
|
| 5 |
+
En NPO, on reçoit des plaintes client (site(s) impacté(s)). L’objectif est de pouvoir, en quelques minutes, vérifier si les KPI radio sont:
|
| 6 |
+
|
| 7 |
+
- Dégradés récemment
|
| 8 |
+
- Dégradés depuis longtemps (persistants)
|
| 9 |
+
- En voie de résolution / résolus
|
| 10 |
+
|
| 11 |
+
Le but est d’avoir une application Panel simple à utiliser, mais “expert-friendly”, qui standardise l’analyse et produit un rapport exportable.
|
| 12 |
+
|
| 13 |
+
## 2) Entrées & formats de données
|
| 14 |
+
|
| 15 |
+
### 2.1 Fichiers
|
| 16 |
+
|
| 17 |
+
- Rapports KPI par technologie: 2G / 3G / LTE
|
| 18 |
+
- Format: `.csv` ou `.zip` contenant un (ou plusieurs) `.csv`
|
| 19 |
+
- Séparateur: `;` (latin1) comme dans les apps existantes
|
| 20 |
+
|
| 21 |
+
### 2.2 Colonnes clés attendues
|
| 22 |
+
|
| 23 |
+
- Date: `PERIOD_START_TIME` (parfois date seule ou date+heure)
|
| 24 |
+
- Identifiant NE/site/cell (variable selon RAT)
|
| 25 |
+
- 2G: `BCF name` / `DN`
|
| 26 |
+
- 3G: `WBTS name` / `DN`
|
| 27 |
+
- LTE: `LNBTS name` / `DN`
|
| 28 |
+
|
| 29 |
+
### 2.3 KPI (liste fournie)
|
| 30 |
+
|
| 31 |
+
- Les colonnes KPI sont nombreuses et hétérogènes.
|
| 32 |
+
- Il faut distinguer:
|
| 33 |
+
- KPI “taux” (availability, success rate, CSSR, etc.) => agrégation moyenne
|
| 34 |
+
- KPI “compteurs/volumes/traffic” => agrégation somme
|
| 35 |
+
|
| 36 |
+
## 3) Sorties attendues
|
| 37 |
+
|
| 38 |
+
### 3.1 Résultats UI
|
| 39 |
+
|
| 40 |
+
- Tableau “Health Summary” par site:
|
| 41 |
+
- Nombre de KPI dégradés (par RAT)
|
| 42 |
+
- Nombre de KPI persistants
|
| 43 |
+
- Top KPI les plus critiques
|
| 44 |
+
- Drill-down par site:
|
| 45 |
+
- Courbes temporelles
|
| 46 |
+
- Jours dégradés
|
| 47 |
+
- Comparaison baseline vs récent
|
| 48 |
+
|
| 49 |
+
### 3.2 Export
|
| 50 |
+
|
| 51 |
+
- Export Excel d’un rapport complet:
|
| 52 |
+
- Résumé datasets
|
| 53 |
+
- Règles KPI (seuils/direction)
|
| 54 |
+
- Résumé site
|
| 55 |
+
- Détails par KPI/site
|
| 56 |
+
- Série journalière (optionnel / selon volume)
|
| 57 |
+
|
| 58 |
+
## 4) Modèle de détection (logique “expert”)
|
| 59 |
+
|
| 60 |
+
### 4.1 Normalisation
|
| 61 |
+
|
| 62 |
+
- Parsing date en `datetime`
|
| 63 |
+
- Construction `date_only`
|
| 64 |
+
- Extraction d’un `site_code` (ou code site) depuis le nom (pattern numérique / split comme dans l’app trafic)
|
| 65 |
+
- Enrichissement via `physical_db` (City, Latitude/Longitude) comme dans `trafic_analysis_panel.py`
|
| 66 |
+
|
| 67 |
+
### 4.2 Règles KPI (paramétrables)
|
| 68 |
+
|
| 69 |
+
Chaque KPI doit avoir:
|
| 70 |
+
|
| 71 |
+
- `direction`:
|
| 72 |
+
- `higher_is_better` (availability, success rate, CSSR, throughput, traffic)
|
| 73 |
+
- `lower_is_better` (drop rate, blocking, congestion, loss, discard, RTWP, PRB usage)
|
| 74 |
+
- `sla` (optionnel): seuil absolu
|
| 75 |
+
- `agg`: `mean` ou `sum`
|
| 76 |
+
|
| 77 |
+
### 4.3 Fenêtres temporelles
|
| 78 |
+
|
| 79 |
+
Paramètres globaux:
|
| 80 |
+
|
| 81 |
+
- Baseline window (ex: 30 jours)
|
| 82 |
+
- Recent window (ex: 7 jours)
|
| 83 |
+
- Min consecutive bad days (ex: 3 jours) => persistant
|
| 84 |
+
|
| 85 |
+
### 4.4 Critères de dégradation
|
| 86 |
+
|
| 87 |
+
Pour un couple (site, KPI, RAT):
|
| 88 |
+
|
| 89 |
+
- **Dégradation relative** vs baseline
|
| 90 |
+
- ex: variation > X% dans le “mauvais sens”
|
| 91 |
+
- **Dégradation absolue** vs SLA
|
| 92 |
+
- ex: availability < 98% ou drop > 2%
|
| 93 |
+
|
| 94 |
+
### 4.5 Classification (états)
|
| 95 |
+
|
| 96 |
+
- `OK`: pas de dégradation
|
| 97 |
+
- `DEGRADED`: dégradé récemment
|
| 98 |
+
- `PERSISTENT_DEGRADED`: dégradé récemment + streak >= N jours
|
| 99 |
+
- `RESOLVED` (V2): dégradé avant mais OK sur les derniers jours
|
| 100 |
+
- `NO_DATA`: pas de points exploitables
|
| 101 |
+
|
| 102 |
+
## 5) UX / écrans (Panel)
|
| 103 |
+
|
| 104 |
+
### 5.0 Mode multipage (Portal)
|
| 105 |
+
|
| 106 |
+
L’app cible est un **portal Panel multipage** qui regroupe plusieurs pages (apps) sur un seul serveur Panel.
|
| 107 |
+
|
| 108 |
+
Pages initiales:
|
| 109 |
+
|
| 110 |
+
- Global Traffic Analysis (page existante: `panel_app/trafic_analysis_panel.py`)
|
| 111 |
+
- KPI Health Check (nouvelle page: `panel_app/kpi_health_check_panel.py`)
|
| 112 |
+
|
| 113 |
+
Navigation:
|
| 114 |
+
|
| 115 |
+
- menu latéral (sélecteur de page)
|
| 116 |
+
- la sidebar et le contenu principal changent selon la page sélectionnée
|
| 117 |
+
|
| 118 |
+
### 5.1 Sidebar (configuration)
|
| 119 |
+
|
| 120 |
+
- Upload 2G/3G/LTE
|
| 121 |
+
- Période d’analyse optionnelle
|
| 122 |
+
- Paramètres: baseline/recent/threshold/persistance
|
| 123 |
+
- Boutons:
|
| 124 |
+
- Charger & construire les règles
|
| 125 |
+
- Lancer le health check
|
| 126 |
+
- Export Excel
|
| 127 |
+
|
| 128 |
+
### 5.2 Main
|
| 129 |
+
|
| 130 |
+
- Datasets summary
|
| 131 |
+
- KPI Rules table (editable)
|
| 132 |
+
- Site Summary table
|
| 133 |
+
- Drill-down:
|
| 134 |
+
- Sélection site
|
| 135 |
+
- Sélection RAT
|
| 136 |
+
- Sélection KPI
|
| 137 |
+
- Courbe KPI
|
| 138 |
+
- Table KPI/site (statuts)
|
| 139 |
+
|
| 140 |
+
## 6) Architecture code & organisation
|
| 141 |
+
|
| 142 |
+
### 6.1 Modules
|
| 143 |
+
|
| 144 |
+
Objectif: **app modulaire**, pas un fichier monolithique.
|
| 145 |
+
|
| 146 |
+
- `panel_app/kpi_health_check_panel.py`
|
| 147 |
+
- UI Panel (widgets, layout)
|
| 148 |
+
- branchement des callbacks
|
| 149 |
+
- aucune logique “métier” lourde
|
| 150 |
+
- `panel_app/panel_portal.py`
|
| 151 |
+
- page d’accueil + navigation multipage
|
| 152 |
+
- import des pages et affichage via `get_page_components()`
|
| 153 |
+
- `process_kpi/kpi_health_check/io.py`
|
| 154 |
+
- lecture ZIP/CSV
|
| 155 |
+
- support ZIP multi-CSV (V2)
|
| 156 |
+
- `process_kpi/kpi_health_check/normalization.py`
|
| 157 |
+
- détection colonne date / parsing
|
| 158 |
+
- extraction `site_code` depuis BCF/WBTS/LNBTS/DN
|
| 159 |
+
- agrégation journalière (mean vs sum)
|
| 160 |
+
- enrichissement `physical_db` (City/Lat/Lon)
|
| 161 |
+
- `process_kpi/kpi_health_check/rules.py`
|
| 162 |
+
- génération des règles KPI (direction, SLA, agg)
|
| 163 |
+
- validation / normalisation des règles
|
| 164 |
+
- `process_kpi/kpi_health_check/engine.py`
|
| 165 |
+
- calcul baseline vs récent
|
| 166 |
+
- classification (OK / DEGRADED / PERSISTENT_DEGRADED / RESOLVED)
|
| 167 |
+
- construction tables output
|
| 168 |
+
- `process_kpi/kpi_health_check/multi_rat.py`
|
| 169 |
+
- synthèse cross-RAT par site
|
| 170 |
+
- “top anomalies” multi-RAT
|
| 171 |
+
- `process_kpi/kpi_health_check/export.py`
|
| 172 |
+
- build Excel bytes (réutilise `panel_app/convert_to_excel_panel.py`)
|
| 173 |
+
|
| 174 |
+
### 6.2 Fonctions clés
|
| 175 |
+
|
| 176 |
+
- Lecture ZIP/CSV
|
| 177 |
+
- Détection colonnes date & ID
|
| 178 |
+
- Construction dataset journalier
|
| 179 |
+
- Génération règles KPI
|
| 180 |
+
- Évaluation health check
|
| 181 |
+
- Export Excel
|
| 182 |
+
|
| 183 |
+
### 6.3 Règle `site_code` + enrichissement physical DB (comme l’app trafic)
|
| 184 |
+
|
| 185 |
+
- **Extraction code site**
|
| 186 |
+
- stratégie principale: logique proche de `trafic_analysis_panel.py` (split / préfixe numérique du nom)
|
| 187 |
+
- fallback: regex sur séquence de chiffres dans le nom
|
| 188 |
+
- **Enrichissement**
|
| 189 |
+
- charger `physical_db/physical_database.csv` via `get_physical_db()`
|
| 190 |
+
- construire `code` depuis `Code_Sector` (`split('_')[0]`) puis cast int
|
| 191 |
+
- jointure sur le code pour récupérer `City`, `Longitude`, `Latitude`
|
| 192 |
+
|
| 193 |
+
## 7) Roadmap / itérations
|
| 194 |
+
|
| 195 |
+
### V1 (MVP)
|
| 196 |
+
|
| 197 |
+
- [DONE] Upload 2G/3G/LTE (multi-RAT)
|
| 198 |
+
- [DONE] Détection KPI numériques
|
| 199 |
+
- [DONE] Règles KPI éditables
|
| 200 |
+
- [DONE] Détection DEGRADED / PERSISTENT_DEGRADED / OK
|
| 201 |
+
- [DONE] Drill-down simple + export
|
| 202 |
+
|
| 203 |
+
### V2 (expert)
|
| 204 |
+
|
| 205 |
+
- [DONE] RESOLVED (dégradé puis OK)
|
| 206 |
+
- [DONE] Support ZIP multi-CSV
|
| 207 |
+
- [N/A] Support “cell-level” vs “site-level” (switch) (KPI confirmés par site)
|
| 208 |
+
- [TODO] Score de criticité (pondérer par trafic, population, criticité client)
|
| 209 |
+
- [DONE] Table “Top anomalies” multi-RAT (cross-RAT)
|
| 210 |
+
- [TODO] Visualisations avancées (heatmap par jour, histogrammes, etc.)
|
| 211 |
+
|
| 212 |
+
### V3 (industrialisation)
|
| 213 |
+
|
| 214 |
+
- [TODO] Presets de règles par opérateur
|
| 215 |
+
- [TODO] Gestion profils / sauvegarde de configuration
|
| 216 |
+
- [TODO] Import automatique de “liste des sites plaintes"
|
| 217 |
+
- [TODO] Génération PDF (optionnel) et pack de preuves
|
| 218 |
+
|
| 219 |
+
## 8) Points ouverts à confirmer
|
| 220 |
+
|
| 221 |
+
- [DONE] Les KPI sont par site
|
| 222 |
+
- [DONE] Les ZIP contiennent-ils parfois plusieurs CSV ? (support multi-CSV implémenté)
|
| 223 |
+
- [PARTIAL] Format exact de `PERIOD_START_TIME` sur tous les rapports ? (parsing renforcé, à valider sur tes fichiers)
|
| 224 |
+
- [TODO] Extraction du code site: règle unique ou dépend du naming ?
|
| 225 |
+
|
| 226 |
+
## 9) Critères de réussite
|
| 227 |
+
|
| 228 |
+
- Charger un rapport KPI et obtenir un top sites dégradés en < 1 minute
|
| 229 |
+
- Pouvoir isoler rapidement: depuis quand, sur quels KPI, et si c’est persistant
|
| 230 |
+
- Export Excel exploitable pour partage interne
|
panel_app/kpi_health_check_panel.py
ADDED
|
@@ -0,0 +1,475 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import io
|
| 2 |
+
import os
|
| 3 |
+
import sys
|
| 4 |
+
from datetime import date
|
| 5 |
+
|
| 6 |
+
import pandas as pd
|
| 7 |
+
import panel as pn
|
| 8 |
+
import plotly.express as px
|
| 9 |
+
|
| 10 |
+
ROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
|
| 11 |
+
if ROOT_DIR not in sys.path:
|
| 12 |
+
sys.path.insert(0, ROOT_DIR)
|
| 13 |
+
|
| 14 |
+
from process_kpi.kpi_health_check.engine import evaluate_health_check
|
| 15 |
+
from process_kpi.kpi_health_check.export import build_export_bytes
|
| 16 |
+
from process_kpi.kpi_health_check.io import read_bytes_to_df
|
| 17 |
+
from process_kpi.kpi_health_check.multi_rat import compute_multirat_views
|
| 18 |
+
from process_kpi.kpi_health_check.normalization import (
|
| 19 |
+
build_daily_kpi,
|
| 20 |
+
infer_date_col,
|
| 21 |
+
infer_id_col,
|
| 22 |
+
)
|
| 23 |
+
from process_kpi.kpi_health_check.rules import infer_kpi_direction, infer_kpi_sla
|
| 24 |
+
|
| 25 |
+
pn.extension("plotly", "tabulator")
|
| 26 |
+
|
| 27 |
+
PLOTLY_CONFIG = {"displaylogo": False, "scrollZoom": True, "displayModeBar": True}
|
| 28 |
+
|
| 29 |
+
|
| 30 |
+
def read_fileinput_to_df(file_input: pn.widgets.FileInput) -> pd.DataFrame | None:
|
| 31 |
+
if file_input is None or not file_input.value:
|
| 32 |
+
return None
|
| 33 |
+
|
| 34 |
+
return read_bytes_to_df(file_input.value, file_input.filename or "")
|
| 35 |
+
|
| 36 |
+
|
| 37 |
+
current_daily_by_rat: dict[str, pd.DataFrame] = {}
|
| 38 |
+
current_rules_df: pd.DataFrame | None = None
|
| 39 |
+
current_status_df: pd.DataFrame | None = None
|
| 40 |
+
current_summary_df: pd.DataFrame | None = None
|
| 41 |
+
current_multirat_df: pd.DataFrame | None = None
|
| 42 |
+
current_top_anomalies_df: pd.DataFrame | None = None
|
| 43 |
+
current_export_bytes: bytes | None = None
|
| 44 |
+
|
| 45 |
+
file_2g = pn.widgets.FileInput(name="2G KPI report", accept=".csv,.zip")
|
| 46 |
+
file_3g = pn.widgets.FileInput(name="3G KPI report", accept=".csv,.zip")
|
| 47 |
+
file_lte = pn.widgets.FileInput(name="LTE KPI report", accept=".csv,.zip")
|
| 48 |
+
|
| 49 |
+
analysis_range = pn.widgets.DateRangePicker(name="Analysis date range (optional)")
|
| 50 |
+
baseline_days = pn.widgets.IntInput(name="Baseline window (days)", value=30)
|
| 51 |
+
recent_days = pn.widgets.IntInput(name="Recent window (days)", value=7)
|
| 52 |
+
rel_threshold_pct = pn.widgets.FloatInput(
|
| 53 |
+
name="Relative change threshold (%)", value=10.0, step=1.0
|
| 54 |
+
)
|
| 55 |
+
min_consecutive_days = pn.widgets.IntInput(
|
| 56 |
+
name="Min consecutive bad days (persistent)", value=3
|
| 57 |
+
)
|
| 58 |
+
|
| 59 |
+
load_button = pn.widgets.Button(
|
| 60 |
+
name="Load datasets & build rules", button_type="primary"
|
| 61 |
+
)
|
| 62 |
+
run_button = pn.widgets.Button(name="Run health check", button_type="primary")
|
| 63 |
+
|
| 64 |
+
status_pane = pn.pane.Alert(
|
| 65 |
+
"Upload KPI reports (ZIP/CSV), then load datasets and run health check.",
|
| 66 |
+
alert_type="primary",
|
| 67 |
+
)
|
| 68 |
+
|
| 69 |
+
datasets_table = pn.widgets.Tabulator(
|
| 70 |
+
height=180, sizing_mode="stretch_width", layout="fit_data_table"
|
| 71 |
+
)
|
| 72 |
+
rules_table = pn.widgets.Tabulator(
|
| 73 |
+
height=260, sizing_mode="stretch_width", layout="fit_data_table"
|
| 74 |
+
)
|
| 75 |
+
|
| 76 |
+
try:
|
| 77 |
+
rules_table.editable = True
|
| 78 |
+
except Exception: # noqa: BLE001
|
| 79 |
+
try:
|
| 80 |
+
cfg = dict(rules_table.configuration or {})
|
| 81 |
+
cfg["editable"] = True
|
| 82 |
+
rules_table.configuration = cfg
|
| 83 |
+
except Exception: # noqa: BLE001
|
| 84 |
+
pass
|
| 85 |
+
|
| 86 |
+
site_summary_table = pn.widgets.Tabulator(
|
| 87 |
+
height=260, sizing_mode="stretch_width", layout="fit_data_table"
|
| 88 |
+
)
|
| 89 |
+
|
| 90 |
+
multirat_summary_table = pn.widgets.Tabulator(
|
| 91 |
+
height=260, sizing_mode="stretch_width", layout="fit_data_table"
|
| 92 |
+
)
|
| 93 |
+
|
| 94 |
+
top_anomalies_table = pn.widgets.Tabulator(
|
| 95 |
+
height=260, sizing_mode="stretch_width", layout="fit_data_table"
|
| 96 |
+
)
|
| 97 |
+
|
| 98 |
+
site_select = pn.widgets.AutocompleteInput(
|
| 99 |
+
name="Select a site (Type to search)",
|
| 100 |
+
options={},
|
| 101 |
+
case_sensitive=False,
|
| 102 |
+
search_strategy="includes",
|
| 103 |
+
restrict=True,
|
| 104 |
+
)
|
| 105 |
+
rat_select = pn.widgets.RadioButtonGroup(
|
| 106 |
+
name="RAT", options=["2G", "3G", "LTE"], value="LTE"
|
| 107 |
+
)
|
| 108 |
+
kpi_select = pn.widgets.Select(name="KPI", options=[])
|
| 109 |
+
|
| 110 |
+
site_kpi_table = pn.widgets.Tabulator(
|
| 111 |
+
height=260, sizing_mode="stretch_width", layout="fit_data_table"
|
| 112 |
+
)
|
| 113 |
+
trend_plot_pane = pn.pane.Plotly(sizing_mode="stretch_both", config=PLOTLY_CONFIG)
|
| 114 |
+
|
| 115 |
+
export_button = pn.widgets.FileDownload(
|
| 116 |
+
label="Download KPI Health Check report",
|
| 117 |
+
filename="KPI_Health_Check_Report.xlsx",
|
| 118 |
+
button_type="primary",
|
| 119 |
+
)
|
| 120 |
+
|
| 121 |
+
|
| 122 |
+
def _filtered_daily(df: pd.DataFrame) -> pd.DataFrame:
|
| 123 |
+
if df is None or df.empty:
|
| 124 |
+
return pd.DataFrame()
|
| 125 |
+
if (
|
| 126 |
+
analysis_range.value
|
| 127 |
+
and len(analysis_range.value) == 2
|
| 128 |
+
and analysis_range.value[0]
|
| 129 |
+
and analysis_range.value[1]
|
| 130 |
+
):
|
| 131 |
+
start, end = analysis_range.value
|
| 132 |
+
mask = (df["date_only"] >= start) & (df["date_only"] <= end)
|
| 133 |
+
return df[mask].copy()
|
| 134 |
+
return df
|
| 135 |
+
|
| 136 |
+
|
| 137 |
+
def _update_site_options() -> None:
|
| 138 |
+
all_sites = []
|
| 139 |
+
for df in current_daily_by_rat.values():
|
| 140 |
+
if df is None or df.empty:
|
| 141 |
+
continue
|
| 142 |
+
cols = [c for c in ["site_code", "City"] if c in df.columns]
|
| 143 |
+
all_sites.append(df[cols].drop_duplicates("site_code"))
|
| 144 |
+
|
| 145 |
+
if not all_sites:
|
| 146 |
+
site_select.options = {}
|
| 147 |
+
site_select.value = None
|
| 148 |
+
return
|
| 149 |
+
|
| 150 |
+
sites_df = pd.concat(all_sites, ignore_index=True).drop_duplicates("site_code")
|
| 151 |
+
if "City" not in sites_df.columns:
|
| 152 |
+
sites_df["City"] = pd.NA
|
| 153 |
+
|
| 154 |
+
sites_df = sites_df.sort_values(by=["City", "site_code"], na_position="last")
|
| 155 |
+
|
| 156 |
+
opts: dict[str, int] = {}
|
| 157 |
+
for _, row in sites_df.iterrows():
|
| 158 |
+
label = (
|
| 159 |
+
f"{row['City']}_{row['site_code']}"
|
| 160 |
+
if pd.notna(row.get("City"))
|
| 161 |
+
else str(row["site_code"])
|
| 162 |
+
)
|
| 163 |
+
opts[str(label)] = int(row["site_code"])
|
| 164 |
+
|
| 165 |
+
site_select.options = opts
|
| 166 |
+
if opts and site_select.value not in opts.values():
|
| 167 |
+
site_select.value = next(iter(opts.values()))
|
| 168 |
+
|
| 169 |
+
|
| 170 |
+
def _update_kpi_options() -> None:
|
| 171 |
+
rat = rat_select.value
|
| 172 |
+
df = current_daily_by_rat.get(rat)
|
| 173 |
+
if df is None or df.empty:
|
| 174 |
+
kpi_select.options = []
|
| 175 |
+
kpi_select.value = None
|
| 176 |
+
return
|
| 177 |
+
|
| 178 |
+
kpis = [
|
| 179 |
+
c
|
| 180 |
+
for c in df.columns
|
| 181 |
+
if c not in {"site_code", "date_only", "Longitude", "Latitude", "City", "RAT"}
|
| 182 |
+
]
|
| 183 |
+
kpis = sorted([str(c) for c in kpis])
|
| 184 |
+
kpi_select.options = kpis
|
| 185 |
+
if kpis and kpi_select.value not in kpis:
|
| 186 |
+
kpi_select.value = kpis[0]
|
| 187 |
+
|
| 188 |
+
|
| 189 |
+
def _update_site_view(event=None) -> None:
|
| 190 |
+
if current_status_df is None or current_status_df.empty:
|
| 191 |
+
site_kpi_table.value = pd.DataFrame()
|
| 192 |
+
trend_plot_pane.object = None
|
| 193 |
+
return
|
| 194 |
+
|
| 195 |
+
code = site_select.value
|
| 196 |
+
rat = rat_select.value
|
| 197 |
+
kpi = kpi_select.value
|
| 198 |
+
|
| 199 |
+
if code is None or rat is None:
|
| 200 |
+
site_kpi_table.value = pd.DataFrame()
|
| 201 |
+
trend_plot_pane.object = None
|
| 202 |
+
return
|
| 203 |
+
|
| 204 |
+
site_df = current_status_df[
|
| 205 |
+
(current_status_df["site_code"] == int(code))
|
| 206 |
+
& (current_status_df["RAT"] == rat)
|
| 207 |
+
].copy()
|
| 208 |
+
site_kpi_table.value = site_df
|
| 209 |
+
|
| 210 |
+
daily = current_daily_by_rat.get(rat)
|
| 211 |
+
if daily is None or daily.empty or not kpi or kpi not in daily.columns:
|
| 212 |
+
trend_plot_pane.object = None
|
| 213 |
+
return
|
| 214 |
+
|
| 215 |
+
d = _filtered_daily(daily)
|
| 216 |
+
s = d[d["site_code"] == int(code)].copy().sort_values("date_only")
|
| 217 |
+
if s.empty:
|
| 218 |
+
trend_plot_pane.object = None
|
| 219 |
+
return
|
| 220 |
+
|
| 221 |
+
title = f"{rat} - {kpi} - site {int(code)}"
|
| 222 |
+
fig = px.line(s, x="date_only", y=kpi, markers=True)
|
| 223 |
+
fig.update_layout(template="plotly_white", title=title)
|
| 224 |
+
trend_plot_pane.object = fig
|
| 225 |
+
|
| 226 |
+
|
| 227 |
+
def load_datasets(event=None) -> None:
|
| 228 |
+
try:
|
| 229 |
+
status_pane.alert_type = "primary"
|
| 230 |
+
status_pane.object = "Loading datasets..."
|
| 231 |
+
|
| 232 |
+
global current_daily_by_rat, current_rules_df
|
| 233 |
+
global current_status_df, current_summary_df, current_export_bytes
|
| 234 |
+
global current_multirat_df, current_top_anomalies_df
|
| 235 |
+
|
| 236 |
+
current_daily_by_rat = {}
|
| 237 |
+
current_rules_df = None
|
| 238 |
+
current_status_df = None
|
| 239 |
+
current_summary_df = None
|
| 240 |
+
current_multirat_df = None
|
| 241 |
+
current_top_anomalies_df = None
|
| 242 |
+
current_export_bytes = None
|
| 243 |
+
|
| 244 |
+
site_summary_table.value = pd.DataFrame()
|
| 245 |
+
multirat_summary_table.value = pd.DataFrame()
|
| 246 |
+
top_anomalies_table.value = pd.DataFrame()
|
| 247 |
+
site_kpi_table.value = pd.DataFrame()
|
| 248 |
+
trend_plot_pane.object = None
|
| 249 |
+
|
| 250 |
+
inputs = {"2G": file_2g, "3G": file_3g, "LTE": file_lte}
|
| 251 |
+
rows = []
|
| 252 |
+
rules_rows = []
|
| 253 |
+
|
| 254 |
+
loaded_any = False
|
| 255 |
+
for rat, widget in inputs.items():
|
| 256 |
+
df_raw = read_fileinput_to_df(widget)
|
| 257 |
+
if df_raw is None:
|
| 258 |
+
continue
|
| 259 |
+
loaded_any = True
|
| 260 |
+
|
| 261 |
+
date_col = None
|
| 262 |
+
id_col = None
|
| 263 |
+
try:
|
| 264 |
+
date_col = infer_date_col(df_raw)
|
| 265 |
+
except Exception: # noqa: BLE001
|
| 266 |
+
date_col = None
|
| 267 |
+
try:
|
| 268 |
+
id_col = infer_id_col(df_raw, rat)
|
| 269 |
+
except Exception: # noqa: BLE001
|
| 270 |
+
id_col = None
|
| 271 |
+
|
| 272 |
+
daily, kpi_cols = build_daily_kpi(df_raw, rat)
|
| 273 |
+
current_daily_by_rat[rat] = daily
|
| 274 |
+
|
| 275 |
+
d = _filtered_daily(daily)
|
| 276 |
+
rows.append(
|
| 277 |
+
{
|
| 278 |
+
"RAT": rat,
|
| 279 |
+
"rows_raw": int(df_raw.shape[0]),
|
| 280 |
+
"cols_raw": int(df_raw.shape[1]),
|
| 281 |
+
"date_col": date_col,
|
| 282 |
+
"id_col": id_col,
|
| 283 |
+
"sites": int(d["site_code"].nunique()),
|
| 284 |
+
"days": int(d["date_only"].nunique()),
|
| 285 |
+
"kpis": int(len(kpi_cols)),
|
| 286 |
+
}
|
| 287 |
+
)
|
| 288 |
+
|
| 289 |
+
for kpi in kpi_cols:
|
| 290 |
+
direction = infer_kpi_direction(kpi)
|
| 291 |
+
rules_rows.append(
|
| 292 |
+
{
|
| 293 |
+
"RAT": rat,
|
| 294 |
+
"KPI": kpi,
|
| 295 |
+
"direction": direction,
|
| 296 |
+
"sla": infer_kpi_sla(kpi, direction),
|
| 297 |
+
}
|
| 298 |
+
)
|
| 299 |
+
|
| 300 |
+
if not loaded_any:
|
| 301 |
+
raise ValueError("Please upload at least one KPI report")
|
| 302 |
+
|
| 303 |
+
datasets_table.value = pd.DataFrame(rows)
|
| 304 |
+
|
| 305 |
+
rules_df = (
|
| 306 |
+
pd.DataFrame(rules_rows)
|
| 307 |
+
.drop_duplicates(subset=["RAT", "KPI"])
|
| 308 |
+
.sort_values(by=["RAT", "KPI"])
|
| 309 |
+
)
|
| 310 |
+
current_rules_df = rules_df
|
| 311 |
+
rules_table.value = rules_df
|
| 312 |
+
|
| 313 |
+
_update_site_options()
|
| 314 |
+
_update_kpi_options()
|
| 315 |
+
|
| 316 |
+
status_pane.alert_type = "success"
|
| 317 |
+
status_pane.object = (
|
| 318 |
+
"Datasets loaded. Edit KPI rules if needed, then run health check."
|
| 319 |
+
)
|
| 320 |
+
|
| 321 |
+
except Exception as exc: # noqa: BLE001
|
| 322 |
+
status_pane.alert_type = "danger"
|
| 323 |
+
status_pane.object = f"Error: {exc}"
|
| 324 |
+
|
| 325 |
+
|
| 326 |
+
def run_health_check(event=None) -> None:
|
| 327 |
+
try:
|
| 328 |
+
status_pane.alert_type = "primary"
|
| 329 |
+
status_pane.object = "Running health check..."
|
| 330 |
+
|
| 331 |
+
global current_status_df, current_summary_df, current_export_bytes
|
| 332 |
+
global current_multirat_df, current_top_anomalies_df
|
| 333 |
+
|
| 334 |
+
rules_df = (
|
| 335 |
+
rules_table.value
|
| 336 |
+
if isinstance(rules_table.value, pd.DataFrame)
|
| 337 |
+
else pd.DataFrame()
|
| 338 |
+
)
|
| 339 |
+
if rules_df.empty:
|
| 340 |
+
raise ValueError("KPI rules table is empty")
|
| 341 |
+
|
| 342 |
+
all_status = []
|
| 343 |
+
all_summary = []
|
| 344 |
+
|
| 345 |
+
for rat, daily in current_daily_by_rat.items():
|
| 346 |
+
d = _filtered_daily(daily)
|
| 347 |
+
status_df, summary_df = evaluate_health_check(
|
| 348 |
+
d,
|
| 349 |
+
rat,
|
| 350 |
+
rules_df,
|
| 351 |
+
int(baseline_days.value),
|
| 352 |
+
int(recent_days.value),
|
| 353 |
+
float(rel_threshold_pct.value),
|
| 354 |
+
int(min_consecutive_days.value),
|
| 355 |
+
)
|
| 356 |
+
if not status_df.empty:
|
| 357 |
+
all_status.append(status_df)
|
| 358 |
+
if not summary_df.empty:
|
| 359 |
+
all_summary.append(summary_df)
|
| 360 |
+
|
| 361 |
+
current_status_df = (
|
| 362 |
+
pd.concat(all_status, ignore_index=True) if all_status else pd.DataFrame()
|
| 363 |
+
)
|
| 364 |
+
current_summary_df = (
|
| 365 |
+
pd.concat(all_summary, ignore_index=True) if all_summary else pd.DataFrame()
|
| 366 |
+
)
|
| 367 |
+
site_summary_table.value = current_summary_df
|
| 368 |
+
|
| 369 |
+
current_multirat_df, current_top_anomalies_df = compute_multirat_views(
|
| 370 |
+
current_status_df
|
| 371 |
+
)
|
| 372 |
+
multirat_summary_table.value = current_multirat_df
|
| 373 |
+
top_anomalies_table.value = current_top_anomalies_df
|
| 374 |
+
|
| 375 |
+
current_export_bytes = _build_export_bytes()
|
| 376 |
+
|
| 377 |
+
_update_site_view()
|
| 378 |
+
|
| 379 |
+
status_pane.alert_type = "success"
|
| 380 |
+
status_pane.object = "Health check completed."
|
| 381 |
+
|
| 382 |
+
except Exception as exc: # noqa: BLE001
|
| 383 |
+
status_pane.alert_type = "danger"
|
| 384 |
+
status_pane.object = f"Error: {exc}"
|
| 385 |
+
|
| 386 |
+
|
| 387 |
+
def _build_export_bytes() -> bytes:
|
| 388 |
+
return build_export_bytes(
|
| 389 |
+
(
|
| 390 |
+
datasets_table.value
|
| 391 |
+
if isinstance(datasets_table.value, pd.DataFrame)
|
| 392 |
+
else None
|
| 393 |
+
),
|
| 394 |
+
rules_table.value if isinstance(rules_table.value, pd.DataFrame) else None,
|
| 395 |
+
current_summary_df if isinstance(current_summary_df, pd.DataFrame) else None,
|
| 396 |
+
current_status_df if isinstance(current_status_df, pd.DataFrame) else None,
|
| 397 |
+
(
|
| 398 |
+
current_multirat_df
|
| 399 |
+
if isinstance(current_multirat_df, pd.DataFrame)
|
| 400 |
+
else None
|
| 401 |
+
),
|
| 402 |
+
(
|
| 403 |
+
current_top_anomalies_df
|
| 404 |
+
if isinstance(current_top_anomalies_df, pd.DataFrame)
|
| 405 |
+
else None
|
| 406 |
+
),
|
| 407 |
+
)
|
| 408 |
+
|
| 409 |
+
|
| 410 |
+
def _export_callback() -> io.BytesIO:
|
| 411 |
+
data = current_export_bytes or b""
|
| 412 |
+
if not data:
|
| 413 |
+
return io.BytesIO()
|
| 414 |
+
return io.BytesIO(data)
|
| 415 |
+
|
| 416 |
+
|
| 417 |
+
load_button.on_click(load_datasets)
|
| 418 |
+
run_button.on_click(run_health_check)
|
| 419 |
+
|
| 420 |
+
rat_select.param.watch(lambda e: (_update_kpi_options(), _update_site_view()), "value")
|
| 421 |
+
site_select.param.watch(_update_site_view, "value")
|
| 422 |
+
kpi_select.param.watch(_update_site_view, "value")
|
| 423 |
+
|
| 424 |
+
export_button.callback = _export_callback
|
| 425 |
+
|
| 426 |
+
|
| 427 |
+
# Page layout components (used by the multipage portal)
|
| 428 |
+
sidebar = pn.Column(
|
| 429 |
+
file_2g,
|
| 430 |
+
file_3g,
|
| 431 |
+
file_lte,
|
| 432 |
+
"---",
|
| 433 |
+
analysis_range,
|
| 434 |
+
baseline_days,
|
| 435 |
+
recent_days,
|
| 436 |
+
rel_threshold_pct,
|
| 437 |
+
min_consecutive_days,
|
| 438 |
+
"---",
|
| 439 |
+
load_button,
|
| 440 |
+
run_button,
|
| 441 |
+
"---",
|
| 442 |
+
export_button,
|
| 443 |
+
)
|
| 444 |
+
|
| 445 |
+
main = pn.Column(
|
| 446 |
+
status_pane,
|
| 447 |
+
pn.pane.Markdown("## Datasets"),
|
| 448 |
+
datasets_table,
|
| 449 |
+
pn.pane.Markdown("## KPI Rules (editable)"),
|
| 450 |
+
rules_table,
|
| 451 |
+
pn.pane.Markdown("## Site Summary"),
|
| 452 |
+
site_summary_table,
|
| 453 |
+
pn.pane.Markdown("## Multi-RAT Summary"),
|
| 454 |
+
multirat_summary_table,
|
| 455 |
+
pn.pane.Markdown("## Top anomalies (cross-RAT)"),
|
| 456 |
+
top_anomalies_table,
|
| 457 |
+
pn.layout.Divider(),
|
| 458 |
+
pn.pane.Markdown("## Drill-down"),
|
| 459 |
+
pn.Row(site_select, rat_select, kpi_select),
|
| 460 |
+
pn.Row(
|
| 461 |
+
pn.Column(site_kpi_table, sizing_mode="stretch_width"),
|
| 462 |
+
pn.Column(trend_plot_pane, sizing_mode="stretch_both"),
|
| 463 |
+
),
|
| 464 |
+
)
|
| 465 |
+
|
| 466 |
+
|
| 467 |
+
def get_page_components():
|
| 468 |
+
return sidebar, main
|
| 469 |
+
|
| 470 |
+
|
| 471 |
+
if __name__ == "__main__":
|
| 472 |
+
template = pn.template.MaterialTemplate(title="KPI Health Check - Panel")
|
| 473 |
+
template.sidebar.append(sidebar)
|
| 474 |
+
template.main.append(main)
|
| 475 |
+
template.servable()
|
panel_app/panel_portal.py
ADDED
|
@@ -0,0 +1,115 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import os
|
| 2 |
+
import sys
|
| 3 |
+
|
| 4 |
+
import panel as pn
|
| 5 |
+
|
| 6 |
+
ROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
|
| 7 |
+
if ROOT_DIR not in sys.path:
|
| 8 |
+
sys.path.insert(0, ROOT_DIR)
|
| 9 |
+
|
| 10 |
+
pn.extension("plotly", "tabulator")
|
| 11 |
+
|
| 12 |
+
import kpi_health_check_panel
|
| 13 |
+
|
| 14 |
+
# Import pages (kept as modules, not nested templates)
|
| 15 |
+
import trafic_analysis_panel
|
| 16 |
+
|
| 17 |
+
PAGES = {
|
| 18 |
+
"📊 Global Traffic Analysis": {
|
| 19 |
+
"get_components": trafic_analysis_panel.get_page_components,
|
| 20 |
+
"description": "Analyse trafic multi-RAT + cartes + exports.",
|
| 21 |
+
},
|
| 22 |
+
"📈 KPI Health Check": {
|
| 23 |
+
"get_components": kpi_health_check_panel.get_page_components,
|
| 24 |
+
"description": "Détection KPI dégradés/persistants/résolus + drill-down + export.",
|
| 25 |
+
},
|
| 26 |
+
}
|
| 27 |
+
|
| 28 |
+
HOME_PAGE = "🏠 Gallery"
|
| 29 |
+
|
| 30 |
+
page_sidebar_container = pn.Column(sizing_mode="stretch_width")
|
| 31 |
+
page_main_container = pn.Column(sizing_mode="stretch_both")
|
| 32 |
+
|
| 33 |
+
page_title = pn.pane.Markdown("", sizing_mode="stretch_width")
|
| 34 |
+
back_button = pn.widgets.Button(
|
| 35 |
+
name="← Back to gallery",
|
| 36 |
+
button_type="primary",
|
| 37 |
+
width=180,
|
| 38 |
+
)
|
| 39 |
+
|
| 40 |
+
home_button = pn.widgets.Button(
|
| 41 |
+
name=HOME_PAGE,
|
| 42 |
+
button_type="default",
|
| 43 |
+
width_policy="max",
|
| 44 |
+
)
|
| 45 |
+
|
| 46 |
+
|
| 47 |
+
def _load_page(page_name: str) -> None:
|
| 48 |
+
if page_name == HOME_PAGE:
|
| 49 |
+
page_title.object = "## Applications"
|
| 50 |
+
|
| 51 |
+
tiles = []
|
| 52 |
+
for title, meta in PAGES.items():
|
| 53 |
+
btn = pn.widgets.Button(name="Open", button_type="primary", width=120)
|
| 54 |
+
btn.on_click(lambda e, t=title: _load_page(t))
|
| 55 |
+
|
| 56 |
+
tile = pn.Column(
|
| 57 |
+
pn.pane.Markdown(f"### {title}\n\n{meta.get('description', '')}"),
|
| 58 |
+
btn,
|
| 59 |
+
sizing_mode="stretch_width",
|
| 60 |
+
margin=(10, 10, 10, 10),
|
| 61 |
+
)
|
| 62 |
+
tiles.append(tile)
|
| 63 |
+
|
| 64 |
+
gallery = pn.GridBox(*tiles, ncols=2, sizing_mode="stretch_width")
|
| 65 |
+
page_sidebar_container.objects = [
|
| 66 |
+
pn.pane.Markdown(
|
| 67 |
+
"""### Bienvenue\n\nChoisis une application dans la gallery."""
|
| 68 |
+
)
|
| 69 |
+
]
|
| 70 |
+
page_main_container.objects = [page_title, gallery]
|
| 71 |
+
return
|
| 72 |
+
|
| 73 |
+
meta = PAGES.get(page_name)
|
| 74 |
+
if meta is None:
|
| 75 |
+
page_sidebar_container.objects = [
|
| 76 |
+
pn.pane.Alert("Unknown page", alert_type="danger")
|
| 77 |
+
]
|
| 78 |
+
page_main_container.objects = []
|
| 79 |
+
return
|
| 80 |
+
|
| 81 |
+
sidebar, main = meta["get_components"]()
|
| 82 |
+
page_title.object = f"## {page_name}"
|
| 83 |
+
page_sidebar_container.objects = [sidebar]
|
| 84 |
+
page_main_container.objects = [
|
| 85 |
+
pn.Row(back_button, pn.Spacer(), sizing_mode="stretch_width"),
|
| 86 |
+
page_title,
|
| 87 |
+
main,
|
| 88 |
+
]
|
| 89 |
+
|
| 90 |
+
|
| 91 |
+
template = pn.template.MaterialTemplate(title="OML DB - Panel Portal")
|
| 92 |
+
|
| 93 |
+
|
| 94 |
+
def _go_home(event=None) -> None:
|
| 95 |
+
_load_page(HOME_PAGE)
|
| 96 |
+
|
| 97 |
+
|
| 98 |
+
back_button.on_click(_go_home)
|
| 99 |
+
home_button.on_click(_go_home)
|
| 100 |
+
|
| 101 |
+
_load_page(HOME_PAGE)
|
| 102 |
+
|
| 103 |
+
template.sidebar.append(
|
| 104 |
+
pn.Column(
|
| 105 |
+
pn.pane.Markdown("## Navigation"),
|
| 106 |
+
home_button,
|
| 107 |
+
pn.layout.Divider(),
|
| 108 |
+
page_sidebar_container,
|
| 109 |
+
sizing_mode="stretch_width",
|
| 110 |
+
)
|
| 111 |
+
)
|
| 112 |
+
|
| 113 |
+
template.main.append(page_main_container)
|
| 114 |
+
|
| 115 |
+
template.servable()
|
panel_app/trafic_analysis_panel.py
CHANGED
|
@@ -16,12 +16,16 @@ if ROOT_DIR not in sys.path:
|
|
| 16 |
from panel_app.convert_to_excel_panel import write_dfs_to_excel
|
| 17 |
from utils.utils_vars import get_physical_db
|
| 18 |
|
| 19 |
-
pn.extension(
|
| 20 |
-
"
|
| 21 |
-
"
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
|
| 26 |
|
| 27 |
def read_fileinput_to_df(file_input: pn.widgets.FileInput) -> pd.DataFrame | None:
|
|
@@ -1480,7 +1484,7 @@ def _update_site_view(event=None) -> None: # noqa: D401, ARG001
|
|
| 1480 |
]
|
| 1481 |
first_row = site_detail_df.iloc[0]
|
| 1482 |
site_label = f"{first_row['code']}"
|
| 1483 |
-
if pd.notna(first_row.get(
|
| 1484 |
site_label += f" ({first_row['City']})"
|
| 1485 |
|
| 1486 |
if traffic_cols:
|
|
@@ -2389,8 +2393,12 @@ main_content = pn.Column(
|
|
| 2389 |
export_button,
|
| 2390 |
)
|
| 2391 |
|
| 2392 |
-
|
| 2393 |
-
|
|
|
|
| 2394 |
|
| 2395 |
|
| 2396 |
-
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
from panel_app.convert_to_excel_panel import write_dfs_to_excel
|
| 17 |
from utils.utils_vars import get_physical_db
|
| 18 |
|
| 19 |
+
pn.extension(
|
| 20 |
+
"plotly",
|
| 21 |
+
"tabulator",
|
| 22 |
+
raw_css=[
|
| 23 |
+
":fullscreen { background-color: white; overflow: auto; }",
|
| 24 |
+
"::backdrop { background-color: white; }",
|
| 25 |
+
".plot-fullscreen-wrapper:fullscreen { padding: 20px; display: flex; flex-direction: column; }",
|
| 26 |
+
".plot-fullscreen-wrapper:fullscreen > * { height: 100% !important; width: 100% !important; }",
|
| 27 |
+
],
|
| 28 |
+
)
|
| 29 |
|
| 30 |
|
| 31 |
def read_fileinput_to_df(file_input: pn.widgets.FileInput) -> pd.DataFrame | None:
|
|
|
|
| 1484 |
]
|
| 1485 |
first_row = site_detail_df.iloc[0]
|
| 1486 |
site_label = f"{first_row['code']}"
|
| 1487 |
+
if pd.notna(first_row.get("City")):
|
| 1488 |
site_label += f" ({first_row['City']})"
|
| 1489 |
|
| 1490 |
if traffic_cols:
|
|
|
|
| 2393 |
export_button,
|
| 2394 |
)
|
| 2395 |
|
| 2396 |
+
|
| 2397 |
+
def get_page_components():
|
| 2398 |
+
return sidebar_content, main_content
|
| 2399 |
|
| 2400 |
|
| 2401 |
+
if __name__ == "__main__":
|
| 2402 |
+
template.sidebar.append(sidebar_content)
|
| 2403 |
+
template.main.append(main_content)
|
| 2404 |
+
template.servable()
|
process_kpi/__init__.py
ADDED
|
File without changes
|
process_kpi/kpi_health_check/__init__.py
ADDED
|
File without changes
|
process_kpi/kpi_health_check/engine.py
ADDED
|
@@ -0,0 +1,210 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from datetime import date, timedelta
|
| 2 |
+
|
| 3 |
+
import numpy as np
|
| 4 |
+
import pandas as pd
|
| 5 |
+
|
| 6 |
+
|
| 7 |
+
def window_bounds(end_date: date, days: int) -> tuple[date, date]:
|
| 8 |
+
start = end_date - timedelta(days=days - 1)
|
| 9 |
+
return start, end_date
|
| 10 |
+
|
| 11 |
+
|
| 12 |
+
def is_bad(
|
| 13 |
+
value: float | None,
|
| 14 |
+
baseline: float | None,
|
| 15 |
+
direction: str,
|
| 16 |
+
rel_threshold_pct: float,
|
| 17 |
+
sla: float | None,
|
| 18 |
+
) -> bool:
|
| 19 |
+
if value is None or (isinstance(value, float) and np.isnan(value)):
|
| 20 |
+
return False
|
| 21 |
+
bad = False
|
| 22 |
+
if sla is not None and not (isinstance(sla, float) and np.isnan(sla)):
|
| 23 |
+
if direction == "higher_is_better":
|
| 24 |
+
bad = bad or (value < float(sla))
|
| 25 |
+
else:
|
| 26 |
+
bad = bad or (value > float(sla))
|
| 27 |
+
|
| 28 |
+
if baseline is None or (isinstance(baseline, float) and np.isnan(baseline)):
|
| 29 |
+
return bad
|
| 30 |
+
|
| 31 |
+
thr = float(rel_threshold_pct) / 100.0
|
| 32 |
+
if direction == "higher_is_better":
|
| 33 |
+
return bad or (value < baseline * (1.0 - thr))
|
| 34 |
+
return bad or (value > baseline * (1.0 + thr))
|
| 35 |
+
|
| 36 |
+
|
| 37 |
+
def max_consecutive_days(dates: list[date]) -> int:
|
| 38 |
+
if not dates:
|
| 39 |
+
return 0
|
| 40 |
+
dates_sorted = sorted(set(dates))
|
| 41 |
+
streak = 1
|
| 42 |
+
best = 1
|
| 43 |
+
for prev, cur in zip(dates_sorted, dates_sorted[1:]):
|
| 44 |
+
if cur == prev + timedelta(days=1):
|
| 45 |
+
streak += 1
|
| 46 |
+
else:
|
| 47 |
+
streak = 1
|
| 48 |
+
if streak > best:
|
| 49 |
+
best = streak
|
| 50 |
+
return best
|
| 51 |
+
|
| 52 |
+
|
| 53 |
+
def evaluate_health_check(
|
| 54 |
+
daily: pd.DataFrame,
|
| 55 |
+
rat: str,
|
| 56 |
+
rules_df: pd.DataFrame,
|
| 57 |
+
baseline_days_n: int,
|
| 58 |
+
recent_days_n: int,
|
| 59 |
+
rel_threshold_pct: float,
|
| 60 |
+
min_consecutive_days: int,
|
| 61 |
+
) -> tuple[pd.DataFrame, pd.DataFrame]:
|
| 62 |
+
if daily.empty:
|
| 63 |
+
return pd.DataFrame(), pd.DataFrame()
|
| 64 |
+
|
| 65 |
+
end_date = max(daily["date_only"])
|
| 66 |
+
recent_start, recent_end = window_bounds(end_date, int(recent_days_n))
|
| 67 |
+
baseline_end = recent_start - timedelta(days=1)
|
| 68 |
+
baseline_start = baseline_end - timedelta(days=int(baseline_days_n) - 1)
|
| 69 |
+
|
| 70 |
+
rat_rules = rules_df[rules_df["RAT"] == rat].copy()
|
| 71 |
+
kpis = [k for k in rat_rules["KPI"].tolist() if k in daily.columns]
|
| 72 |
+
|
| 73 |
+
rows = []
|
| 74 |
+
|
| 75 |
+
for site_code, g_site in daily.groupby("site_code"):
|
| 76 |
+
city = (
|
| 77 |
+
g_site["City"].dropna().iloc[0]
|
| 78 |
+
if ("City" in g_site.columns and g_site["City"].notna().any())
|
| 79 |
+
else None
|
| 80 |
+
)
|
| 81 |
+
g_site = g_site.sort_values("date_only")
|
| 82 |
+
|
| 83 |
+
for kpi in kpis:
|
| 84 |
+
rule = rat_rules[rat_rules["KPI"] == kpi].iloc[0]
|
| 85 |
+
direction = str(rule.get("direction", "higher_is_better"))
|
| 86 |
+
sla = rule.get("sla", np.nan)
|
| 87 |
+
try:
|
| 88 |
+
sla_val = float(sla) if pd.notna(sla) else None
|
| 89 |
+
except Exception:
|
| 90 |
+
sla_val = None
|
| 91 |
+
|
| 92 |
+
s = g_site[["date_only", kpi]].dropna(subset=[kpi])
|
| 93 |
+
if s.empty:
|
| 94 |
+
rows.append(
|
| 95 |
+
{
|
| 96 |
+
"RAT": rat,
|
| 97 |
+
"site_code": int(site_code),
|
| 98 |
+
"City": city,
|
| 99 |
+
"KPI": kpi,
|
| 100 |
+
"status": "NO_DATA",
|
| 101 |
+
}
|
| 102 |
+
)
|
| 103 |
+
continue
|
| 104 |
+
|
| 105 |
+
baseline_mask = (s["date_only"] >= baseline_start) & (
|
| 106 |
+
s["date_only"] <= baseline_end
|
| 107 |
+
)
|
| 108 |
+
recent_mask = (s["date_only"] >= recent_start) & (
|
| 109 |
+
s["date_only"] <= recent_end
|
| 110 |
+
)
|
| 111 |
+
|
| 112 |
+
baseline = (
|
| 113 |
+
s.loc[baseline_mask, kpi].median() if baseline_mask.any() else np.nan
|
| 114 |
+
)
|
| 115 |
+
recent = s.loc[recent_mask, kpi].median() if recent_mask.any() else np.nan
|
| 116 |
+
|
| 117 |
+
daily_recent = s.loc[recent_mask, ["date_only", kpi]].copy()
|
| 118 |
+
bad_dates = []
|
| 119 |
+
if not daily_recent.empty:
|
| 120 |
+
for d, v in zip(
|
| 121 |
+
daily_recent["date_only"].tolist(), daily_recent[kpi].tolist()
|
| 122 |
+
):
|
| 123 |
+
if is_bad(
|
| 124 |
+
float(v) if pd.notna(v) else None,
|
| 125 |
+
float(baseline) if pd.notna(baseline) else None,
|
| 126 |
+
direction,
|
| 127 |
+
rel_threshold_pct,
|
| 128 |
+
sla_val,
|
| 129 |
+
):
|
| 130 |
+
bad_dates.append(d)
|
| 131 |
+
|
| 132 |
+
max_streak = max_consecutive_days(bad_dates)
|
| 133 |
+
persistent = max_streak >= int(min_consecutive_days)
|
| 134 |
+
|
| 135 |
+
is_bad_recent = is_bad(
|
| 136 |
+
float(recent) if pd.notna(recent) else None,
|
| 137 |
+
float(baseline) if pd.notna(baseline) else None,
|
| 138 |
+
direction,
|
| 139 |
+
rel_threshold_pct,
|
| 140 |
+
sla_val,
|
| 141 |
+
)
|
| 142 |
+
|
| 143 |
+
is_bad_current = is_bad_recent
|
| 144 |
+
if not daily_recent.empty:
|
| 145 |
+
last_row = daily_recent.sort_values("date_only").iloc[-1]
|
| 146 |
+
last_val = last_row[kpi]
|
| 147 |
+
is_bad_current = is_bad(
|
| 148 |
+
float(last_val) if pd.notna(last_val) else None,
|
| 149 |
+
float(baseline) if pd.notna(baseline) else None,
|
| 150 |
+
direction,
|
| 151 |
+
rel_threshold_pct,
|
| 152 |
+
sla_val,
|
| 153 |
+
)
|
| 154 |
+
|
| 155 |
+
had_bad_recent = (len(bad_dates) > 0) or bool(is_bad_recent)
|
| 156 |
+
|
| 157 |
+
if is_bad_current and persistent:
|
| 158 |
+
status = "PERSISTENT_DEGRADED"
|
| 159 |
+
elif is_bad_current:
|
| 160 |
+
status = "DEGRADED"
|
| 161 |
+
elif had_bad_recent:
|
| 162 |
+
status = "RESOLVED"
|
| 163 |
+
else:
|
| 164 |
+
status = "OK"
|
| 165 |
+
|
| 166 |
+
rows.append(
|
| 167 |
+
{
|
| 168 |
+
"RAT": rat,
|
| 169 |
+
"site_code": int(site_code),
|
| 170 |
+
"City": city,
|
| 171 |
+
"KPI": kpi,
|
| 172 |
+
"direction": direction,
|
| 173 |
+
"sla": sla_val,
|
| 174 |
+
"baseline_median": baseline,
|
| 175 |
+
"recent_median": recent,
|
| 176 |
+
"bad_days_recent": len(bad_dates),
|
| 177 |
+
"max_streak_recent": int(max_streak),
|
| 178 |
+
"status": status,
|
| 179 |
+
}
|
| 180 |
+
)
|
| 181 |
+
|
| 182 |
+
status_df = pd.DataFrame(rows)
|
| 183 |
+
|
| 184 |
+
summary_rows = []
|
| 185 |
+
for site_code, g in status_df.groupby("site_code"):
|
| 186 |
+
city = (
|
| 187 |
+
g["City"].dropna().iloc[0]
|
| 188 |
+
if ("City" in g.columns and g["City"].notna().any())
|
| 189 |
+
else None
|
| 190 |
+
)
|
| 191 |
+
degraded_cnt = int(g["status"].isin(["DEGRADED", "PERSISTENT_DEGRADED"]).sum())
|
| 192 |
+
persistent_cnt = int((g["status"] == "PERSISTENT_DEGRADED").sum())
|
| 193 |
+
resolved_cnt = int((g["status"] == "RESOLVED").sum())
|
| 194 |
+
summary_rows.append(
|
| 195 |
+
{
|
| 196 |
+
"RAT": rat,
|
| 197 |
+
"site_code": int(site_code),
|
| 198 |
+
"City": city,
|
| 199 |
+
"degraded_kpis": degraded_cnt,
|
| 200 |
+
"persistent_kpis": persistent_cnt,
|
| 201 |
+
"resolved_kpis": resolved_cnt,
|
| 202 |
+
}
|
| 203 |
+
)
|
| 204 |
+
|
| 205 |
+
summary_df = pd.DataFrame(summary_rows).sort_values(
|
| 206 |
+
by=["degraded_kpis", "persistent_kpis", "resolved_kpis"],
|
| 207 |
+
ascending=[False, False, False],
|
| 208 |
+
)
|
| 209 |
+
|
| 210 |
+
return status_df, summary_df
|
process_kpi/kpi_health_check/export.py
ADDED
|
@@ -0,0 +1,38 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import pandas as pd
|
| 2 |
+
|
| 3 |
+
from panel_app.convert_to_excel_panel import write_dfs_to_excel
|
| 4 |
+
|
| 5 |
+
|
| 6 |
+
def build_export_bytes(
|
| 7 |
+
datasets_df: pd.DataFrame | None,
|
| 8 |
+
rules_df: pd.DataFrame | None,
|
| 9 |
+
summary_df: pd.DataFrame | None,
|
| 10 |
+
status_df: pd.DataFrame | None,
|
| 11 |
+
multirat_summary_df: pd.DataFrame | None = None,
|
| 12 |
+
top_anomalies_df: pd.DataFrame | None = None,
|
| 13 |
+
) -> bytes:
|
| 14 |
+
dfs = [
|
| 15 |
+
datasets_df if isinstance(datasets_df, pd.DataFrame) else pd.DataFrame(),
|
| 16 |
+
rules_df if isinstance(rules_df, pd.DataFrame) else pd.DataFrame(),
|
| 17 |
+
summary_df if isinstance(summary_df, pd.DataFrame) else pd.DataFrame(),
|
| 18 |
+
status_df if isinstance(status_df, pd.DataFrame) else pd.DataFrame(),
|
| 19 |
+
(
|
| 20 |
+
multirat_summary_df
|
| 21 |
+
if isinstance(multirat_summary_df, pd.DataFrame)
|
| 22 |
+
else pd.DataFrame()
|
| 23 |
+
),
|
| 24 |
+
(
|
| 25 |
+
top_anomalies_df
|
| 26 |
+
if isinstance(top_anomalies_df, pd.DataFrame)
|
| 27 |
+
else pd.DataFrame()
|
| 28 |
+
),
|
| 29 |
+
]
|
| 30 |
+
sheet_names = [
|
| 31 |
+
"Datasets",
|
| 32 |
+
"KPI_Rules",
|
| 33 |
+
"Site_Summary",
|
| 34 |
+
"Site_KPI_Status",
|
| 35 |
+
"MultiRAT_Summary",
|
| 36 |
+
"Top_Anomalies",
|
| 37 |
+
]
|
| 38 |
+
return write_dfs_to_excel(dfs, sheet_names, index=False)
|
process_kpi/kpi_health_check/io.py
ADDED
|
@@ -0,0 +1,45 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import io
|
| 2 |
+
import zipfile
|
| 3 |
+
|
| 4 |
+
import pandas as pd
|
| 5 |
+
|
| 6 |
+
|
| 7 |
+
def read_bytes_to_df(file_bytes: bytes, filename: str) -> pd.DataFrame:
|
| 8 |
+
if not file_bytes:
|
| 9 |
+
raise ValueError("Empty file")
|
| 10 |
+
|
| 11 |
+
filename_l = (filename or "").lower()
|
| 12 |
+
data = io.BytesIO(file_bytes)
|
| 13 |
+
|
| 14 |
+
if filename_l.endswith(".zip"):
|
| 15 |
+
with zipfile.ZipFile(data) as z:
|
| 16 |
+
csv_files = [f for f in z.namelist() if f.lower().endswith(".csv")]
|
| 17 |
+
if not csv_files:
|
| 18 |
+
raise ValueError("No CSV file found in the ZIP archive")
|
| 19 |
+
dfs = []
|
| 20 |
+
for csv_name in csv_files:
|
| 21 |
+
try:
|
| 22 |
+
with z.open(csv_name) as f:
|
| 23 |
+
df = pd.read_csv(
|
| 24 |
+
f,
|
| 25 |
+
encoding="latin1",
|
| 26 |
+
sep=";",
|
| 27 |
+
low_memory=False,
|
| 28 |
+
)
|
| 29 |
+
if isinstance(df, pd.DataFrame) and not df.empty:
|
| 30 |
+
dfs.append(df)
|
| 31 |
+
except Exception:
|
| 32 |
+
continue
|
| 33 |
+
|
| 34 |
+
if not dfs:
|
| 35 |
+
raise ValueError("No readable CSV content found in the ZIP archive")
|
| 36 |
+
|
| 37 |
+
if len(dfs) == 1:
|
| 38 |
+
return dfs[0]
|
| 39 |
+
|
| 40 |
+
return pd.concat(dfs, ignore_index=True, sort=False)
|
| 41 |
+
|
| 42 |
+
if filename_l.endswith(".csv"):
|
| 43 |
+
return pd.read_csv(data, encoding="latin1", sep=";", low_memory=False)
|
| 44 |
+
|
| 45 |
+
raise ValueError("Unsupported file format. Please upload a ZIP or CSV file.")
|
process_kpi/kpi_health_check/multi_rat.py
ADDED
|
@@ -0,0 +1,126 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import pandas as pd
|
| 2 |
+
|
| 3 |
+
|
| 4 |
+
def compute_multirat_views(
|
| 5 |
+
status_df: pd.DataFrame,
|
| 6 |
+
) -> tuple[pd.DataFrame, pd.DataFrame]:
|
| 7 |
+
if status_df is None or status_df.empty:
|
| 8 |
+
return pd.DataFrame(), pd.DataFrame()
|
| 9 |
+
|
| 10 |
+
df = status_df.copy()
|
| 11 |
+
df["is_degraded"] = df["status"].isin(["DEGRADED", "PERSISTENT_DEGRADED"])
|
| 12 |
+
df["is_persistent"] = df["status"].isin(["PERSISTENT_DEGRADED"])
|
| 13 |
+
df["is_resolved"] = df["status"].isin(["RESOLVED"])
|
| 14 |
+
|
| 15 |
+
def _first_city(s: pd.Series):
|
| 16 |
+
s2 = s.dropna()
|
| 17 |
+
return s2.iloc[0] if not s2.empty else None
|
| 18 |
+
|
| 19 |
+
base = (
|
| 20 |
+
df.groupby("site_code", as_index=False)
|
| 21 |
+
.agg(
|
| 22 |
+
City=("City", _first_city),
|
| 23 |
+
degraded_kpis_total=("is_degraded", "sum"),
|
| 24 |
+
persistent_kpis_total=("is_persistent", "sum"),
|
| 25 |
+
resolved_kpis_total=("is_resolved", "sum"),
|
| 26 |
+
)
|
| 27 |
+
.copy()
|
| 28 |
+
)
|
| 29 |
+
|
| 30 |
+
impacted = (
|
| 31 |
+
df[df["is_degraded"]]
|
| 32 |
+
.groupby("site_code")["RAT"]
|
| 33 |
+
.nunique()
|
| 34 |
+
.rename("impacted_rats")
|
| 35 |
+
.reset_index()
|
| 36 |
+
)
|
| 37 |
+
|
| 38 |
+
resolved_pivot = (
|
| 39 |
+
df[df["is_resolved"]]
|
| 40 |
+
.pivot_table(
|
| 41 |
+
index="site_code",
|
| 42 |
+
columns="RAT",
|
| 43 |
+
values="KPI",
|
| 44 |
+
aggfunc="count",
|
| 45 |
+
fill_value=0,
|
| 46 |
+
)
|
| 47 |
+
.rename(columns=lambda c: f"resolved_{c}")
|
| 48 |
+
.reset_index()
|
| 49 |
+
)
|
| 50 |
+
|
| 51 |
+
base = pd.merge(base, impacted, on="site_code", how="left")
|
| 52 |
+
base["impacted_rats"] = base["impacted_rats"].fillna(0).astype(int)
|
| 53 |
+
|
| 54 |
+
degraded_pivot = (
|
| 55 |
+
df[df["is_degraded"]]
|
| 56 |
+
.pivot_table(
|
| 57 |
+
index="site_code",
|
| 58 |
+
columns="RAT",
|
| 59 |
+
values="KPI",
|
| 60 |
+
aggfunc="count",
|
| 61 |
+
fill_value=0,
|
| 62 |
+
)
|
| 63 |
+
.rename(columns=lambda c: f"degraded_{c}")
|
| 64 |
+
.reset_index()
|
| 65 |
+
)
|
| 66 |
+
|
| 67 |
+
persistent_pivot = (
|
| 68 |
+
df[df["is_persistent"]]
|
| 69 |
+
.pivot_table(
|
| 70 |
+
index="site_code",
|
| 71 |
+
columns="RAT",
|
| 72 |
+
values="KPI",
|
| 73 |
+
aggfunc="count",
|
| 74 |
+
fill_value=0,
|
| 75 |
+
)
|
| 76 |
+
.rename(columns=lambda c: f"persistent_{c}")
|
| 77 |
+
.reset_index()
|
| 78 |
+
)
|
| 79 |
+
|
| 80 |
+
out = base
|
| 81 |
+
if not degraded_pivot.empty:
|
| 82 |
+
out = pd.merge(out, degraded_pivot, on="site_code", how="left")
|
| 83 |
+
if not persistent_pivot.empty:
|
| 84 |
+
out = pd.merge(out, persistent_pivot, on="site_code", how="left")
|
| 85 |
+
if not resolved_pivot.empty:
|
| 86 |
+
out = pd.merge(out, resolved_pivot, on="site_code", how="left")
|
| 87 |
+
|
| 88 |
+
metric_cols = [c for c in out.columns if c != "City"]
|
| 89 |
+
out[metric_cols] = out[metric_cols].fillna(0)
|
| 90 |
+
out = out.sort_values(
|
| 91 |
+
by=["persistent_kpis_total", "degraded_kpis_total", "impacted_rats"],
|
| 92 |
+
ascending=[False, False, False],
|
| 93 |
+
)
|
| 94 |
+
|
| 95 |
+
top = df[df["is_degraded"]].copy()
|
| 96 |
+
sev = {"PERSISTENT_DEGRADED": 2, "DEGRADED": 1}
|
| 97 |
+
top["severity"] = top["status"].map(sev).fillna(0).astype(int)
|
| 98 |
+
|
| 99 |
+
for col in ["bad_days_recent", "max_streak_recent"]:
|
| 100 |
+
if col not in top.columns:
|
| 101 |
+
top[col] = pd.NA
|
| 102 |
+
|
| 103 |
+
top = top.sort_values(
|
| 104 |
+
by=["severity", "max_streak_recent", "bad_days_recent"],
|
| 105 |
+
ascending=[False, False, False],
|
| 106 |
+
)
|
| 107 |
+
|
| 108 |
+
top_cols = [
|
| 109 |
+
c
|
| 110 |
+
for c in [
|
| 111 |
+
"severity",
|
| 112 |
+
"RAT",
|
| 113 |
+
"site_code",
|
| 114 |
+
"City",
|
| 115 |
+
"KPI",
|
| 116 |
+
"status",
|
| 117 |
+
"baseline_median",
|
| 118 |
+
"recent_median",
|
| 119 |
+
"bad_days_recent",
|
| 120 |
+
"max_streak_recent",
|
| 121 |
+
]
|
| 122 |
+
if c in top.columns
|
| 123 |
+
]
|
| 124 |
+
top = top[top_cols].head(300)
|
| 125 |
+
|
| 126 |
+
return out, top
|
process_kpi/kpi_health_check/normalization.py
ADDED
|
@@ -0,0 +1,219 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import re
|
| 2 |
+
|
| 3 |
+
import numpy as np
|
| 4 |
+
import pandas as pd
|
| 5 |
+
|
| 6 |
+
from utils.utils_vars import get_physical_db
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
def to_numeric(series: pd.Series) -> pd.Series:
|
| 10 |
+
if pd.api.types.is_numeric_dtype(series):
|
| 11 |
+
return pd.to_numeric(series, errors="coerce")
|
| 12 |
+
s = series.astype(str)
|
| 13 |
+
s = s.str.replace("\u00a0", "", regex=False)
|
| 14 |
+
s = s.str.replace(" ", "", regex=False)
|
| 15 |
+
s = s.str.replace("%", "", regex=False)
|
| 16 |
+
s = s.str.replace(",", ".", regex=False)
|
| 17 |
+
s = s.replace({"nan": np.nan, "None": np.nan, "": np.nan})
|
| 18 |
+
return pd.to_numeric(s, errors="coerce")
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
def parse_datetime(series: pd.Series) -> pd.Series:
|
| 22 |
+
if series.empty:
|
| 23 |
+
return pd.to_datetime(series, errors="coerce")
|
| 24 |
+
first = series.dropna().astype(str).iloc[0] if series.dropna().any() else ""
|
| 25 |
+
|
| 26 |
+
formats: list[str | None] = []
|
| 27 |
+
if len(first) > 10:
|
| 28 |
+
formats.extend(
|
| 29 |
+
[
|
| 30 |
+
"%m.%d.%Y %H:%M:%S",
|
| 31 |
+
"%d.%m.%Y %H:%M:%S",
|
| 32 |
+
"%Y-%m-%d %H:%M:%S",
|
| 33 |
+
"%Y/%m/%d %H:%M:%S",
|
| 34 |
+
"%d/%m/%Y %H:%M:%S",
|
| 35 |
+
"%m/%d/%Y %H:%M:%S",
|
| 36 |
+
]
|
| 37 |
+
)
|
| 38 |
+
formats.extend(
|
| 39 |
+
[
|
| 40 |
+
"%m.%d.%Y",
|
| 41 |
+
"%d.%m.%Y",
|
| 42 |
+
"%Y-%m-%d",
|
| 43 |
+
"%Y/%m/%d",
|
| 44 |
+
"%d/%m/%Y",
|
| 45 |
+
"%m/%d/%Y",
|
| 46 |
+
]
|
| 47 |
+
)
|
| 48 |
+
|
| 49 |
+
for fmt in formats:
|
| 50 |
+
dt = pd.to_datetime(series, errors="coerce", format=fmt)
|
| 51 |
+
if dt.notna().any():
|
| 52 |
+
return dt
|
| 53 |
+
|
| 54 |
+
return pd.to_datetime(series, errors="coerce")
|
| 55 |
+
|
| 56 |
+
|
| 57 |
+
def extract_site_code(value: object) -> int | None:
|
| 58 |
+
if value is None or (isinstance(value, float) and np.isnan(value)):
|
| 59 |
+
return None
|
| 60 |
+
s = str(value)
|
| 61 |
+
m = re.search(r"(\d{4,7})", s)
|
| 62 |
+
if not m:
|
| 63 |
+
return None
|
| 64 |
+
try:
|
| 65 |
+
return int(m.group(1))
|
| 66 |
+
except ValueError:
|
| 67 |
+
return None
|
| 68 |
+
|
| 69 |
+
|
| 70 |
+
def infer_date_col(df: pd.DataFrame) -> str:
|
| 71 |
+
for c in ["PERIOD_START_TIME", "PERIOD_START_DATE", "date", "Date", "DATE"]:
|
| 72 |
+
if c in df.columns:
|
| 73 |
+
return c
|
| 74 |
+
raise ValueError("Cannot find a date column (expected PERIOD_START_TIME)")
|
| 75 |
+
|
| 76 |
+
|
| 77 |
+
def infer_id_col(df: pd.DataFrame, rat: str) -> str:
|
| 78 |
+
rat_candidates = {
|
| 79 |
+
"2G": ["BCF name", "BCF", "BTS name", "BSC name", "DN"],
|
| 80 |
+
"3G": ["WBTS name", "WBTS ID", "DN"],
|
| 81 |
+
"LTE": ["LNBTS name", "MRBTS/SBTS name", "DN"],
|
| 82 |
+
}
|
| 83 |
+
|
| 84 |
+
candidates = [c for c in rat_candidates.get(rat, []) if c in df.columns]
|
| 85 |
+
if not candidates and "DN" in df.columns:
|
| 86 |
+
candidates = ["DN"]
|
| 87 |
+
if not candidates:
|
| 88 |
+
raise ValueError(f"Cannot infer an entity/site column for {rat} dataset")
|
| 89 |
+
|
| 90 |
+
physical_codes: set[int] | None = None
|
| 91 |
+
try:
|
| 92 |
+
physical = load_physical_db()
|
| 93 |
+
if not physical.empty and "code" in physical.columns:
|
| 94 |
+
physical_codes = set(
|
| 95 |
+
pd.to_numeric(physical["code"], errors="coerce")
|
| 96 |
+
.dropna()
|
| 97 |
+
.astype(int)
|
| 98 |
+
.tolist()
|
| 99 |
+
)
|
| 100 |
+
except Exception:
|
| 101 |
+
physical_codes = None
|
| 102 |
+
|
| 103 |
+
if not physical_codes:
|
| 104 |
+
return candidates[0]
|
| 105 |
+
|
| 106 |
+
best_col = candidates[0]
|
| 107 |
+
best_score = -1.0
|
| 108 |
+
for c in candidates:
|
| 109 |
+
sample = df[c].head(2000)
|
| 110 |
+
codes = sample.apply(extract_site_code)
|
| 111 |
+
non_null = float(codes.notna().mean()) if len(codes) else 0.0
|
| 112 |
+
|
| 113 |
+
if physical_codes:
|
| 114 |
+
match = (
|
| 115 |
+
float(codes.dropna().astype(int).isin(physical_codes).mean())
|
| 116 |
+
if codes.notna().any()
|
| 117 |
+
else 0.0
|
| 118 |
+
)
|
| 119 |
+
score = match * 10.0 + non_null
|
| 120 |
+
else:
|
| 121 |
+
score = non_null
|
| 122 |
+
|
| 123 |
+
if score > best_score:
|
| 124 |
+
best_score = score
|
| 125 |
+
best_col = c
|
| 126 |
+
|
| 127 |
+
return best_col
|
| 128 |
+
|
| 129 |
+
|
| 130 |
+
def non_kpi_identifier_cols(df: pd.DataFrame, rat: str) -> set[str]:
|
| 131 |
+
common = {
|
| 132 |
+
"DN",
|
| 133 |
+
"PLMN name",
|
| 134 |
+
"RNC name",
|
| 135 |
+
"BSC name",
|
| 136 |
+
"BCF name",
|
| 137 |
+
"MRBTS/SBTS name",
|
| 138 |
+
"LNBTS name",
|
| 139 |
+
"WBTS name",
|
| 140 |
+
"WBTS ID",
|
| 141 |
+
}
|
| 142 |
+
rat_specific = {
|
| 143 |
+
"2G": {"BSC name", "BSC", "BCF name", "BCF", "BTS name"},
|
| 144 |
+
"3G": {"PLMN name", "RNC name", "WBTS name", "WBTS ID"},
|
| 145 |
+
"LTE": {"MRBTS/SBTS name", "LNBTS name"},
|
| 146 |
+
}
|
| 147 |
+
cols = set()
|
| 148 |
+
for c in common.union(rat_specific.get(rat, set())):
|
| 149 |
+
if c in df.columns:
|
| 150 |
+
cols.add(c)
|
| 151 |
+
return cols
|
| 152 |
+
|
| 153 |
+
|
| 154 |
+
def infer_agg(kpi: str) -> str:
|
| 155 |
+
k = str(kpi).lower()
|
| 156 |
+
if any(x in k for x in ["traffic", "volume", "erl", "total", "gbytes", "gb"]):
|
| 157 |
+
return "sum"
|
| 158 |
+
return "mean"
|
| 159 |
+
|
| 160 |
+
|
| 161 |
+
def load_physical_db() -> pd.DataFrame:
|
| 162 |
+
physical_db = get_physical_db().copy()
|
| 163 |
+
physical_db["code"] = physical_db["Code_Sector"].str.split("_").str[0]
|
| 164 |
+
physical_db["code"] = pd.to_numeric(physical_db["code"], errors="coerce")
|
| 165 |
+
physical_db = physical_db.dropna(subset=["code"])
|
| 166 |
+
physical_db["code"] = physical_db["code"].astype(int)
|
| 167 |
+
keep = [
|
| 168 |
+
c for c in ["code", "Longitude", "Latitude", "City"] if c in physical_db.columns
|
| 169 |
+
]
|
| 170 |
+
return physical_db[keep].drop_duplicates("code")
|
| 171 |
+
|
| 172 |
+
|
| 173 |
+
def build_daily_kpi(df_raw: pd.DataFrame, rat: str) -> tuple[pd.DataFrame, list[str]]:
|
| 174 |
+
df = df_raw.copy()
|
| 175 |
+
date_col = infer_date_col(df)
|
| 176 |
+
id_col = infer_id_col(df, rat)
|
| 177 |
+
|
| 178 |
+
df["date"] = parse_datetime(df[date_col])
|
| 179 |
+
df = df.dropna(subset=["date"])
|
| 180 |
+
df["date_only"] = df["date"].dt.date
|
| 181 |
+
|
| 182 |
+
df["site_code"] = df[id_col].apply(extract_site_code)
|
| 183 |
+
df = df.dropna(subset=["site_code"])
|
| 184 |
+
df["site_code"] = df["site_code"].astype(int)
|
| 185 |
+
|
| 186 |
+
meta = {date_col, id_col, "date", "date_only", "site_code"}
|
| 187 |
+
meta = meta.union(non_kpi_identifier_cols(df, rat))
|
| 188 |
+
candidate_cols = [c for c in df.columns if c not in meta]
|
| 189 |
+
|
| 190 |
+
numeric_cols: dict[str, pd.Series] = {}
|
| 191 |
+
for c in candidate_cols:
|
| 192 |
+
numeric_cols[c] = to_numeric(df[c])
|
| 193 |
+
|
| 194 |
+
numeric_df = pd.DataFrame(numeric_cols)
|
| 195 |
+
kpi_cols = [c for c in numeric_df.columns if numeric_df[c].notna().any()]
|
| 196 |
+
if not kpi_cols:
|
| 197 |
+
raise ValueError(f"No numeric KPI columns detected for {rat}")
|
| 198 |
+
|
| 199 |
+
base = pd.concat(
|
| 200 |
+
[
|
| 201 |
+
df[["site_code", "date_only"]].reset_index(drop=True),
|
| 202 |
+
numeric_df[kpi_cols].reset_index(drop=True),
|
| 203 |
+
],
|
| 204 |
+
axis=1,
|
| 205 |
+
)
|
| 206 |
+
|
| 207 |
+
agg_dict = {k: infer_agg(k) for k in kpi_cols}
|
| 208 |
+
daily = base.groupby(["site_code", "date_only"], as_index=False).agg(agg_dict)
|
| 209 |
+
|
| 210 |
+
physical = load_physical_db()
|
| 211 |
+
if not physical.empty:
|
| 212 |
+
daily = pd.merge(
|
| 213 |
+
daily, physical, left_on="site_code", right_on="code", how="left"
|
| 214 |
+
)
|
| 215 |
+
daily = daily.drop(columns=[c for c in ["code"] if c in daily.columns])
|
| 216 |
+
|
| 217 |
+
daily["RAT"] = rat
|
| 218 |
+
|
| 219 |
+
return daily, kpi_cols
|
process_kpi/kpi_health_check/rules.py
ADDED
|
@@ -0,0 +1,31 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
def infer_kpi_direction(kpi: str) -> str:
|
| 2 |
+
k = str(kpi).lower()
|
| 3 |
+
lower_is_better = [
|
| 4 |
+
"drop",
|
| 5 |
+
"dcr",
|
| 6 |
+
"blocking",
|
| 7 |
+
"block",
|
| 8 |
+
"congestion",
|
| 9 |
+
"loss",
|
| 10 |
+
"discard",
|
| 11 |
+
"rtwp",
|
| 12 |
+
"prb usage",
|
| 13 |
+
"usage",
|
| 14 |
+
"fail",
|
| 15 |
+
]
|
| 16 |
+
if any(x in k for x in lower_is_better):
|
| 17 |
+
return "lower_is_better"
|
| 18 |
+
return "higher_is_better"
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
def infer_kpi_sla(kpi: str, direction: str) -> float | None:
|
| 22 |
+
k = str(kpi).lower()
|
| 23 |
+
if direction == "higher_is_better" and any(
|
| 24 |
+
x in k for x in ["availability", "cssr", "success", " sr"]
|
| 25 |
+
):
|
| 26 |
+
return 98.0
|
| 27 |
+
if direction == "lower_is_better" and any(
|
| 28 |
+
x in k for x in ["drop", "dcr", "blocking", "congestion", "loss", "discard"]
|
| 29 |
+
):
|
| 30 |
+
return 2.0
|
| 31 |
+
return None
|