Spaces:
Sleeping
Sleeping
Upload 5 files
Browse files- README.md +42 -12
- app.py +1247 -0
- btech.png +0 -0
- data.csv +0 -0
- requirements.txt +13 -2
README.md
CHANGED
|
@@ -1,19 +1,49 @@
|
|
| 1 |
---
|
| 2 |
-
title:
|
| 3 |
-
emoji:
|
| 4 |
-
colorFrom:
|
| 5 |
colorTo: red
|
| 6 |
-
sdk:
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
- streamlit
|
| 10 |
pinned: false
|
| 11 |
-
|
| 12 |
---
|
| 13 |
|
| 14 |
-
#
|
| 15 |
|
| 16 |
-
|
|
|
|
| 17 |
|
| 18 |
-
|
| 19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
title: MineVision AI - Advanced Fatigue Analytics
|
| 3 |
+
emoji: ⛏️
|
| 4 |
+
colorFrom: blue
|
| 5 |
colorTo: red
|
| 6 |
+
sdk: streamlit
|
| 7 |
+
sdk_version: 1.38.0 # Ganti dengan versi streamlit yang digunakan
|
| 8 |
+
app_file: app.py
|
|
|
|
| 9 |
pinned: false
|
| 10 |
+
license: apache-2.0
|
| 11 |
---
|
| 12 |
|
| 13 |
+
# MineVision AI - Advanced Fatigue Analytics
|
| 14 |
|
| 15 |
+
## Deskripsi
|
| 16 |
+
Aplikasi ini adalah dashboard analitik kelelahan berbasis web yang dirancang untuk operasi pertambangan. Menggunakan data dari sistem deteksi kelelahan (seperti Wenco DSS), aplikasi ini menyediakan wawasan dan analisis real-time untuk membantu mengidentifikasi, menilai, dan mengelola risiko kelelahan operator. Tujuannya adalah untuk meningkatkan keselamatan kerja dan produktivitas dengan mengurangi kecelakaan yang terkait dengan kelelahan.
|
| 17 |
|
| 18 |
+
## Fitur Utama
|
| 19 |
+
* **Dashboard Eksekutif**: Menampilkan metrik keselamatan utama seperti total alert, jumlah operator dan aset, serta durasi rata-rata kejadian.
|
| 20 |
+
* **Analisis Tren**: Visualisasi tren kelelahan berdasarkan jam, shift, hari dalam seminggu, dan minggu.
|
| 21 |
+
* **Analisis Lanjutan**: Analisis berdasarkan jenis armada, kecepatan vs jam, durasi vs jam, distribusi kecepatan, dan distribusi operator per shift.
|
| 22 |
+
* **Kategorisasi Risiko Kelelahan**: Menganalisis kejadian berdasarkan matriks risiko kelelahan (Kritis, Tinggi, Sedang, Rendah) berdasarkan kecepatan dan waktu.
|
| 23 |
+
* **Wawasan Berbasis AI**: Ringkasan otomatis dan wawasan berdasarkan data yang dianalisis.
|
| 24 |
+
* **Asisten AI Interaktif**: Chatbot sederhana untuk menanyakan informasi tentang data kelelahan (operator terbanyak, shift terbanyak, dll.).
|
| 25 |
+
|
| 26 |
+
## Teknologi yang Digunakan
|
| 27 |
+
* **Streamlit**: Framework untuk membuat aplikasi web interaktif dalam Python.
|
| 28 |
+
* **Pandas**: Manipulasi dan analisis data.
|
| 29 |
+
* **Plotly/Plotly Express**: Visualisasi data interaktif.
|
| 30 |
+
* **Openpyxl**: Pembacaan file Excel.
|
| 31 |
+
|
| 32 |
+
## Cara Menggunakan
|
| 33 |
+
1. Akses aplikasi melalui URL Hugging Face Spaces.
|
| 34 |
+
2. Gunakan filter di sidebar untuk menyaring data berdasarkan Tahun, Bulan, Minggu, Rentang Tanggal, Operator, Shift, dan Rentang Jam.
|
| 35 |
+
3. Jelajahi berbagai bagian dashboard untuk memahami pola kelelahan.
|
| 36 |
+
4. Gunakan kotak chat "MineVision AI Assistant" di bagian atas untuk menanyakan pertanyaan spesifik tentang data.
|
| 37 |
+
|
| 38 |
+
## Struktur Proyek
|
| 39 |
+
* `app.py`: File utama yang berisi kode aplikasi Streamlit.
|
| 40 |
+
* `requirements.txt`: File yang berisi daftar dependensi Python yang diperlukan untuk menjalankan aplikasi.
|
| 41 |
+
* `manual fatique.xlsx`: File data input contoh (jika disertakan dalam repositori).
|
| 42 |
+
|
| 43 |
+
## Catatan
|
| 44 |
+
* Aplikasi ini dirancang untuk menganalisis data kelelahan operator dari file Excel. Pastikan struktur data masukan sesuai atau sesuaikan kode untuk membaca data dari sumber lain.
|
| 45 |
+
* Wawasan dan rekomendasi didasarkan pada analisis data historis dan prinsip-prinsip manajemen risiko kelelahan (FRMS).
|
| 46 |
+
* Asisten AI saat ini menyediakan jawaban berbasis aturan sederhana berdasarkan data yang tersedia dan informasi umum tentang FRMS. Ini bukan model AI canggih seperti GPT.
|
| 47 |
+
|
| 48 |
+
## Lisensi
|
| 49 |
+
Apache 2.0
|
app.py
ADDED
|
@@ -0,0 +1,1247 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
|
| 2 |
+
import streamlit as st
|
| 3 |
+
import pandas as pd
|
| 4 |
+
import plotly.express as px
|
| 5 |
+
import plotly.graph_objects as go
|
| 6 |
+
import numpy as np
|
| 7 |
+
from datetime import datetime, timedelta
|
| 8 |
+
from typing import List
|
| 9 |
+
import os
|
| 10 |
+
|
| 11 |
+
# =================== PAGE CONFIG ===================
|
| 12 |
+
st.set_page_config(
|
| 13 |
+
page_title="PLN Audit Insight & Intelligence Dashboard",
|
| 14 |
+
page_icon="",
|
| 15 |
+
layout="wide",
|
| 16 |
+
initial_sidebar_state="expanded"
|
| 17 |
+
)
|
| 18 |
+
|
| 19 |
+
# =================== CUSTOM CSS (Updated for PLN Colors) ===================
|
| 20 |
+
st.markdown("""<style>
|
| 21 |
+
.main-header {
|
| 22 |
+
background-color: white;
|
| 23 |
+
padding: 25px;
|
| 24 |
+
border-radius: 12px;
|
| 25 |
+
margin-bottom: 25px;
|
| 26 |
+
box-shadow: 0 4px 12px rgba(0,0,0,0.06);
|
| 27 |
+
border: 1px solid #e0e0e0;
|
| 28 |
+
}
|
| 29 |
+
h1, h2, h3, h4, h5, .stMarkdown h1, .stMarkdown h2, .stMarkdown h3 {
|
| 30 |
+
text-align: center;
|
| 31 |
+
font-weight: 700;
|
| 32 |
+
color: #003DA5; /* Dark Blue - PLN Color */
|
| 33 |
+
}
|
| 34 |
+
.metric-card {
|
| 35 |
+
background: white;
|
| 36 |
+
padding: 16px;
|
| 37 |
+
border-radius: 10px;
|
| 38 |
+
box-shadow: 0 3px 10px rgba(0,0,0,0.05);
|
| 39 |
+
text-align: center;
|
| 40 |
+
border: 1px solid #f0f0f0;
|
| 41 |
+
}
|
| 42 |
+
.ai-insight {
|
| 43 |
+
background: #f0f4ff; /* Light Blue */
|
| 44 |
+
padding: 14px 18px;
|
| 45 |
+
border-left: 4px solid #003DA5; /* PLN Blue */
|
| 46 |
+
margin: 10px 0;
|
| 47 |
+
border-radius: 0 6px 6px 0;
|
| 48 |
+
font-size: 0.95em;
|
| 49 |
+
}
|
| 50 |
+
.ai-recommendation {
|
| 51 |
+
background: #e8f5e9;
|
| 52 |
+
padding: 14px 18px;
|
| 53 |
+
border-left: 4px solid #4caf50;
|
| 54 |
+
margin: 10px 0;
|
| 55 |
+
border-radius: 0 6px 6px 0;
|
| 56 |
+
font-size: 0.95em;
|
| 57 |
+
}
|
| 58 |
+
.risk-very-high { color: #c62828; font-weight: bold; }
|
| 59 |
+
.risk-high { color: #d32f2f; }
|
| 60 |
+
.risk-moderate { color: #f57c00; }
|
| 61 |
+
.risk-slight { color: #388e3c; }
|
| 62 |
+
.trend-worsening { color: #d32f2f; }
|
| 63 |
+
.trend-improvement { color: #388e3c; }
|
| 64 |
+
.trend-stable { color: #616161; }
|
| 65 |
+
.chart-container {
|
| 66 |
+
border: 1px solid #e0e0e0;
|
| 67 |
+
border-radius: 8px;
|
| 68 |
+
padding: 15px;
|
| 69 |
+
margin: 10px 0;
|
| 70 |
+
background-color: white;
|
| 71 |
+
box-shadow: 0 2px 6px rgba(0,0,0,0.03);
|
| 72 |
+
}
|
| 73 |
+
.section-title {
|
| 74 |
+
color: #003DA5; /* PLN Blue */
|
| 75 |
+
font-weight: 700;
|
| 76 |
+
font-size: 1.5em;
|
| 77 |
+
text-align: left;
|
| 78 |
+
margin-top: 20px;
|
| 79 |
+
margin-bottom: 10px;
|
| 80 |
+
}
|
| 81 |
+
.ai-section {
|
| 82 |
+
background: #ffffff;
|
| 83 |
+
padding: 20px;
|
| 84 |
+
border-radius: 8px;
|
| 85 |
+
margin: 10px 0;
|
| 86 |
+
box-shadow: 0 2px 6px rgba(0,0,0,0.03);
|
| 87 |
+
}
|
| 88 |
+
/* PLN Styled Selectbox and Multiselect */
|
| 89 |
+
.stSelectbox > label, .stMultiselect > label {
|
| 90 |
+
color: #003DA5; /* PLN Blue */
|
| 91 |
+
font-weight: bold;
|
| 92 |
+
}
|
| 93 |
+
.stSelectbox > div > div, .stMultiselect > div > div {
|
| 94 |
+
border: 2px solid #003DA5; /* PLN Blue Border */
|
| 95 |
+
border-radius: 8px;
|
| 96 |
+
}
|
| 97 |
+
.st-bq {
|
| 98 |
+
background-color: #f0f4ff; /* Light Blue Background */
|
| 99 |
+
}
|
| 100 |
+
.stButton > button {
|
| 101 |
+
background-color: #003DA5; /* PLN Blue Button */
|
| 102 |
+
color: white;
|
| 103 |
+
border: none;
|
| 104 |
+
border-radius: 8px;
|
| 105 |
+
padding: 8px 16px;
|
| 106 |
+
font-weight: bold;
|
| 107 |
+
}
|
| 108 |
+
.stButton > button:hover {
|
| 109 |
+
background-color: #0050A0; /* Darker PLN Blue on Hover */
|
| 110 |
+
}
|
| 111 |
+
/* Filter Container Styling */
|
| 112 |
+
.filter-container {
|
| 113 |
+
background-color: #f9f9f9;
|
| 114 |
+
border-radius: 10px;
|
| 115 |
+
padding: 15px;
|
| 116 |
+
margin-bottom: 15px;
|
| 117 |
+
box-shadow: 0 2px 4px rgba(0,0,0,0.05);
|
| 118 |
+
}
|
| 119 |
+
.filter-title {
|
| 120 |
+
color: #003DA5;
|
| 121 |
+
font-weight: bold;
|
| 122 |
+
font-size: 1.1em;
|
| 123 |
+
margin-bottom: 10px;
|
| 124 |
+
text-align: center;
|
| 125 |
+
}
|
| 126 |
+
</style>""", unsafe_allow_html=True)
|
| 127 |
+
|
| 128 |
+
# =================== DATA LOADING (FROM data.xlsx) ===================
|
| 129 |
+
@st.cache_data(ttl=300) # refresh every 5 min
|
| 130 |
+
def load_data():
|
| 131 |
+
file_path = "data.xlsx"
|
| 132 |
+
if not os.path.exists(file_path):
|
| 133 |
+
st.error(f"❌ File **`{file_path}`** not found. Please ensure it's in the same directory as this script.")
|
| 134 |
+
return pd.DataFrame()
|
| 135 |
+
try:
|
| 136 |
+
# Load Excel file
|
| 137 |
+
df = pd.read_excel(file_path, sheet_name='Sheet1', engine='openpyxl')
|
| 138 |
+
# Check for required columns
|
| 139 |
+
required_cols = ['created_at']
|
| 140 |
+
missing = [c for c in required_cols if c not in df.columns]
|
| 141 |
+
if missing:
|
| 142 |
+
st.error(f"❌ Missing required columns: {missing}. Available: {list(df.columns)}")
|
| 143 |
+
return pd.DataFrame()
|
| 144 |
+
# Parse datetime
|
| 145 |
+
df['created_at'] = pd.to_datetime(df['created_at'], errors='coerce')
|
| 146 |
+
if df['created_at'].isna().all():
|
| 147 |
+
st.error("❌ `created_at` column could not be parsed as datetime.")
|
| 148 |
+
return pd.DataFrame()
|
| 149 |
+
# Optional: close_at
|
| 150 |
+
if 'close_at' in df.columns:
|
| 151 |
+
df['close_at'] = pd.to_datetime(df['close_at'], errors='coerce')
|
| 152 |
+
df['days_to_close'] = (df['close_at'] - df['created_at']).dt.total_seconds() / (24 * 3600)
|
| 153 |
+
df['days_to_close'] = df['days_to_close'].apply(lambda x: x if x >= 0 else np.nan)
|
| 154 |
+
else:
|
| 155 |
+
df['days_to_close'] = np.nan
|
| 156 |
+
# Derived columns
|
| 157 |
+
df['created_month'] = df['created_at'].dt.to_period('M')
|
| 158 |
+
df['created_date'] = df['created_at'].dt.date
|
| 159 |
+
df['created_week'] = df['created_at'].dt.to_period('W')
|
| 160 |
+
# Keep only valid rows
|
| 161 |
+
df = df.dropna(subset=['created_at']).copy()
|
| 162 |
+
# Log shape
|
| 163 |
+
st.sidebar.success(f"Loaded {len(df):,} Audit Findings from `data.xlsx`")
|
| 164 |
+
return df
|
| 165 |
+
except Exception as e:
|
| 166 |
+
st.exception(f"Error loading data.xlsx: {e}")
|
| 167 |
+
return pd.DataFrame()
|
| 168 |
+
|
| 169 |
+
df = load_data()
|
| 170 |
+
if df.empty:
|
| 171 |
+
st.stop()
|
| 172 |
+
|
| 173 |
+
# =================== SIDEBAR FILTERS (Perbaikan) ===================
|
| 174 |
+
st.sidebar.markdown('<div class="filter-container">', unsafe_allow_html=True)
|
| 175 |
+
st.sidebar.markdown('<h4 class="filter-title">Filter Dashboard</h4>', unsafe_allow_html=True)
|
| 176 |
+
|
| 177 |
+
# Inisialisasi df_filtered
|
| 178 |
+
df_filtered = df.copy()
|
| 179 |
+
|
| 180 |
+
# Flag to track if filters were applied
|
| 181 |
+
filters_applied = False
|
| 182 |
+
|
| 183 |
+
# 1. Date Range Filter
|
| 184 |
+
min_date = df['created_at'].min().date()
|
| 185 |
+
max_date = df['created_at'].max().date()
|
| 186 |
+
date_range = st.sidebar.date_input(
|
| 187 |
+
"Date Range",
|
| 188 |
+
value=(min_date, max_date),
|
| 189 |
+
min_value=min_date,
|
| 190 |
+
max_value=max_date
|
| 191 |
+
)
|
| 192 |
+
|
| 193 |
+
# 2. Filter by Vendor (nama_perusahaan) - Default to All
|
| 194 |
+
if 'nama_perusahaan' in df.columns:
|
| 195 |
+
unique_vendors = sorted(df['nama_perusahaan'].dropna().astype(str).unique())
|
| 196 |
+
all_vendors_option = "All Vendors"
|
| 197 |
+
vendor_options = [all_vendors_option] + list(unique_vendors)
|
| 198 |
+
selected_vendor = st.sidebar.selectbox("Vendor", vendor_options, index=0) # Default to "All"
|
| 199 |
+
if selected_vendor != all_vendors_option:
|
| 200 |
+
df_filtered = df_filtered[df_filtered['nama_perusahaan'].astype(str) == selected_vendor]
|
| 201 |
+
filters_applied = True
|
| 202 |
+
|
| 203 |
+
# 3. Filter by Area/Unit Type (temuan_nama_distrik or creator_nama_distrik) - Renamed
|
| 204 |
+
area_col = None
|
| 205 |
+
if 'temuan_nama_distrik' in df_filtered.columns:
|
| 206 |
+
area_col = 'temuan_nama_distrik'
|
| 207 |
+
elif 'creator_nama_distrik' in df_filtered.columns:
|
| 208 |
+
area_col = 'creator_nama_distrik'
|
| 209 |
+
|
| 210 |
+
if area_col:
|
| 211 |
+
# Define mapping for display names
|
| 212 |
+
area_mapping = {
|
| 213 |
+
'UMRO': 'Unit Maintenance',
|
| 214 |
+
'UP GRESIK': 'Unit Pembangkit'
|
| 215 |
+
}
|
| 216 |
+
unique_areas_raw = sorted(df_filtered[area_col].dropna().astype(str).unique())
|
| 217 |
+
# Map raw values to display names, keep unmapped values as is
|
| 218 |
+
unique_areas_display = [area_mapping.get(area, area) for area in unique_areas_raw]
|
| 219 |
+
# Prepend "All" option
|
| 220 |
+
all_areas_option = "All Units"
|
| 221 |
+
area_options = [all_areas_option] + unique_areas_display
|
| 222 |
+
selected_area_display = st.sidebar.selectbox("Unit Type", area_options, index=0) # Default to "All"
|
| 223 |
+
if selected_area_display != all_areas_option:
|
| 224 |
+
# Reverse map the selected display name back to the raw value for filtering
|
| 225 |
+
selected_area_raw = next((raw for raw, disp in area_mapping.items() if disp == selected_area_display), selected_area_display)
|
| 226 |
+
df_filtered = df_filtered[df_filtered[area_col].astype(str) == selected_area_raw]
|
| 227 |
+
filters_applied = True
|
| 228 |
+
|
| 229 |
+
# 4. Status filter - Changed to selectbox (dropdown)
|
| 230 |
+
status_filter_applied = False
|
| 231 |
+
if 'temuan_status' in df_filtered.columns:
|
| 232 |
+
all_status = sorted(df_filtered['temuan_status'].dropna().astype(str).unique())
|
| 233 |
+
# Prepend "All" option
|
| 234 |
+
all_status_option = "All Status"
|
| 235 |
+
status_options = [all_status_option] + list(all_status)
|
| 236 |
+
selected_status = st.sidebar.selectbox(
|
| 237 |
+
"Status",
|
| 238 |
+
status_options,
|
| 239 |
+
index=0 # Default to "All"
|
| 240 |
+
)
|
| 241 |
+
if selected_status != all_status_option:
|
| 242 |
+
df_filtered = df_filtered[df_filtered['temuan_status'].astype(str) == selected_status]
|
| 243 |
+
status_filter_applied = True
|
| 244 |
+
filters_applied = True
|
| 245 |
+
|
| 246 |
+
# Apply date filter *after* other filters
|
| 247 |
+
if len(date_range) == 2:
|
| 248 |
+
df_filtered = df_filtered[
|
| 249 |
+
(df_filtered['created_at'].dt.date >= date_range[0]) &
|
| 250 |
+
(df_filtered['created_at'].dt.date <= date_range[1])
|
| 251 |
+
]
|
| 252 |
+
if date_range[0] != min_date or date_range[1] != max_date:
|
| 253 |
+
filters_applied = True
|
| 254 |
+
|
| 255 |
+
# Submit Button
|
| 256 |
+
submit_clicked = st.sidebar.button("Apply Filters")
|
| 257 |
+
|
| 258 |
+
# Apply filters logic when button is clicked
|
| 259 |
+
if submit_clicked:
|
| 260 |
+
# The filtering based on selections already happened above
|
| 261 |
+
# Here we just update the summary based on the current state of df_filtered
|
| 262 |
+
active_filters = []
|
| 263 |
+
if 'selected_vendor' in locals() and selected_vendor != all_vendors_option:
|
| 264 |
+
active_filters.append(f"Vendor: {selected_vendor}")
|
| 265 |
+
if 'selected_area_display' in locals() and selected_area_display != all_areas_option:
|
| 266 |
+
active_filters.append(f"Unit: {selected_area_display}")
|
| 267 |
+
if 'selected_status' in locals() and selected_status != all_status_option:
|
| 268 |
+
active_filters.append(f"Status: {selected_status}")
|
| 269 |
+
if len(date_range) == 2 and (date_range[0] != min_date or date_range[1] != max_date):
|
| 270 |
+
active_filters.append(f"Date: {date_range[0]} to {date_range[1]}")
|
| 271 |
+
|
| 272 |
+
if active_filters:
|
| 273 |
+
st.sidebar.success("**Active Filters:**")
|
| 274 |
+
for f in active_filters:
|
| 275 |
+
st.sidebar.markdown(f"- {f}")
|
| 276 |
+
st.sidebar.info(f"Showing {len(df_filtered)} records based on filters.")
|
| 277 |
+
else:
|
| 278 |
+
st.sidebar.info("No specific filters applied (showing all records).")
|
| 279 |
+
else:
|
| 280 |
+
# Show default message when not submitted yet
|
| 281 |
+
st.sidebar.info("Set filters and click 'Apply Filters'.")
|
| 282 |
+
|
| 283 |
+
st.sidebar.markdown('</div>', unsafe_allow_html=True)
|
| 284 |
+
|
| 285 |
+
|
| 286 |
+
# =================== HEADER ===================
|
| 287 |
+
st.markdown("""
|
| 288 |
+
<div class="main-header">
|
| 289 |
+
<h1>PLN Audit Insight & Intelligence Dashboard</h1>
|
| 290 |
+
<p style="text-align:center; color:#546e7a; font-size:1.1em; margin-top:8px;">
|
| 291 |
+
Operational Risk Intelligence for Audit & Compliance
|
| 292 |
+
</p>
|
| 293 |
+
</div>
|
| 294 |
+
""", unsafe_allow_html=True)
|
| 295 |
+
|
| 296 |
+
|
| 297 |
+
# =================== 1. Pie Charts: Temuan/Person by Company (PG & UM) - PERBAIKAN ===================
|
| 298 |
+
st.markdown("<h3 class='section-title'>OBJECTIVE 1 - Company Reporting Activity: Who Reports the Most?</h3>", unsafe_allow_html=True)
|
| 299 |
+
|
| 300 |
+
# Asumsikan df_filtered adalah data utama yang telah difilter
|
| 301 |
+
df_local = df_filtered.copy()
|
| 302 |
+
|
| 303 |
+
# Tambah kolom bulan
|
| 304 |
+
df_local['created_month'] = df_local['created_at'].dt.to_period('M')
|
| 305 |
+
|
| 306 |
+
# --- Langsung buat Area_Type PG / UM tanpa filter ---
|
| 307 |
+
|
| 308 |
+
if 'temuan_kode_distrik' in df_local.columns:
|
| 309 |
+
|
| 310 |
+
df_local['Area_Type'] = df_local['temuan_kode_distrik'].apply(
|
| 311 |
+
lambda x: 'PG' if 'PG' in str(x).upper()
|
| 312 |
+
else 'UM' if 'UM' in str(x).upper()
|
| 313 |
+
else 'Other'
|
| 314 |
+
)
|
| 315 |
+
|
| 316 |
+
# Otomatis bagi dataset
|
| 317 |
+
df_pg = df_local[df_local['Area_Type'] == 'PG'].copy()
|
| 318 |
+
df_um = df_local[df_local['Area_Type'] == 'UM'].copy()
|
| 319 |
+
|
| 320 |
+
else:
|
| 321 |
+
df_pg = pd.DataFrame()
|
| 322 |
+
df_um = pd.DataFrame()
|
| 323 |
+
|
| 324 |
+
# --- Fungsi untuk menghitung rasio perusahaan ---
|
| 325 |
+
def calculate_avg_ratio_per_company(df_area):
|
| 326 |
+
if df_area.empty:
|
| 327 |
+
# Jika area tidak dipilih atau data kosong setelah filter
|
| 328 |
+
return pd.DataFrame()
|
| 329 |
+
# Hitung temuan per bulan per perusahaan
|
| 330 |
+
findings_by_company_month = df_area.groupby(['created_month', 'nama_perusahaan']).size().reset_index(name='findings_count')
|
| 331 |
+
# Hitung jumlah orang unik per bulan per perusahaan
|
| 332 |
+
creators_by_company_month = df_area.groupby(['created_month', 'nama_perusahaan'])['creator_nid'].nunique().reset_index(name='unique_creators')
|
| 333 |
+
# Gabung
|
| 334 |
+
merged = findings_by_company_month.merge(creators_by_company_month, on=['created_month', 'nama_perusahaan'], how='outer')
|
| 335 |
+
# Isi NaN dengan 0 untuk kolom yang mungkin hilang dari merge
|
| 336 |
+
merged = merged.fillna({'findings_count': 0, 'unique_creators': 0})
|
| 337 |
+
# Filter untuk menghindari pembagian dengan nol
|
| 338 |
+
# Kita hanya ingin menghitung rasio jika jumlah pelapor > 0
|
| 339 |
+
merged = merged[merged['unique_creators'] > 0]
|
| 340 |
+
# Hitung rasio (ignore NaN)
|
| 341 |
+
# Pembagian oleh 0 akan menghasilkan inf, jadi kita ganti inf dengan NaN
|
| 342 |
+
merged['ratio'] = merged['findings_count'] / merged['unique_creators']
|
| 343 |
+
merged['ratio'] = merged['ratio'].replace([np.inf, -np.inf], np.nan)
|
| 344 |
+
|
| 345 |
+
# Jika tidak ada baris valid setelah filter, kembalikan DataFrame kosong
|
| 346 |
+
if merged.empty:
|
| 347 |
+
return pd.DataFrame()
|
| 348 |
+
|
| 349 |
+
# Rata-rata bulanan per perusahaan
|
| 350 |
+
# Group by nama_perusahaan dan ambil mean dari rasio
|
| 351 |
+
# mean() akan mengabaikan NaN secara default
|
| 352 |
+
avg_ratio = merged.groupby('nama_perusahaan')['ratio'].mean().reset_index(name='avg_monthly_ratio')
|
| 353 |
+
|
| 354 |
+
# Jika hasil akhirnya hanya NaN (karena semua rasio perusahaan adalah NaN), kembalikan DataFrame kosong
|
| 355 |
+
if avg_ratio['avg_monthly_ratio'].isna().all():
|
| 356 |
+
return pd.DataFrame()
|
| 357 |
+
|
| 358 |
+
return avg_ratio
|
| 359 |
+
|
| 360 |
+
# Hitung untuk masing-masing area
|
| 361 |
+
avg_ratio_pg = calculate_avg_ratio_per_company(df_pg)
|
| 362 |
+
avg_ratio_um = calculate_avg_ratio_per_company(df_um)
|
| 363 |
+
|
| 364 |
+
# Fungsi untuk menentukan warna
|
| 365 |
+
def get_color_map(company_series):
|
| 366 |
+
pln_color = "#FFD700" # Kuning untuk PLN
|
| 367 |
+
# Daftar warna biru (dari gelap ke terang)
|
| 368 |
+
blue_colors = ["#1E90FF", "#87CEEB", "#B0E0E6", "#ADD8E6", "#E0F6FF"]
|
| 369 |
+
color_map = {}
|
| 370 |
+
for company in company_series:
|
| 371 |
+
if 'PLN' in str(company).upper():
|
| 372 |
+
color_map[company] = pln_color
|
| 373 |
+
else:
|
| 374 |
+
# Pilih warna biru berdasarkan indeks, ulangi jika perlu
|
| 375 |
+
idx = len([c for c in color_map.values() if c != pln_color]) % len(blue_colors)
|
| 376 |
+
color_map[company] = blue_colors[idx]
|
| 377 |
+
return color_map
|
| 378 |
+
|
| 379 |
+
# Plot
|
| 380 |
+
col1, col2 = st.columns(2)
|
| 381 |
+
|
| 382 |
+
with col1:
|
| 383 |
+
st.markdown("<h5>Avg Monthly Finding by Company</h5>", unsafe_allow_html=True)
|
| 384 |
+
if not avg_ratio_pg.empty:
|
| 385 |
+
color_discrete_map_pg = get_color_map(avg_ratio_pg['nama_perusahaan'])
|
| 386 |
+
fig_pg = px.pie(
|
| 387 |
+
avg_ratio_pg,
|
| 388 |
+
values='avg_monthly_ratio',
|
| 389 |
+
names='nama_perusahaan',
|
| 390 |
+
title='Unit Pembangkit Company',
|
| 391 |
+
color='nama_perusahaan',
|
| 392 |
+
color_discrete_map=color_discrete_map_pg
|
| 393 |
+
)
|
| 394 |
+
st.plotly_chart(fig_pg, use_container_width=True)
|
| 395 |
+
|
| 396 |
+
# AI Insight untuk PG
|
| 397 |
+
if not avg_ratio_pg.empty:
|
| 398 |
+
# Temukan perusahaan dengan rasio tertinggi dan terendah di PG
|
| 399 |
+
top_company_pg = avg_ratio_pg.loc[avg_ratio_pg['avg_monthly_ratio'].idxmax()]
|
| 400 |
+
low_company_pg = avg_ratio_pg.loc[avg_ratio_pg['avg_monthly_ratio'].idxmin()]
|
| 401 |
+
|
| 402 |
+
st.markdown("### Insight")
|
| 403 |
+
insight_text = (
|
| 404 |
+
f"<div class='ai-insight'>"
|
| 405 |
+
f"In PG Area, <strong>{top_company_pg['nama_perusahaan']}</strong> has the highest average finding-to-person ratio "
|
| 406 |
+
f"(<strong>{top_company_pg['avg_monthly_ratio']:.2f}</strong>), indicating potentially high exposure or active reporting. "
|
| 407 |
+
f"Consider reviewing their operational procedures. "
|
| 408 |
+
f"Conversely, <strong>{low_company_pg['nama_perusahaan']}</strong> has the lowest ratio "
|
| 409 |
+
f"(<strong>{low_company_pg['avg_monthly_ratio']:.2f}</strong>), suggesting effective risk management or lower activity levels."
|
| 410 |
+
f"</div>"
|
| 411 |
+
)
|
| 412 |
+
st.markdown(insight_text, unsafe_allow_html=True)
|
| 413 |
+
else:
|
| 414 |
+
st.warning("No data for PG area or all ratios are NaN.")
|
| 415 |
+
|
| 416 |
+
with col2:
|
| 417 |
+
st.markdown("<h5>Avg Monthly Finding by Company</h5>", unsafe_allow_html=True)
|
| 418 |
+
if not avg_ratio_um.empty:
|
| 419 |
+
color_discrete_map_um = get_color_map(avg_ratio_um['nama_perusahaan'])
|
| 420 |
+
fig_um = px.pie(
|
| 421 |
+
avg_ratio_um,
|
| 422 |
+
values='avg_monthly_ratio',
|
| 423 |
+
names='nama_perusahaan',
|
| 424 |
+
title='Unit Maintenance',
|
| 425 |
+
color='nama_perusahaan',
|
| 426 |
+
color_discrete_map=color_discrete_map_um
|
| 427 |
+
)
|
| 428 |
+
st.plotly_chart(fig_um, use_container_width=True)
|
| 429 |
+
|
| 430 |
+
# AI Insight untuk UM
|
| 431 |
+
if not avg_ratio_um.empty:
|
| 432 |
+
# Temukan perusahaan dengan rasio tertinggi dan terendah di UM
|
| 433 |
+
top_company_um = avg_ratio_um.loc[avg_ratio_um['avg_monthly_ratio'].idxmax()]
|
| 434 |
+
low_company_um = avg_ratio_um.loc[avg_ratio_um['avg_monthly_ratio'].idxmin()]
|
| 435 |
+
|
| 436 |
+
st.markdown("### Insight")
|
| 437 |
+
insight_text = (
|
| 438 |
+
f"<div class='ai-insight'>"
|
| 439 |
+
f"In UM Area, <strong>{top_company_um['nama_perusahaan']}</strong> exhibits the highest average finding-to-person ratio "
|
| 440 |
+
f"(<strong>{top_company_um['avg_monthly_ratio']:.2f}</strong>), warranting a focused safety audit. "
|
| 441 |
+
f"<strong>{low_company_um['nama_perusahaan']}</strong> shows the lowest ratio "
|
| 442 |
+
f"(<strong>{low_company_um['avg_monthly_ratio']:.2f}</strong>), which could reflect strong safety practices or requires verification of reporting completeness."
|
| 443 |
+
f"</div>"
|
| 444 |
+
)
|
| 445 |
+
st.markdown(insight_text, unsafe_allow_html=True)
|
| 446 |
+
else:
|
| 447 |
+
st.warning("No data for UM area or all ratios are NaN.")
|
| 448 |
+
|
| 449 |
+
|
| 450 |
+
# =================== 2. Treemap: Distribusi Temuan per Area (nama_lokasi_full) - PERBAIKAN ===================
|
| 451 |
+
st.markdown("<h3 class='section-title'>OBJECTIVE 2 - Active vs Inactive Locations: Who Leads?</h3>", unsafe_allow_html=True)
|
| 452 |
+
|
| 453 |
+
# Hitung temuan per bulan per lokasi
|
| 454 |
+
findings_by_location_month = df_local.groupby(['created_month', 'nama_lokasi_full']).size().reset_index(name='findings_count')
|
| 455 |
+
# Hitung jumlah orang unik per bulan per lokasi
|
| 456 |
+
creators_by_location_month = df_local.groupby(['created_month', 'nama_lokasi_full'])['creator_nid'].nunique().reset_index(name='unique_creators')
|
| 457 |
+
# Gabung
|
| 458 |
+
merged_loc = findings_by_location_month.merge(creators_by_location_month, on=['created_month', 'nama_lokasi_full'], how='outer')
|
| 459 |
+
# Isi NaN dengan 0 untuk kolom yang mungkin hilang dari merge
|
| 460 |
+
merged_loc = merged_loc.fillna({'findings_count': 0, 'unique_creators': 0})
|
| 461 |
+
# Filter untuk menghindari pembagian dengan nol
|
| 462 |
+
merged_loc = merged_loc[merged_loc['unique_creators'] > 0]
|
| 463 |
+
# Hitung rasio (ignore NaN)
|
| 464 |
+
# Pembagian oleh 0 akan menghasilkan inf, jadi kita ganti inf dengan NaN
|
| 465 |
+
merged_loc['ratio'] = merged_loc['findings_count'] / merged_loc['unique_creators']
|
| 466 |
+
merged_loc['ratio'] = merged_loc['ratio'].replace([np.inf, -np.inf], np.nan)
|
| 467 |
+
|
| 468 |
+
# Rata-rata bulanan per lokasi
|
| 469 |
+
# Group by nama_lokasi_full dan ambil mean dari rasio
|
| 470 |
+
# mean() akan mengabaikan NaN secara default
|
| 471 |
+
avg_ratio_per_location = merged_loc.groupby('nama_lokasi_full')['ratio'].mean().reset_index(name='avg_monthly_ratio')
|
| 472 |
+
|
| 473 |
+
# Filter hasil akhir untuk menghindari NaN
|
| 474 |
+
avg_ratio_per_location = avg_ratio_per_location.dropna(subset=['avg_monthly_ratio'])
|
| 475 |
+
|
| 476 |
+
# Plot Treemap
|
| 477 |
+
if not avg_ratio_per_location.empty:
|
| 478 |
+
# Tambahkan kolom untuk warna berdasarkan kriteria
|
| 479 |
+
def categorize_risk(r):
|
| 480 |
+
if r > 1.3:
|
| 481 |
+
return 'High Activity (> 1.3)' # Warna Hijau
|
| 482 |
+
elif r > 1.0:
|
| 483 |
+
return 'Medium Activity (1.0 - 1.3)' # Warna Kuning
|
| 484 |
+
else:
|
| 485 |
+
return 'Low Activity (<= 1.0)' # Warna Merah
|
| 486 |
+
|
| 487 |
+
avg_ratio_per_location['Activity_Category'] = avg_ratio_per_location['avg_monthly_ratio'].apply(categorize_risk)
|
| 488 |
+
|
| 489 |
+
# Peta warna
|
| 490 |
+
color_map = {
|
| 491 |
+
'High Activity (> 1.3)': '#4CAF50', # Hijau
|
| 492 |
+
'Medium Activity (1.0 - 1.3)': '#FFB300', # Kuning
|
| 493 |
+
'Low Activity (<= 1.0)': '#D32F2F' # Merah
|
| 494 |
+
}
|
| 495 |
+
|
| 496 |
+
# Gunakan treemap plot dengan ukuran mencerminkan rata-rata rasio dan warna berdasarkan kategori aktivitas
|
| 497 |
+
fig_treemap = px.treemap(
|
| 498 |
+
avg_ratio_per_location,
|
| 499 |
+
path=['nama_lokasi_full'], # Path untuk hierarki (hanya satu level di sini)
|
| 500 |
+
values='avg_monthly_ratio', # Nilai yang menentukan ukuran area
|
| 501 |
+
title='Avg Monthly Finding by Location',
|
| 502 |
+
labels={'avg_monthly_ratio': 'Avg Monthly Finding/Person Ratio', 'nama_lokasi_full': 'Location'},
|
| 503 |
+
color='Activity_Category', # Warna berdasarkan kategori aktivitas
|
| 504 |
+
color_discrete_map=color_map
|
| 505 |
+
)
|
| 506 |
+
# Format hover
|
| 507 |
+
fig_treemap.update_traces(
|
| 508 |
+
hovertemplate="<b>%{label}</b><br>Avg Ratio: %{value:.2f}<br>Activity Level: %{color}<extra></extra>"
|
| 509 |
+
)
|
| 510 |
+
fig_treemap.update_layout(height=600)
|
| 511 |
+
st.plotly_chart(fig_treemap, use_container_width=True)
|
| 512 |
+
|
| 513 |
+
# AI Insight untuk Treemap Lokasi (Business-focused)
|
| 514 |
+
if not avg_ratio_per_location.empty:
|
| 515 |
+
# Temukan lokasi dengan rasio tertinggi dan terendah
|
| 516 |
+
top_location = avg_ratio_per_location.loc[avg_ratio_per_location['avg_monthly_ratio'].idxmax()]
|
| 517 |
+
low_location = avg_ratio_per_location.loc[avg_ratio_per_location['avg_monthly_ratio'].idxmin()]
|
| 518 |
+
|
| 519 |
+
st.markdown("### Insight")
|
| 520 |
+
insight_text = (
|
| 521 |
+
f"<div class='ai-insight'>"
|
| 522 |
+
f"The treemap visualizes the average finding-to-person ratio per location, indicating reporting activity levels. "
|
| 523 |
+
f"Locations with <span style='color:#4CAF50; font-weight:bold;'>green</span> color have a high ratio reporting"
|
| 524 |
+
f"Those with <span style='color:#FFB300; font-weight:bold;'>yellow</span> color have a medium ratio, indicating area with moderate reporting. "
|
| 525 |
+
f"Locations with <span style='color:#D32F2F; font-weight:bold;'>red</span> color have a low ratio indicate lower activity levels or potentially under-reporting. "
|
| 526 |
+
f"<strong>{top_location['nama_lokasi_full']}</strong> shows the highest activity level "
|
| 527 |
+
f"(<strong>{top_location['avg_monthly_ratio']:.2f}</strong>, color: {top_location['Activity_Category']}). "
|
| 528 |
+
f"<strong>{low_location['nama_lokasi_full']}</strong> shows the lowest activity level "
|
| 529 |
+
f"(<strong>{low_location['avg_monthly_ratio']:.2f}</strong>, color: {low_location['Activity_Category']}). "
|
| 530 |
+
f"Areas with high activity (green) warrant investigation into the underlying causes of frequent findings. "
|
| 531 |
+
f"Areas with low activity (red) should be reviewed to ensure reporting completeness and identify any hidden risks."
|
| 532 |
+
f"</div>"
|
| 533 |
+
)
|
| 534 |
+
st.markdown(insight_text, unsafe_allow_html=True)
|
| 535 |
+
else:
|
| 536 |
+
st.warning("No data available for location ratio calculation or all ratios are NaN.")
|
| 537 |
+
|
| 538 |
+
import plotly.express as px
|
| 539 |
+
import numpy as np
|
| 540 |
+
|
| 541 |
+
import plotly.express as px
|
| 542 |
+
import numpy as np
|
| 543 |
+
# =================== 3. Reporter & Executor Analysis (3a, 3b, 3c, 3d) ===================
|
| 544 |
+
st.markdown("<h3 class='section-title'>OBJECTIVE 3 - Frequency & Response Time: Who Reports Well? Who Executes Well?</h3>", unsafe_allow_html=True)
|
| 545 |
+
|
| 546 |
+
# 3a & 3b: Reporter Frequency & Executor Lead Time by nama (Average Monthly Rate per Division)
|
| 547 |
+
col_3a, col_3b = st.columns(2)
|
| 548 |
+
|
| 549 |
+
with col_3a:
|
| 550 |
+
st.markdown("<h5>3a. Average Finding by Division (Reporter)</h5>", unsafe_allow_html=True)
|
| 551 |
+
if 'nama' in df_local.columns:
|
| 552 |
+
# Hitung temuan per bulan per nama
|
| 553 |
+
findings_by_nama_month = df_local.groupby(['created_month', 'nama']).size().reset_index(name='findings_count')
|
| 554 |
+
# Hitung jumlah orang unik per bulan per nama
|
| 555 |
+
creators_by_nama_month = df_local.groupby(['created_month', 'nama'])['creator_nid'].nunique().reset_index(name='unique_creators')
|
| 556 |
+
# Gabung
|
| 557 |
+
merged_rep = findings_by_nama_month.merge(creators_by_nama_month, on=['created_month', 'nama'], how='outer')
|
| 558 |
+
# Isi NaN dengan 0 untuk kolom yang mungkin hilang dari merge
|
| 559 |
+
merged_rep = merged_rep.fillna({'findings_count': 0, 'unique_creators': 0})
|
| 560 |
+
# Filter untuk menghindari pembagian dengan nol
|
| 561 |
+
merged_rep = merged_rep[merged_rep['unique_creators'] > 0]
|
| 562 |
+
# Hitung rasio (ignore NaN)
|
| 563 |
+
merged_rep['ratio'] = merged_rep['findings_count'] / merged_rep['unique_creators']
|
| 564 |
+
merged_rep['ratio'] = merged_rep['ratio'].replace([np.inf, -np.inf], np.nan)
|
| 565 |
+
|
| 566 |
+
# Rata-rata bulanan per nama
|
| 567 |
+
avg_ratio_per_nama = merged_rep.groupby('nama')['ratio'].mean().reset_index(name='avg_monthly_ratio')
|
| 568 |
+
|
| 569 |
+
# Filter hasil akhir untuk menghindari NaN
|
| 570 |
+
avg_ratio_per_nama = avg_ratio_per_nama.dropna(subset=['avg_monthly_ratio'])
|
| 571 |
+
if not avg_ratio_per_nama.empty:
|
| 572 |
+
# Tambahkan kolom untuk warna KE DATAFRAME
|
| 573 |
+
# Urutkan untuk menentukan 5 teratas
|
| 574 |
+
avg_ratio_per_nama_sorted = avg_ratio_per_nama.sort_values('avg_monthly_ratio', ascending=True)
|
| 575 |
+
top_5_indices = avg_ratio_per_nama_sorted.tail(5).index
|
| 576 |
+
# Buat warna default, lalu ubah untuk top 5
|
| 577 |
+
avg_ratio_per_nama_sorted['color'] = '#1f77b4' # Warna default plotly
|
| 578 |
+
avg_ratio_per_nama_sorted.loc[avg_ratio_per_nama_sorted.index.isin(top_5_indices), 'color'] = '#4CAF50' # Warna hijau untuk top 5
|
| 579 |
+
|
| 580 |
+
# Pilihan sorting
|
| 581 |
+
sort_option_3a = st.selectbox("Sort 3a by:", ["Lowest First", "Highest First"], key='sort_3a')
|
| 582 |
+
if sort_option_3a == "Highest First":
|
| 583 |
+
avg_ratio_per_nama_sorted = avg_ratio_per_nama_sorted.sort_values('avg_monthly_ratio', ascending=False)
|
| 584 |
+
# Jika "Lowest First", sudah diurutkan ascending di atas
|
| 585 |
+
|
| 586 |
+
fig_rep_nama = px.bar(
|
| 587 |
+
avg_ratio_per_nama_sorted,
|
| 588 |
+
x='avg_monthly_ratio',
|
| 589 |
+
y='nama',
|
| 590 |
+
orientation='h',
|
| 591 |
+
title='Avg Monthly Finding by Division',
|
| 592 |
+
labels={'avg_monthly_ratio': 'Avg Monthly Finding/Person Ratio', 'nama': 'Division'},
|
| 593 |
+
color='color', # Gunakan nama kolom yang ditambahkan
|
| 594 |
+
color_discrete_map={c: c for c in avg_ratio_per_nama_sorted['color'].unique()}, # Peta warna
|
| 595 |
+
text=avg_ratio_per_nama_sorted['avg_monthly_ratio'].apply(lambda x: f'{x:.2f}') # Format 2 angka desimal
|
| 596 |
+
)
|
| 597 |
+
# Hapus legend untuk warna karena tidak informatif
|
| 598 |
+
fig_rep_nama.update_layout(yaxis={'categoryorder': 'total ascending'}, height=500, showlegend=False)
|
| 599 |
+
fig_rep_nama.update_traces(textposition='auto') # Posisi teks otomatis
|
| 600 |
+
st.plotly_chart(fig_rep_nama, use_container_width=True)
|
| 601 |
+
|
| 602 |
+
# AI Insight for 3a
|
| 603 |
+
top_nama = avg_ratio_per_nama_sorted.iloc[-1] if not avg_ratio_per_nama_sorted.empty else None
|
| 604 |
+
low_nama = avg_ratio_per_nama_sorted.iloc[0] if not avg_ratio_per_nama_sorted.empty else None
|
| 605 |
+
if top_nama is not None and low_nama is not None:
|
| 606 |
+
st.markdown("### Insight")
|
| 607 |
+
insight_text = (
|
| 608 |
+
f"<div class='ai-insight'>"
|
| 609 |
+
f"The division <strong>{top_nama['nama']}</strong> has the highest average finding-to-person ratio "
|
| 610 |
+
f"(<strong>{top_nama['avg_monthly_ratio']:.2f}</strong>), indicating potentially high reporting activity or exposure. "
|
| 611 |
+
f"Conversely, <strong>{low_nama['nama']}</strong> has the lowest ratio "
|
| 612 |
+
f"(<strong>{low_nama['avg_monthly_ratio']:.2f}</strong>), suggesting lower activity or potentially under-reporting. "
|
| 613 |
+
f"Monitor high-ratio divisions for potential systemic issues and verify reporting completeness in low-ratio ones."
|
| 614 |
+
f"</div>"
|
| 615 |
+
)
|
| 616 |
+
st.markdown(insight_text, unsafe_allow_html=True)
|
| 617 |
+
else:
|
| 618 |
+
st.warning("No data or all ratios are NaN for reporter analysis by division.")
|
| 619 |
+
else:
|
| 620 |
+
st.warning("Column 'nama' not available for reporter analysis (3a).")
|
| 621 |
+
|
| 622 |
+
with col_3b:
|
| 623 |
+
st.markdown("<h5>3b. Average by Division (Executor)</h5>", unsafe_allow_html=True)
|
| 624 |
+
if 'nama' in df_local.columns and 'days_to_close' in df_local.columns:
|
| 625 |
+
# Hitung rata-rata lead time per nama per bulan
|
| 626 |
+
leadtime_by_nama_month = df_local.groupby(['created_month', 'nama'])['days_to_close'].mean().reset_index(name='avg_leadtime')
|
| 627 |
+
# Rata-rata bulanan keseluruhan per nama
|
| 628 |
+
avg_leadtime_nama = leadtime_by_nama_month.groupby('nama')['avg_leadtime'].mean().reset_index(name='avg_monthly_leadtime')
|
| 629 |
+
|
| 630 |
+
# Filter hasil akhir untuk menghindari NaN
|
| 631 |
+
avg_leadtime_nama = avg_leadtime_nama.dropna(subset=['avg_monthly_leadtime'])
|
| 632 |
+
if not avg_leadtime_nama.empty:
|
| 633 |
+
# Tambahkan kolom untuk warna KE DATAFRAME
|
| 634 |
+
# Urutkan untuk menentukan 5 teratas
|
| 635 |
+
avg_leadtime_nama_sorted = avg_leadtime_nama.sort_values('avg_monthly_leadtime', ascending=True)
|
| 636 |
+
top_5_indices = avg_leadtime_nama_sorted.tail(5).index
|
| 637 |
+
# Buat warna default, lalu ubah untuk top 5
|
| 638 |
+
avg_leadtime_nama_sorted['color'] = '#1f77b4' # Warna default plotly
|
| 639 |
+
avg_leadtime_nama_sorted.loc[avg_leadtime_nama_sorted.index.isin(top_5_indices), 'color'] = '#D32F2F' # Warna merah untuk top 5
|
| 640 |
+
|
| 641 |
+
# Pilihan sorting
|
| 642 |
+
sort_option_3b = st.selectbox("Sort 3b by:", ["Fastest First", "Slowest First"], key='sort_3b')
|
| 643 |
+
if sort_option_3b == "Slowest First":
|
| 644 |
+
avg_leadtime_nama_sorted = avg_leadtime_nama_sorted.sort_values('avg_monthly_leadtime', ascending=False)
|
| 645 |
+
# Jika "Fastest First", sudah diurutkan ascending di atas
|
| 646 |
+
|
| 647 |
+
fig_exec_nama = px.bar(
|
| 648 |
+
avg_leadtime_nama_sorted,
|
| 649 |
+
x='avg_monthly_leadtime',
|
| 650 |
+
y='nama',
|
| 651 |
+
orientation='h',
|
| 652 |
+
title='Avg Monthly Lead Time by Division',
|
| 653 |
+
labels={'avg_monthly_leadtime': 'Avg Lead Time (Days)', 'nama': 'Division'},
|
| 654 |
+
color='color', # Gunakan nama kolom yang ditambahkan
|
| 655 |
+
color_discrete_map={c: c for c in avg_leadtime_nama_sorted['color'].unique()}, # Peta warna
|
| 656 |
+
text=avg_leadtime_nama_sorted['avg_monthly_leadtime'].apply(lambda x: f'{x:.2f}') # Format 2 angka desimal
|
| 657 |
+
)
|
| 658 |
+
# Hapus legend untuk warna karena tidak informatif
|
| 659 |
+
fig_exec_nama.update_layout(yaxis={'categoryorder': 'total ascending'}, height=500, showlegend=False)
|
| 660 |
+
fig_exec_nama.update_traces(textposition='auto') # Posisi teks otomatis
|
| 661 |
+
st.plotly_chart(fig_exec_nama, use_container_width=True)
|
| 662 |
+
|
| 663 |
+
# AI Insight for 3b
|
| 664 |
+
top_nama = avg_leadtime_nama_sorted.iloc[-1] if not avg_leadtime_nama_sorted.empty else None
|
| 665 |
+
low_nama = avg_leadtime_nama_sorted.iloc[0] if not avg_leadtime_nama_sorted.empty else None
|
| 666 |
+
if top_nama is not None and low_nama is not None:
|
| 667 |
+
st.markdown("### Insight")
|
| 668 |
+
insight_text = (
|
| 669 |
+
f"<div class='ai-insight'>"
|
| 670 |
+
f"The division <strong>{top_nama['nama']}</strong> has the highest average lead time "
|
| 671 |
+
f"(<strong>{top_nama['avg_monthly_leadtime']:.2f} days</strong>), indicating slower resolution. "
|
| 672 |
+
f"<strong>{low_nama['nama']}</strong> has the fastest average resolution "
|
| 673 |
+
f"(<strong>{low_nama['avg_monthly_leadtime']:.2f} days</strong>). "
|
| 674 |
+
f"Focus on improving SLA compliance in divisions with longer lead times."
|
| 675 |
+
f"</div>"
|
| 676 |
+
)
|
| 677 |
+
st.markdown(insight_text, unsafe_allow_html=True)
|
| 678 |
+
else:
|
| 679 |
+
st.warning("No data or all lead times are NaN for executor analysis by division.")
|
| 680 |
+
else:
|
| 681 |
+
st.warning("Columns 'nama' or 'days_to_close' not available for executor analysis (3b).")
|
| 682 |
+
|
| 683 |
+
# 3c & 3d: Reporter Frequency & Executor Lead Time by creator_name and nama_pic (Average Monthly Rate per Person)
|
| 684 |
+
col_3c, col_3d = st.columns(2)
|
| 685 |
+
|
| 686 |
+
with col_3c:
|
| 687 |
+
st.markdown("<h5>3c. Average Finding Rate per Reporter (Name)</h5>", unsafe_allow_html=True)
|
| 688 |
+
if 'creator_name' in df_local.columns:
|
| 689 |
+
# Hitung temuan per bulan per creator_name
|
| 690 |
+
findings_by_creator_month = df_local.groupby(['created_month', 'creator_name']).size().reset_index(name='findings_count')
|
| 691 |
+
# Hitung jumlah bulan aktif per creator_name
|
| 692 |
+
active_months_by_creator = findings_by_creator_month.groupby('creator_name')['created_month'].nunique().reset_index(name='active_months')
|
| 693 |
+
# Gabung untuk mendapatkan total temuan per creator
|
| 694 |
+
total_findings_by_creator = findings_by_creator_month.groupby('creator_name')['findings_count'].sum().reset_index()
|
| 695 |
+
# Gabung semua
|
| 696 |
+
merged_rep_creator = total_findings_by_creator.merge(active_months_by_creator, on='creator_name', how='outer')
|
| 697 |
+
# Isi NaN dengan 0
|
| 698 |
+
merged_rep_creator = merged_rep_creator.fillna({'findings_count': 0, 'active_months': 0})
|
| 699 |
+
# Filter untuk menghindari pembagian dengan nol (jika seseorang tidak aktif sepanjang periode)
|
| 700 |
+
merged_rep_creator = merged_rep_creator[merged_rep_creator['active_months'] > 0]
|
| 701 |
+
# Hitung rata-rata bulanan (ignore NaN)
|
| 702 |
+
merged_rep_creator['avg_monthly_rate'] = merged_rep_creator['findings_count'] / merged_rep_creator['active_months']
|
| 703 |
+
merged_rep_creator['avg_monthly_rate'] = merged_rep_creator['avg_monthly_rate'].replace([np.inf, -np.inf], np.nan)
|
| 704 |
+
|
| 705 |
+
# Filter hasil akhir untuk menghindari NaN
|
| 706 |
+
avg_rate_per_creator = merged_rep_creator.dropna(subset=['avg_monthly_rate'])
|
| 707 |
+
if not avg_rate_per_creator.empty:
|
| 708 |
+
# Tambahkan kolom untuk warna KE DATAFRAME
|
| 709 |
+
# Urutkan untuk menentukan 5 teratas
|
| 710 |
+
avg_rate_per_creator_sorted = avg_rate_per_creator.sort_values('avg_monthly_rate', ascending=True)
|
| 711 |
+
top_5_indices = avg_rate_per_creator_sorted.tail(5).index
|
| 712 |
+
# Buat warna default, lalu ubah untuk top 5
|
| 713 |
+
avg_rate_per_creator_sorted['color'] = '#1f77b4' # Warna default plotly
|
| 714 |
+
avg_rate_per_creator_sorted.loc[avg_rate_per_creator_sorted.index.isin(top_5_indices), 'color'] = '#4CAF50' # Warna hijau untuk top 5
|
| 715 |
+
|
| 716 |
+
# Pilihan sorting
|
| 717 |
+
sort_option_3c = st.selectbox("Sort 3c by:", ["Lowest First", "Highest First"], key='sort_3c')
|
| 718 |
+
if sort_option_3c == "Highest First":
|
| 719 |
+
avg_rate_per_creator_sorted = avg_rate_per_creator_sorted.sort_values('avg_monthly_rate', ascending=False)
|
| 720 |
+
# Jika "Lowest First", sudah diurutkan ascending di atas
|
| 721 |
+
|
| 722 |
+
# Ambil top 10 untuk visualisasi
|
| 723 |
+
top10_creators = avg_rate_per_creator_sorted.tail(1000) # Ambil 10 terakhir setelah sorting
|
| 724 |
+
fig_rep_creator = px.bar(
|
| 725 |
+
top10_creators,
|
| 726 |
+
x='avg_monthly_rate',
|
| 727 |
+
y='creator_name',
|
| 728 |
+
orientation='h',
|
| 729 |
+
title='Avg Monthly Finding by Creator Name',
|
| 730 |
+
labels={'avg_monthly_rate': 'Avg Monthly Finding Rate', 'creator_name': 'Creator Name'},
|
| 731 |
+
color='color', # Gunakan nama kolom yang ditambahkan
|
| 732 |
+
color_discrete_map={c: c for c in top10_creators['color'].unique()}, # Peta warna
|
| 733 |
+
text=top10_creators['avg_monthly_rate'].apply(lambda x: f'{x:.2f}') # Format 2 angka desimal
|
| 734 |
+
)
|
| 735 |
+
# Hapus legend untuk warna karena tidak informatif
|
| 736 |
+
fig_rep_creator.update_layout(yaxis={'categoryorder': 'total ascending'}, height=500, showlegend=False)
|
| 737 |
+
fig_rep_creator.update_traces(textposition='auto') # Posisi teks otomatis
|
| 738 |
+
st.plotly_chart(fig_rep_creator, use_container_width=True)
|
| 739 |
+
|
| 740 |
+
# AI Insight for 3c
|
| 741 |
+
top_creator = avg_rate_per_creator_sorted.iloc[-1] if not avg_rate_per_creator_sorted.empty else None
|
| 742 |
+
low_creator = avg_rate_per_creator_sorted.iloc[0] if not avg_rate_per_creator_sorted.empty else None
|
| 743 |
+
if top_creator is not None and low_creator is not None:
|
| 744 |
+
st.markdown("### Insight")
|
| 745 |
+
insight_text = (
|
| 746 |
+
f"<div class='ai-insight'>"
|
| 747 |
+
f"The reporter <strong>{top_creator['creator_name']}</strong> has the highest average monthly finding rate "
|
| 748 |
+
f"(<strong>{top_creator['avg_monthly_rate']:.2f}</strong>), indicating active engagement. "
|
| 749 |
+
f"<strong>{low_creator['creator_name']}</strong> has the lowest rate "
|
| 750 |
+
f"(<strong>{low_creator['avg_monthly_rate']:.2f}</strong>), which might indicate lower activity or under-reporting. "
|
| 751 |
+
f"Recognize high performers and investigate low performers."
|
| 752 |
+
f"</div>"
|
| 753 |
+
)
|
| 754 |
+
st.markdown(insight_text, unsafe_allow_html=True)
|
| 755 |
+
else:
|
| 756 |
+
st.warning("No data or all rates are NaN for reporter analysis by creator_name.")
|
| 757 |
+
else:
|
| 758 |
+
st.warning("Column 'creator_name' not available for reporter analysis (3c).")
|
| 759 |
+
|
| 760 |
+
with col_3d:
|
| 761 |
+
st.markdown("<h5>3d. Average Lead Time by Executor (Name)</h5>", unsafe_allow_html=True)
|
| 762 |
+
if 'nama_pic' in df_local.columns and 'days_to_close' in df_local.columns:
|
| 763 |
+
# Hitung rata-rata lead time per executor per bulan
|
| 764 |
+
leadtime_by_executor_month = df_local.groupby(['created_month', 'nama_pic'])['days_to_close'].mean().reset_index(name='avg_leadtime')
|
| 765 |
+
# Hitung jumlah bulan aktif per executor
|
| 766 |
+
active_months_by_executor = leadtime_by_executor_month.groupby('nama_pic')['created_month'].nunique().reset_index(name='active_months')
|
| 767 |
+
# Hitung total lead time per executor
|
| 768 |
+
total_leadtime_by_executor = leadtime_by_executor_month.groupby('nama_pic')['avg_leadtime'].sum().reset_index()
|
| 769 |
+
# Gabung semua
|
| 770 |
+
merged_exec_pic = total_leadtime_by_executor.merge(active_months_by_executor, on='nama_pic', how='outer')
|
| 771 |
+
# Isi NaN dengan 0
|
| 772 |
+
merged_exec_pic = merged_exec_pic.fillna({'avg_leadtime': 0, 'active_months': 0})
|
| 773 |
+
# Filter untuk menghindari pembagian dengan nol
|
| 774 |
+
merged_exec_pic = merged_exec_pic[merged_exec_pic['active_months'] > 0]
|
| 775 |
+
# Hitung rata-rata bulanan (ignore NaN)
|
| 776 |
+
merged_exec_pic['avg_monthly_leadtime'] = merged_exec_pic['avg_leadtime'] / merged_exec_pic['active_months']
|
| 777 |
+
merged_exec_pic['avg_monthly_leadtime'] = merged_exec_pic['avg_monthly_leadtime'].replace([np.inf, -np.inf], np.nan)
|
| 778 |
+
|
| 779 |
+
# Filter hasil akhir untuk menghindari NaN
|
| 780 |
+
avg_leadtime_per_executor = merged_exec_pic.dropna(subset=['avg_monthly_leadtime'])
|
| 781 |
+
if not avg_leadtime_per_executor.empty:
|
| 782 |
+
# Tambahkan kolom untuk warna KE DATAFRAME
|
| 783 |
+
# Urutkan untuk menentukan 5 teratas
|
| 784 |
+
avg_leadtime_per_executor_sorted = avg_leadtime_per_executor.sort_values('avg_monthly_leadtime', ascending=True)
|
| 785 |
+
top_5_indices = avg_leadtime_per_executor_sorted.tail(5).index
|
| 786 |
+
# Buat warna default, lalu ubah untuk top 5
|
| 787 |
+
avg_leadtime_per_executor_sorted['color'] = '#1f77b4' # Warna default plotly
|
| 788 |
+
avg_leadtime_per_executor_sorted.loc[avg_leadtime_per_executor_sorted.index.isin(top_5_indices), 'color'] = '#D32F2F' # Warna merah untuk top 5
|
| 789 |
+
|
| 790 |
+
# Pilihan sorting
|
| 791 |
+
sort_option_3d = st.selectbox("Sort 3d by:", ["Fastest First", "Slowest First"], key='sort_3d')
|
| 792 |
+
if sort_option_3d == "Slowest First":
|
| 793 |
+
avg_leadtime_per_executor_sorted = avg_leadtime_per_executor_sorted.sort_values('avg_monthly_leadtime', ascending=False)
|
| 794 |
+
# Jika "Fastest First", sudah diurutkan ascending di atas
|
| 795 |
+
|
| 796 |
+
# Ambil top 10 untuk visualisasi
|
| 797 |
+
top10_executors = avg_leadtime_per_executor_sorted.nlargest(1000, 'avg_monthly_leadtime') # Ambil 10 terlama
|
| 798 |
+
fig_exec_pic = px.bar(
|
| 799 |
+
top10_executors,
|
| 800 |
+
x='avg_monthly_leadtime',
|
| 801 |
+
y='nama_pic',
|
| 802 |
+
orientation='h',
|
| 803 |
+
title='Avg Monthly Lead Time by Executor (Name)',
|
| 804 |
+
labels={'avg_monthly_leadtime': 'Avg Monthly Lead Time (Days)', 'nama_pic': 'Executor Name'},
|
| 805 |
+
color='color', # Gunakan nama kolom yang ditambahkan
|
| 806 |
+
color_discrete_map={c: c for c in top10_executors['color'].unique()}, # Peta warna
|
| 807 |
+
text=top10_executors['avg_monthly_leadtime'].apply(lambda x: f'{x:.2f}') # Format 2 angka desimal
|
| 808 |
+
)
|
| 809 |
+
# Hapus legend untuk warna karena tidak informatif
|
| 810 |
+
fig_exec_pic.update_layout(yaxis={'categoryorder': 'total ascending'}, height=500, showlegend=False)
|
| 811 |
+
fig_exec_pic.update_traces(textposition='auto') # Posisi teks otomatis
|
| 812 |
+
st.plotly_chart(fig_exec_pic, use_container_width=True)
|
| 813 |
+
|
| 814 |
+
# AI Insight for 3d
|
| 815 |
+
top_executor = avg_leadtime_per_executor_sorted.iloc[-1] if not avg_leadtime_per_executor_sorted.empty else None
|
| 816 |
+
low_executor = avg_leadtime_per_executor_sorted.iloc[0] if not avg_leadtime_per_executor_sorted.empty else None
|
| 817 |
+
if top_executor is not None and low_executor is not None:
|
| 818 |
+
st.markdown("### Insight")
|
| 819 |
+
insight_text = (
|
| 820 |
+
f"<div class='ai-insight'>"
|
| 821 |
+
f"The executor <strong>{top_executor['nama_pic']}</strong> has the highest average monthly lead time "
|
| 822 |
+
f"(<strong>{top_executor['avg_monthly_leadtime']:.2f} days</strong>), indicating slower resolution. "
|
| 823 |
+
f"<strong>{low_executor['nama_pic']}</strong> resolves tasks fastest on average "
|
| 824 |
+
f"(<strong>{low_executor['avg_monthly_leadtime']:.2f} days</strong>). "
|
| 825 |
+
f"Focus on improving SLA compliance for executors with longer lead times."
|
| 826 |
+
f"</div>"
|
| 827 |
+
)
|
| 828 |
+
st.markdown(insight_text, unsafe_allow_html=True)
|
| 829 |
+
else:
|
| 830 |
+
st.warning("No data or all lead times are NaN for executor analysis by nama_pic.")
|
| 831 |
+
else:
|
| 832 |
+
st.warning("Columns 'nama_pic' or 'days_to_close' not available for executor analysis (3d).")
|
| 833 |
+
####OBJECTIVE 4
|
| 834 |
+
try:
|
| 835 |
+
from wordcloud import WordCloud
|
| 836 |
+
import matplotlib.pyplot as plt
|
| 837 |
+
WORDCLOUD_AVAILABLE = True
|
| 838 |
+
except ImportError:
|
| 839 |
+
WORDCLOUD_AVAILABLE = False
|
| 840 |
+
st.warning("⚠️ Library `wordcloud` atau `matplotlib` tidak ditemukan. Install dengan `pip install wordcloud matplotlib` untuk fitur WordCloud.")
|
| 841 |
+
|
| 842 |
+
if WORDCLOUD_AVAILABLE:
|
| 843 |
+
st.markdown("<h3 class='section-title'>4. Global Text Insights (Word Clouds)</h3>", unsafe_allow_html=True)
|
| 844 |
+
|
| 845 |
+
col_wc1, col_wc2, col_wc3 = st.columns(3)
|
| 846 |
+
|
| 847 |
+
# Fungsi untuk membuat dan menampilkan wordcloud
|
| 848 |
+
def generate_wordcloud(text_data, title, col):
|
| 849 |
+
# Periksa apakah text_data adalah Series kosong atau None
|
| 850 |
+
if text_data is None or text_data.empty:
|
| 851 |
+
col.warning(f"No data available in series for {title}.")
|
| 852 |
+
return
|
| 853 |
+
# Periksa apakah semua nilai adalah NaN
|
| 854 |
+
if text_data.isna().all():
|
| 855 |
+
col.warning(f"All data is NaN for {title}.")
|
| 856 |
+
return
|
| 857 |
+
# Gabung semua teks menjadi satu string
|
| 858 |
+
text = ' '.join(text_data.dropna().astype(str))
|
| 859 |
+
# Bersihkan teks dari karakter non-alfanumerik (opsional)
|
| 860 |
+
import re
|
| 861 |
+
text = re.sub(r'[^a-zA-Z\s]', ' ', text)
|
| 862 |
+
if text.strip(): # Pastikan teks tidak kosong setelah pembersihan
|
| 863 |
+
# Buat WordCloud
|
| 864 |
+
wordcloud = WordCloud(
|
| 865 |
+
width=400,
|
| 866 |
+
height=300,
|
| 867 |
+
background_color='white',
|
| 868 |
+
colormap='viridis',
|
| 869 |
+
max_words=100,
|
| 870 |
+
relative_scaling=0.5,
|
| 871 |
+
random_state=42
|
| 872 |
+
).generate(text)
|
| 873 |
+
|
| 874 |
+
# Plot menggunakan matplotlib
|
| 875 |
+
fig, ax = plt.subplots(figsize=(8, 6))
|
| 876 |
+
ax.imshow(wordcloud, interpolation='bilinear')
|
| 877 |
+
ax.axis('off')
|
| 878 |
+
ax.set_title(title, fontsize=16)
|
| 879 |
+
plt.tight_layout()
|
| 880 |
+
|
| 881 |
+
# Tampilkan di Streamlit
|
| 882 |
+
col.pyplot(fig, use_container_width=True)
|
| 883 |
+
else:
|
| 884 |
+
col.warning(f"No valid text data for {title} after cleaning.")
|
| 885 |
+
|
| 886 |
+
# Kolom Judul
|
| 887 |
+
with col_wc1:
|
| 888 |
+
if 'judul' in df_local.columns:
|
| 889 |
+
generate_wordcloud(df_local['judul'], "Word Cloud: Judul", col_wc1)
|
| 890 |
+
else:
|
| 891 |
+
col_wc1.warning("Column 'judul' not available.")
|
| 892 |
+
|
| 893 |
+
# Kolom Kondisi
|
| 894 |
+
with col_wc2:
|
| 895 |
+
if 'kondisi' in df_local.columns:
|
| 896 |
+
generate_wordcloud(df_local['kondisi'], "Word Cloud: Kondisi", col_wc2)
|
| 897 |
+
else:
|
| 898 |
+
col_wc2.warning("Column 'kondisi' not available.")
|
| 899 |
+
|
| 900 |
+
# Kolom Rekomendasi
|
| 901 |
+
with col_wc3:
|
| 902 |
+
if 'rekomendasi' in df_local.columns:
|
| 903 |
+
generate_wordcloud(df_local['rekomendasi'], "Word Cloud: Rekomendasi", col_wc3)
|
| 904 |
+
else:
|
| 905 |
+
col_wc3.warning("Column 'rekomendasi' not available.")
|
| 906 |
+
else:
|
| 907 |
+
st.markdown("<h3 class='section-title'>4. Global Text Insights (Word Clouds)</h3>", unsafe_allow_html=True)
|
| 908 |
+
st.info("WordCloud library not installed. Install `wordcloud` and `matplotlib` to enable this feature.")
|
| 909 |
+
|
| 910 |
+
# =================== 5. Matrix (Tetap Dipertahankan) ===================
|
| 911 |
+
st.markdown("<h3 class='section-title'>OBJECTIVE 5 - Findings vs Lead Time: Which Companies Move Slow?</h3>", unsafe_allow_html=True)
|
| 912 |
+
|
| 913 |
+
import math
|
| 914 |
+
import plotly.express as px
|
| 915 |
+
import pandas as pd
|
| 916 |
+
try:
|
| 917 |
+
df_local_matrix = df.copy()
|
| 918 |
+
# ============================
|
| 919 |
+
# 0. Filter: ONLY 1 COMPANY & 1 PROFILE (if applicable)
|
| 920 |
+
# ============================
|
| 921 |
+
# (Skipped for general dashboard view)
|
| 922 |
+
# ============================
|
| 923 |
+
# 1. Exclude Positive findings
|
| 924 |
+
# ============================
|
| 925 |
+
if 'temuan_kategori' in df_local_matrix.columns:
|
| 926 |
+
df_local_matrix = df_local_matrix[df_local_matrix["temuan_kategori"] != "Positive"]
|
| 927 |
+
# ============================
|
| 928 |
+
# 2. Ensure datetime columns
|
| 929 |
+
# ============================
|
| 930 |
+
df_local_matrix['created_at'] = pd.to_datetime(df_local_matrix['created_at'], errors='coerce')
|
| 931 |
+
df_local_matrix['close_at'] = pd.to_datetime(df_local_matrix['close_at'], errors='coerce')
|
| 932 |
+
# ============================
|
| 933 |
+
# 3. Compute LEAD TIME
|
| 934 |
+
# ============================
|
| 935 |
+
df_local_matrix['lead_time_days'] = (df_local_matrix['close_at'] - df_local_matrix['created_at']).dt.days
|
| 936 |
+
df_local_matrix['lead_time_days'] = df_local_matrix['lead_time_days'].fillna(0)
|
| 937 |
+
# ============================
|
| 938 |
+
# 4. Average Monthly Finding Count per Operator
|
| 939 |
+
# ============================
|
| 940 |
+
if 'nama' not in df_local_matrix.columns:
|
| 941 |
+
st.error("❌ Kolom 'nama' (operator) tidak ditemukan.")
|
| 942 |
+
# st.stop() # Stop bisa dihilangkan agar script tetap jalan
|
| 943 |
+
else:
|
| 944 |
+
# Buat kolom bulan (YYYY-MM)
|
| 945 |
+
df_local_matrix = df_local_matrix.assign(month=df_local_matrix['created_at'].dt.to_period('M').astype(str))
|
| 946 |
+
# Hitung jumlah temuan per operator per bulan
|
| 947 |
+
monthly_counts = (
|
| 948 |
+
df_local_matrix
|
| 949 |
+
.groupby(['nama', 'month'])['kode_temuan']
|
| 950 |
+
.nunique()
|
| 951 |
+
.reset_index(name='monthly_count')
|
| 952 |
+
)
|
| 953 |
+
# Hitung rata-rata bulanan per operator
|
| 954 |
+
operator_avg = (
|
| 955 |
+
monthly_counts
|
| 956 |
+
.groupby('nama')['monthly_count']
|
| 957 |
+
.mean() # <-- RATA-RATA per bulan (bukan total!)
|
| 958 |
+
.reset_index(name='Finding Count')
|
| 959 |
+
)
|
| 960 |
+
# ============================
|
| 961 |
+
# 5. Average Lead Time per Operator
|
| 962 |
+
# ============================
|
| 963 |
+
operator_lead = (
|
| 964 |
+
df_local_matrix.groupby('nama')['lead_time_days']
|
| 965 |
+
.mean()
|
| 966 |
+
.reset_index(name='Average Lead Time')
|
| 967 |
+
)
|
| 968 |
+
# ============================
|
| 969 |
+
# 6. Merge Risk Matrix
|
| 970 |
+
# ============================
|
| 971 |
+
risk_matrix = operator_avg.merge(operator_lead, on='nama', how='left')
|
| 972 |
+
risk_matrix = risk_matrix.rename(columns={'nama': 'Operator Name'})
|
| 973 |
+
# Handle operator tanpa lead time (e.g., belum closed)
|
| 974 |
+
risk_matrix['Average Lead Time'] = risk_matrix['Average Lead Time'].fillna(0)
|
| 975 |
+
# ============================
|
| 976 |
+
# 7. Quadrant Logic (unchanged)
|
| 977 |
+
# ============================
|
| 978 |
+
X_LIMIT = 20
|
| 979 |
+
Y_LIMIT = 3
|
| 980 |
+
def assign_quadrant(row):
|
| 981 |
+
if row['Finding Count'] >= X_LIMIT and row['Average Lead Time'] >= Y_LIMIT:
|
| 982 |
+
return "Quadrant I – High Leadtime & High Count"
|
| 983 |
+
elif row['Finding Count'] < X_LIMIT and row['Average Lead Time'] >= Y_LIMIT:
|
| 984 |
+
return "Quadrant II – High Leadtime but Low Count"
|
| 985 |
+
elif row['Finding Count'] >= X_LIMIT and row['Average Lead Time'] < Y_LIMIT:
|
| 986 |
+
return "Quadrant III – Low Leadtime but High Count"
|
| 987 |
+
else:
|
| 988 |
+
return "Quadrant IV – Low Leadtime & Low Count"
|
| 989 |
+
risk_matrix['quadrant'] = risk_matrix.apply(assign_quadrant, axis=1)
|
| 990 |
+
quadrant_count = risk_matrix['quadrant'].value_counts()
|
| 991 |
+
# ============================
|
| 992 |
+
# 8. Scatter Plot (format visual tetap sam persis)
|
| 993 |
+
# ============================
|
| 994 |
+
max_x = risk_matrix['Finding Count'].max() + 1
|
| 995 |
+
max_y = risk_matrix['Average Lead Time'].max() + 5
|
| 996 |
+
fig = px.scatter(
|
| 997 |
+
risk_matrix,
|
| 998 |
+
x='Finding Count',
|
| 999 |
+
y='Average Lead Time',
|
| 1000 |
+
hover_name="Operator Name",
|
| 1001 |
+
size=[12] * len(risk_matrix),
|
| 1002 |
+
size_max=15,
|
| 1003 |
+
title="Audit Findings Risk Matrix: Avg Monthly Count vs Lead Time"
|
| 1004 |
+
)
|
| 1005 |
+
# Background quadrant (same as original)
|
| 1006 |
+
fig.add_shape(type="rect", x0=X_LIMIT, x1=max_x, y0=Y_LIMIT, y1=max_y,
|
| 1007 |
+
fillcolor="rgba(255,0,0,0.25)", line_width=0) # Q1
|
| 1008 |
+
fig.add_shape(type="rect", x0=0, x1=X_LIMIT, y0=Y_LIMIT, y1=max_y,
|
| 1009 |
+
fillcolor="rgba(255,150,50,0.25)", line_width=0) # Q2
|
| 1010 |
+
fig.add_shape(type="rect", x0=X_LIMIT, x1=max_x, y0=0, y1=Y_LIMIT,
|
| 1011 |
+
fillcolor="rgba(255,200,200,0.25)", line_width=0) # Q3
|
| 1012 |
+
fig.add_shape(type="rect", x0=0, x1=X_LIMIT, y0=0, y1=Y_LIMIT,
|
| 1013 |
+
fillcolor="rgba(0,120,255,0.15)", line_width=0) # Q4
|
| 1014 |
+
fig.add_vline(x=X_LIMIT, line_dash="dash", line_color="black")
|
| 1015 |
+
fig.add_hline(y=Y_LIMIT, line_dash="dash", line_color="black")
|
| 1016 |
+
# Quadrant count annotations (same positions & style)
|
| 1017 |
+
fig.add_annotation(x=X_LIMIT + (max_x - X_LIMIT)/2,
|
| 1018 |
+
y=Y_LIMIT + (max_y - Y_LIMIT)/2,
|
| 1019 |
+
text=f"<b>{quadrant_count.get('Quadrant I – High Leadtime & High Count',0)}</b>",
|
| 1020 |
+
showarrow=False, font=dict(size=22, color="darkred"))
|
| 1021 |
+
fig.add_annotation(x=X_LIMIT/2,
|
| 1022 |
+
y=Y_LIMIT + (max_y - Y_LIMIT)/2,
|
| 1023 |
+
text=f"<b>{quadrant_count.get('Quadrant II – High Leadtime but Low Count',0)}</b>",
|
| 1024 |
+
showarrow=False, font=dict(size=22, color="orange"))
|
| 1025 |
+
fig.add_annotation(x=X_LIMIT + (max_x - X_LIMIT)/2,
|
| 1026 |
+
y=Y_LIMIT/2,
|
| 1027 |
+
text=f"<b>{quadrant_count.get('Quadrant III – Low Leadtime but High Count',0)}</b>",
|
| 1028 |
+
showarrow=False, font=dict(size=22, color="red"))
|
| 1029 |
+
fig.add_annotation(x=X_LIMIT/2,
|
| 1030 |
+
y=Y_LIMIT/2,
|
| 1031 |
+
text=f"<b>{quadrant_count.get('Quadrant IV – Low Leadtime & Low Count',0)}</b>",
|
| 1032 |
+
showarrow=False, font=dict(size=22, color="green"))
|
| 1033 |
+
st.plotly_chart(fig, use_container_width=True)
|
| 1034 |
+
# ============================
|
| 1035 |
+
# 9. Summary Table
|
| 1036 |
+
# ============================
|
| 1037 |
+
st.subheader("Summary (Avg Monthly Count vs Avg Lead Time)")
|
| 1038 |
+
st.dataframe(
|
| 1039 |
+
risk_matrix.sort_values("Finding Count", ascending=False),
|
| 1040 |
+
use_container_width=True
|
| 1041 |
+
)
|
| 1042 |
+
except Exception as e:
|
| 1043 |
+
st.error(f"⚠️ Error Risk Matrix: {e}")
|
| 1044 |
+
# st.exception(e) # Uncomment for debugging
|
| 1045 |
+
|
| 1046 |
+
# =================== 6. ✅ AI INSIGHT ENGINE (BARU - BERDASARKAN DATA & RATIO) ===================
|
| 1047 |
+
st.markdown("## 6. Insight & Recommendation")
|
| 1048 |
+
|
| 1049 |
+
def compute_ai_insights(df: pd.DataFrame) -> List[dict]:
|
| 1050 |
+
"""
|
| 1051 |
+
Generates insights and recommendations based on the current data and average monthly ratios.
|
| 1052 |
+
Returns a list of dictionaries, each containing an 'insight' and a 'recommendation'.
|
| 1053 |
+
"""
|
| 1054 |
+
insight_recommendations = []
|
| 1055 |
+
|
| 1056 |
+
if df.empty:
|
| 1057 |
+
return insight_recommendations
|
| 1058 |
+
|
| 1059 |
+
total_findings = len(df)
|
| 1060 |
+
total_locations = df['nama_lokasi_full'].nunique() if 'nama_lokasi_full' in df.columns else 0
|
| 1061 |
+
total_companies = df['nama_perusahaan'].nunique() if 'nama_perusahaan' in df.columns else 0
|
| 1062 |
+
total_divisions = df['nama'].nunique() if 'nama' in df.columns else 0
|
| 1063 |
+
|
| 1064 |
+
# --- 1. Insight & Recommendation: Rata-rata Bulanan Ratio Temuan/Orang Perusahaan ---
|
| 1065 |
+
if 'nama_perusahaan' in df.columns and 'creator_nid' in df.columns:
|
| 1066 |
+
df_with_month = df.copy()
|
| 1067 |
+
df_with_month['created_month'] = df_with_month['created_at'].dt.to_period('M')
|
| 1068 |
+
|
| 1069 |
+
# Hitung temuan per bulan per perusahaan
|
| 1070 |
+
findings_by_company_month = df_with_month.groupby(['created_month', 'nama_perusahaan']).size().reset_index(name='findings_count')
|
| 1071 |
+
# Hitung jumlah orang unik per bulan per perusahaan
|
| 1072 |
+
creators_by_company_month = df_with_month.groupby(['created_month', 'nama_perusahaan'])['creator_nid'].nunique().reset_index(name='unique_creators')
|
| 1073 |
+
# Gabung
|
| 1074 |
+
merged_ratio = findings_by_company_month.merge(creators_by_company_month, on=['created_month', 'nama_perusahaan'], how='outer')
|
| 1075 |
+
# Filter untuk menghindari pembagian dengan nol
|
| 1076 |
+
merged_ratio = merged_ratio[merged_ratio['unique_creators'] > 0]
|
| 1077 |
+
# Hitung rasio (ignore NaN)
|
| 1078 |
+
merged_ratio['ratio'] = merged_ratio['findings_count'] / merged_ratio['unique_creators']
|
| 1079 |
+
merged_ratio['ratio'] = merged_ratio['ratio'].replace([np.inf, -np.inf], np.nan)
|
| 1080 |
+
|
| 1081 |
+
# Rata-rata bulanan per perusahaan
|
| 1082 |
+
avg_ratio_per_company = merged_ratio.groupby('nama_perusahaan')['ratio'].mean().reset_index(name='avg_monthly_ratio')
|
| 1083 |
+
# Filter hasil akhir untuk menghindari NaN
|
| 1084 |
+
avg_ratio_per_company = avg_ratio_per_company.dropna(subset=['avg_monthly_ratio'])
|
| 1085 |
+
|
| 1086 |
+
if not avg_ratio_per_company.empty:
|
| 1087 |
+
# Temukan perusahaan dengan rasio tertinggi dan terendah
|
| 1088 |
+
top_company_ratio = avg_ratio_per_company.loc[avg_ratio_per_company['avg_monthly_ratio'].idxmax()]
|
| 1089 |
+
low_company_ratio = avg_ratio_per_company.loc[avg_ratio_per_company['avg_monthly_ratio'].idxmin()]
|
| 1090 |
+
|
| 1091 |
+
insight_text = (
|
| 1092 |
+
f"Based on the average monthly finding-to-person ratio, "
|
| 1093 |
+
f"Company '{top_company_ratio['nama_perusahaan']}' has the highest activity level ({top_company_ratio['avg_monthly_ratio']:.2f} findings/person/month), "
|
| 1094 |
+
f"while '{low_company_ratio['nama_perusahaan']}' has the lowest ({low_company_ratio['avg_monthly_ratio']:.2f} findings/person/month)."
|
| 1095 |
+
)
|
| 1096 |
+
recommendation_text = (
|
| 1097 |
+
f"For '{top_company_ratio['nama_perusahaan']}': Investigate the underlying reasons for the high ratio. Is it due to active reporting, higher risk, or more personnel? "
|
| 1098 |
+
f"For '{low_company_ratio['nama_perusahaan']}': Verify if the low ratio reflects effective risk management or potential under-reporting."
|
| 1099 |
+
)
|
| 1100 |
+
insight_recommendations.append({"insight": insight_text, "recommendation": recommendation_text})
|
| 1101 |
+
|
| 1102 |
+
# --- 2. Insight & Recommendation: Distribusi Temuan (Umum) ---
|
| 1103 |
+
if 'temuan_kategori' in df.columns:
|
| 1104 |
+
cat_counts = df['temuan_kategori'].value_counts()
|
| 1105 |
+
top_cat = cat_counts.index[0] if not cat_counts.empty else "N/A"
|
| 1106 |
+
top_cat_count = cat_counts.iloc[0] if not cat_counts.empty else 0
|
| 1107 |
+
if top_cat != "N/A":
|
| 1108 |
+
perc = (top_cat_count / total_findings) * 100
|
| 1109 |
+
if top_cat == "Positive":
|
| 1110 |
+
insight_text = (
|
| 1111 |
+
f"The majority of findings ({top_cat_count} or {perc:.1f}%) are categorized as 'Positive'. "
|
| 1112 |
+
f"This indicates a strong culture of recognizing and reporting good practices and safety compliance."
|
| 1113 |
+
)
|
| 1114 |
+
recommendation_text = (
|
| 1115 |
+
f"Maintain and reinforce the positive reporting culture. "
|
| 1116 |
+
f"Consider using these 'Positive' examples as best practice case studies for training and awareness programs."
|
| 1117 |
+
)
|
| 1118 |
+
else:
|
| 1119 |
+
insight_text = (
|
| 1120 |
+
f"The most frequent finding category is '{top_cat}' ({top_cat_count} instances, {perc:.1f}% of total). "
|
| 1121 |
+
f"This highlights a specific area requiring focused attention."
|
| 1122 |
+
)
|
| 1123 |
+
recommendation_text = (
|
| 1124 |
+
f"Conduct a root-cause analysis for the '{top_cat}' category. "
|
| 1125 |
+
f"Develop targeted corrective actions and preventive measures to address the underlying issues."
|
| 1126 |
+
)
|
| 1127 |
+
insight_recommendations.append({"insight": insight_text, "recommendation": recommendation_text})
|
| 1128 |
+
|
| 1129 |
+
# --- 3. Insight & Recommendation: Aktivitas Lokasi (Umum) ---
|
| 1130 |
+
if 'nama_lokasi_full' in df.columns and total_locations > 0:
|
| 1131 |
+
loc_counts = df['nama_lokasi_full'].value_counts()
|
| 1132 |
+
top_loc = loc_counts.index[0] if not loc_counts.empty else "N/A"
|
| 1133 |
+
top_loc_count = loc_counts.iloc[0] if not loc_counts.empty else 0
|
| 1134 |
+
if top_loc != "N/A":
|
| 1135 |
+
insight_text = (
|
| 1136 |
+
f"Location '{top_loc}' has the highest number of findings ({top_loc_count}). "
|
| 1137 |
+
f"This could indicate higher activity, more scrutiny, or potentially higher risk in this area."
|
| 1138 |
+
)
|
| 1139 |
+
recommendation_text = (
|
| 1140 |
+
f"Perform a detailed review of activities in '{top_loc}'. "
|
| 1141 |
+
f"Determine if the high volume is due to increased activity or specific risk factors. "
|
| 1142 |
+
f"Ensure adequate resources and controls are in place."
|
| 1143 |
+
)
|
| 1144 |
+
insight_recommendations.append({"insight": insight_text, "recommendation": recommendation_text})
|
| 1145 |
+
|
| 1146 |
+
# --- 4. Insight & Recommendation: Kinerja Resolusi (Umum) ---
|
| 1147 |
+
if 'days_to_close' in df.columns:
|
| 1148 |
+
closed_df = df.dropna(subset=['days_to_close'])
|
| 1149 |
+
if not closed_df.empty:
|
| 1150 |
+
avg_close_time = closed_df['days_to_close'].mean()
|
| 1151 |
+
median_close_time = closed_df['days_to_close'].median()
|
| 1152 |
+
# Ambang batas SLA, misal 7 hari
|
| 1153 |
+
sla_threshold = 7
|
| 1154 |
+
slow_findings = closed_df[closed_df['days_to_close'] > sla_threshold]
|
| 1155 |
+
slow_count = len(slow_findings)
|
| 1156 |
+
slow_percentage = (slow_count / len(closed_df)) * 100 if len(closed_df) > 0 else 0
|
| 1157 |
+
|
| 1158 |
+
insight_text = (
|
| 1159 |
+
f"The average time to close findings is {avg_close_time:.1f} days (median: {median_close_time:.1f} days). "
|
| 1160 |
+
f"{slow_count} findings ({slow_percentage:.1f}%) exceeded the {sla_threshold}-day SLA."
|
| 1161 |
+
)
|
| 1162 |
+
if slow_percentage > 20:
|
| 1163 |
+
recommendation_text = (
|
| 1164 |
+
f"The resolution performance is below target. Investigate bottlenecks in the closure process. "
|
| 1165 |
+
f"Prioritize findings that are taking longer than {sla_threshold} days. Consider implementing an escalation matrix."
|
| 1166 |
+
)
|
| 1167 |
+
else:
|
| 1168 |
+
recommendation_text = (
|
| 1169 |
+
f"The resolution performance is generally good, but there's room for improvement. "
|
| 1170 |
+
f"Focus on reducing the backlog of findings that exceed the {sla_threshold}-day SLA."
|
| 1171 |
+
)
|
| 1172 |
+
insight_recommendations.append({"insight": insight_text, "recommendation": recommendation_text})
|
| 1173 |
+
|
| 1174 |
+
# --- 5. Insight & Recommendation: Tren Bulanan (Umum) ---
|
| 1175 |
+
if 'created_at' in df.columns:
|
| 1176 |
+
monthly_trend = df.set_index('created_at').resample('M').size()
|
| 1177 |
+
if len(monthly_trend) >= 2:
|
| 1178 |
+
last_month_count = monthly_trend.iloc[-1]
|
| 1179 |
+
prev_month_count = monthly_trend.iloc[-2]
|
| 1180 |
+
if prev_month_count > 0:
|
| 1181 |
+
change_pct = (last_month_count - prev_month_count) / prev_month_count * 100
|
| 1182 |
+
trend_word = "increase" if change_pct > 0 else "decrease"
|
| 1183 |
+
insight_text = (
|
| 1184 |
+
f"There was a {change_pct:+.1f}% {trend_word} in finding volume between the last two months "
|
| 1185 |
+
f"({monthly_trend.index[-2].strftime('%b %Y')} and {monthly_trend.index[-1].strftime('%b %Y')})."
|
| 1186 |
+
)
|
| 1187 |
+
if abs(change_pct) > 20: # Jika perubahan besar
|
| 1188 |
+
recommendation_text = (
|
| 1189 |
+
f"Investigate the cause of this significant {trend_word} in findings. "
|
| 1190 |
+
f"Review operational changes, contractor activities, or audit focus shifts that occurred recently."
|
| 1191 |
+
)
|
| 1192 |
+
else:
|
| 1193 |
+
recommendation_text = (
|
| 1194 |
+
f"Monitor the trend over the next few weeks to see if this change represents a new pattern or a temporary fluctuation."
|
| 1195 |
+
)
|
| 1196 |
+
insight_recommendations.append({"insight": insight_text, "recommendation": recommendation_text})
|
| 1197 |
+
|
| 1198 |
+
# --- 6. Insight & Recommendation: Aktivitas Pelapor (Umum) ---
|
| 1199 |
+
if 'creator_nid' in df.columns:
|
| 1200 |
+
active_reporters = df['creator_nid'].nunique()
|
| 1201 |
+
total_reports = len(df)
|
| 1202 |
+
avg_reports_per_person = total_reports / active_reporters if active_reporters > 0 else 0
|
| 1203 |
+
# Cek apakah ada reporter dominan
|
| 1204 |
+
top_reporter_counts = df['creator_nid'].value_counts()
|
| 1205 |
+
if not top_reporter_counts.empty:
|
| 1206 |
+
top_reporter_id = top_reporter_counts.index[0]
|
| 1207 |
+
top_reporter_count = top_reporter_counts.iloc[0]
|
| 1208 |
+
if top_reporter_count / total_reports > 0.15: # Jika satu orang membuat > 15% laporan
|
| 1209 |
+
insight_text = (
|
| 1210 |
+
f"Reporter with ID '{top_reporter_id}' has submitted a disproportionately high number of findings ({top_reporter_count}). "
|
| 1211 |
+
f"They account for {top_reporter_count/total_reports*100:.1f}% of the total volume."
|
| 1212 |
+
)
|
| 1213 |
+
recommendation_text = (
|
| 1214 |
+
f"Recognize the active reporter. Also, ensure reporting is distributed across the team "
|
| 1215 |
+
f"to provide a more comprehensive view of risks across all areas and activities."
|
| 1216 |
+
)
|
| 1217 |
+
insight_recommendations.append({"insight": insight_text, "recommendation": recommendation_text})
|
| 1218 |
+
|
| 1219 |
+
return insight_recommendations
|
| 1220 |
+
|
| 1221 |
+
# Panggil fungsi untuk mendapatkan insight dan rekomendasi
|
| 1222 |
+
ai_insights_and_recs = compute_ai_insights(df_filtered)
|
| 1223 |
+
|
| 1224 |
+
# Tampilkan hasil
|
| 1225 |
+
|
| 1226 |
+
if ai_insights_and_recs:
|
| 1227 |
+
for i, item in enumerate(ai_insights_and_recs):
|
| 1228 |
+
insight = item["insight"]
|
| 1229 |
+
recommendation = item["recommendation"]
|
| 1230 |
+
# Tampilkan Insight
|
| 1231 |
+
st.markdown(f'<div class="ai-insight"><strong>Insight {i+1}:</strong> {insight}</div>', unsafe_allow_html=True)
|
| 1232 |
+
# Tampilkan Recommendation
|
| 1233 |
+
st.markdown(f'<div class="ai-recommendation"><strong>Recommendation {i+1}:</strong> {recommendation}</div>', unsafe_allow_html=True)
|
| 1234 |
+
else:
|
| 1235 |
+
# Jika tidak ada insight yang dihasilkan, mungkin karena data kosong atau kolom tidak ditemukan
|
| 1236 |
+
st.markdown('<div class="ai-insight">No significant AI insights could be generated. This might be due to insufficient data or missing required columns after filtering.</div>', unsafe_allow_html=True)
|
| 1237 |
+
|
| 1238 |
+
# =================== FOOTER ===================
|
| 1239 |
+
st.markdown("---")
|
| 1240 |
+
st.markdown(
|
| 1241 |
+
"""
|
| 1242 |
+
<div style="text-align:center; color:#757575; font-size:0.9em;">
|
| 1243 |
+
<strong> Special Design for PLN </strong> • © 2025 PT Bukit Technology
|
| 1244 |
+
</div>
|
| 1245 |
+
""",
|
| 1246 |
+
unsafe_allow_html=True
|
| 1247 |
+
)
|
btech.png
ADDED
|
data.csv
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
requirements.txt
CHANGED
|
@@ -1,3 +1,14 @@
|
|
| 1 |
altair
|
| 2 |
-
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
altair
|
| 2 |
+
streamlit>=1.38.0
|
| 3 |
+
pandas>=2.2.2
|
| 4 |
+
numpy>=1.26.4
|
| 5 |
+
plotly>=5.24.1
|
| 6 |
+
plotly-express>=0.4.1
|
| 7 |
+
openpyxl>=3.1.5
|
| 8 |
+
python-dateutil>=2.9.0
|
| 9 |
+
# --- Tambahkan untuk WordCloud ---
|
| 10 |
+
wordcloud>=1.9.3
|
| 11 |
+
matplotlib>=3.8.0
|
| 12 |
+
# --- Tambahkan untuk Analisis Prediktif (AI Insights) ---
|
| 13 |
+
statsmodels>=0.14.0
|
| 14 |
+
# -------------------------------
|