import streamlit as st st.markdown( """ # Automatic Site Clustering Documentation ## 1. Objective Cluster sites from geographic coordinates with a configurable max sites per cluster. ## 2. When to use this tool Use this page to build operational clusters for planning, field operations, or optimization workloads. ## 3. Input files and accepted formats - Required: one Excel file in `.xlsx` - Sample: `samples/Site_Clustering.xlsx` ## 4. Required columns/fields You must select: - latitude column - longitude column - region column - site code column ## 5. Step-by-step usage 1. Open `Apps > Automatic Site Clustering`. 2. Upload `.xlsx` dataset. 3. Select columns and set `Max sites per cluster`. 4. Choose clustering method: - uniform cluster size (Hilbert curve) - lower-than-max non-uniform clusters (KMeans) 5. Optionally enable region mixing. 6. Click `Run Clustering` and download output. ## 6. Outputs generated - clustered dataset with a `Cluster` column - cluster size charts - map visualization by cluster - downloadable file: `clustered_sites.xlsx` ## 7. Frequent errors and fixes - Invalid map or missing points. - Fix: verify numeric latitude/longitude values. - Unexpected cluster composition. - Fix: tune `Max sites per cluster` and method choice. - Empty output. - Fix: ensure uploaded file is not empty and selected columns are correct. ## 8. Minimal reproducible example - Input: `samples/Site_Clustering.xlsx` - Action: run with default `max_sites=25`, no region mixing. - Expected result: cluster assignment, charts, map, and downloadable Excel. ## 9. Known limitations - KMeans outcome can vary with data distribution. - Hilbert strategy is coordinate-normalization based. - Extreme outliers can reduce cluster interpretability. ## 10. Version and update date - Documentation version: 1.0 - Last update: 2026-02-23 """ )