--- title: KPI Score Correlation Analysis emoji: 📊 colorFrom: blue colorTo: purple sdk: gradio sdk_version: "5.33.0" app_file: kpi_correlation_app.py pinned: false --- # KPI Score Correlation Analysis This directory contains tools for analyzing correlations between IPM scores and axiia scores. ## Architecture The code has been refactored to share common functionality: - **`correlation_analysis_core.py`**: Core analysis module with shared functions - **`csv_utils.py`**: Utilities for loading CSV/Excel files - **`analyze_correlations_v2.py`**: Command-line interface (CLI) - **`kpi_correlation_app.py`**: Gradio web interface ## Installation Ensure you have the required dependencies: ```bash pip install pandas numpy scipy matplotlib seaborn gradio pyyaml openpyxl xlrd ``` ## Usage ### Command-Line Interface The CLI tool is best for batch processing and automation: ```bash # Basic usage python3 analyze_correlations_v2.py -k kpi_file.csv -s scores_file.csv # With custom output file python3 analyze_correlations_v2.py -k kpi_file.csv -s scores_file.csv -o results.yaml # With plots python3 analyze_correlations_v2.py -k kpi_file.csv -s scores_file.csv -p # Full example python3 analyze_correlations_v2.py \ -k ../../data/lenovo_kpi.csv \ -s ../../data/lenovo-scores-0603.csv \ -o score_corr.yaml \ -p ``` Options: - `-k, --kpi`: Path to KPI file (CSV or Excel) - `-s, --scores`: Path to scores file (CSV) - `-o, --output`: Output YAML file (default: score_corr.yaml) - `-p, --plot`: Generate correlation plots ### Gradio Web Interface The Gradio app provides an interactive UI with parameterized analysis: ```bash # Basic usage (uses default scores file in same directory) python3 kpi_correlation_app.py # With custom scores file python3 kpi_correlation_app.py --scores-file path/to/scores.csv # Share publicly python3 kpi_correlation_app.py --share # Custom port python3 kpi_correlation_app.py --port 8080 # Full example python3 kpi_correlation_app.py \ --scores-file ../../data/lenovo-scores-0603.csv \ --port 7860 ``` Options: - `--scores-file`: Path to scores CSV file (default: lenovo-scores-0603.csv) - `--share`: Create a public link - `--port`: Port to run on (default: 7860) The Gradio interface provides the following parameterized features: 1. **Data Selection**: - Choose between different KPI files (CSV/Excel) - Select specific score columns for analysis - Filter data by manager status and other criteria 2. **Analysis Parameters**: - Correlation method selection (Pearson/Spearman) - Confidence level adjustment - Sample size requirements - Outlier detection thresholds 3. **Visualization Options**: - Plot type selection (scatter, regression, etc.) - Color scheme customization - Figure size and DPI settings - Trend line display options 4. **Output Configuration**: - Export format selection (YAML/CSV/Excel) - Custom output file naming - Detailed vs. summary report options - Plot export settings All parameters can be adjusted in real-time through the web interface, with immediate updates to the analysis results and visualizations. ## Input File Requirements ### KPI File - Must contain an email column (case-insensitive) - Must contain IPM columns for FY23/24 and FY24/25 - Supports CSV and Excel formats ### Scores File - Must contain columns: `email`, `problem_score`, `ability_score` - CSV format ## Output ### CLI Output - Console output with data quality report and correlation analysis - YAML file with detailed results - Optional PNG plots (individual and combined) ### Gradio Output - Interactive web interface - Real-time analysis results - Interactive scatter plots with trend lines - Data quality statistics ## Correlation Pairs Analyzed - **AC**: problem_score vs FY23/24 IPM - **AD**: problem_score vs FY24/25 IPM - **BC**: ability_score vs FY23/24 IPM - **BD**: ability_score vs FY24/25 IPM Each pair shows: - Pearson correlation coefficient - Spearman correlation coefficient - Number of valid samples - Data quality metrics # HF deployment git subtree push --prefix=data-analysis/kpi_score_analysis hfspace main