Spaces:
Sleeping
Sleeping
File size: 4,160 Bytes
4e67a93 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 | ---
title: KPI Score Correlation Analysis
emoji: 📊
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: "5.33.0"
app_file: kpi_correlation_app.py
pinned: false
---
# KPI Score Correlation Analysis
This directory contains tools for analyzing correlations between IPM scores and axiia scores.
## Architecture
The code has been refactored to share common functionality:
- **`correlation_analysis_core.py`**: Core analysis module with shared functions
- **`csv_utils.py`**: Utilities for loading CSV/Excel files
- **`analyze_correlations_v2.py`**: Command-line interface (CLI)
- **`kpi_correlation_app.py`**: Gradio web interface
## Installation
Ensure you have the required dependencies:
```bash
pip install pandas numpy scipy matplotlib seaborn gradio pyyaml openpyxl xlrd
```
## Usage
### Command-Line Interface
The CLI tool is best for batch processing and automation:
```bash
# Basic usage
python3 analyze_correlations_v2.py -k kpi_file.csv -s scores_file.csv
# With custom output file
python3 analyze_correlations_v2.py -k kpi_file.csv -s scores_file.csv -o results.yaml
# With plots
python3 analyze_correlations_v2.py -k kpi_file.csv -s scores_file.csv -p
# Full example
python3 analyze_correlations_v2.py \
-k ../../data/lenovo_kpi.csv \
-s ../../data/lenovo-scores-0603.csv \
-o score_corr.yaml \
-p
```
Options:
- `-k, --kpi`: Path to KPI file (CSV or Excel)
- `-s, --scores`: Path to scores file (CSV)
- `-o, --output`: Output YAML file (default: score_corr.yaml)
- `-p, --plot`: Generate correlation plots
### Gradio Web Interface
The Gradio app provides an interactive UI with parameterized analysis:
```bash
# Basic usage (uses default scores file in same directory)
python3 kpi_correlation_app.py
# With custom scores file
python3 kpi_correlation_app.py --scores-file path/to/scores.csv
# Share publicly
python3 kpi_correlation_app.py --share
# Custom port
python3 kpi_correlation_app.py --port 8080
# Full example
python3 kpi_correlation_app.py \
--scores-file ../../data/lenovo-scores-0603.csv \
--port 7860
```
Options:
- `--scores-file`: Path to scores CSV file (default: lenovo-scores-0603.csv)
- `--share`: Create a public link
- `--port`: Port to run on (default: 7860)
The Gradio interface provides the following parameterized features:
1. **Data Selection**:
- Choose between different KPI files (CSV/Excel)
- Select specific score columns for analysis
- Filter data by manager status and other criteria
2. **Analysis Parameters**:
- Correlation method selection (Pearson/Spearman)
- Confidence level adjustment
- Sample size requirements
- Outlier detection thresholds
3. **Visualization Options**:
- Plot type selection (scatter, regression, etc.)
- Color scheme customization
- Figure size and DPI settings
- Trend line display options
4. **Output Configuration**:
- Export format selection (YAML/CSV/Excel)
- Custom output file naming
- Detailed vs. summary report options
- Plot export settings
All parameters can be adjusted in real-time through the web interface, with immediate updates to the analysis results and visualizations.
## Input File Requirements
### KPI File
- Must contain an email column (case-insensitive)
- Must contain IPM columns for FY23/24 and FY24/25
- Supports CSV and Excel formats
### Scores File
- Must contain columns: `email`, `problem_score`, `ability_score`
- CSV format
## Output
### CLI Output
- Console output with data quality report and correlation analysis
- YAML file with detailed results
- Optional PNG plots (individual and combined)
### Gradio Output
- Interactive web interface
- Real-time analysis results
- Interactive scatter plots with trend lines
- Data quality statistics
## Correlation Pairs Analyzed
- **AC**: problem_score vs FY23/24 IPM
- **AD**: problem_score vs FY24/25 IPM
- **BC**: ability_score vs FY23/24 IPM
- **BD**: ability_score vs FY24/25 IPM
Each pair shows:
- Pearson correlation coefficient
- Spearman correlation coefficient
- Number of valid samples
- Data quality metrics
# HF deployment
git subtree push --prefix=data-analysis/kpi_score_analysis hfspace main
|