kpi_analysis / docs /README.md
zh3036's picture
Deploy KPI snapshot 2025-06-12
4e67a93
---
title: KPI Score Correlation Analysis
emoji: 📊
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: "5.33.0"
app_file: kpi_correlation_app.py
pinned: false
---
# KPI Score Correlation Analysis
This directory contains tools for analyzing correlations between IPM scores and axiia scores.
## Architecture
The code has been refactored to share common functionality:
- **`correlation_analysis_core.py`**: Core analysis module with shared functions
- **`csv_utils.py`**: Utilities for loading CSV/Excel files
- **`analyze_correlations_v2.py`**: Command-line interface (CLI)
- **`kpi_correlation_app.py`**: Gradio web interface
## Installation
Ensure you have the required dependencies:
```bash
pip install pandas numpy scipy matplotlib seaborn gradio pyyaml openpyxl xlrd
```
## Usage
### Command-Line Interface
The CLI tool is best for batch processing and automation:
```bash
# Basic usage
python3 analyze_correlations_v2.py -k kpi_file.csv -s scores_file.csv
# With custom output file
python3 analyze_correlations_v2.py -k kpi_file.csv -s scores_file.csv -o results.yaml
# With plots
python3 analyze_correlations_v2.py -k kpi_file.csv -s scores_file.csv -p
# Full example
python3 analyze_correlations_v2.py \
-k ../../data/lenovo_kpi.csv \
-s ../../data/lenovo-scores-0603.csv \
-o score_corr.yaml \
-p
```
Options:
- `-k, --kpi`: Path to KPI file (CSV or Excel)
- `-s, --scores`: Path to scores file (CSV)
- `-o, --output`: Output YAML file (default: score_corr.yaml)
- `-p, --plot`: Generate correlation plots
### Gradio Web Interface
The Gradio app provides an interactive UI with parameterized analysis:
```bash
# Basic usage (uses default scores file in same directory)
python3 kpi_correlation_app.py
# With custom scores file
python3 kpi_correlation_app.py --scores-file path/to/scores.csv
# Share publicly
python3 kpi_correlation_app.py --share
# Custom port
python3 kpi_correlation_app.py --port 8080
# Full example
python3 kpi_correlation_app.py \
--scores-file ../../data/lenovo-scores-0603.csv \
--port 7860
```
Options:
- `--scores-file`: Path to scores CSV file (default: lenovo-scores-0603.csv)
- `--share`: Create a public link
- `--port`: Port to run on (default: 7860)
The Gradio interface provides the following parameterized features:
1. **Data Selection**:
- Choose between different KPI files (CSV/Excel)
- Select specific score columns for analysis
- Filter data by manager status and other criteria
2. **Analysis Parameters**:
- Correlation method selection (Pearson/Spearman)
- Confidence level adjustment
- Sample size requirements
- Outlier detection thresholds
3. **Visualization Options**:
- Plot type selection (scatter, regression, etc.)
- Color scheme customization
- Figure size and DPI settings
- Trend line display options
4. **Output Configuration**:
- Export format selection (YAML/CSV/Excel)
- Custom output file naming
- Detailed vs. summary report options
- Plot export settings
All parameters can be adjusted in real-time through the web interface, with immediate updates to the analysis results and visualizations.
## Input File Requirements
### KPI File
- Must contain an email column (case-insensitive)
- Must contain IPM columns for FY23/24 and FY24/25
- Supports CSV and Excel formats
### Scores File
- Must contain columns: `email`, `problem_score`, `ability_score`
- CSV format
## Output
### CLI Output
- Console output with data quality report and correlation analysis
- YAML file with detailed results
- Optional PNG plots (individual and combined)
### Gradio Output
- Interactive web interface
- Real-time analysis results
- Interactive scatter plots with trend lines
- Data quality statistics
## Correlation Pairs Analyzed
- **AC**: problem_score vs FY23/24 IPM
- **AD**: problem_score vs FY24/25 IPM
- **BC**: ability_score vs FY23/24 IPM
- **BD**: ability_score vs FY24/25 IPM
Each pair shows:
- Pearson correlation coefficient
- Spearman correlation coefficient
- Number of valid samples
- Data quality metrics
# HF deployment
git subtree push --prefix=data-analysis/kpi_score_analysis hfspace main