kpi_analysis / docs /README.md
zh3036's picture
Deploy KPI snapshot 2025-06-12
4e67a93

A newer version of the Gradio SDK is available: 6.11.0

Upgrade
metadata
title: KPI Score Correlation Analysis
emoji: 📊
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.33.0
app_file: kpi_correlation_app.py
pinned: false

KPI Score Correlation Analysis

This directory contains tools for analyzing correlations between IPM scores and axiia scores.

Architecture

The code has been refactored to share common functionality:

  • correlation_analysis_core.py: Core analysis module with shared functions
  • csv_utils.py: Utilities for loading CSV/Excel files
  • analyze_correlations_v2.py: Command-line interface (CLI)
  • kpi_correlation_app.py: Gradio web interface

Installation

Ensure you have the required dependencies:

pip install pandas numpy scipy matplotlib seaborn gradio pyyaml openpyxl xlrd

Usage

Command-Line Interface

The CLI tool is best for batch processing and automation:

# Basic usage
python3 analyze_correlations_v2.py -k kpi_file.csv -s scores_file.csv

# With custom output file
python3 analyze_correlations_v2.py -k kpi_file.csv -s scores_file.csv -o results.yaml

# With plots
python3 analyze_correlations_v2.py -k kpi_file.csv -s scores_file.csv -p

# Full example
python3 analyze_correlations_v2.py \
  -k ../../data/lenovo_kpi.csv \
  -s ../../data/lenovo-scores-0603.csv \
  -o score_corr.yaml \
  -p

Options:

  • -k, --kpi: Path to KPI file (CSV or Excel)
  • -s, --scores: Path to scores file (CSV)
  • -o, --output: Output YAML file (default: score_corr.yaml)
  • -p, --plot: Generate correlation plots

Gradio Web Interface

The Gradio app provides an interactive UI with parameterized analysis:

# Basic usage (uses default scores file in same directory)
python3 kpi_correlation_app.py

# With custom scores file
python3 kpi_correlation_app.py --scores-file path/to/scores.csv

# Share publicly
python3 kpi_correlation_app.py --share

# Custom port
python3 kpi_correlation_app.py --port 8080

# Full example
python3 kpi_correlation_app.py \
  --scores-file ../../data/lenovo-scores-0603.csv \
  --port 7860

Options:

  • --scores-file: Path to scores CSV file (default: lenovo-scores-0603.csv)
  • --share: Create a public link
  • --port: Port to run on (default: 7860)

The Gradio interface provides the following parameterized features:

  1. Data Selection:

    • Choose between different KPI files (CSV/Excel)
    • Select specific score columns for analysis
    • Filter data by manager status and other criteria
  2. Analysis Parameters:

    • Correlation method selection (Pearson/Spearman)
    • Confidence level adjustment
    • Sample size requirements
    • Outlier detection thresholds
  3. Visualization Options:

    • Plot type selection (scatter, regression, etc.)
    • Color scheme customization
    • Figure size and DPI settings
    • Trend line display options
  4. Output Configuration:

    • Export format selection (YAML/CSV/Excel)
    • Custom output file naming
    • Detailed vs. summary report options
    • Plot export settings

All parameters can be adjusted in real-time through the web interface, with immediate updates to the analysis results and visualizations.

Input File Requirements

KPI File

  • Must contain an email column (case-insensitive)
  • Must contain IPM columns for FY23/24 and FY24/25
  • Supports CSV and Excel formats

Scores File

  • Must contain columns: email, problem_score, ability_score
  • CSV format

Output

CLI Output

  • Console output with data quality report and correlation analysis
  • YAML file with detailed results
  • Optional PNG plots (individual and combined)

Gradio Output

  • Interactive web interface
  • Real-time analysis results
  • Interactive scatter plots with trend lines
  • Data quality statistics

Correlation Pairs Analyzed

  • AC: problem_score vs FY23/24 IPM
  • AD: problem_score vs FY24/25 IPM
  • BC: ability_score vs FY23/24 IPM
  • BD: ability_score vs FY24/25 IPM

Each pair shows:

  • Pearson correlation coefficient
  • Spearman correlation coefficient
  • Number of valid samples
  • Data quality metrics

HF deployment

git subtree push --prefix=data-analysis/kpi_score_analysis hfspace main