Msk7000's picture
Upload 5 files
d4798c4 verified

A newer version of the Streamlit SDK is available: 1.58.0

Upgrade
metadata
title: PCA Variance Puzzle Explorer
emoji: 🧩
colorFrom: blue
colorTo: yellow
sdk: streamlit
pinned: false
app_file: app.py

PCA Variance Puzzle Explorer

An interactive educational dashboard designed for students and researchers to intuitively understand Principal Component Analysis (PCA) and the mechanics of dimensionality reduction.

This application serves as a gamified hands-on exercise following a data science lecture on business analysis and footwear design optimization.

🌟 Concept & Learning Objectives

Instead of just looking at abstract mathematical formulas, users try to find the "First Principal Component (PC1)" by manually rotating a regression line over a scatter plot of correlated footwear dataset (foot length vs. foot width).

Through this hands-on exercise, users can grasp the core concepts of PCA:

  • Maximizing Variance: Discovering that the angle capturing the widest spread of data yields the highest "Explained Variance Ratio".
  • Dimensionality Reduction: Compressing 2D spatial metrics into a single 1D principal score while minimizing information loss.
  • Reconstruction Error: Visually understanding how projecting data points onto the axis relates to minimizing residual errors.

πŸš€ How to Play / Use

  1. Adjust the Angle: Use the slider in the sidebar to rotate the principal axis line (from -90.0 to 90.0 degrees).
  2. Maximize the Score: Observe the "Explained Variance Ratio (%)" updating in real-time. Try to find the optimal angle that covers the highest percentage of information.
  3. Visualize Projection: Toggle the "Show projected data (after compression)" checkbox to see how 2D data points collapse onto the 1D line as compressed data.
  4. Reveal the Answer: Click the "Show Answer" button to compare your empirical guess with the mathematically calculated exact PCA angle and maximum variance ratio.

πŸ› οΈ Repository Structure

To deploy this successfully on Hugging Face Spaces, ensure your repository contains the following files:

  • app.py: The main Streamlit application script.
  • requirements.txt: Python dependencies (streamlit, numpy, pandas, matplotlib, scikit-learn).
  • README.md: This configuration and documentation file.
  • NotoSansJP-Regular.ttf: Japanese font file to prevent character rendering warnings on charts.

Developed for Data Science Education and Computational Analysis Workshops.