A newer version of the Streamlit SDK is available: 1.58.0
title: PCA Variance Puzzle Explorer
emoji: π§©
colorFrom: blue
colorTo: yellow
sdk: streamlit
pinned: false
app_file: app.py
PCA Variance Puzzle Explorer
An interactive educational dashboard designed for students and researchers to intuitively understand Principal Component Analysis (PCA) and the mechanics of dimensionality reduction.
This application serves as a gamified hands-on exercise following a data science lecture on business analysis and footwear design optimization.
π Concept & Learning Objectives
Instead of just looking at abstract mathematical formulas, users try to find the "First Principal Component (PC1)" by manually rotating a regression line over a scatter plot of correlated footwear dataset (foot length vs. foot width).
Through this hands-on exercise, users can grasp the core concepts of PCA:
- Maximizing Variance: Discovering that the angle capturing the widest spread of data yields the highest "Explained Variance Ratio".
- Dimensionality Reduction: Compressing 2D spatial metrics into a single 1D principal score while minimizing information loss.
- Reconstruction Error: Visually understanding how projecting data points onto the axis relates to minimizing residual errors.
π How to Play / Use
- Adjust the Angle: Use the slider in the sidebar to rotate the principal axis line (from -90.0 to 90.0 degrees).
- Maximize the Score: Observe the "Explained Variance Ratio (%)" updating in real-time. Try to find the optimal angle that covers the highest percentage of information.
- Visualize Projection: Toggle the "Show projected data (after compression)" checkbox to see how 2D data points collapse onto the 1D line as compressed data.
- Reveal the Answer: Click the "Show Answer" button to compare your empirical guess with the mathematically calculated exact PCA angle and maximum variance ratio.
π οΈ Repository Structure
To deploy this successfully on Hugging Face Spaces, ensure your repository contains the following files:
app.py: The main Streamlit application script.requirements.txt: Python dependencies (streamlit,numpy,pandas,matplotlib,scikit-learn).README.md: This configuration and documentation file.NotoSansJP-Regular.ttf: Japanese font file to prevent character rendering warnings on charts.
Developed for Data Science Education and Computational Analysis Workshops.