| ---
|
| title: PCA Variance Puzzle Explorer
|
| emoji: 🧩
|
| colorFrom: blue
|
| colorTo: yellow
|
| sdk: streamlit
|
| pinned: false
|
| app_file: app.py
|
| ---
|
|
|
| # PCA Variance Puzzle Explorer
|
|
|
| An interactive educational dashboard designed for students and researchers to intuitively understand **Principal Component Analysis (PCA)** and the mechanics of dimensionality reduction.
|
|
|
| This application serves as a gamified hands-on exercise following a data science lecture on business analysis and footwear design optimization.
|
|
|
| ## 🌟 Concept & Learning Objectives
|
| Instead of just looking at abstract mathematical formulas, users try to find the "First Principal Component (PC1)" by manually rotating a regression line over a scatter plot of correlated footwear dataset (foot length vs. foot width).
|
|
|
| Through this hands-on exercise, users can grasp the core concepts of PCA:
|
| - **Maximizing Variance:** Discovering that the angle capturing the widest spread of data yields the highest "Explained Variance Ratio".
|
| - **Dimensionality Reduction:** Compressing 2D spatial metrics into a single 1D principal score while minimizing information loss.
|
| - **Reconstruction Error:** Visually understanding how projecting data points onto the axis relates to minimizing residual errors.
|
|
|
| ## 🚀 How to Play / Use
|
| 1. **Adjust the Angle:** Use the slider in the sidebar to rotate the principal axis line (from -90.0 to 90.0 degrees).
|
| 2. **Maximize the Score:** Observe the "Explained Variance Ratio (%)" updating in real-time. Try to find the optimal angle that covers the highest percentage of information.
|
| 3. **Visualize Projection:** Toggle the "Show projected data (after compression)" checkbox to see how 2D data points collapse onto the 1D line as compressed data.
|
| 4. **Reveal the Answer:** Click the "Show Answer" button to compare your empirical guess with the mathematically calculated exact PCA angle and maximum variance ratio.
|
|
|
| ## 🛠️ Repository Structure
|
| To deploy this successfully on Hugging Face Spaces, ensure your repository contains the following files:
|
| - `app.py`: The main Streamlit application script.
|
| - `requirements.txt`: Python dependencies (`streamlit`, `numpy`, `pandas`, `matplotlib`, `scikit-learn`).
|
| - `README.md`: This configuration and documentation file.
|
| - `NotoSansJP-Regular.ttf`: Japanese font file to prevent character rendering warnings on charts.
|
|
|
| ---
|
| Developed for Data Science Education and Computational Analysis Workshops. |