Spaces:

Harika22
/

Machine_learning

Sleeping

App Files Files Community

Harika22 commited on May 26, 2025

Commit

fcf7f83

verified ·

1 Parent(s): 80e18d9

Update pages/9_KNN.py

Browse files

Files changed (1) hide show

pages/9_KNN.py +79 -102

pages/9_KNN.py CHANGED Viewed

@@ -1,132 +1,109 @@
 import streamlit as st
-st.set_page_config(page_title="KNN Visual Guide", page_icon="📊", layout="wide")
-st.markdown("""
-    <style>
-        .stApp {
-            background: linear-gradient(to right, #141E30, #243B55);
-            color: white;
-            font-family: 'Segoe UI', sans-serif;
-        }
-        h1, h2, h3 {
-            color: #00CED1;
-        }
-        .sidebar .sidebar-content {
-            background-color: #1e1e1e;
-        }
-        .block-container {
-            padding-top: 2rem;
-            padding-bottom: 2rem;
-        }
-        a {
-            color: #00BFFF;
-            text-decoration: none;
-        }
-        a:hover {
-            color: #1E90FF;
-        }
-    </style>
-""", unsafe_allow_html=True)
-st.sidebar.title("📊 KNN Visual Guide")
-st.sidebar.markdown("Dive into KNN concepts interactively!")
-st.markdown("<h1 style='text-align: center;'>🧭 K-Nearest Neighbors (KNN) Explorer</h1>", unsafe_allow_html=True)
-section = st.radio(
-    "Choose a KNN Concept to Explore:",
-    [
-        "📘 Introduction to KNN",
-        "⚙️ How KNN Works",
         "🎯 Underfitting vs Overfitting",
-        "📉 Cross-Validation",
         "🛠️ Hyperparameter Tuning",
         "⚖️ Feature Scaling",
         "🧮 Weighted KNN",
-        "🗺️ Decision Boundaries"
-    ]
 )
-if section == "📘 Introduction to KNN":
-    st.subheader("📘 What is KNN?")
-    st.markdown("""
-    KNN stands for **K-Nearest Neighbors**, a simple and powerful algorithm used for:
-    - 📍 **Classification**: Predicting a category
-    - 🔢 **Regression**: Predicting a continuous value
-    ✅ Lazy Learning → No training phase, just memorization
-    ✅ Based on **distance** to nearest neighbors
-    ✅ Works well with clean and scaled data
     """)
-elif section == "⚙️ How KNN Works":
-    st.subheader("⚙️ How Does KNN Work?")
-    st.markdown("""
-    🚀 **Step-by-step** process:
-    1. Pick a value for `K`
-    2. Measure distance (Euclidean, Manhattan, etc.) to all training points
-    3. Pick `K` nearest ones
-    4. 📊 Classification → Majority vote
-       📈 Regression → Average/weighted average
     """)
-elif section == "🎯 Underfitting vs Overfitting":
-    st.subheader("🎯 Underfitting vs Overfitting")
-    st.markdown("""
-    🔍 **Overfitting**: Model memorizes data — poor on new data
-    🔍 **Underfitting**: Model too simple — misses patterns
-    ✅ **Best Fit**: Balance both using cross-validation
     """)
-elif section == "📉 Cross-Validation":
-    st.subheader("📉 Training vs Cross-Validation")
-    st.markdown("""
-    🧪 **Training Error**: Error on known data
-    🔄 **Cross-Validation Error**: Error on unseen data
-    🎯 Choose K where both errors are low → best generalization
     """)
-elif section == "🛠️ Hyperparameter Tuning":
-    st.subheader("🛠️ Tuning KNN")
-    st.markdown("""
-    🔧 Main Parameters:
-    - `k`: Number of neighbors
-    - `weights`: All equal or weighted by distance
-    - `metric`: Distance type (e.g., Euclidean)
-    🧠 Use Grid Search, Random Search, or Optuna for optimization
     """)
-elif section == "⚖️ Feature Scaling":
-    st.subheader("⚖️ Why Scale Your Features?")
-    st.markdown("""
-    KNN relies on distance — so features must be on the same scale:
-    - 🔢 **Standardization**: Mean = 0, SD = 1
-    - 🔻 **Normalization**: Rescales between 0 and 1
-    ❗ Always scale after train-test split to avoid data leakage
     """)
-elif section == "🧮 Weighted KNN":
-    st.subheader("🧮 Weighted KNN")
-    st.markdown("""
-    Instead of equal votes, **Weighted KNN** gives:
-    - Higher weight to nearer neighbors
-    - Lower influence from distant points
-    📌 Improves performance when neighbor relevance varies
     """)
-elif section == "🗺️ Decision Boundaries":
-    st.subheader("🗺️ Decision Regions")
-    st.markdown("""
-    Visuals of how KNN separates classes:
-    - `k=1` → Very sensitive, sharp boundaries → Overfitting
-    - `k > 1` → Smoother, more general
-    📊 Helps interpret model behavior in 2D/3D space
     """)
 st.markdown("""
-<hr style='border: 1px solid #555;'>
-<h4 style='color: #00CED1;'>🔗 Try it Yourself in Colab:</h4>
-<a href='https://colab.research.google.com/drive/11wk6wt7sZImXhTqzYrre3ic4oj3KFC4M?usp=sharing' target='_blank'>Open Interactive Notebook</a>
 """, unsafe_allow_html=True)
-st.success("Explore, visualize, and understand how KNN works like never before! 🚀")

 import streamlit as st
+st.set_page_config(page_title="KNN", page_icon="🤖", layout="wide")
+st.markdown("<h1 style='text-align: center; color: #FF4C60;'>🔍 K-Nearest Neighbors (KNN) Algorithm</h1>", unsafe_allow_html=True)
+st.sidebar.title("🤖 KNN App")
+st.sidebar.markdown("Explore KNN concepts step-by-step using the sections below.")
+option = st.radio(
+    "Select a concept to learn:",
+    (
+        "📘 What is KNN?",
+        "⚙️ How Does KNN Work?",
         "🎯 Underfitting vs Overfitting",
+        "📉 Training vs Cross-Validation Error",
         "🛠️ Hyperparameter Tuning",
         "⚖️ Feature Scaling",
         "🧮 Weighted KNN",
+        "🗺️ Decision Regions",
+        "🔁 Cross-Validation Explained"
+    )
 )
+if option == "📘 What is KNN?":
+    st.write("""
+    K-Nearest Neighbors (KNN) is a **non-parametric**, **lazy learning** algorithm used for both classification and regression.
+    ✅ It stores all training data instead of learning a function.
+    ✅ It uses distance metrics (e.g., Euclidean, Manhattan) to make predictions.
+    ✅ Suitable for small to moderately sized datasets.
+    """)
+elif option == "⚙️ How Does KNN Work?":
+    st.write("""
+    **Training Phase:**
+    - No actual training occurs. KNN memorizes the training dataset.
+    **Prediction Phase (Classification):**
+    1. Choose a value of **K**
+    2. Calculate distances from the new point to all others
+    3. Pick **K closest** points
+    4. Use majority vote to classify
+    **Prediction Phase (Regression):**
+    - Average the values of the K nearest neighbors.
     """)
+elif option == "🎯 Underfitting vs Overfitting":
+    st.write("""
+    - **Overfitting**: The model is too specific to the training data. Poor on unseen data.
+    - **Underfitting**: The model is too simple. Poor even on training data.
+    - **Ideal Model**: A balance that performs well on both seen and unseen data.
     """)
+elif option == "📉 Training vs Cross-Validation Error":
+    st.write("""
+    - **Training Error** is the error on the known training data.
+    - **Cross-Validation Error** is from unseen validation data.
+    ✅ Use cross-validation to pick the best value of `K`.
+    🔍 Big gap = Overfitting; Both high = Underfitting.
     """)
+elif option == "🛠️ Hyperparameter Tuning":
+    st.write("""
+    - **K**: Number of neighbors — test multiple values.
+    - **Weights**: Equal (`uniform`) or based on distance (`distance`).
+    - **Metric**: How distance is measured (Euclidean, Manhattan).
+    - Use **Grid Search**, **Random Search**, or **Optuna** for best tuning.
     """)
+elif option == "⚖️ Feature Scaling":
+    st.write("""
+    KNN uses distances — so features must be on the same scale.
+    - **Normalization** scales data between 0 and 1.
+    - **Standardization** centers data around mean 0.
+    ⚠️ Always scale data after splitting to avoid leakage.
     """)
+elif option == "🧮 Weighted KNN":
+    st.write("""
+    Weighted KNN assigns higher importance to closer neighbors.
+    - Use `weights='distance'` to apply this logic in libraries like scikit-learn.
+    - Helps in noisy datasets or when closer points are more meaningful.
     """)
+elif option == "🗺️ Decision Regions":
+    st.write("""
+    - Small `k` values create complex, wiggly decision boundaries (overfitting).
+    - Larger `k` smooths the boundary (better generalization).
+    - Visualizing decision regions helps understand the algorithm’s behavior.
     """)
+elif option == "🔁 Cross-Validation Explained":
+    st.write("""
+    - **K-Fold Cross-Validation** splits data into `K` parts.
+    - The model trains on K-1 parts and tests on the remaining part.
+    - Helps evaluate model stability and avoid overfitting.
     """)
+st.markdown("<h2 style='color: #58a6ff;'>📓 Try KNN in Colab:</h2>", unsafe_allow_html=True)
 st.markdown("""
+<a href='https://colab.research.google.com/drive/11wk6wt7sZImXhTqzYrre3ic4oj3KFC4M?usp=sharing' target='_blank'>
+🔗 Open Jupyter Notebook on Colab
+</a>
 """, unsafe_allow_html=True)
+st.success("KNN is easy to understand and surprisingly powerful! Tune it well, scale your data, and validate your model to get the best results.")