Harika22 commited on
Commit
80e18d9
ยท
verified ยท
1 Parent(s): 8b1a429

Update pages/9_KNN.py

Browse files
Files changed (1) hide show
  1. pages/9_KNN.py +91 -82
pages/9_KNN.py CHANGED
@@ -1,123 +1,132 @@
1
  import streamlit as st
2
 
3
- st.set_page_config(page_title="KNN Explained", page_icon="๐Ÿค–", layout="wide")
4
 
5
  st.markdown("""
6
  <style>
7
  .stApp {
8
- background-image: linear-gradient(120deg, #232526, #414345);
9
- color: #f8f8f2;
 
10
  }
11
- h1, h2, h3, h4 {
12
- color: #ff79c6;
13
  }
14
  .sidebar .sidebar-content {
15
- background-color: #2c3e50;
16
  }
17
  .block-container {
18
  padding-top: 2rem;
19
  padding-bottom: 2rem;
20
  }
21
  a {
22
- color: #8be9fd;
23
  text-decoration: none;
24
  }
25
  a:hover {
26
- color: #50fa7b;
27
  }
28
  </style>
29
  """, unsafe_allow_html=True)
30
 
31
- st.sidebar.title("๐Ÿค– KNN Explorer")
32
- st.sidebar.markdown("Discover the K-Nearest Neighbors algorithm step-by-step.")
33
-
34
- st.markdown("""
35
- <h1 style='text-align: center;'>๐Ÿ” K-Nearest Neighbors (KNN) Simplified</h1>
36
- """, unsafe_allow_html=True)
37
-
38
- with st.expander("๐Ÿ“˜ What is KNN?"):
39
- st.write("""
40
- K-Nearest Neighbors (KNN) is a **simple**, **intuitive**, and **non-parametric** algorithm used in classification and regression.
41
- It makes predictions based on the majority class or average of the `K` closest training samples.
42
-
43
- โœ… No training phase required โ€” just store the data.
44
- โœ… Uses distance-based similarity (e.g., Euclidean).
45
- โœ… Effective for well-separated and small datasets.
 
 
 
 
 
 
 
 
 
 
 
 
 
46
  """)
47
 
48
- with st.expander("โš™๏ธ How Does KNN Work?"):
49
- st.write("""
50
- **Training Phase:**
51
- - KNN does not train in the traditional sense. It memorizes the training set.
52
-
53
- **Prediction Phase:**
54
- 1. Choose a value of `K`
55
- 2. Calculate distance between test point and all training points
56
- 3. Pick `K` nearest neighbors
57
- 4. Predict class (majority vote) or value (average)
58
  """)
59
-
60
- with st.expander("๐ŸŽฏ Underfitting vs Overfitting"):
61
- st.write("""
62
- - **Overfitting**: Very low training error but poor generalization. Happens with low `K` (e.g., `k=1`).
63
- - **Underfitting**: Model is too simple to learn any patterns (e.g., very high `K`).
64
- - **Sweet Spot**: Use **cross-validation** to pick the best `K` that balances both.
65
- """)
66
-
67
- with st.expander("๐Ÿ“‰ Training vs Cross-Validation Error"):
68
- st.write("""
69
- - **Training Error**: How well the model does on the data it has seen.
70
- - **CV Error**: Performance on unseen data using validation.
71
 
72
- โš–๏ธ Aim for **low CV error** for best generalization.
 
 
 
 
 
73
  """)
74
 
75
- with st.expander("๐Ÿ› ๏ธ Hyperparameter Tuning"):
76
- st.write("""
77
- - `k`: Number of neighbors (e.g., 3, 5, 7)
78
- - `weights`: 'uniform' or 'distance' (weight closer neighbors more)
79
- - `metric`: Distance function (Euclidean, Manhattan, etc.)
80
-
81
- ๐Ÿ” Use **Grid Search**, **Randomized Search**, or **Bayesian Optimization** to tune these.
82
  """)
83
 
84
- with st.expander("โš–๏ธ Why Scaling Matters"):
85
- st.write("""
86
- KNN is based on distance โ€” so features on different scales can skew results.
87
-
88
- โœ… Use **StandardScaler** (Z-score) or **MinMaxScaler** for preprocessing.
89
- โš ๏ธ Always scale **after splitting** the data.
 
 
90
  """)
91
 
92
- with st.expander("๐Ÿ“ Weighted KNN"):
93
- st.write("""
94
- - Weighted KNN assigns more importance to closer neighbors.
95
- - Itโ€™s useful when closer points are more likely to belong to the same class.
96
- - Just use `weights='distance'` in most libraries like scikit-learn.
 
 
97
  """)
98
 
99
-
100
- with st.expander("๐Ÿ—บ๏ธ Decision Boundaries"):
101
- st.write("""
102
- - `k=1`: Sharp, complex boundaries โ€” can lead to overfitting.
103
- - Larger `k`: Smoother boundaries โ€” better generalization.
104
- - Visualize using 2D plots to understand how `K` affects predictions.
 
105
  """)
106
 
107
- with st.expander("๐Ÿ” What is Cross-Validation?"):
108
- st.write("""
109
- - **K-Fold Cross-Validation** splits data into `K` parts (folds).
110
- - Train on `K-1` folds, test on the remaining.
111
- - Repeat `K` times and average results.
112
-
113
- โœ… Helps prevent overfitting and guides hyperparameter selection.
114
  """)
115
 
116
  st.markdown("""
117
- <h2 style='color: #8be9fd;'>๐Ÿ““ Try KNN in Action:</h2>
118
- <a href='https://colab.research.google.com/drive/11wk6wt7sZImXhTqzYrre3ic4oj3KFC4M?usp=sharing' target='_blank'>
119
- ๐Ÿš€ Open Colab Notebook
120
- </a>
121
  """, unsafe_allow_html=True)
122
 
123
- st.success("KNN is simple yet powerful. Use scaling, choose the right K, and always validate your results!")
 
1
  import streamlit as st
2
 
3
+ st.set_page_config(page_title="KNN Visual Guide", page_icon="๐Ÿ“Š", layout="wide")
4
 
5
  st.markdown("""
6
  <style>
7
  .stApp {
8
+ background: linear-gradient(to right, #141E30, #243B55);
9
+ color: white;
10
+ font-family: 'Segoe UI', sans-serif;
11
  }
12
+ h1, h2, h3 {
13
+ color: #00CED1;
14
  }
15
  .sidebar .sidebar-content {
16
+ background-color: #1e1e1e;
17
  }
18
  .block-container {
19
  padding-top: 2rem;
20
  padding-bottom: 2rem;
21
  }
22
  a {
23
+ color: #00BFFF;
24
  text-decoration: none;
25
  }
26
  a:hover {
27
+ color: #1E90FF;
28
  }
29
  </style>
30
  """, unsafe_allow_html=True)
31
 
32
+ st.sidebar.title("๐Ÿ“Š KNN Visual Guide")
33
+ st.sidebar.markdown("Dive into KNN concepts interactively!")
34
+
35
+ st.markdown("<h1 style='text-align: center;'>๐Ÿงญ K-Nearest Neighbors (KNN) Explorer</h1>", unsafe_allow_html=True)
36
+
37
+ section = st.radio(
38
+ "Choose a KNN Concept to Explore:",
39
+ [
40
+ "๐Ÿ“˜ Introduction to KNN",
41
+ "โš™๏ธ How KNN Works",
42
+ "๐ŸŽฏ Underfitting vs Overfitting",
43
+ "๐Ÿ“‰ Cross-Validation",
44
+ "๐Ÿ› ๏ธ Hyperparameter Tuning",
45
+ "โš–๏ธ Feature Scaling",
46
+ "๐Ÿงฎ Weighted KNN",
47
+ "๐Ÿ—บ๏ธ Decision Boundaries"
48
+ ]
49
+ )
50
+
51
+ if section == "๐Ÿ“˜ Introduction to KNN":
52
+ st.subheader("๐Ÿ“˜ What is KNN?")
53
+ st.markdown("""
54
+ KNN stands for **K-Nearest Neighbors**, a simple and powerful algorithm used for:
55
+ - ๐Ÿ“ **Classification**: Predicting a category
56
+ - ๐Ÿ”ข **Regression**: Predicting a continuous value
57
+ โœ… Lazy Learning โ†’ No training phase, just memorization
58
+ โœ… Based on **distance** to nearest neighbors
59
+ โœ… Works well with clean and scaled data
60
  """)
61
 
62
+ elif section == "โš™๏ธ How KNN Works":
63
+ st.subheader("โš™๏ธ How Does KNN Work?")
64
+ st.markdown("""
65
+ ๐Ÿš€ **Step-by-step** process:
66
+ 1. Pick a value for `K`
67
+ 2. Measure distance (Euclidean, Manhattan, etc.) to all training points
68
+ 3. Pick `K` nearest ones
69
+ 4. ๐Ÿ“Š Classification โ†’ Majority vote
70
+ ๐Ÿ“ˆ Regression โ†’ Average/weighted average
 
71
  """)
 
 
 
 
 
 
 
 
 
 
 
 
72
 
73
+ elif section == "๐ŸŽฏ Underfitting vs Overfitting":
74
+ st.subheader("๐ŸŽฏ Underfitting vs Overfitting")
75
+ st.markdown("""
76
+ ๐Ÿ” **Overfitting**: Model memorizes data โ€” poor on new data
77
+ ๐Ÿ” **Underfitting**: Model too simple โ€” misses patterns
78
+ โœ… **Best Fit**: Balance both using cross-validation
79
  """)
80
 
81
+ elif section == "๐Ÿ“‰ Cross-Validation":
82
+ st.subheader("๐Ÿ“‰ Training vs Cross-Validation")
83
+ st.markdown("""
84
+ ๐Ÿงช **Training Error**: Error on known data
85
+ ๐Ÿ”„ **Cross-Validation Error**: Error on unseen data
86
+ ๐ŸŽฏ Choose K where both errors are low โ†’ best generalization
 
87
  """)
88
 
89
+ elif section == "๐Ÿ› ๏ธ Hyperparameter Tuning":
90
+ st.subheader("๐Ÿ› ๏ธ Tuning KNN")
91
+ st.markdown("""
92
+ ๐Ÿ”ง Main Parameters:
93
+ - `k`: Number of neighbors
94
+ - `weights`: All equal or weighted by distance
95
+ - `metric`: Distance type (e.g., Euclidean)
96
+ ๐Ÿง  Use Grid Search, Random Search, or Optuna for optimization
97
  """)
98
 
99
+ elif section == "โš–๏ธ Feature Scaling":
100
+ st.subheader("โš–๏ธ Why Scale Your Features?")
101
+ st.markdown("""
102
+ KNN relies on distance โ€” so features must be on the same scale:
103
+ - ๐Ÿ”ข **Standardization**: Mean = 0, SD = 1
104
+ - ๐Ÿ”ป **Normalization**: Rescales between 0 and 1
105
+ โ— Always scale after train-test split to avoid data leakage
106
  """)
107
 
108
+ elif section == "๐Ÿงฎ Weighted KNN":
109
+ st.subheader("๐Ÿงฎ Weighted KNN")
110
+ st.markdown("""
111
+ Instead of equal votes, **Weighted KNN** gives:
112
+ - Higher weight to nearer neighbors
113
+ - Lower influence from distant points
114
+ ๐Ÿ“Œ Improves performance when neighbor relevance varies
115
  """)
116
 
117
+ elif section == "๐Ÿ—บ๏ธ Decision Boundaries":
118
+ st.subheader("๐Ÿ—บ๏ธ Decision Regions")
119
+ st.markdown("""
120
+ Visuals of how KNN separates classes:
121
+ - `k=1` โ†’ Very sensitive, sharp boundaries โ†’ Overfitting
122
+ - `k > 1` โ†’ Smoother, more general
123
+ ๐Ÿ“Š Helps interpret model behavior in 2D/3D space
124
  """)
125
 
126
  st.markdown("""
127
+ <hr style='border: 1px solid #555;'>
128
+ <h4 style='color: #00CED1;'>๐Ÿ”— Try it Yourself in Colab:</h4>
129
+ <a href='https://colab.research.google.com/drive/11wk6wt7sZImXhTqzYrre3ic4oj3KFC4M?usp=sharing' target='_blank'>Open Interactive Notebook</a>
 
130
  """, unsafe_allow_html=True)
131
 
132
+ st.success("Explore, visualize, and understand how KNN works like never before! ๐Ÿš€")