Zero_To_Hero_ML / pages /6.ML_Algorithms.py
UmaKumpatla's picture
Update pages/6.ML_Algorithms.py
fc3a9da verified
import streamlit as st
st.subheader("🔍 1. K-Nearest Neighbors (KNN)")
st.write("KNN is a simple, instance-based learning algorithm. It doesn't learn during training—just stores the data. When it needs to make a prediction, it finds the K closest neighbors (based on distance) and votes to decide the output.")
st.subheader("What is it?")
st.write("KNN is a supervised machine learning algorithm that’s mainly used for classification, but it can handle regression tasks too. Think of it as a decision-making system that looks at its closest “friends” before choosing what to do.")
st.write("**🧠 How Does KNN Work?**")
st.write("No Training Required")
st.write("KNN doesn’t actually `learn` anything during training—it just stores the data. That’s why it’s called a **lazy learner**.")
st.write("**Making Predictions**")
st.write("**When you ask KNN to predict something, it:**")
st.write("• Measures how close each point in the dataset is to the new input")
st.write("• Finds the **k closest neighbors**")
st.write("• `For classification`: It checks which class appears most frequently among those neighbors")
st.write("• `For regression`: It takes the average of their values")
st.write("**📏 Measuring Closeness**")
st.write("• KNN uses distance metrics to figure out who’s “close.” Popular ones include:")
st.write("• `Euclidean Distance` (like measuring with a ruler)")
st.write("•`Manhattan Distance` (grid-style, like walking city blocks)")
st.write("• `Minkowski Distance` (a general form that includes both above)")
st.write("• `Cosine Similarity` (great for text and direction-based comparison)")
st.write("**⚙️ Key Settings in KNN")
st.write("• `k` (Number of Neighbors)")
st.write("• `Small k` → More sensitive, possibly overfits")
st.write("• `Large k` → Smoother results, risk of underfitting")
st.write("**✅ Why Use KNN?**")
st.write("• It’s super intuitive—no complex math during training")
st.write("• You can start using it quickly")
st.write("• It works great on small, well-prepared datasets")
st.write("**❌ Downsides of KNN**")
st.write("• Slow predictions on large datasets")
st.write("• Needs feature scaling (like normalization) to perform well")
st.write("• Doesn’t handle high-dimensional data very well")
st.write("**📦 Real-World Use Cases**")
st.write("• Product recommendation systems")
st.write("• Image classification (e.g., handwritten digits)")
st.write("• Anomaly detection (like spotting fraud)")
st.write("Text classification (emails, reviews, etc.)")
st.markdown("📌 Best for small datasets and easy interpretability.")
st.markdown("⚠️ Needs scaling & can be slow on large datasets.")
# Highlighted page link styled as a button
st.markdown("""
<a href="https://huggingface.co/spaces/UmaKumpatla/KNN" target="_blank">
<div style="display:inline-block; background-color:#4CAF50; color:white; padding:10px 20px; border-radius:8px; text-decoration:none; font-weight:bold;">
👉 Lets get into the KNN
</div>
</a>
""", unsafe_allow_html=True)
st.subheader("🌳 2. Decision Tree")
st.write("A Decision Tree splits data into branches based on feature values, like a flowchart. Each decision node asks a question, and the branches represent the outcomes.")
st.markdown("📌 Great for interpretability and visual understanding.")
st.markdown("⚠️ Can easily overfit if not controlled (pruning, max depth, etc.).")
# Highlighted page link styled as a button
st.markdown("""
<a href="https://huggingface.co/spaces/UmaKumpatla/DecisionTree" target="_blank">
<div style="display:inline-block; background-color:#4CAF50; color:white; padding:10px 20px; border-radius:8px; text-decoration:none; font-weight:bold;">
👉 Lets get into the DecisionTree
</div>
</a>
""", unsafe_allow_html=True)
st.subheader("🧠 3. Ensemble Methods (Random Forest, Gradient Boosting, etc.)")
st.markdown("Ensembles combine the predictions of multiple models (often trees) to make better decisions.")
st.markdown("Random Forest averages results of many decision trees.")
st.markdown("Gradient Boosting builds trees one by one to fix previous errors.")
st.markdown("📌 High accuracy, robust to overfitting.")
st.markdown("⚠️ Slower training, harder to interpret.")
# Highlighted page link styled as a button
st.markdown("""
<a href="https://huggingface.co/spaces/UmaKumpatla/Ensemble_Techniques" target="_blank">
<div style="display:inline-block; background-color:#4CAF50; color:white; padding:10px 20px; border-radius:8px; text-decoration:none; font-weight:bold;">
👉 Lets get into the Ensemble Techniques
</div>
</a>
""", unsafe_allow_html=True)
st.subheader("➕ 4. Logistic Regression")
st.markdown("Despite the name, Logistic Regression is used for classification. It uses a sigmoid function to predict the probability of a class label (binary or multiclass).")
st.markdown("📌 Fast, works well when the relationship is linear between features and outcome.")
st.markdown("⚠️ Assumes linear boundaries, may underperform on complex data.")
# Highlighted page link styled as a button
st.markdown("""
<a href="https://huggingface.co/spaces/UmaKumpatla/Logistic_Regression" target="_blank">
<div style="display:inline-block; background-color:#4CAF50; color:white; padding:10px 20px; border-radius:8px; text-decoration:none; font-weight:bold;">
👉 Lets get into the Logistic Regression
</div>
</a>
""", unsafe_allow_html=True)
st.subheader("📈 5. Linear Regression")
st.markdown("Linear Regression models the relationship between features and a continuous target variable by fitting a straight line (or hyperplane) that minimizes error.")
st.markdown("📌 Simple and fast; great for trend analysis and predictions.")
st.markdown("⚠️ Assumes linearity, sensitive to outliers.")
# Highlighted page link styled as a button
st.markdown("""
<a href="https://huggingface.co/spaces/UmaKumpatla/Linear_Regression" target="_blank">
<div style="display:inline-block; background-color:#4CAF50; color:white; padding:10px 20px; border-radius:8px; text-decoration:none; font-weight:bold;">
👉 Lets get into the Linear Regression
</div>
</a>
""", unsafe_allow_html=True)
st.subheader("🧱 6. Support Vector Machine (SVM)")
st.markdown("SVM finds the best boundary (hyperplane) that separates classes by maximizing the margin between them. It can also use kernels to handle non-linear separation.")
st.markdown("📌 Very powerful on small to medium datasets with clear margins.")
st.markdown("⚠️ Not ideal for large datasets, sensitive to parameter tuning.")
# Highlighted page link styled as a button
st.markdown("""
<a href="https://huggingface.co/spaces/UmaKumpatla/Support_Vector_Machine-SVM" target="_blank">
<div style="display:inline-block; background-color:#4CAF50; color:white; padding:10px 20px; border-radius:8px; text-decoration:none; font-weight:bold;">
👉 Lets get into the SVM
</div>
</a>
""", unsafe_allow_html=True)