Spaces:
Sleeping
Sleeping
| import streamlit as st | |
| st.set_page_config(page_title="Random Forest", page_icon="๐ฒ", layout="wide") | |
| # Title | |
| st.markdown("<h1 style='text-align:center;'>๐ฒ Random Forest Algorithm</h1>", unsafe_allow_html=True) | |
| # Introduction | |
| st.header("๐ What is Random Forest?") | |
| st.markdown(""" | |
| Random Forest is a **supervised learning algorithm** used for both **classification** and **regression** tasks. | |
| It works by building a large number of **decision trees** and combining their results to make a final prediction. | |
| Think of it like a group of people (trees) voting on an answer โ this **reduces overfitting** and improves accuracy. | |
| """) | |
| # Real-world Uses | |
| st.header("๐ฏ Where is Random Forest Used?") | |
| st.markdown(""" | |
| - ๐พ Crop disease detection | |
| - ๐ Stock market predictions | |
| - ๐ฅ Medical diagnosis | |
| - ๐ณ Fraud detection | |
| - ๐ฅ Customer churn prediction | |
| """) | |
| # How it works | |
| st.header("โ๏ธ How Random Forest Works") | |
| with st.expander("Step 1: Bootstrapping (Sampling with Replacement)"): | |
| st.markdown(""" | |
| - Randomly select multiple **subsets** of the dataset (with replacement). | |
| - Each subset is used to train a **separate decision tree**. | |
| """) | |
| with st.expander("Step 2: Tree Building"): | |
| st.markdown(""" | |
| - For each decision tree, choose a **random subset of features** at every split. | |
| - This randomness helps in creating **diverse trees**, reducing correlation. | |
| """) | |
| with st.expander("Step 3: Aggregating Results"): | |
| st.markdown(""" | |
| - **Classification**: Majority Voting (most common class wins) | |
| - **Regression**: Average of all tree predictions | |
| """) | |
| # Visual Illustration | |
| st.header("๐ณ Visual Intuition") | |
| st.image("https://upload.wikimedia.org/wikipedia/commons/7/76/Random_forest_diagram_complete.png", caption="Random Forest Structure", use_container_width=True) | |
| # Advantages | |
| st.header("โ Why Use Random Forest?") | |
| st.markdown(""" | |
| - Handles both classification and regression | |
| - Reduces overfitting compared to a single decision tree | |
| - Works well with large datasets and high-dimensional data | |
| - Robust to outliers and missing data | |
| """) | |
| # Hyperparameters | |
| st.header("๐ ๏ธ Key Hyperparameters") | |
| with st.expander("๐ฒ n_estimators"): | |
| st.markdown("Number of trees in the forest. More trees = better performance, but more computation.") | |
| with st.expander("๐งฎ max_depth"): | |
| st.markdown("Maximum depth of each tree. Controls overfitting.") | |
| with st.expander("๐ max_features"): | |
| st.markdown("Number of features to consider at each split (auto, sqrt, log2).") | |
| with st.expander("๐ช min_samples_split"): | |
| st.markdown("Minimum samples required to split a node.") | |
| with st.expander("๐ฏ criterion"): | |
| st.markdown("Function to measure the quality of a split (`gini` or `entropy` for classification).") | |
| # Evaluation Metrics | |
| st.header("๐ Evaluation Metrics") | |
| with st.expander("โ๏ธ Accuracy"): | |
| st.latex(r"Accuracy = \frac{TP + TN}{TP + TN + FP + FN}") | |
| st.markdown("Overall how often the model was correct.") | |
| with st.expander("๐ฏ Precision"): | |
| st.latex(r"Precision = \frac{TP}{TP + FP}") | |
| st.markdown("Out of all predicted positives, how many were actually positive?") | |
| with st.expander("๐ Recall"): | |
| st.latex(r"Recall = \frac{TP}{TP + FN}") | |
| st.markdown("Out of all actual positives, how many did we correctly identify?") | |
| with st.expander("โ๏ธ F1 Score"): | |
| st.latex(r"F1 = 2 \cdot \frac{Precision \cdot Recall}{Precision + Recall}") | |
| st.markdown("Harmonic mean of precision and recall.") | |
| with st.expander("๐ ROC-AUC"): | |
| st.markdown("Measures the tradeoff between true positive rate and false positive rate.") | |
| # Summary | |
| st.header("๐ Summary") | |
| st.markdown(""" | |
| - Random Forest is an **ensemble** of decision trees | |
| - Uses **bagging** and **feature randomness** to create a robust model | |
| - Great for **accuracy**, **stability**, and **generalization** | |
| - Handles missing data and avoids overfitting well | |
| - Best when you need strong baseline performance with minimal tuning | |
| """) | |
| st.success("๐ Now youโve got a solid grip on Random Forest!") | |