| {% extends "layout.html" %} | |
| {% block content %} | |
| <script src="https://cdn.tailwindcss.com"></script> | |
| <div class="container mx-auto px-4 py-8"> | |
| <h1 class="text-3xl font-bold text-center text-gray-800 mb-6">π² Random Forest Regressor π²</h1> | |
| <p class="text-gray-600 mb-4 max-w-xl mx-auto text-center"> | |
| The Random Forest Regressor is a powerful ensemble learning method. Imagine a forest of decision trees working together. | |
| Instead of relying on a single, potentially biased tree, it combines the wisdom of many. This helps it capture complex, non-linear patterns | |
| in your data far better than simpler models. | |
| </p> | |
| <div class="mt-8 p-6 bg-blue-50 rounded-lg shadow-inner border border-blue-200 max-w-xl mx-auto text-center"> | |
| <h2 class="text-2xl font-bold text-blue-800 mb-4">β¨ Try It Yourself! β¨</h2> | |
| <p class="text-blue-700 mb-6"> | |
| Ready to see the Random Forest Regressor in action? Click the button below to predict an exam score! | |
| </p> | |
| <a href="/prediction_flow" | |
| class="inline-block bg-blue-600 text-white px-8 py-3 hover:bg-blue-700 transition font-semibold text-lg shadow-md no-underline"> | |
| Predict a Score Now! | |
| </a> | |
| {% if prediction is defined %} | |
| <div class="mt-6 bg-blue-100 border border-blue-300 rounded-lg p-4 text-center"> | |
| <h3 class="text-xl font-bold text-blue-800">π Predicted Score:</h3> | |
| <p class="text-4xl font-extrabold text-blue-900 mt-2">{{ prediction }}</p> | |
| <p class="text-sm text-gray-700 mt-1">Based on {{ hours }} hours of study</p> | |
| </div> | |
| {% elif error %} | |
| <p class="text-red-600 mt-4 text-center font-semibold">{{ error }}</p> | |
| {% endif %} | |
| </div> | |
| --- | |
| <div class="mt-10 grid gap-8 md:grid-cols-2"> | |
| <div class="bg-white p-6 rounded-lg shadow-md border border-gray-200"> | |
| <h2 class="text-xl font-semibold text-gray-700 mb-4 flex items-center"> | |
| <span class="mr-2 text-yellow-500">π</span> Sample Training Data | |
| </h2> | |
| <p class="text-gray-600 text-sm mb-4"> | |
| Here's a simple dataset we could use to train our Random Forest Regressor, predicting exam scores based on hours studied: | |
| </p> | |
| <div class="overflow-x-auto"> | |
| <table class="min-w-full divide-y divide-gray-200"> | |
| <thead class="bg-gray-100"> | |
| <tr> | |
| <th scope="col" class="px-4 py-2 text-left text-xs font-medium text-gray-700 uppercase tracking-wider"> | |
| Hours Studied (X) | |
| </th> | |
| <th scope="col" class="px-4 py-2 text-left text-xs font-medium text-gray-700 uppercase tracking-wider"> | |
| Exam Score (y) | |
| </th> | |
| </tr> | |
| </thead> | |
| <tbody class="bg-white divide-y divide-gray-200"> | |
| <tr><td class="px-4 py-2 whitespace-nowrap text-sm text-gray-900">1</td><td class="px-4 py-2 whitespace-nowrap text-sm text-gray-900">35</td></tr> | |
| <tr><td class="px-4 py-2 whitespace-nowrap text-sm text-gray-900">2</td><td class="px-4 py-2 whitespace-nowrap text-sm text-gray-900">45</td></tr> | |
| <tr><td class="px-4 py-2 whitespace-nowrap text-sm text-gray-900">3</td><td class="px-4 py-2 whitespace-nowrap text-sm text-gray-900">55</td></tr> | |
| <tr><td class="px-4 py-2 whitespace-nowrap text-sm text-gray-900">4</td><td class="px-4 py-2 whitespace-nowrap text-sm text-gray-900">65</td></tr> | |
| <tr><td class="px-4 py-2 whitespace-nowrap text-sm text-gray-900">5</td><td class="px-4 py-2 whitespace-nowrap text-sm text-gray-900">75</td></tr> | |
| <tr><td class="px-4 py-2 whitespace-nowrap text-sm text-gray-900">6</td><td class="px-4 py-2 whitespace-nowrap text-sm text-gray-900">80</td></tr> | |
| <tr><td class="px-4 py-2 whitespace-nowrap text-sm text-gray-900">7</td><td class="px-4 py-2 whitespace-nowrap text-sm text-gray-900">82</td></tr> | |
| <tr><td class="px-4 py-2 whitespace-nowrap text-sm text-gray-900">8</td><td class="px-4 py-2 whitespace-nowrap text-sm text-gray-900">88</td></tr> | |
| <tr><td class="px-4 py-2 whitespace-nowrap text-sm text-gray-900">9</td><td class="px-4 py-2 whitespace-nowrap text-sm text-gray-900">92</td></tr> | |
| <tr><td class="px-4 py-2 whitespace-nowrap text-sm text-gray-900">10</td><td class="px-4 py-2 whitespace-nowrap text-sm text-gray-900">95</td></tr> | |
| </tbody> | |
| </table> | |
| </div> | |
| </div> | |
| <div class="bg-white p-6 rounded-lg shadow-md border border-gray-200"> | |
| <h2 class="text-xl font-semibold text-gray-700 mb-4 flex items-center"> | |
| <span class="mr-2 text-blue-500">π§ </span> How Random Forest Regression Works | |
| </h2> | |
| <p class="text-gray-600 text-sm mb-4"> | |
| A Random Forest is built from many individual decision trees. For regression, it predicts by averaging the outputs of these trees. | |
| This significantly reduces overfitting and makes the model robust. | |
| </p> | |
| <ul class="list-disc pl-6 text-sm mt-2 text-gray-600"> | |
| <li> | |
| Building Trees with Bagging (Bootstrap Aggregating): | |
| Each tree is trained on a different random subset of your original data, sampled with replacement. This creates varied training sets for each tree. | |
| </li> | |
| <li> | |
| Feature Randomness (Random Subspace): | |
| When a tree makes a split, it only considers a random subset of the available features. This ensures no single feature dominates all trees. | |
| </li> | |
| </ul> | |
| </div> | |
| <div class="bg-white p-6 rounded-lg shadow-md border border-gray-200 md:col-span-2"> | |
| <h2 class="text-xl font-semibold text-gray-700 mb-4 flex items-center"> | |
| <span class="mr-2 text-purple-500">βοΈ</span> The Splitting Process in Each Tree | |
| </h2> | |
| <p class="text-gray-600 text-sm mb-4"> | |
| Each decision tree in the forest grows by repeatedly splitting its data. The goal is to create "pure" child nodes. For regression, "purity" means minimizing the Mean Squared Error (MSE) within a node. | |
| </p> | |
| <ul class="list-disc pl-6 text-sm mt-2 text-gray-600"> | |
| <li class="mb-2"> | |
| <strong>Finding the Best Split:</strong><br> | |
| At each node, the tree evaluates all possible split points for its random subset of features. It picks the split that results in the greatest reduction in Mean Squared Error (MSE).<br> | |
| The MSE formula is: <strong>MSE = (1/n) Γ Ξ£(π¦<sub>i</sub> β Ε·<sub>i</sub>)<sup>2</sup></strong>, where <em>π¦<sub>i</sub></em> is the actual value and <em>Ε·<sub>i</sub></em> is the predicted value (the average of y in that node). | |
| </li> | |
| <li> | |
| Recursive Partitioning: | |
| This splitting process repeats for the newly created child nodes. It continues until stopping conditions are met (e.g., maximum depth, minimum samples in a leaf). | |
| </li> | |
| <li> | |
| Leaf Node Prediction: | |
| Once a tree is fully grown, its final prediction at any leaf node is simply the average of all the training data's target values (y values) that ended up in that leaf. | |
| </li> | |
| </ul> | |
| </div> | |
| <div class="bg-white p-6 rounded-lg shadow-md border border-gray-200 md:col-span-2"> | |
| <h2 class="text-xl font-semibold text-gray-700 mb-4 flex items-center"> | |
| <span class="mr-2 text-green-500">π</span> Predicting with New Input (e.g., 5.5 Hours) | |
| </h2> | |
| <p class="text-gray-600 text-sm mb-4"> | |
| When you enter a new value, like 5.5 hours studied, here's the journey it takes through the Random Forest: | |
| </p> | |
| <ol class="list-decimal pl-6 text-sm mt-2 text-gray-600"> | |
| <li> | |
| Sent to Every Tree: | |
| Your input 5.5 hours goes to each and every decision tree in the Random Forest. | |
| </li> | |
| <li> | |
| Tree by Tree Journey: | |
| In each tree, 5.5 travels down, following the branches based on the split conditions (e.g., if "Hours Studied <= 5?", then 5.5 goes to the "No" branch). | |
| </li> | |
| <li> | |
| Individual Tree Predictions: | |
| Eventually, 5.5 lands in a leaf node for each tree. That tree's prediction is the average of the exam scores (y values) of the training data points that settled in that same leaf node during training. | |
| <br><br> | |
| Example: If one tree's leaf node for "Hours Studied > 5.0 and <= 7.0" contained training points (6 hours, 80 score) and (7 hours, 82 score), that tree would predict (80+82)/2 = 81 for 5.5 hours. | |
| </li> | |
| <li> | |
| Averaging for the Final Answer: | |
| Once all individual trees have made their predictions, the Random Forest Regressor simply calculates the average of all these individual tree predictions. | |
| <br><br> | |
| For instance, if Tree 1 predicted 81, Tree 2 predicted 78, Tree 3 predicted 80.5, etc., the final output would be the average of these numbers. | |
| </li> | |
| <li> | |
| Your Predicted Score: | |
| This final averaged value is what you see as the predicted exam score, offering a robust and well-rounded estimate! | |
| </li> | |
| </ol> | |
| </div> | |
| </div> | |
| </div> | |
| {% endblock %} | |