Spaces:

AashishAIHub
/

DataScience

Running

File size: 7,097 Bytes

854c114

{
    "cells": [
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "# ML Practice Series: Module 06 - Support Vector Machines (SVM)\n",
                "\n",
                "Welcome to Module 06! We're exploring **Support Vector Machines**, a powerful algorithm for both linear and non-linear classification.\n",
                "\n",
                "### Resources:\n",
                "Visit the **[Machine Learning Guide - SVM Section](https://aashishgarg13.github.io/DataScience/ml_complete-all-topics/)** on your hub to see interactive demos of how the margin changes and how kernels project data into higher dimensions.\n",
                "\n",
                "### Objectives:\n",
                "1. **Maximum Margin**: Understanding support vectors.\n",
                "2. **The Kernel Trick**: Handling non-linear data.\n",
                "3. **Regularization (C Parameter)**: Hard vs Soft margins.\n",
                "\n",
                "---"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## 1. Environment Setup"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "import pandas as pd\n",
                "import numpy as np\n",
                "import matplotlib.pyplot as plt\n",
                "import seaborn as sns\n",
                "from sklearn.svm import SVC\n",
                "from sklearn.model_selection import train_test_split\n",
                "from sklearn.metrics import accuracy_score, confusion_matrix\n",
                "from sklearn.datasets import make_moons\n",
                "\n",
                "# Generate non-linear data (Moons)\n",
                "X, y = make_moons(n_samples=200, noise=0.15, random_state=42)\n",
                "plt.scatter(X[:,0], X[:,1], c=y, cmap='viridis')\n",
                "plt.title(\"Non-Linearly Separable Data\")\n",
                "plt.show()"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## 2. Linear SVM\n",
                "\n",
                "### Task 1: Training a Linear SVM\n",
                "Try fitting a linear SVM to this non-linear data and check the accuracy."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# YOUR CODE HERE\n"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "<details>\n",
                "<summary><b>Click to see Solution</b></summary>\n",
                "\n",
                "```python\n",
                "svm_linear = SVC(kernel='linear')\n",
                "svm_linear.fit(X, y)\n",
                "y_pred = svm_linear.predict(X)\n",
                "print(f\"Linear SVM Accuracy: {accuracy_score(y, y_pred):.4f}\")\n",
                "```\n",
                "</details>"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## 3. The Kernel Trick\n",
                "\n",
                "### Task 2: Polynomial and RBF Kernels\n",
                "Train SVM with `poly` and `rbf` kernels. Which one performs better?\n",
                "\n",
                "*Web Reference: Check the [SVM Kernel Demo](https://aashishgarg13.github.io/DataScience/ml_complete-all-topics/) to see how kernels transform data.*"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# YOUR CODE HERE\n"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "<details>\n",
                "<summary><b>Click to see Solution</b></summary>\n",
                "\n",
                "```python\n",
                "svm_rbf = SVC(kernel='rbf', gamma=1)\n",
                "svm_rbf.fit(X, y)\n",
                "y_pred_rbf = svm_rbf.predict(X)\n",
                "print(f\"RBF SVM Accuracy: {accuracy_score(y, y_pred_rbf):.4f}\")\n",
                "```\n",
                "</details>"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## 4. Tuning the C Parameter\n",
                "\n",
                "### Task 3: Impact of C\n",
                "Experiment with very small C (e.g., 0.01) and very large C (e.g., 1000). Monitor the change in decision boundaries.\n",
                "\n",
                "*Hint: Use the [C-Parameter Visualization](https://aashishgarg13.github.io/DataScience/ml_complete-all-topics/) on your site to see hard vs soft margin.*"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# YOUR CODE HERE\n"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "<details>\n",
                "<summary><b>Click to see Solution</b></summary>\n",
                "\n",
                "```python\n",
                "def plot_svm_boundary(C_val):\n",
                "    model = SVC(kernel='rbf', C=C_val)\n",
                "    model.fit(X, y)\n",
                "    # (Standard boundary plotting code would go here)\n",
                "    print(f\"SVM trained with C={C_val}\")\n",
                "\n",
                "plot_svm_boundary(0.01)\n",
                "plot_svm_boundary(1000)\n",
                "```\n",
                "</details>"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "--- \n",
                "### Great work! \n",
                "SVM is a classic example of how high-dimensional projection can solve complex problems.\n",
                "Next module: **Advanced Ensemble Methods (XGBoost & Boosting)**."
            ]
        }
    ],
    "metadata": {
        "kernelspec": {
            "display_name": "Python 3",
            "language": "python",
            "name": "python3"
        },
        "language_info": {
            "codemirror_mode": {
                "name": "ipython",
                "version": 3
            },
            "file_extension": ".py",
            "mimetype": "text/x-python",
            "name": "python",
            "nbconvert_exporter": "python",
            "pygments_lexer": "ipython3",
            "version": "3.8.0"
        }
    },
    "nbformat": 4,
    "nbformat_minor": 4
}