Spaces:

Bachstelze
/

github_sync

Sleeping

App Files Files Community

Bachstelze commited on Feb 11

Commit

9457a9d

1 Parent(s): d0ae2a0

recommit changes in the report notebook

Browse files

Files changed (1) hide show

A3/A3_Report.ipynb +31 -90

A3/A3_Report.ipynb CHANGED Viewed

@@ -19,7 +19,7 @@
    "source": [
     "## 1. Problem statement\n",
     "\n",
-    "The goal is to classify the weakest link in a squat movement. Given 40 movement features, the model predicts which body region is limiting the person's squat. The input is 40 movement features and the output is a body region classification which is upper and lower body. Due to massive class imbalance in the original 14 class problem with some classes with only 1-2 datapoints, we changed to a 2-class approach that gives predictions that actually helps."
    ]
   },
   {
@@ -47,7 +47,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 306,
    "id": "edbe3fbd",
    "metadata": {},
    "outputs": [],
@@ -77,7 +77,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 307,
    "id": "23f1b38b",
    "metadata": {},
    "outputs": [
@@ -128,7 +128,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 308,
    "id": "080ab472",
    "metadata": {},
    "outputs": [
@@ -200,7 +200,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 309,
    "id": "438e27ae",
    "metadata": {},
    "outputs": [
@@ -298,7 +298,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 310,
    "id": "7560ae66",
    "metadata": {},
    "outputs": [
@@ -335,7 +335,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 311,
    "id": "9f17a88e",
    "metadata": {},
    "outputs": [
@@ -381,7 +381,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 312,
    "id": "d4c02996",
    "metadata": {},
    "outputs": [],
@@ -404,7 +404,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 313,
    "id": "c8292b2b",
    "metadata": {},
    "outputs": [],
@@ -442,7 +442,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 314,
    "id": "b598aef7",
    "metadata": {},
    "outputs": [
@@ -475,7 +475,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 315,
    "id": "962743cc",
    "metadata": {},
    "outputs": [
@@ -603,7 +603,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 316,
    "id": "5c9efd5b",
    "metadata": {},
    "outputs": [
@@ -636,7 +636,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 317,
    "id": "ce01a75f",
    "metadata": {},
    "outputs": [
@@ -814,7 +814,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 318,
    "id": "3e5e5e9b",
    "metadata": {},
    "outputs": [
@@ -849,7 +849,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 319,
    "id": "4de69063",
    "metadata": {},
    "outputs": [
@@ -885,7 +885,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 320,
    "id": "a994b1af",
    "metadata": {},
    "outputs": [
@@ -1035,7 +1035,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 321,
    "id": "00f3eda4",
    "metadata": {},
    "outputs": [
@@ -1062,12 +1062,12 @@
    "id": "bfc7b8c0",
    "metadata": {},
    "source": [
-    "## 7. Model training step 4: hyperparameter tuning of body regions"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 322,
    "id": "6b03902f",
    "metadata": {},
    "outputs": [
@@ -1262,7 +1262,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 323,
    "id": "0b3e066a",
    "metadata": {},
    "outputs": [
@@ -1398,7 +1398,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 324,
    "id": "d21c037d",
    "metadata": {},
    "outputs": [
@@ -1476,7 +1476,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 325,
    "id": "4f01e27a",
    "metadata": {},
    "outputs": [
@@ -1484,7 +1484,7 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "Champion model saved: final_champion_model_A3.pkl\n",
       "\n",
       "Note: The deployable Pipeline model is saved separately as models/classification_champion.pkl\n"
      ]
@@ -1511,7 +1511,7 @@
    "source": [
     "## 10. Deployment\n",
     "\n",
-    "The classification endpoint is added to the existing Gradio app as a second tab. Tab 1 has Movement Scoring from A2. Tab 2 has Body Region Classification which takes 40 deviation features as input and outputs the predicted body region.\n",
     "\n",
     "Deployment URL: https://huggingface.co/spaces/Bachstelze/github_sync"
    ]
@@ -1540,69 +1540,6 @@
     "GitHub Actions automatically syncs the repository to HuggingFace Spaces when pushed to main. The workflow file is found at .github/workflows/push_to_hf_space.yml."
    ]
   },
-  {
-   "cell_type": "code",
-   "execution_count": 326,
-   "id": "9c52b59b",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "=== A3 CLASSIFICATION SCOREBOARD ===\n",
-      "\n",
-      "14-Class Approach:\n",
-      "              Model  Accuracy  F1-Score\n",
-      "                LDA  0.565632  0.566721\n",
-      "Logistic Regression  0.546539  0.562310\n",
-      "          KNN (k=7)  0.553699  0.545808\n",
-      "         KNN (k=10)  0.556086  0.540911\n",
-      "          KNN (k=5)  0.539379  0.530905\n",
-      "        Naive Bayes  0.434368  0.449053\n",
-      "\n",
-      "Best 14-class: LDA (F1: 0.5667)\n",
-      "\n",
-      "--------------------------------------------------\n",
-      "\n",
-      "Body Region Approach:\n",
-      "              Model  Accuracy  F1-Score\n",
-      "          KNN (k=7)  0.837709  0.827772\n",
-      "                LDA  0.830549  0.824773\n",
-      "         KNN (k=10)  0.832936  0.824195\n",
-      "          KNN (k=5)  0.830549  0.823450\n",
-      "Logistic Regression  0.811456  0.817984\n",
-      "        Naive Bayes  0.770883  0.766729\n",
-      "                QDA  0.663484  0.681889\n",
-      "\n",
-      "Best Body Region: KNN (k=7) (F1: 0.8278)\n",
-      "\n",
-      "==================================================\n",
-      "\n",
-      "FINAL CHAMPION: KNN (k=7)\n",
-      "Approach: Body Regions\n",
-      "F1-Score: 0.8278\n"
-     ]
-    }
-   ],
-   "source": [
-    "# Final scoreboard summary\n",
-    "print(\"=== A3 CLASSIFICATION SCOREBOARD ===\\n\")\n",
-    "print(\"14-Class Approach:\")\n",
-    "print(results_14class_baseline[['Model', 'Accuracy', 'F1-Score']].to_string(index=False))\n",
-    "print(f\"\\nBest 14-class: {champion_14class_baseline} (F1: {champion_14class_f1_baseline:.4f})\")\n",
-    "\n",
-    "print(\"\\n\" + \"-\"*50)\n",
-    "print(\"\\nBody Region Approach:\")\n",
-    "print(results_region_baseline[['Model', 'Accuracy', 'F1-Score']].to_string(index=False))\n",
-    "print(f\"\\nBest Body Region: {champion_region_baseline} (F1: {champion_region_f1_baseline:.4f})\")\n",
-    "\n",
-    "print(\"\\n\" + \"=\"*50)\n",
-    "print(f\"\\nFINAL CHAMPION: {final_champion_row['Model']}\")\n",
-    "print(f\"Approach: {final_champion_row['Approach']}\")\n",
-    "print(f\"F1-Score: {final_champion_row['F1-Score']:.4f}\")"
-   ]
-  },
   {
    "cell_type": "markdown",
    "id": "7a142abd",
@@ -1632,13 +1569,17 @@
     "| 3 | Baseline | Body Regions | Grouped classes (Upper/Lower) |\n",
     "| 4 | Tuned | Body Regions | GridSearchCV (5-fold CV) |\n",
     "\n",
-    "Note: Polynomial interaction features were tested but not included in final iterations due to minimal improvement and increased complexity (820 features vs 40)."
    ]
   }
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },
@@ -1652,7 +1593,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.12.3"
   }
  },
  "nbformat": 4,

    "source": [
     "## 1. Problem statement\n",
     "\n",
+    "The goal is to classify the weakest link in a squat movement. Given 38 movement features, the model predicts which body region is limiting the person's squat. The input is 38 movement features and the output is a body region classification which is upper and lower body. Due to massive class imbalance in the original 14 class problem with some classes with only 1-2 datapoints, we changed to a 2-class approach that gives predictions that actually helps."
    ]
   },
   {
   },
   {
    "cell_type": "code",
+   "execution_count": 21,
    "id": "edbe3fbd",
    "metadata": {},
    "outputs": [],
   },
   {
    "cell_type": "code",
+   "execution_count": 22,
    "id": "23f1b38b",
    "metadata": {},
    "outputs": [
   },
   {
    "cell_type": "code",
+   "execution_count": 23,
    "id": "080ab472",
    "metadata": {},
    "outputs": [
   },
   {
    "cell_type": "code",
+   "execution_count": 24,
    "id": "438e27ae",
    "metadata": {},
    "outputs": [
   },
   {
    "cell_type": "code",
+   "execution_count": 25,
    "id": "7560ae66",
    "metadata": {},
    "outputs": [
   },
   {
    "cell_type": "code",
+   "execution_count": 26,
    "id": "9f17a88e",
    "metadata": {},
    "outputs": [
   },
   {
    "cell_type": "code",
+   "execution_count": 27,
    "id": "d4c02996",
    "metadata": {},
    "outputs": [],
   },
   {
    "cell_type": "code",
+   "execution_count": 28,
    "id": "c8292b2b",
    "metadata": {},
    "outputs": [],
   },
   {
    "cell_type": "code",
+   "execution_count": 29,
    "id": "b598aef7",
    "metadata": {},
    "outputs": [
   },
   {
    "cell_type": "code",
+   "execution_count": 30,
    "id": "962743cc",
    "metadata": {},
    "outputs": [
   },
   {
    "cell_type": "code",
+   "execution_count": 31,
    "id": "5c9efd5b",
    "metadata": {},
    "outputs": [
   },
   {
    "cell_type": "code",
+   "execution_count": 32,
    "id": "ce01a75f",
    "metadata": {},
    "outputs": [
   },
   {
    "cell_type": "code",
+   "execution_count": 33,
    "id": "3e5e5e9b",
    "metadata": {},
    "outputs": [
   },
   {
    "cell_type": "code",
+   "execution_count": 34,
    "id": "4de69063",
    "metadata": {},
    "outputs": [
   },
   {
    "cell_type": "code",
+   "execution_count": 35,
    "id": "a994b1af",
    "metadata": {},
    "outputs": [
   },
   {
    "cell_type": "code",
+   "execution_count": 36,
    "id": "00f3eda4",
    "metadata": {},
    "outputs": [
    "id": "bfc7b8c0",
    "metadata": {},
    "source": [
+    "## 7. Model training step 3: hyperparameter tuning of body regions"
    ]
   },
   {
    "cell_type": "code",
+   "execution_count": 37,
    "id": "6b03902f",
    "metadata": {},
    "outputs": [
   },
   {
    "cell_type": "code",
+   "execution_count": 38,
    "id": "0b3e066a",
    "metadata": {},
    "outputs": [
   },
   {
    "cell_type": "code",
+   "execution_count": 39,
    "id": "d21c037d",
    "metadata": {},
    "outputs": [
   },
   {
    "cell_type": "code",
+   "execution_count": 40,
    "id": "4f01e27a",
    "metadata": {},
    "outputs": [
      "name": "stdout",
      "output_type": "stream",
      "text": [
+      "Champion model dictionary saved: final_champion_model_A3.pkl\n",
       "\n",
       "Note: The deployable Pipeline model is saved separately as models/classification_champion.pkl\n"
      ]
    "source": [
     "## 10. Deployment\n",
     "\n",
+    "The classification endpoint is added to the existing Gradio app as a second tab. Tab 1 has Movement Scoring from A2. Tab 2 has Body Region Classification which takes 38 deviation features as input and outputs the predicted body region (Upper Body or Lower Body). The deployed model is KNN (k=7) with StandardScaler preprocessing, achieving 82.8% F1-score.\n",
     "\n",
     "Deployment URL: https://huggingface.co/spaces/Bachstelze/github_sync"
    ]
     "GitHub Actions automatically syncs the repository to HuggingFace Spaces when pushed to main. The workflow file is found at .github/workflows/push_to_hf_space.yml."
    ]
   },
   {
    "cell_type": "markdown",
    "id": "7a142abd",
     "| 3 | Baseline | Body Regions | Grouped classes (Upper/Lower) |\n",
     "| 4 | Tuned | Body Regions | GridSearchCV (5-fold CV) |\n",
     "\n",
+    "Note: Polynomial interaction features were tested but not included in final iterations due to minimal improvement and increased complexity (820 features vs 38).\n",
+    "\n",
+    "### Deployed Model\n",
+    "\n",
+    "The deployed model uses body region classification with KNN (k=7) and StandardScaler preprocessing. It takes 38 input features and achieves 82.8% F1-weighted and 84% accuracy on the test set."
    ]
   }
  ],
  "metadata": {
   "kernelspec": {
+   "display_name": "Python 3",
    "language": "python",
    "name": "python3"
   },
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
+   "version": "3.14.0"
   }
  },
  "nbformat": 4,