Bachstelze commited on
Commit
9457a9d
·
1 Parent(s): d0ae2a0

recommit changes in the report notebook

Browse files
Files changed (1) hide show
  1. A3/A3_Report.ipynb +31 -90
A3/A3_Report.ipynb CHANGED
@@ -19,7 +19,7 @@
19
  "source": [
20
  "## 1. Problem statement\n",
21
  "\n",
22
- "The goal is to classify the weakest link in a squat movement. Given 40 movement features, the model predicts which body region is limiting the person's squat. The input is 40 movement features and the output is a body region classification which is upper and lower body. Due to massive class imbalance in the original 14 class problem with some classes with only 1-2 datapoints, we changed to a 2-class approach that gives predictions that actually helps."
23
  ]
24
  },
25
  {
@@ -47,7 +47,7 @@
47
  },
48
  {
49
  "cell_type": "code",
50
- "execution_count": 306,
51
  "id": "edbe3fbd",
52
  "metadata": {},
53
  "outputs": [],
@@ -77,7 +77,7 @@
77
  },
78
  {
79
  "cell_type": "code",
80
- "execution_count": 307,
81
  "id": "23f1b38b",
82
  "metadata": {},
83
  "outputs": [
@@ -128,7 +128,7 @@
128
  },
129
  {
130
  "cell_type": "code",
131
- "execution_count": 308,
132
  "id": "080ab472",
133
  "metadata": {},
134
  "outputs": [
@@ -200,7 +200,7 @@
200
  },
201
  {
202
  "cell_type": "code",
203
- "execution_count": 309,
204
  "id": "438e27ae",
205
  "metadata": {},
206
  "outputs": [
@@ -298,7 +298,7 @@
298
  },
299
  {
300
  "cell_type": "code",
301
- "execution_count": 310,
302
  "id": "7560ae66",
303
  "metadata": {},
304
  "outputs": [
@@ -335,7 +335,7 @@
335
  },
336
  {
337
  "cell_type": "code",
338
- "execution_count": 311,
339
  "id": "9f17a88e",
340
  "metadata": {},
341
  "outputs": [
@@ -381,7 +381,7 @@
381
  },
382
  {
383
  "cell_type": "code",
384
- "execution_count": 312,
385
  "id": "d4c02996",
386
  "metadata": {},
387
  "outputs": [],
@@ -404,7 +404,7 @@
404
  },
405
  {
406
  "cell_type": "code",
407
- "execution_count": 313,
408
  "id": "c8292b2b",
409
  "metadata": {},
410
  "outputs": [],
@@ -442,7 +442,7 @@
442
  },
443
  {
444
  "cell_type": "code",
445
- "execution_count": 314,
446
  "id": "b598aef7",
447
  "metadata": {},
448
  "outputs": [
@@ -475,7 +475,7 @@
475
  },
476
  {
477
  "cell_type": "code",
478
- "execution_count": 315,
479
  "id": "962743cc",
480
  "metadata": {},
481
  "outputs": [
@@ -603,7 +603,7 @@
603
  },
604
  {
605
  "cell_type": "code",
606
- "execution_count": 316,
607
  "id": "5c9efd5b",
608
  "metadata": {},
609
  "outputs": [
@@ -636,7 +636,7 @@
636
  },
637
  {
638
  "cell_type": "code",
639
- "execution_count": 317,
640
  "id": "ce01a75f",
641
  "metadata": {},
642
  "outputs": [
@@ -814,7 +814,7 @@
814
  },
815
  {
816
  "cell_type": "code",
817
- "execution_count": 318,
818
  "id": "3e5e5e9b",
819
  "metadata": {},
820
  "outputs": [
@@ -849,7 +849,7 @@
849
  },
850
  {
851
  "cell_type": "code",
852
- "execution_count": 319,
853
  "id": "4de69063",
854
  "metadata": {},
855
  "outputs": [
@@ -885,7 +885,7 @@
885
  },
886
  {
887
  "cell_type": "code",
888
- "execution_count": 320,
889
  "id": "a994b1af",
890
  "metadata": {},
891
  "outputs": [
@@ -1035,7 +1035,7 @@
1035
  },
1036
  {
1037
  "cell_type": "code",
1038
- "execution_count": 321,
1039
  "id": "00f3eda4",
1040
  "metadata": {},
1041
  "outputs": [
@@ -1062,12 +1062,12 @@
1062
  "id": "bfc7b8c0",
1063
  "metadata": {},
1064
  "source": [
1065
- "## 7. Model training step 4: hyperparameter tuning of body regions"
1066
  ]
1067
  },
1068
  {
1069
  "cell_type": "code",
1070
- "execution_count": 322,
1071
  "id": "6b03902f",
1072
  "metadata": {},
1073
  "outputs": [
@@ -1262,7 +1262,7 @@
1262
  },
1263
  {
1264
  "cell_type": "code",
1265
- "execution_count": 323,
1266
  "id": "0b3e066a",
1267
  "metadata": {},
1268
  "outputs": [
@@ -1398,7 +1398,7 @@
1398
  },
1399
  {
1400
  "cell_type": "code",
1401
- "execution_count": 324,
1402
  "id": "d21c037d",
1403
  "metadata": {},
1404
  "outputs": [
@@ -1476,7 +1476,7 @@
1476
  },
1477
  {
1478
  "cell_type": "code",
1479
- "execution_count": 325,
1480
  "id": "4f01e27a",
1481
  "metadata": {},
1482
  "outputs": [
@@ -1484,7 +1484,7 @@
1484
  "name": "stdout",
1485
  "output_type": "stream",
1486
  "text": [
1487
- "Champion model saved: final_champion_model_A3.pkl\n",
1488
  "\n",
1489
  "Note: The deployable Pipeline model is saved separately as models/classification_champion.pkl\n"
1490
  ]
@@ -1511,7 +1511,7 @@
1511
  "source": [
1512
  "## 10. Deployment\n",
1513
  "\n",
1514
- "The classification endpoint is added to the existing Gradio app as a second tab. Tab 1 has Movement Scoring from A2. Tab 2 has Body Region Classification which takes 40 deviation features as input and outputs the predicted body region.\n",
1515
  "\n",
1516
  "Deployment URL: https://huggingface.co/spaces/Bachstelze/github_sync"
1517
  ]
@@ -1540,69 +1540,6 @@
1540
  "GitHub Actions automatically syncs the repository to HuggingFace Spaces when pushed to main. The workflow file is found at .github/workflows/push_to_hf_space.yml."
1541
  ]
1542
  },
1543
- {
1544
- "cell_type": "code",
1545
- "execution_count": 326,
1546
- "id": "9c52b59b",
1547
- "metadata": {},
1548
- "outputs": [
1549
- {
1550
- "name": "stdout",
1551
- "output_type": "stream",
1552
- "text": [
1553
- "=== A3 CLASSIFICATION SCOREBOARD ===\n",
1554
- "\n",
1555
- "14-Class Approach:\n",
1556
- " Model Accuracy F1-Score\n",
1557
- " LDA 0.565632 0.566721\n",
1558
- "Logistic Regression 0.546539 0.562310\n",
1559
- " KNN (k=7) 0.553699 0.545808\n",
1560
- " KNN (k=10) 0.556086 0.540911\n",
1561
- " KNN (k=5) 0.539379 0.530905\n",
1562
- " Naive Bayes 0.434368 0.449053\n",
1563
- "\n",
1564
- "Best 14-class: LDA (F1: 0.5667)\n",
1565
- "\n",
1566
- "--------------------------------------------------\n",
1567
- "\n",
1568
- "Body Region Approach:\n",
1569
- " Model Accuracy F1-Score\n",
1570
- " KNN (k=7) 0.837709 0.827772\n",
1571
- " LDA 0.830549 0.824773\n",
1572
- " KNN (k=10) 0.832936 0.824195\n",
1573
- " KNN (k=5) 0.830549 0.823450\n",
1574
- "Logistic Regression 0.811456 0.817984\n",
1575
- " Naive Bayes 0.770883 0.766729\n",
1576
- " QDA 0.663484 0.681889\n",
1577
- "\n",
1578
- "Best Body Region: KNN (k=7) (F1: 0.8278)\n",
1579
- "\n",
1580
- "==================================================\n",
1581
- "\n",
1582
- "FINAL CHAMPION: KNN (k=7)\n",
1583
- "Approach: Body Regions\n",
1584
- "F1-Score: 0.8278\n"
1585
- ]
1586
- }
1587
- ],
1588
- "source": [
1589
- "# Final scoreboard summary\n",
1590
- "print(\"=== A3 CLASSIFICATION SCOREBOARD ===\\n\")\n",
1591
- "print(\"14-Class Approach:\")\n",
1592
- "print(results_14class_baseline[['Model', 'Accuracy', 'F1-Score']].to_string(index=False))\n",
1593
- "print(f\"\\nBest 14-class: {champion_14class_baseline} (F1: {champion_14class_f1_baseline:.4f})\")\n",
1594
- "\n",
1595
- "print(\"\\n\" + \"-\"*50)\n",
1596
- "print(\"\\nBody Region Approach:\")\n",
1597
- "print(results_region_baseline[['Model', 'Accuracy', 'F1-Score']].to_string(index=False))\n",
1598
- "print(f\"\\nBest Body Region: {champion_region_baseline} (F1: {champion_region_f1_baseline:.4f})\")\n",
1599
- "\n",
1600
- "print(\"\\n\" + \"=\"*50)\n",
1601
- "print(f\"\\nFINAL CHAMPION: {final_champion_row['Model']}\")\n",
1602
- "print(f\"Approach: {final_champion_row['Approach']}\")\n",
1603
- "print(f\"F1-Score: {final_champion_row['F1-Score']:.4f}\")"
1604
- ]
1605
- },
1606
  {
1607
  "cell_type": "markdown",
1608
  "id": "7a142abd",
@@ -1632,13 +1569,17 @@
1632
  "| 3 | Baseline | Body Regions | Grouped classes (Upper/Lower) |\n",
1633
  "| 4 | Tuned | Body Regions | GridSearchCV (5-fold CV) |\n",
1634
  "\n",
1635
- "Note: Polynomial interaction features were tested but not included in final iterations due to minimal improvement and increased complexity (820 features vs 40)."
 
 
 
 
1636
  ]
1637
  }
1638
  ],
1639
  "metadata": {
1640
  "kernelspec": {
1641
- "display_name": "Python 3 (ipykernel)",
1642
  "language": "python",
1643
  "name": "python3"
1644
  },
@@ -1652,7 +1593,7 @@
1652
  "name": "python",
1653
  "nbconvert_exporter": "python",
1654
  "pygments_lexer": "ipython3",
1655
- "version": "3.12.3"
1656
  }
1657
  },
1658
  "nbformat": 4,
 
19
  "source": [
20
  "## 1. Problem statement\n",
21
  "\n",
22
+ "The goal is to classify the weakest link in a squat movement. Given 38 movement features, the model predicts which body region is limiting the person's squat. The input is 38 movement features and the output is a body region classification which is upper and lower body. Due to massive class imbalance in the original 14 class problem with some classes with only 1-2 datapoints, we changed to a 2-class approach that gives predictions that actually helps."
23
  ]
24
  },
25
  {
 
47
  },
48
  {
49
  "cell_type": "code",
50
+ "execution_count": 21,
51
  "id": "edbe3fbd",
52
  "metadata": {},
53
  "outputs": [],
 
77
  },
78
  {
79
  "cell_type": "code",
80
+ "execution_count": 22,
81
  "id": "23f1b38b",
82
  "metadata": {},
83
  "outputs": [
 
128
  },
129
  {
130
  "cell_type": "code",
131
+ "execution_count": 23,
132
  "id": "080ab472",
133
  "metadata": {},
134
  "outputs": [
 
200
  },
201
  {
202
  "cell_type": "code",
203
+ "execution_count": 24,
204
  "id": "438e27ae",
205
  "metadata": {},
206
  "outputs": [
 
298
  },
299
  {
300
  "cell_type": "code",
301
+ "execution_count": 25,
302
  "id": "7560ae66",
303
  "metadata": {},
304
  "outputs": [
 
335
  },
336
  {
337
  "cell_type": "code",
338
+ "execution_count": 26,
339
  "id": "9f17a88e",
340
  "metadata": {},
341
  "outputs": [
 
381
  },
382
  {
383
  "cell_type": "code",
384
+ "execution_count": 27,
385
  "id": "d4c02996",
386
  "metadata": {},
387
  "outputs": [],
 
404
  },
405
  {
406
  "cell_type": "code",
407
+ "execution_count": 28,
408
  "id": "c8292b2b",
409
  "metadata": {},
410
  "outputs": [],
 
442
  },
443
  {
444
  "cell_type": "code",
445
+ "execution_count": 29,
446
  "id": "b598aef7",
447
  "metadata": {},
448
  "outputs": [
 
475
  },
476
  {
477
  "cell_type": "code",
478
+ "execution_count": 30,
479
  "id": "962743cc",
480
  "metadata": {},
481
  "outputs": [
 
603
  },
604
  {
605
  "cell_type": "code",
606
+ "execution_count": 31,
607
  "id": "5c9efd5b",
608
  "metadata": {},
609
  "outputs": [
 
636
  },
637
  {
638
  "cell_type": "code",
639
+ "execution_count": 32,
640
  "id": "ce01a75f",
641
  "metadata": {},
642
  "outputs": [
 
814
  },
815
  {
816
  "cell_type": "code",
817
+ "execution_count": 33,
818
  "id": "3e5e5e9b",
819
  "metadata": {},
820
  "outputs": [
 
849
  },
850
  {
851
  "cell_type": "code",
852
+ "execution_count": 34,
853
  "id": "4de69063",
854
  "metadata": {},
855
  "outputs": [
 
885
  },
886
  {
887
  "cell_type": "code",
888
+ "execution_count": 35,
889
  "id": "a994b1af",
890
  "metadata": {},
891
  "outputs": [
 
1035
  },
1036
  {
1037
  "cell_type": "code",
1038
+ "execution_count": 36,
1039
  "id": "00f3eda4",
1040
  "metadata": {},
1041
  "outputs": [
 
1062
  "id": "bfc7b8c0",
1063
  "metadata": {},
1064
  "source": [
1065
+ "## 7. Model training step 3: hyperparameter tuning of body regions"
1066
  ]
1067
  },
1068
  {
1069
  "cell_type": "code",
1070
+ "execution_count": 37,
1071
  "id": "6b03902f",
1072
  "metadata": {},
1073
  "outputs": [
 
1262
  },
1263
  {
1264
  "cell_type": "code",
1265
+ "execution_count": 38,
1266
  "id": "0b3e066a",
1267
  "metadata": {},
1268
  "outputs": [
 
1398
  },
1399
  {
1400
  "cell_type": "code",
1401
+ "execution_count": 39,
1402
  "id": "d21c037d",
1403
  "metadata": {},
1404
  "outputs": [
 
1476
  },
1477
  {
1478
  "cell_type": "code",
1479
+ "execution_count": 40,
1480
  "id": "4f01e27a",
1481
  "metadata": {},
1482
  "outputs": [
 
1484
  "name": "stdout",
1485
  "output_type": "stream",
1486
  "text": [
1487
+ "Champion model dictionary saved: final_champion_model_A3.pkl\n",
1488
  "\n",
1489
  "Note: The deployable Pipeline model is saved separately as models/classification_champion.pkl\n"
1490
  ]
 
1511
  "source": [
1512
  "## 10. Deployment\n",
1513
  "\n",
1514
+ "The classification endpoint is added to the existing Gradio app as a second tab. Tab 1 has Movement Scoring from A2. Tab 2 has Body Region Classification which takes 38 deviation features as input and outputs the predicted body region (Upper Body or Lower Body). The deployed model is KNN (k=7) with StandardScaler preprocessing, achieving 82.8% F1-score.\n",
1515
  "\n",
1516
  "Deployment URL: https://huggingface.co/spaces/Bachstelze/github_sync"
1517
  ]
 
1540
  "GitHub Actions automatically syncs the repository to HuggingFace Spaces when pushed to main. The workflow file is found at .github/workflows/push_to_hf_space.yml."
1541
  ]
1542
  },
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1543
  {
1544
  "cell_type": "markdown",
1545
  "id": "7a142abd",
 
1569
  "| 3 | Baseline | Body Regions | Grouped classes (Upper/Lower) |\n",
1570
  "| 4 | Tuned | Body Regions | GridSearchCV (5-fold CV) |\n",
1571
  "\n",
1572
+ "Note: Polynomial interaction features were tested but not included in final iterations due to minimal improvement and increased complexity (820 features vs 38).\n",
1573
+ "\n",
1574
+ "### Deployed Model\n",
1575
+ "\n",
1576
+ "The deployed model uses body region classification with KNN (k=7) and StandardScaler preprocessing. It takes 38 input features and achieves 82.8% F1-weighted and 84% accuracy on the test set."
1577
  ]
1578
  }
1579
  ],
1580
  "metadata": {
1581
  "kernelspec": {
1582
+ "display_name": "Python 3",
1583
  "language": "python",
1584
  "name": "python3"
1585
  },
 
1593
  "name": "python",
1594
  "nbconvert_exporter": "python",
1595
  "pygments_lexer": "ipython3",
1596
+ "version": "3.14.0"
1597
  }
1598
  },
1599
  "nbformat": 4,