Spaces:
Sleeping
Sleeping
Bachstelze commited on
Commit ·
9457a9d
1
Parent(s): d0ae2a0
recommit changes in the report notebook
Browse files- A3/A3_Report.ipynb +31 -90
A3/A3_Report.ipynb
CHANGED
|
@@ -19,7 +19,7 @@
|
|
| 19 |
"source": [
|
| 20 |
"## 1. Problem statement\n",
|
| 21 |
"\n",
|
| 22 |
-
"The goal is to classify the weakest link in a squat movement. Given
|
| 23 |
]
|
| 24 |
},
|
| 25 |
{
|
|
@@ -47,7 +47,7 @@
|
|
| 47 |
},
|
| 48 |
{
|
| 49 |
"cell_type": "code",
|
| 50 |
-
"execution_count":
|
| 51 |
"id": "edbe3fbd",
|
| 52 |
"metadata": {},
|
| 53 |
"outputs": [],
|
|
@@ -77,7 +77,7 @@
|
|
| 77 |
},
|
| 78 |
{
|
| 79 |
"cell_type": "code",
|
| 80 |
-
"execution_count":
|
| 81 |
"id": "23f1b38b",
|
| 82 |
"metadata": {},
|
| 83 |
"outputs": [
|
|
@@ -128,7 +128,7 @@
|
|
| 128 |
},
|
| 129 |
{
|
| 130 |
"cell_type": "code",
|
| 131 |
-
"execution_count":
|
| 132 |
"id": "080ab472",
|
| 133 |
"metadata": {},
|
| 134 |
"outputs": [
|
|
@@ -200,7 +200,7 @@
|
|
| 200 |
},
|
| 201 |
{
|
| 202 |
"cell_type": "code",
|
| 203 |
-
"execution_count":
|
| 204 |
"id": "438e27ae",
|
| 205 |
"metadata": {},
|
| 206 |
"outputs": [
|
|
@@ -298,7 +298,7 @@
|
|
| 298 |
},
|
| 299 |
{
|
| 300 |
"cell_type": "code",
|
| 301 |
-
"execution_count":
|
| 302 |
"id": "7560ae66",
|
| 303 |
"metadata": {},
|
| 304 |
"outputs": [
|
|
@@ -335,7 +335,7 @@
|
|
| 335 |
},
|
| 336 |
{
|
| 337 |
"cell_type": "code",
|
| 338 |
-
"execution_count":
|
| 339 |
"id": "9f17a88e",
|
| 340 |
"metadata": {},
|
| 341 |
"outputs": [
|
|
@@ -381,7 +381,7 @@
|
|
| 381 |
},
|
| 382 |
{
|
| 383 |
"cell_type": "code",
|
| 384 |
-
"execution_count":
|
| 385 |
"id": "d4c02996",
|
| 386 |
"metadata": {},
|
| 387 |
"outputs": [],
|
|
@@ -404,7 +404,7 @@
|
|
| 404 |
},
|
| 405 |
{
|
| 406 |
"cell_type": "code",
|
| 407 |
-
"execution_count":
|
| 408 |
"id": "c8292b2b",
|
| 409 |
"metadata": {},
|
| 410 |
"outputs": [],
|
|
@@ -442,7 +442,7 @@
|
|
| 442 |
},
|
| 443 |
{
|
| 444 |
"cell_type": "code",
|
| 445 |
-
"execution_count":
|
| 446 |
"id": "b598aef7",
|
| 447 |
"metadata": {},
|
| 448 |
"outputs": [
|
|
@@ -475,7 +475,7 @@
|
|
| 475 |
},
|
| 476 |
{
|
| 477 |
"cell_type": "code",
|
| 478 |
-
"execution_count":
|
| 479 |
"id": "962743cc",
|
| 480 |
"metadata": {},
|
| 481 |
"outputs": [
|
|
@@ -603,7 +603,7 @@
|
|
| 603 |
},
|
| 604 |
{
|
| 605 |
"cell_type": "code",
|
| 606 |
-
"execution_count":
|
| 607 |
"id": "5c9efd5b",
|
| 608 |
"metadata": {},
|
| 609 |
"outputs": [
|
|
@@ -636,7 +636,7 @@
|
|
| 636 |
},
|
| 637 |
{
|
| 638 |
"cell_type": "code",
|
| 639 |
-
"execution_count":
|
| 640 |
"id": "ce01a75f",
|
| 641 |
"metadata": {},
|
| 642 |
"outputs": [
|
|
@@ -814,7 +814,7 @@
|
|
| 814 |
},
|
| 815 |
{
|
| 816 |
"cell_type": "code",
|
| 817 |
-
"execution_count":
|
| 818 |
"id": "3e5e5e9b",
|
| 819 |
"metadata": {},
|
| 820 |
"outputs": [
|
|
@@ -849,7 +849,7 @@
|
|
| 849 |
},
|
| 850 |
{
|
| 851 |
"cell_type": "code",
|
| 852 |
-
"execution_count":
|
| 853 |
"id": "4de69063",
|
| 854 |
"metadata": {},
|
| 855 |
"outputs": [
|
|
@@ -885,7 +885,7 @@
|
|
| 885 |
},
|
| 886 |
{
|
| 887 |
"cell_type": "code",
|
| 888 |
-
"execution_count":
|
| 889 |
"id": "a994b1af",
|
| 890 |
"metadata": {},
|
| 891 |
"outputs": [
|
|
@@ -1035,7 +1035,7 @@
|
|
| 1035 |
},
|
| 1036 |
{
|
| 1037 |
"cell_type": "code",
|
| 1038 |
-
"execution_count":
|
| 1039 |
"id": "00f3eda4",
|
| 1040 |
"metadata": {},
|
| 1041 |
"outputs": [
|
|
@@ -1062,12 +1062,12 @@
|
|
| 1062 |
"id": "bfc7b8c0",
|
| 1063 |
"metadata": {},
|
| 1064 |
"source": [
|
| 1065 |
-
"## 7. Model training step
|
| 1066 |
]
|
| 1067 |
},
|
| 1068 |
{
|
| 1069 |
"cell_type": "code",
|
| 1070 |
-
"execution_count":
|
| 1071 |
"id": "6b03902f",
|
| 1072 |
"metadata": {},
|
| 1073 |
"outputs": [
|
|
@@ -1262,7 +1262,7 @@
|
|
| 1262 |
},
|
| 1263 |
{
|
| 1264 |
"cell_type": "code",
|
| 1265 |
-
"execution_count":
|
| 1266 |
"id": "0b3e066a",
|
| 1267 |
"metadata": {},
|
| 1268 |
"outputs": [
|
|
@@ -1398,7 +1398,7 @@
|
|
| 1398 |
},
|
| 1399 |
{
|
| 1400 |
"cell_type": "code",
|
| 1401 |
-
"execution_count":
|
| 1402 |
"id": "d21c037d",
|
| 1403 |
"metadata": {},
|
| 1404 |
"outputs": [
|
|
@@ -1476,7 +1476,7 @@
|
|
| 1476 |
},
|
| 1477 |
{
|
| 1478 |
"cell_type": "code",
|
| 1479 |
-
"execution_count":
|
| 1480 |
"id": "4f01e27a",
|
| 1481 |
"metadata": {},
|
| 1482 |
"outputs": [
|
|
@@ -1484,7 +1484,7 @@
|
|
| 1484 |
"name": "stdout",
|
| 1485 |
"output_type": "stream",
|
| 1486 |
"text": [
|
| 1487 |
-
"Champion model saved: final_champion_model_A3.pkl\n",
|
| 1488 |
"\n",
|
| 1489 |
"Note: The deployable Pipeline model is saved separately as models/classification_champion.pkl\n"
|
| 1490 |
]
|
|
@@ -1511,7 +1511,7 @@
|
|
| 1511 |
"source": [
|
| 1512 |
"## 10. Deployment\n",
|
| 1513 |
"\n",
|
| 1514 |
-
"The classification endpoint is added to the existing Gradio app as a second tab. Tab 1 has Movement Scoring from A2. Tab 2 has Body Region Classification which takes
|
| 1515 |
"\n",
|
| 1516 |
"Deployment URL: https://huggingface.co/spaces/Bachstelze/github_sync"
|
| 1517 |
]
|
|
@@ -1540,69 +1540,6 @@
|
|
| 1540 |
"GitHub Actions automatically syncs the repository to HuggingFace Spaces when pushed to main. The workflow file is found at .github/workflows/push_to_hf_space.yml."
|
| 1541 |
]
|
| 1542 |
},
|
| 1543 |
-
{
|
| 1544 |
-
"cell_type": "code",
|
| 1545 |
-
"execution_count": 326,
|
| 1546 |
-
"id": "9c52b59b",
|
| 1547 |
-
"metadata": {},
|
| 1548 |
-
"outputs": [
|
| 1549 |
-
{
|
| 1550 |
-
"name": "stdout",
|
| 1551 |
-
"output_type": "stream",
|
| 1552 |
-
"text": [
|
| 1553 |
-
"=== A3 CLASSIFICATION SCOREBOARD ===\n",
|
| 1554 |
-
"\n",
|
| 1555 |
-
"14-Class Approach:\n",
|
| 1556 |
-
" Model Accuracy F1-Score\n",
|
| 1557 |
-
" LDA 0.565632 0.566721\n",
|
| 1558 |
-
"Logistic Regression 0.546539 0.562310\n",
|
| 1559 |
-
" KNN (k=7) 0.553699 0.545808\n",
|
| 1560 |
-
" KNN (k=10) 0.556086 0.540911\n",
|
| 1561 |
-
" KNN (k=5) 0.539379 0.530905\n",
|
| 1562 |
-
" Naive Bayes 0.434368 0.449053\n",
|
| 1563 |
-
"\n",
|
| 1564 |
-
"Best 14-class: LDA (F1: 0.5667)\n",
|
| 1565 |
-
"\n",
|
| 1566 |
-
"--------------------------------------------------\n",
|
| 1567 |
-
"\n",
|
| 1568 |
-
"Body Region Approach:\n",
|
| 1569 |
-
" Model Accuracy F1-Score\n",
|
| 1570 |
-
" KNN (k=7) 0.837709 0.827772\n",
|
| 1571 |
-
" LDA 0.830549 0.824773\n",
|
| 1572 |
-
" KNN (k=10) 0.832936 0.824195\n",
|
| 1573 |
-
" KNN (k=5) 0.830549 0.823450\n",
|
| 1574 |
-
"Logistic Regression 0.811456 0.817984\n",
|
| 1575 |
-
" Naive Bayes 0.770883 0.766729\n",
|
| 1576 |
-
" QDA 0.663484 0.681889\n",
|
| 1577 |
-
"\n",
|
| 1578 |
-
"Best Body Region: KNN (k=7) (F1: 0.8278)\n",
|
| 1579 |
-
"\n",
|
| 1580 |
-
"==================================================\n",
|
| 1581 |
-
"\n",
|
| 1582 |
-
"FINAL CHAMPION: KNN (k=7)\n",
|
| 1583 |
-
"Approach: Body Regions\n",
|
| 1584 |
-
"F1-Score: 0.8278\n"
|
| 1585 |
-
]
|
| 1586 |
-
}
|
| 1587 |
-
],
|
| 1588 |
-
"source": [
|
| 1589 |
-
"# Final scoreboard summary\n",
|
| 1590 |
-
"print(\"=== A3 CLASSIFICATION SCOREBOARD ===\\n\")\n",
|
| 1591 |
-
"print(\"14-Class Approach:\")\n",
|
| 1592 |
-
"print(results_14class_baseline[['Model', 'Accuracy', 'F1-Score']].to_string(index=False))\n",
|
| 1593 |
-
"print(f\"\\nBest 14-class: {champion_14class_baseline} (F1: {champion_14class_f1_baseline:.4f})\")\n",
|
| 1594 |
-
"\n",
|
| 1595 |
-
"print(\"\\n\" + \"-\"*50)\n",
|
| 1596 |
-
"print(\"\\nBody Region Approach:\")\n",
|
| 1597 |
-
"print(results_region_baseline[['Model', 'Accuracy', 'F1-Score']].to_string(index=False))\n",
|
| 1598 |
-
"print(f\"\\nBest Body Region: {champion_region_baseline} (F1: {champion_region_f1_baseline:.4f})\")\n",
|
| 1599 |
-
"\n",
|
| 1600 |
-
"print(\"\\n\" + \"=\"*50)\n",
|
| 1601 |
-
"print(f\"\\nFINAL CHAMPION: {final_champion_row['Model']}\")\n",
|
| 1602 |
-
"print(f\"Approach: {final_champion_row['Approach']}\")\n",
|
| 1603 |
-
"print(f\"F1-Score: {final_champion_row['F1-Score']:.4f}\")"
|
| 1604 |
-
]
|
| 1605 |
-
},
|
| 1606 |
{
|
| 1607 |
"cell_type": "markdown",
|
| 1608 |
"id": "7a142abd",
|
|
@@ -1632,13 +1569,17 @@
|
|
| 1632 |
"| 3 | Baseline | Body Regions | Grouped classes (Upper/Lower) |\n",
|
| 1633 |
"| 4 | Tuned | Body Regions | GridSearchCV (5-fold CV) |\n",
|
| 1634 |
"\n",
|
| 1635 |
-
"Note: Polynomial interaction features were tested but not included in final iterations due to minimal improvement and increased complexity (820 features vs
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1636 |
]
|
| 1637 |
}
|
| 1638 |
],
|
| 1639 |
"metadata": {
|
| 1640 |
"kernelspec": {
|
| 1641 |
-
"display_name": "Python 3
|
| 1642 |
"language": "python",
|
| 1643 |
"name": "python3"
|
| 1644 |
},
|
|
@@ -1652,7 +1593,7 @@
|
|
| 1652 |
"name": "python",
|
| 1653 |
"nbconvert_exporter": "python",
|
| 1654 |
"pygments_lexer": "ipython3",
|
| 1655 |
-
"version": "3.
|
| 1656 |
}
|
| 1657 |
},
|
| 1658 |
"nbformat": 4,
|
|
|
|
| 19 |
"source": [
|
| 20 |
"## 1. Problem statement\n",
|
| 21 |
"\n",
|
| 22 |
+
"The goal is to classify the weakest link in a squat movement. Given 38 movement features, the model predicts which body region is limiting the person's squat. The input is 38 movement features and the output is a body region classification which is upper and lower body. Due to massive class imbalance in the original 14 class problem with some classes with only 1-2 datapoints, we changed to a 2-class approach that gives predictions that actually helps."
|
| 23 |
]
|
| 24 |
},
|
| 25 |
{
|
|
|
|
| 47 |
},
|
| 48 |
{
|
| 49 |
"cell_type": "code",
|
| 50 |
+
"execution_count": 21,
|
| 51 |
"id": "edbe3fbd",
|
| 52 |
"metadata": {},
|
| 53 |
"outputs": [],
|
|
|
|
| 77 |
},
|
| 78 |
{
|
| 79 |
"cell_type": "code",
|
| 80 |
+
"execution_count": 22,
|
| 81 |
"id": "23f1b38b",
|
| 82 |
"metadata": {},
|
| 83 |
"outputs": [
|
|
|
|
| 128 |
},
|
| 129 |
{
|
| 130 |
"cell_type": "code",
|
| 131 |
+
"execution_count": 23,
|
| 132 |
"id": "080ab472",
|
| 133 |
"metadata": {},
|
| 134 |
"outputs": [
|
|
|
|
| 200 |
},
|
| 201 |
{
|
| 202 |
"cell_type": "code",
|
| 203 |
+
"execution_count": 24,
|
| 204 |
"id": "438e27ae",
|
| 205 |
"metadata": {},
|
| 206 |
"outputs": [
|
|
|
|
| 298 |
},
|
| 299 |
{
|
| 300 |
"cell_type": "code",
|
| 301 |
+
"execution_count": 25,
|
| 302 |
"id": "7560ae66",
|
| 303 |
"metadata": {},
|
| 304 |
"outputs": [
|
|
|
|
| 335 |
},
|
| 336 |
{
|
| 337 |
"cell_type": "code",
|
| 338 |
+
"execution_count": 26,
|
| 339 |
"id": "9f17a88e",
|
| 340 |
"metadata": {},
|
| 341 |
"outputs": [
|
|
|
|
| 381 |
},
|
| 382 |
{
|
| 383 |
"cell_type": "code",
|
| 384 |
+
"execution_count": 27,
|
| 385 |
"id": "d4c02996",
|
| 386 |
"metadata": {},
|
| 387 |
"outputs": [],
|
|
|
|
| 404 |
},
|
| 405 |
{
|
| 406 |
"cell_type": "code",
|
| 407 |
+
"execution_count": 28,
|
| 408 |
"id": "c8292b2b",
|
| 409 |
"metadata": {},
|
| 410 |
"outputs": [],
|
|
|
|
| 442 |
},
|
| 443 |
{
|
| 444 |
"cell_type": "code",
|
| 445 |
+
"execution_count": 29,
|
| 446 |
"id": "b598aef7",
|
| 447 |
"metadata": {},
|
| 448 |
"outputs": [
|
|
|
|
| 475 |
},
|
| 476 |
{
|
| 477 |
"cell_type": "code",
|
| 478 |
+
"execution_count": 30,
|
| 479 |
"id": "962743cc",
|
| 480 |
"metadata": {},
|
| 481 |
"outputs": [
|
|
|
|
| 603 |
},
|
| 604 |
{
|
| 605 |
"cell_type": "code",
|
| 606 |
+
"execution_count": 31,
|
| 607 |
"id": "5c9efd5b",
|
| 608 |
"metadata": {},
|
| 609 |
"outputs": [
|
|
|
|
| 636 |
},
|
| 637 |
{
|
| 638 |
"cell_type": "code",
|
| 639 |
+
"execution_count": 32,
|
| 640 |
"id": "ce01a75f",
|
| 641 |
"metadata": {},
|
| 642 |
"outputs": [
|
|
|
|
| 814 |
},
|
| 815 |
{
|
| 816 |
"cell_type": "code",
|
| 817 |
+
"execution_count": 33,
|
| 818 |
"id": "3e5e5e9b",
|
| 819 |
"metadata": {},
|
| 820 |
"outputs": [
|
|
|
|
| 849 |
},
|
| 850 |
{
|
| 851 |
"cell_type": "code",
|
| 852 |
+
"execution_count": 34,
|
| 853 |
"id": "4de69063",
|
| 854 |
"metadata": {},
|
| 855 |
"outputs": [
|
|
|
|
| 885 |
},
|
| 886 |
{
|
| 887 |
"cell_type": "code",
|
| 888 |
+
"execution_count": 35,
|
| 889 |
"id": "a994b1af",
|
| 890 |
"metadata": {},
|
| 891 |
"outputs": [
|
|
|
|
| 1035 |
},
|
| 1036 |
{
|
| 1037 |
"cell_type": "code",
|
| 1038 |
+
"execution_count": 36,
|
| 1039 |
"id": "00f3eda4",
|
| 1040 |
"metadata": {},
|
| 1041 |
"outputs": [
|
|
|
|
| 1062 |
"id": "bfc7b8c0",
|
| 1063 |
"metadata": {},
|
| 1064 |
"source": [
|
| 1065 |
+
"## 7. Model training step 3: hyperparameter tuning of body regions"
|
| 1066 |
]
|
| 1067 |
},
|
| 1068 |
{
|
| 1069 |
"cell_type": "code",
|
| 1070 |
+
"execution_count": 37,
|
| 1071 |
"id": "6b03902f",
|
| 1072 |
"metadata": {},
|
| 1073 |
"outputs": [
|
|
|
|
| 1262 |
},
|
| 1263 |
{
|
| 1264 |
"cell_type": "code",
|
| 1265 |
+
"execution_count": 38,
|
| 1266 |
"id": "0b3e066a",
|
| 1267 |
"metadata": {},
|
| 1268 |
"outputs": [
|
|
|
|
| 1398 |
},
|
| 1399 |
{
|
| 1400 |
"cell_type": "code",
|
| 1401 |
+
"execution_count": 39,
|
| 1402 |
"id": "d21c037d",
|
| 1403 |
"metadata": {},
|
| 1404 |
"outputs": [
|
|
|
|
| 1476 |
},
|
| 1477 |
{
|
| 1478 |
"cell_type": "code",
|
| 1479 |
+
"execution_count": 40,
|
| 1480 |
"id": "4f01e27a",
|
| 1481 |
"metadata": {},
|
| 1482 |
"outputs": [
|
|
|
|
| 1484 |
"name": "stdout",
|
| 1485 |
"output_type": "stream",
|
| 1486 |
"text": [
|
| 1487 |
+
"Champion model dictionary saved: final_champion_model_A3.pkl\n",
|
| 1488 |
"\n",
|
| 1489 |
"Note: The deployable Pipeline model is saved separately as models/classification_champion.pkl\n"
|
| 1490 |
]
|
|
|
|
| 1511 |
"source": [
|
| 1512 |
"## 10. Deployment\n",
|
| 1513 |
"\n",
|
| 1514 |
+
"The classification endpoint is added to the existing Gradio app as a second tab. Tab 1 has Movement Scoring from A2. Tab 2 has Body Region Classification which takes 38 deviation features as input and outputs the predicted body region (Upper Body or Lower Body). The deployed model is KNN (k=7) with StandardScaler preprocessing, achieving 82.8% F1-score.\n",
|
| 1515 |
"\n",
|
| 1516 |
"Deployment URL: https://huggingface.co/spaces/Bachstelze/github_sync"
|
| 1517 |
]
|
|
|
|
| 1540 |
"GitHub Actions automatically syncs the repository to HuggingFace Spaces when pushed to main. The workflow file is found at .github/workflows/push_to_hf_space.yml."
|
| 1541 |
]
|
| 1542 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1543 |
{
|
| 1544 |
"cell_type": "markdown",
|
| 1545 |
"id": "7a142abd",
|
|
|
|
| 1569 |
"| 3 | Baseline | Body Regions | Grouped classes (Upper/Lower) |\n",
|
| 1570 |
"| 4 | Tuned | Body Regions | GridSearchCV (5-fold CV) |\n",
|
| 1571 |
"\n",
|
| 1572 |
+
"Note: Polynomial interaction features were tested but not included in final iterations due to minimal improvement and increased complexity (820 features vs 38).\n",
|
| 1573 |
+
"\n",
|
| 1574 |
+
"### Deployed Model\n",
|
| 1575 |
+
"\n",
|
| 1576 |
+
"The deployed model uses body region classification with KNN (k=7) and StandardScaler preprocessing. It takes 38 input features and achieves 82.8% F1-weighted and 84% accuracy on the test set."
|
| 1577 |
]
|
| 1578 |
}
|
| 1579 |
],
|
| 1580 |
"metadata": {
|
| 1581 |
"kernelspec": {
|
| 1582 |
+
"display_name": "Python 3",
|
| 1583 |
"language": "python",
|
| 1584 |
"name": "python3"
|
| 1585 |
},
|
|
|
|
| 1593 |
"name": "python",
|
| 1594 |
"nbconvert_exporter": "python",
|
| 1595 |
"pygments_lexer": "ipython3",
|
| 1596 |
+
"version": "3.14.0"
|
| 1597 |
}
|
| 1598 |
},
|
| 1599 |
"nbformat": 4,
|