| # AutoML Regression Model for Shoe Dataset | |
| ## Model Summary | |
| This model was trained using **AutoGluon Tabular (v1.4.0)** on the dataset [maryzhang/hw1-24679-tabular-dataset](https://huggingface.co/datasets/maryzhang/hw1-24679-tabular-dataset). | |
| The task is **regression**, predicting the **actual measured shoe length (mm)** from shoe attributes. | |
| - **Best Model**: `CatBoost_r177_BAG_L1` (bagged ensemble of CatBoost models) | |
| - **Test R² Score**: **0.8904** (≈ 89% variance explained) | |
| - **Validation R² Score**: 0.8049 | |
| - **Pearson correlation**: 0.9473 | |
| - **RMSE**: 1.80 mm | |
| - **MAE**: 1.10 mm | |
| - **Median AE**: 0.68 mm | |
| These values indicate the model can predict shoe length within ~1–2 mm of the actual measurement on average. | |
| --- | |
| ## Leaderboard (Top 5 Models) | |
| | Rank | Model | Test R² | Val R² | Pred Time (s) | Fit Time (s) | | |
| |------|------------------------|---------|---------|---------------|--------------| | |
| | 1 | CatBoost_r177_BAG_L1 | 0.8994 | 0.8049 | 0.0293 | 27.14 | | |
| | 2 | LightGBMLarge_BAG_L2 | 0.8971 | 0.7995 | 0.7011 | 238.93 | | |
| | 3 | CatBoost_BAG_L2 | 0.8939 | 0.8405 | 0.6155 | 276.40 | | |
| | 4 | CatBoost_r9_BAG_L1 | 0.8917 | 0.7889 | 0.0606 | 53.87 | | |
| | 5 | WeightedEnsemble_L3 | 0.8904 | 0.8500 | 0.9871 | 333.68 | | |
| --- | |
| ## Dataset | |
| - **Source**: [maryzhang/hw1-24679-tabular-dataset](https://huggingface.co/datasets/maryzhang/hw1-24679-tabular-dataset) | |
| - **Size**: 338 samples (30 original, 308 augmented) | |
| - **Features**: | |
| - US size (numeric) | |
| - Shoe size (mm) (numeric) | |
| - Type of shoe (categorical) | |
| - Shoe color (categorical) | |
| - Shoe brand (categorical) | |
| - **Target**: *Actual measured shoe length (mm)* | |
| - **Splits**: 80% training, 20% testing (random_state=42) | |
| --- | |
| ## Preprocessing | |
| - Converted Hugging Face dataset to Pandas DataFrame | |
| - Train/test split with stratified random seed | |
| - AutoGluon handled categorical encoding, normalization, and feature selection automatically | |
| --- | |
| ## Training Setup | |
| - **Framework**: AutoGluon Tabular v1.4.0 | |
| - **Search Strategy**: Bagged/stacked ensembles with model selection (`presets="best"`) | |
| - **Time Budget**: 1200 seconds (20 minutes) | |
| - **Evaluation Metric**: R² | |
| - **Hyperparameter Search**: Automated by AutoGluon (CatBoost, LightGBM, ensemble stacking) | |
| --- | |
| ## Metrics | |
| - **R²**: 0.8904 (test) | |
| - **RMSE**: 1.80 mm | |
| - **MAE**: 1.10 mm | |
| - **Median AE**: 0.68 mm | |
| - **Uncertainty**: Variability assessed across multiple base models in ensemble. Bagging reduces variance; expected error ±2 mm for most predictions. | |
| --- | |
| ## Intended Use | |
| - **Educational**: Demonstrates AutoML regression in CMU course 24-679 | |
| - **Limitations**: | |
| - Small dataset size (338 samples) → not robust for production use | |
| - Augmented data may not reflect real-world variability | |
| - Not suitable for medical or industrial applications | |
| --- | |
| ## Ethical Considerations | |
| - Predictions should **not** be used to recommend or prescribe footwear sizes in clinical or consumer contexts. | |
| - Dataset augmentation could introduce biases not present in real measurements. | |
| --- | |
| ## License | |
| - **Dataset**: MIT License | |
| - **Model**: MIT License | |
| --- | |
| ## Hardware / Compute | |
| - **Training**: Google Colab (CPU runtime) | |
| - **Time**: ~20 minutes wall-clock time | |
| - **RAM**: <8 GB used | |
| --- | |
| ## AI Usage Disclosure | |
| - Model training and hyperparameter search used **AutoML (AutoGluon)**. | |
| - Model card text and documentation partially generated with **AI assistance (ChatGPT)**. | |
| --- | |
| ## Acknowledgments | |
| - Dataset by **Mary Zhang (CMU 24-679)** | |
| - Model training and documentation by **Yash Sakhale** | |