Homework2-task1 / README.md
ysakhale's picture
Update README.md
2854858 verified
# AutoML Regression Model for Shoe Dataset
## Model Summary
This model was trained using **AutoGluon Tabular (v1.4.0)** on the dataset [maryzhang/hw1-24679-tabular-dataset](https://huggingface.co/datasets/maryzhang/hw1-24679-tabular-dataset).
The task is **regression**, predicting the **actual measured shoe length (mm)** from shoe attributes.
- **Best Model**: `CatBoost_r177_BAG_L1` (bagged ensemble of CatBoost models)
- **Test R² Score**: **0.8904** (≈ 89% variance explained)
- **Validation R² Score**: 0.8049
- **Pearson correlation**: 0.9473
- **RMSE**: 1.80 mm
- **MAE**: 1.10 mm
- **Median AE**: 0.68 mm
These values indicate the model can predict shoe length within ~1–2 mm of the actual measurement on average.
---
## Leaderboard (Top 5 Models)
| Rank | Model | Test R² | Val R² | Pred Time (s) | Fit Time (s) |
|------|------------------------|---------|---------|---------------|--------------|
| 1 | CatBoost_r177_BAG_L1 | 0.8994 | 0.8049 | 0.0293 | 27.14 |
| 2 | LightGBMLarge_BAG_L2 | 0.8971 | 0.7995 | 0.7011 | 238.93 |
| 3 | CatBoost_BAG_L2 | 0.8939 | 0.8405 | 0.6155 | 276.40 |
| 4 | CatBoost_r9_BAG_L1 | 0.8917 | 0.7889 | 0.0606 | 53.87 |
| 5 | WeightedEnsemble_L3 | 0.8904 | 0.8500 | 0.9871 | 333.68 |
---
## Dataset
- **Source**: [maryzhang/hw1-24679-tabular-dataset](https://huggingface.co/datasets/maryzhang/hw1-24679-tabular-dataset)
- **Size**: 338 samples (30 original, 308 augmented)
- **Features**:
- US size (numeric)
- Shoe size (mm) (numeric)
- Type of shoe (categorical)
- Shoe color (categorical)
- Shoe brand (categorical)
- **Target**: *Actual measured shoe length (mm)*
- **Splits**: 80% training, 20% testing (random_state=42)
---
## Preprocessing
- Converted Hugging Face dataset to Pandas DataFrame
- Train/test split with stratified random seed
- AutoGluon handled categorical encoding, normalization, and feature selection automatically
---
## Training Setup
- **Framework**: AutoGluon Tabular v1.4.0
- **Search Strategy**: Bagged/stacked ensembles with model selection (`presets="best"`)
- **Time Budget**: 1200 seconds (20 minutes)
- **Evaluation Metric**: R²
- **Hyperparameter Search**: Automated by AutoGluon (CatBoost, LightGBM, ensemble stacking)
---
## Metrics
- **R²**: 0.8904 (test)
- **RMSE**: 1.80 mm
- **MAE**: 1.10 mm
- **Median AE**: 0.68 mm
- **Uncertainty**: Variability assessed across multiple base models in ensemble. Bagging reduces variance; expected error ±2 mm for most predictions.
---
## Intended Use
- **Educational**: Demonstrates AutoML regression in CMU course 24-679
- **Limitations**:
- Small dataset size (338 samples) → not robust for production use
- Augmented data may not reflect real-world variability
- Not suitable for medical or industrial applications
---
## Ethical Considerations
- Predictions should **not** be used to recommend or prescribe footwear sizes in clinical or consumer contexts.
- Dataset augmentation could introduce biases not present in real measurements.
---
## License
- **Dataset**: MIT License
- **Model**: MIT License
---
## Hardware / Compute
- **Training**: Google Colab (CPU runtime)
- **Time**: ~20 minutes wall-clock time
- **RAM**: <8 GB used
---
## AI Usage Disclosure
- Model training and hyperparameter search used **AutoML (AutoGluon)**.
- Model card text and documentation partially generated with **AI assistance (ChatGPT)**.
---
## Acknowledgments
- Dataset by **Mary Zhang (CMU 24-679)**
- Model training and documentation by **Yash Sakhale**