AutoML Regression Model for Shoe Dataset

Model Summary

This model was trained using AutoGluon Tabular (v1.4.0) on the dataset maryzhang/hw1-24679-tabular-dataset.
The task is regression, predicting the actual measured shoe length (mm) from shoe attributes.

Best Model: CatBoost_r177_BAG_L1 (bagged ensemble of CatBoost models)
Test R² Score: 0.8904 (≈ 89% variance explained)
Validation R² Score: 0.8049
Pearson correlation: 0.9473
RMSE: 1.80 mm
MAE: 1.10 mm
Median AE: 0.68 mm

These values indicate the model can predict shoe length within ~1–2 mm of the actual measurement on average.

Leaderboard (Top 5 Models)

Rank	Model	Test R²	Val R²	Pred Time (s)	Fit Time (s)
1	CatBoost_r177_BAG_L1	0.8994	0.8049	0.0293	27.14
2	LightGBMLarge_BAG_L2	0.8971	0.7995	0.7011	238.93
3	CatBoost_BAG_L2	0.8939	0.8405	0.6155	276.40
4	CatBoost_r9_BAG_L1	0.8917	0.7889	0.0606	53.87
5	WeightedEnsemble_L3	0.8904	0.8500	0.9871	333.68

Dataset

Source: maryzhang/hw1-24679-tabular-dataset
Size: 338 samples (30 original, 308 augmented)
Features:
- US size (numeric)
- Shoe size (mm) (numeric)
- Type of shoe (categorical)
- Shoe color (categorical)
- Shoe brand (categorical)
Target: Actual measured shoe length (mm)
Splits: 80% training, 20% testing (random_state=42)

Preprocessing

Converted Hugging Face dataset to Pandas DataFrame
Train/test split with stratified random seed
AutoGluon handled categorical encoding, normalization, and feature selection automatically

Training Setup

Framework: AutoGluon Tabular v1.4.0
Search Strategy: Bagged/stacked ensembles with model selection (presets="best")
Time Budget: 1200 seconds (20 minutes)
Evaluation Metric: R²
Hyperparameter Search: Automated by AutoGluon (CatBoost, LightGBM, ensemble stacking)

Metrics

R²: 0.8904 (test)
RMSE: 1.80 mm
MAE: 1.10 mm
Median AE: 0.68 mm
Uncertainty: Variability assessed across multiple base models in ensemble. Bagging reduces variance; expected error ±2 mm for most predictions.

Intended Use

Educational: Demonstrates AutoML regression in CMU course 24-679
Limitations:
- Small dataset size (338 samples) → not robust for production use
- Augmented data may not reflect real-world variability
- Not suitable for medical or industrial applications

Ethical Considerations

Predictions should not be used to recommend or prescribe footwear sizes in clinical or consumer contexts.
Dataset augmentation could introduce biases not present in real measurements.

License

Dataset: MIT License
Model: MIT License

Hardware / Compute

Training: Google Colab (CPU runtime)
Time: ~20 minutes wall-clock time
RAM: <8 GB used

AI Usage Disclosure

Model training and hyperparameter search used AutoML (AutoGluon).
Model card text and documentation partially generated with AI assistance (ChatGPT).

Acknowledgments

Dataset by Mary Zhang (CMU 24-679)
Model training and documentation by Yash Sakhale

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support