|
|
--- |
|
|
datasets: |
|
|
- FaiyazAzam/hw1-image-ds-groot-224 |
|
|
--- |
|
|
# groot_autogluon_predictor_w_hpo |
|
|
|
|
|
## Model Description |
|
|
This model was trained using [AutoGluon Multimodal](https://auto.gluon.ai/). |
|
|
The best-performing architecture was **ResNet18 (timm_image)**. |
|
|
|
|
|
The following model is an AutoGluon Multimodal Image Classifier created using Hyperparameter Optimization. |
|
|
This model utilizes a groot image set with a binary classifier "has_groot" or "doesn't have groot", ultimately working to classify which images have a groot figuring within it, and which do not. |
|
|
|
|
|
## Hyperparameters |
|
|
- 'model.names': ['timm_image'] |
|
|
- 'model.timm_image.checkpoint_name': ['resnet18'] |
|
|
- 'optim.lr': 2.96e-4 |
|
|
- 'env.per_gpu_batch_size': 16 |
|
|
- 'optim.weight_decay': 1.6e-6 |
|
|
- 'optim.max_epochs': 50 |
|
|
|
|
|
|
|
|
## Training & Early Stopping |
|
|
Utilized ASHA early-stopping scheduler, and an HPO timeout of 900 seconds. |
|
|
|
|
|
## Evaluation |
|
|
**Test set metrics:** |
|
|
-'accuracy': 0.9 |
|
|
-'f1': 0.899 |
|
|
|
|
|
**Confusion Matrix** |
|
|
$$ |
|
|
\begin{bmatrix} |
|
|
15 & 0 \\ |
|
|
3 & 12 |
|
|
\end{bmatrix} |
|
|
$$ |
|
|
|
|
|
**Per Class Metrics** |
|
|
|
|
|
| Class | Precision | Recall | f1-score | |
|
|
|----------|----------|----------|----------| |
|
|
| 0 | 0.833 | 1.0 | 0.909 | |
|
|
| 1 | 1.0 | 0.8 | 0.899 | |
|
|
|
|
|
|
|
|
## Data Augmentation |
|
|
Dataset utilized found here: https://huggingface.co/datasets/FaiyazAzam/hw1-image-ds-groot-224 |
|
|
- RandomResizedCrop(224) |
|
|
- RandomHorizontalFlip(p=0.5) |
|
|
- ColorJitter |
|
|
- Normalize(mean=[...], std=[...]) |
|
|
|
|
|
|
|
|
## Input & Preprocessing |
|
|
- Input resolution: 224x224 RGB images |
|
|
- Preprocessing: Resize to 224x224 and normalize with ImageNet mean/std. |
|
|
|
|
|
## Known Failure Modes |
|
|
|
|
|
- Struggles with extreme lighting variations |
|
|
- Confuses class A and B if object is partially occluded |
|
|
|
|
|
|
|
|
## Usage |
|
|
```python |
|
|
from autogluon.multimodal import MultiModalPredictor |
|
|
predictor = MultiModalPredictor.load('groot_autogluon_predictor_w_hpo') |
|
|
pred = predictor.predict("example.jpg") |