codealchemist01's picture
Upload README.md with huggingface_hub
729d8cb verified
---
license: apache-2.0
tags:
- computer-vision
- image-classification
- food101
- cnn-vit
- hybrid
datasets:
- food101
metrics:
- accuracy
library_name: pytorch
---
# 🍕 Hybrid Food Image Classifier (CNN + ViT)
This model combines ResNet50 (CNN) and DeiT-Base (ViT) with an adaptive fusion module for state-of-the-art food image classification.
## Model Architecture
- **CNN Branch**: ResNet50 (pretrained on ImageNet)
- **ViT Branch**: DeiT-Base Distilled (pretrained)
- **Fusion Module**: Adaptive attention-based fusion with multi-head cross-attention
- **Classes**: 101 food categories from Food-101 dataset
## Performance
- **Validation Accuracy**: ~82.5%
- **Top-5 Accuracy**: >95%
## Files
- `best_model.pth`: Trained PyTorch checkpoint
- `real_class_mapping.json`: Human-readable class names
- `config.yaml`: Training configuration
- `food101_class_names.json`: Original class names
## Quick Usage
```python
from huggingface_hub import hf_hub_download
import torch
# Download model
ckpt_path = hf_hub_download(
repo_id="codealchemist01/food-image-classifier-hybrid",
filename="best_model.pth"
)
# Load checkpoint
checkpoint = torch.load(ckpt_path, map_location="cpu")
```
## Demo
Try the live demo: [Food Classifier Space](https://huggingface.co/spaces/codealchemist01/food-classifier-space)
## Training Details
- **Dataset**: Food-101 (101,000 images across 101 categories)
- **Framework**: PyTorch 2.0+
- **Image Size**: 224x224
- **Optimizer**: AdamW with cosine annealing warm restarts
- **Augmentations**: Albumentations (flip, rotation, color jitter)
- **Mixed Precision**: FP16 training
## Citation
```bibtex
@misc{food-classifier-hybrid,
author = {codealchemist01},
title = {Hybrid Food Image Classifier},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/codealchemist01/food-image-classifier-hybrid}}
}
```