File size: 1,915 Bytes

c762fb0
729d8cb

---
license: apache-2.0
tags:
  - computer-vision
  - image-classification
  - food101
  - cnn-vit
  - hybrid
datasets:
  - food101
metrics:
  - accuracy
library_name: pytorch
---

# 🍕 Hybrid Food Image Classifier (CNN + ViT)

This model combines ResNet50 (CNN) and DeiT-Base (ViT) with an adaptive fusion module for state-of-the-art food image classification.

## Model Architecture

- **CNN Branch**: ResNet50 (pretrained on ImageNet)
- **ViT Branch**: DeiT-Base Distilled (pretrained)
- **Fusion Module**: Adaptive attention-based fusion with multi-head cross-attention
- **Classes**: 101 food categories from Food-101 dataset

## Performance

- **Validation Accuracy**: ~82.5%
- **Top-5 Accuracy**: >95%

## Files

- `best_model.pth`: Trained PyTorch checkpoint
- `real_class_mapping.json`: Human-readable class names
- `config.yaml`: Training configuration
- `food101_class_names.json`: Original class names

## Quick Usage

```python
from huggingface_hub import hf_hub_download
import torch

# Download model
ckpt_path = hf_hub_download(
    repo_id="codealchemist01/food-image-classifier-hybrid",
    filename="best_model.pth"
)

# Load checkpoint
checkpoint = torch.load(ckpt_path, map_location="cpu")
```

## Demo

Try the live demo: [Food Classifier Space](https://huggingface.co/spaces/codealchemist01/food-classifier-space)

## Training Details

- **Dataset**: Food-101 (101,000 images across 101 categories)
- **Framework**: PyTorch 2.0+
- **Image Size**: 224x224
- **Optimizer**: AdamW with cosine annealing warm restarts
- **Augmentations**: Albumentations (flip, rotation, color jitter)
- **Mixed Precision**: FP16 training

## Citation

```bibtex
@misc{food-classifier-hybrid,
  author = {codealchemist01},
  title = {Hybrid Food Image Classifier},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/codealchemist01/food-image-classifier-hybrid}}
}
```