Spaces:

mahmoudalrefaey
/

FoodClassifier-ViT

Sleeping

App Files Files Community

FoodClassifier-ViT / README.md

mahmoudalrefaey

Upload README.md

2282441 verified 8 months ago

preview code

raw

history blame contribute delete

5.63 kB

A newer version of the Gradio SDK is available: 6.9.0

Upgrade

metadata

title: FoodClassifier-ViT
emoji: 🍕
colorFrom: indigo
colorTo: pink
sdk: gradio
app_file: app.py
pinned: false

FoodViT - Food Classification Application

A production-ready food classification application using Vision Transformer (ViT) that can classify images into three categories: pizza, steak, and sushi.

🍕 Features

Web Interface: Beautiful Gradio web interface for easy image upload and classification
Command Line Tool: Batch prediction capabilities for processing multiple images
High Accuracy: Trained Vision Transformer model with excellent performance
Production Ready: Modular, well-structured codebase with proper error handling
Dynamic Example Images: Example images are randomly selected from assets/samples/ at each app launch
Easy Deployment: Simple setup and configuration
Model weights hosted on Hugging Face Hub: The model file is not included in this repository; it is automatically downloaded from the Hugging Face Model Hub at runtime.

📁 Project Structure

FoodViT/
├── app.py                 # Main application entry point
├── predict.py            # Command-line prediction script
├── config.py             # Configuration settings
├── requirements.txt      # Python dependencies
├── README.md            # This file
├── INSTALLATION.md      # Installation and troubleshooting guide
├── utils/
│   ├── model_loader.py  # Model loading utilities
│   ├── image_processor.py # Image preprocessing
│   └── predictor.py     # Prediction logic
├── interface/
│   └── gradio_app.py    # Gradio web interface
└── assets/
    └── samples/         # Example images for Gradio interface

🚀 Quick Start

1. Installation

# Clone the repository
git clone <repository-url>
cd FoodViT

# Install dependencies
pip install -r requirements.txt

2. Run the Web Interface

# Start the Gradio web interface
python app.py

The interface will be available at http://localhost:7860

3. Command Line Usage

# Predict a single image
python predict.py path/to/image.jpg

# Predict all images in a directory
python predict.py path/to/image/directory

# Get detailed prediction information
python predict.py path/to/image.jpg --detailed

# Save results to JSON file
python predict.py path/to/image/directory --output results.json

🎯 Usage Examples

Web Interface

Open your browser and go to http://localhost:7860
Upload an image of pizza, steak, or sushi
View the prediction results with confidence scores
Try the example images provided (randomly selected from assets/samples/)

Command Line

# Single image prediction
python predict.py pizza.jpg
# Output: ✅ pizza.jpg: Pizza (95.23%)

# Batch prediction with details
python predict.py test_images/ --detailed --output results.json

⚙️ Configuration

Edit config.py to customize:

Model settings: Model path, device, image size
Class configuration: Class names and mappings
Gradio interface: Title, description, theme
Application settings: Host, port, debug mode

🔧 Advanced Usage

Custom Model Loading

from utils.model_loader import ModelLoader

# Load custom model
loader = ModelLoader()
loader.load_model()
model = loader.get_model()

Image Preprocessing

from utils.image_processor import ImageProcessor

# Preprocess custom image
processor = ImageProcessor()
tensor = processor.preprocess_image("path/to/image.jpg")

Direct Prediction

from utils.predictor import FoodPredictor

# Initialize and predict
predictor = FoodPredictor()
predictor.initialize()
result = predictor.predict("path/to/image.jpg")
print(f"Predicted: {result['class']} ({result['confidence']:.2%})")

📊 Model Information

Architecture: Vision Transformer (ViT-Base)
Input Size: 224x224 pixels
Classes: 3 (pizza, steak, sushi)
Training Data: Pizza-Steak-Sushi dataset
Framework: PyTorch with Transformers
Model weights: Downloaded automatically from the Hugging Face Model Hub (see model repo)

🛠️ Development

Project Structure

utils/: Core utilities for model loading, image processing, and prediction
interface/: Web interface components
assets/samples/: Example images and static assets

Adding New Features

New Model: Update config.py and utils/model_loader.py
New Classes: Modify config.py CLASS_CONFIG
New Interface: Create new files in interface/
New Utilities: Add to utils/ directory

🧹 Project Cleanliness & GitHub Readiness

All unnecessary files and caches have been removed
Example images are dynamically loaded
No test or debug files in the repo
Model weights are not included in the repo (downloaded from the Hub)
Ready for production and version control

🐛 Troubleshooting

See INSTALLATION.md for detailed troubleshooting, dependency, and environment tips.

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

📞 Support

For questions and support:

Open an issue on GitHub
Check the troubleshooting section
Review the configuration options

Enjoy classifying your food images! 🍕🥩🍣