Spaces:

mahmoudalrefaey
/

FoodClassifier-ViT

Sleeping

App Files Files Community

mahmoudalrefaey commited on Jul 1, 2025

Commit

2417fa0

verified ·

1 Parent(s): e535ad1

Upload 7 files

Browse files

Files changed (7) hide show

.gitignore +79 -0
INSTALLATION.md +185 -0
README.md +194 -0
app.py +125 -0
config.py +50 -0
predict.py +210 -0
requirements.txt +12 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,79 @@

+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+env/
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+*.egg-info/
+.installed.cfg
+*.egg
+# Virtual environments
+.venv/
+venv/
+ENV/
+env.bak/
+foodvit_env/
+# PyInstaller
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+.hypothesis/
+.pytest_cache/
+# Jupyter Notebook
+.ipynb_checkpoints
+# pyenv
+.python-version
+# mypy
+.mypy_cache/
+.dmypy.json
+# VS Code
+.vscode/
+# Mac
+.DS_Store
+# Windows
+Thumbs.db
+Desktop.ini
+# Model weights (optional: comment out if you want to track model files)
+model/*.pth
+# Sample images (optional: comment out if you want to track sample images)
+assets/samples/*

INSTALLATION.md ADDED Viewed

	@@ -0,0 +1,185 @@

+# Installation Guide for FoodViT
+## Prerequisites
+- Python 3.8 or higher
+- pip package manager
+- At least 4GB RAM (8GB recommended)
+- GPU support optional but recommended for faster inference
+## Installation Steps
+### 1. Clone or Download the Project
+Make sure you have all the project files in your directory:
+- `app.py` - Main application
+- `predict.py` - Command line tool
+- `config.py` - Configuration
+- `requirements.txt` - Dependencies
+- `model/bestViT_PT.pth` - Trained model
+- All utility and interface files
+### 2. Create a Virtual Environment (Recommended)
+```bash
+# Create virtual environment
+python -m venv foodvit_env
+# Activate virtual environment
+# On Windows:
+foodvit_env\Scripts\activate
+# On macOS/Linux:
+source foodvit_env/bin/activate
+```
+### 3. Install Dependencies
+```bash
+# Install PyTorch first (choose appropriate version for your system)
+# For CPU only:
+pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu
+# For CUDA (if you have NVIDIA GPU):
+# pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
+# Install other dependencies
+pip install -r requirements.txt
+```
+### 4. Troubleshooting Dependency Issues
+If you encounter dependency conflicts, try this step-by-step approach:
+```bash
+# 1. Install core dependencies first
+pip install torch torchvision
+pip install transformers==4.28.0
+pip install huggingface-hub==0.15.1
+pip install accelerate==0.20.3
+# 2. Install image processing libraries
+pip install Pillow opencv-python albumentations
+# 3. Install Gradio
+pip install gradio==3.35.2
+# 4. Install other utilities
+pip install numpy scikit-learn datasets
+```
+### 5. Alternative: Use Conda
+If you prefer conda:
+```bash
+# Create conda environment
+conda create -n foodvit python=3.9
+conda activate foodvit
+# Install PyTorch
+conda install pytorch torchvision -c pytorch
+# Install other packages
+pip install transformers==4.28.0 huggingface-hub==0.15.1
+pip install gradio==3.35.2
+pip install -r requirements.txt
+```
+## Testing the Installation
+### 1. Run Basic Tests
+```bash
+python simple_test.py
+```
+This should show all tests passing.
+### 2. Test the Web Interface
+```bash
+python app.py
+```
+Then open your browser to `http://localhost:7860`
+### 3. Test Command Line Tool
+```bash
+# Test help
+python predict.py --help
+# Test with a sample image (if you have one)
+python predict.py path/to/your/image.jpg
+```
+## Common Issues and Solutions
+### Issue: "cannot import name 'split_torch_state_dict_into_shards'"
+**Solution**: This is a version compatibility issue. Try:
+```bash
+pip uninstall huggingface-hub transformers accelerate
+pip install huggingface-hub==0.15.1 transformers==4.28.0 accelerate==0.20.3
+```
+### Issue: CUDA/GPU not working
+**Solution**:
+1. Check if you have NVIDIA GPU
+2. Install appropriate CUDA version
+3. Install PyTorch with CUDA support
+4. Or set device to 'cpu' in `config.py`
+### Issue: Model file not found
+**Solution**: Ensure `model/bestViT_PT.pth` exists in the project directory.
+### Issue: Memory errors
+**Solution**:
+1. Close other applications
+2. Use CPU instead of GPU
+3. Reduce batch size in configuration
+## System Requirements
+### Minimum Requirements
+- Python 3.8+
+- 4GB RAM
+- 500MB disk space
+### Recommended Requirements
+- Python 3.9+
+- 8GB RAM
+- NVIDIA GPU with CUDA support
+- 1GB disk space
+## Verification
+After successful installation, you should be able to:
+1. ✅ Run `python simple_test.py` without errors
+2. ✅ Start the web interface with `python app.py`
+3. ✅ Use command line tool with `python predict.py --help`
+4. ✅ Upload images and get predictions in the web interface
+## Getting Help
+If you encounter issues:
+1. Check the error messages carefully
+2. Ensure all dependencies are installed correctly
+3. Try the troubleshooting steps above
+4. Check if your Python version is compatible
+5. Verify the model file exists and is not corrupted
+## Next Steps
+Once installation is complete:
+1. **Web Interface**: Run `python app.py` and visit `http://localhost:7860`
+2. **Command Line**: Use `python predict.py` for batch processing
+3. **Customization**: Edit `config.py` to modify settings
+4. **Development**: Use the modular structure for extending functionality

README.md ADDED Viewed

	@@ -0,0 +1,194 @@

+# FoodViT - Food Classification Application
+A production-ready food classification application using Vision Transformer (ViT) that can classify images into three categories: **pizza**, **steak**, and **sushi**.
+## 🍕 Features
+- **Web Interface**: Beautiful Gradio web interface for easy image upload and classification
+- **Command Line Tool**: Batch prediction capabilities for processing multiple images
+- **High Accuracy**: Trained Vision Transformer model with excellent performance
+- **Production Ready**: Modular, well-structured codebase with proper error handling
+- **Dynamic Example Images**: Example images are randomly selected from `assets/samples/` at each app launch
+- **Easy Deployment**: Simple setup and configuration
+## 📁 Project Structure
+```
+FoodViT/
+├── app.py                 # Main application entry point
+├── predict.py            # Command-line prediction script
+├── config.py             # Configuration settings
+├── requirements.txt      # Python dependencies
+├── README.md            # This file
+├── INSTALLATION.md      # Installation and troubleshooting guide
+├── model/
+│   └── bestViT_PT.pth   # Trained PyTorch model
+├── utils/
+│   ├── model_loader.py  # Model loading utilities
+│   ├── image_processor.py # Image preprocessing
+│   └── predictor.py     # Prediction logic
+├── interface/
+│   └── gradio_app.py    # Gradio web interface
+└── assets/
+    └── samples/         # Example images for Gradio interface
+```
+## 🚀 Quick Start
+### 1. Installation
+```bash
+# Clone the repository
+git clone <repository-url>
+cd FoodViT
+# Install dependencies
+pip install -r requirements.txt
+```
+### 2. Run the Web Interface
+```bash
+# Start the Gradio web interface
+python app.py
+```
+The interface will be available at `http://localhost:7860`
+### 3. Command Line Usage
+```bash
+# Predict a single image
+python predict.py path/to/image.jpg
+# Predict all images in a directory
+python predict.py path/to/image/directory
+# Get detailed prediction information
+python predict.py path/to/image.jpg --detailed
+# Save results to JSON file
+python predict.py path/to/image/directory --output results.json
+```
+## 🎯 Usage Examples
+### Web Interface
+1. Open your browser and go to `http://localhost:7860`
+2. Upload an image of pizza, steak, or sushi
+3. View the prediction results with confidence scores
+4. Try the example images provided (randomly selected from `assets/samples/`)
+### Command Line
+```bash
+# Single image prediction
+python predict.py pizza.jpg
+# Output: ✅ pizza.jpg: Pizza (95.23%)
+# Batch prediction with details
+python predict.py test_images/ --detailed --output results.json
+```
+## ⚙️ Configuration
+Edit `config.py` to customize:
+- **Model settings**: Model path, device, image size
+- **Class configuration**: Class names and mappings
+- **Gradio interface**: Title, description, theme
+- **Application settings**: Host, port, debug mode
+## 🔧 Advanced Usage
+### Custom Model Loading
+```python
+from utils.model_loader import ModelLoader
+# Load custom model
+loader = ModelLoader()
+loader.load_model()
+model = loader.get_model()
+```
+### Image Preprocessing
+```python
+from utils.image_processor import ImageProcessor
+# Preprocess custom image
+processor = ImageProcessor()
+tensor = processor.preprocess_image("path/to/image.jpg")
+```
+### Direct Prediction
+```python
+from utils.predictor import FoodPredictor
+# Initialize and predict
+predictor = FoodPredictor()
+predictor.initialize()
+result = predictor.predict("path/to/image.jpg")
+print(f"Predicted: {result['class']} ({result['confidence']:.2%})")
+```
+## 📊 Model Information
+- **Architecture**: Vision Transformer (ViT-Base)
+- **Input Size**: 224x224 pixels
+- **Classes**: 3 (pizza, steak, sushi)
+- **Training Data**: Pizza-Steak-Sushi dataset
+- **Framework**: PyTorch with Transformers
+## 🛠️ Development
+### Project Structure
+- **`utils/`**: Core utilities for model loading, image processing, and prediction
+- **`interface/`**: Web interface components
+- **`model/`**: Trained model files
+- **`assets/samples/`**: Example images and static assets
+### Adding New Features
+1. **New Model**: Update `config.py` and `utils/model_loader.py`
+2. **New Classes**: Modify `config.py` CLASS_CONFIG
+3. **New Interface**: Create new files in `interface/`
+4. **New Utilities**: Add to `utils/` directory
+## 🧹 Project Cleanliness & GitHub Readiness
+- All unnecessary files and caches have been removed
+- Example images are dynamically loaded
+- No test or debug files in the repo
+- Ready for production and version control
+## 🐛 Troubleshooting
+See `INSTALLATION.md` for detailed troubleshooting, dependency, and environment tips.
+## 📝 License
+This project is licensed under the MIT License - see the LICENSE file for details.
+## 🤝 Contributing
+1. Fork the repository
+2. Create a feature branch
+3. Make your changes
+4. Add tests if applicable
+5. Submit a pull request
+## 📞 Support
+For questions and support:
+- Open an issue on GitHub
+- Check the troubleshooting section
+- Review the configuration options
+---
+**Enjoy classifying your food images! 🍕🥩🍣**

app.py ADDED Viewed

	@@ -0,0 +1,125 @@

+"""
+Main application file for FoodViT
+Entry point for the food classification application
+"""
+import os
+import sys
+import argparse
+from pathlib import Path
+# Add current directory to path for imports
+sys.path.append(os.path.dirname(os.path.abspath(__file__)))
+from config import APP_CONFIG
+from interface.gradio_app import launch_interface
+from utils.predictor import predictor
+def check_dependencies():
+    """Check if all required dependencies are available"""
+    required_packages = [
+        'torch',
+        'transformers',
+        'gradio',
+        'PIL',
+        'cv2',
+        'albumentations',
+        'numpy'
+    ]
+    missing_packages = []
+    for package in required_packages:
+        try:
+            __import__(package)
+        except ImportError:
+            missing_packages.append(package)
+    if missing_packages:
+        print(f"Missing required packages: {', '.join(missing_packages)}")
+        print("Please install them using: pip install -r requirements.txt")
+        return False
+    return True
+def check_model_file():
+    """Check if the model file exists"""
+    model_path = Path("model/bestViT_PT.pth")
+    if not model_path.exists():
+        print(f"Model file not found: {model_path}")
+        print("Please ensure the trained model file is in the model/ directory")
+        return False
+    return True
+def main():
+    """Main function to run the application"""
+    # Parse command line arguments
+    parser = argparse.ArgumentParser(description="FoodViT - Food Classification Application")
+    parser.add_argument(
+        "--port",
+        type=int,
+        default=APP_CONFIG["port"],
+        help="Port to run the server on"
+    )
+    parser.add_argument(
+        "--host",
+        type=str,
+        default=APP_CONFIG["host"],
+        help="Host to run the server on"
+    )
+    parser.add_argument(
+        "--share",
+        action="store_true",
+        help="Create a public link for the interface"
+    )
+    parser.add_argument(
+        "--debug",
+        action="store_true",
+        help="Enable debug mode"
+    )
+    args = parser.parse_args()
+    print("=" * 50)
+    print("FoodViT - Food Classification Application")
+    print("=" * 50)
+    # Check dependencies
+    print("Checking dependencies...")
+    if not check_dependencies():
+        sys.exit(1)
+    print("✓ All dependencies available")
+    # Check model file
+    print("Checking model file...")
+    if not check_model_file():
+        sys.exit(1)
+    print("✓ Model file found")
+    # Initialize predictor
+    print("Initializing model...")
+    if not predictor.initialize():
+        print("✗ Failed to initialize model")
+        sys.exit(1)
+    print("✓ Model initialized successfully")
+    # Get model info
+    model_info = predictor.get_model_info()
+    if "error" not in model_info:
+        print(f"✓ Model loaded on {model_info['device']}")
+        print(f"✓ Total parameters: {model_info['total_parameters']:,}")
+    print("\nStarting Gradio interface...")
+    print(f"Server will be available at: http://{args.host}:{args.port}")
+    try:
+        # Launch the interface
+        launch_interface()
+    except KeyboardInterrupt:
+        print("\nApplication stopped by user")
+    except Exception as e:
+        print(f"Error running application: {e}")
+        sys.exit(1)
+if __name__ == "__main__":
+    main()

config.py ADDED Viewed

	@@ -0,0 +1,50 @@

+"""
+Configuration file for FoodViT project
+Contains all model and application settings
+"""
+import os
+import torch
+# Model Configuration
+MODEL_CONFIG = {
+    "model_path": "model/bestViT_PT.pth",
+    "feature_extractor_name": "google/vit-base-patch16-224",
+    "num_labels": 3,
+    "image_size": 224,
+    "device": "cuda" if torch.cuda.is_available() else "cpu"
+}
+# Class Configuration
+CLASS_CONFIG = {
+    "class_names": ["pizza", "steak", "sushi"],
+    "id2label": {0: "pizza", 1: "steak", 2: "sushi"},
+    "label2id": {"pizza": 0, "steak": 1, "sushi": 2}
+}
+# Image Processing Configuration
+IMAGE_CONFIG = {
+    "target_size": (224, 224),
+    "normalize_mean": [0.5, 0.5, 0.5],
+    "normalize_std": [0.5, 0.5, 0.5]
+}
+# Gradio Interface Configuration
+GRADIO_CONFIG = {
+    "title": "FoodViT - Food Classification",
+    "description": "Upload an image to classify it as pizza, steak, or sushi",
+    "examples": [
+        ["assets/example_pizza.jpg"],
+        ["assets/example_steak.jpg"],
+        ["assets/example_sushi.jpg"]
+    ],
+    "theme": "default"
+}
+# Application Configuration
+APP_CONFIG = {
+    "debug": False,
+    "host": "127.0.0.1",
+    "port": 7860,
+    "share": False
+}

predict.py ADDED Viewed

	@@ -0,0 +1,210 @@

+"""
+Command-line prediction script for FoodViT
+Allows batch prediction and testing of the model
+"""
+import os
+import sys
+import argparse
+from pathlib import Path
+from PIL import Image
+# Add current directory to path for imports
+sys.path.append(os.path.dirname(os.path.abspath(__file__)))
+from utils.predictor import predictor
+from config import CLASS_CONFIG
+def predict_single_image(image_path):
+    """
+    Predict food class for a single image
+    Args:
+        image_path: Path to the image file
+    Returns:
+        dict: Prediction results
+    """
+    try:
+        # Check if file exists
+        if not os.path.exists(image_path):
+            return {"error": f"Image file not found: {image_path}"}
+        # Load image
+        image = Image.open(image_path)
+        # Make prediction
+        result = predictor.predict(image)
+        return result
+    except Exception as e:
+        return {"error": f"Error processing {image_path}: {str(e)}"}
+def predict_batch_images(image_dir):
+    """
+    Predict food classes for all images in a directory
+    Args:
+        image_dir: Directory containing images
+    Returns:
+        list: List of prediction results
+    """
+    results = []
+    # Supported image extensions
+    image_extensions = {'.jpg', '.jpeg', '.png', '.bmp', '.tiff', '.tif'}
+    try:
+        # Get all image files in directory
+        image_files = [
+            f for f in os.listdir(image_dir)
+            if Path(f).suffix.lower() in image_extensions
+        ]
+        if not image_files:
+            print(f"No image files found in {image_dir}")
+            return results
+        print(f"Found {len(image_files)} image files")
+        # Process each image
+        for i, filename in enumerate(image_files, 1):
+            image_path = os.path.join(image_dir, filename)
+            print(f"Processing {i}/{len(image_files)}: {filename}")
+            result = predict_single_image(image_path)
+            result['filename'] = filename
+            results.append(result)
+        return results
+    except Exception as e:
+        print(f"Error processing directory {image_dir}: {str(e)}")
+        return results
+def print_results(results, detailed=False):
+    """
+    Print prediction results in a formatted way
+    Args:
+        results: Single result dict or list of results
+        detailed: Whether to print detailed information
+    """
+    if isinstance(results, dict):
+        results = [results]
+    for result in results:
+        if "error" in result:
+            filename = result.get('filename', 'Unknown')
+            print(f"❌ {filename}: {result['error']}")
+            continue
+        if not result.get("success", False):
+            filename = result.get('filename', 'Unknown')
+            print(f"❌ {filename}: Prediction failed")
+            continue
+        # Extract information
+        filename = result.get('filename', 'Image')
+        predicted_class = result["class"]
+        confidence = result["confidence"]
+        # Print basic result
+        print(f"✅ {filename}: {predicted_class.title()} ({confidence:.2%})")
+        # Print detailed information if requested
+        if detailed:
+            print(f"   Class ID: {result['class_id']}")
+            print("   All probabilities:")
+            for class_name, prob in result["probabilities"].items():
+                print(f"     - {class_name.title()}: {prob:.2%}")
+            print()
+def main():
+    """Main function for command-line prediction"""
+    parser = argparse.ArgumentParser(description="FoodViT - Command Line Prediction")
+    parser.add_argument(
+        "input",
+        help="Image file path or directory containing images"
+    )
+    parser.add_argument(
+        "--detailed",
+        action="store_true",
+        help="Show detailed prediction information"
+    )
+    parser.add_argument(
+        "--output",
+        type=str,
+        help="Output file to save results (JSON format)"
+    )
+    args = parser.parse_args()
+    print("FoodViT - Command Line Prediction")
+    print("=" * 40)
+    # Initialize predictor
+    print("Initializing model...")
+    if not predictor.initialize():
+        print("Failed to initialize model")
+        sys.exit(1)
+    print("✓ Model initialized successfully")
+    # Check if input is file or directory
+    input_path = Path(args.input)
+    if input_path.is_file():
+        # Single image prediction
+        print(f"Predicting single image: {args.input}")
+        result = predict_single_image(args.input)
+        print_results([result], args.detailed)
+        results = [result]
+    elif input_path.is_dir():
+        # Batch prediction
+        print(f"Predicting images in directory: {args.input}")
+        results = predict_batch_images(args.input)
+        print_results(results, args.detailed)
+    else:
+        print(f"Error: {args.input} is not a valid file or directory")
+        sys.exit(1)
+    # Save results if output file specified
+    if args.output and results:
+        try:
+            import json
+            # Convert numpy types to native Python types for JSON serialization
+            json_results = []
+            for result in results:
+                json_result = {}
+                for key, value in result.items():
+                    if key == 'probabilities':
+                        json_result[key] = {k: float(v) for k, v in value.items()}
+                    elif isinstance(value, (int, float, str, bool)):
+                        json_result[key] = value
+                    else:
+                        json_result[key] = str(value)
+                json_results.append(json_result)
+            with open(args.output, 'w') as f:
+                json.dump(json_results, f, indent=2)
+            print(f"Results saved to: {args.output}")
+        except Exception as e:
+            print(f"Error saving results: {e}")
+    # Print summary
+    successful_predictions = [r for r in results if r.get("success", False)]
+    failed_predictions = len(results) - len(successful_predictions)
+    print(f"\nSummary:")
+    print(f"Total images: {len(results)}")
+    print(f"Successful predictions: {len(successful_predictions)}")
+    print(f"Failed predictions: {failed_predictions}")
+if __name__ == "__main__":
+    main()

requirements.txt ADDED Viewed

	@@ -0,0 +1,12 @@

+torch>=1.9.0,<2.0.0
+torchvision>=0.10.0,<0.15.0
+transformers>=4.20.0,<4.30.0
+gradio>=3.50.0,<4.0.0
+Pillow>=8.0.0,<10.0.0
+opencv-python>=4.5.0,<4.8.0
+albumentations>=1.3.0,<1.4.0
+numpy>=1.21.0,<1.25.0
+scikit-learn>=1.0.0,<1.3.0
+datasets>=2.0.0,<2.14.0
+accelerate>=0.20.0,<0.21.0
+huggingface-hub>=0.15.0,<0.16.0