--- license: mit tags: - vision - food-recognition - ingredients - utensils - portion-size - computer-vision - mobile - ug-food-dataset --- # UG Food Detection Model This model identifies food ingredients, utensils, and estimates portion sizes from images. ## Model Description This Vision Transformer (ViT) model is trained on the UG Food Dataset to recognize: - Food ingredients: Various food items and ingredients - Kitchen utensils: Cooking tools and equipment - Portion sizes: Measurement estimates ## Classes The model can identify 40 classes. ## Usage ```python from transformers import ViTImageProcessor, ViTForImageClassification from PIL import Image import torch # Load model and processor processor = ViTImageProcessor.from_pretrained("ssevan/ug-food-detector") model = ViTForImageClassification.from_pretrained("ssevan/ug-food-detector") # Process image image = Image.open('food_image.jpg') inputs = processor(image, return_tensors='pt') # Get predictions with torch.no_grad(): outputs = model(**inputs) probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1) predicted_class_idx = torch.argmax(probabilities, dim=1).item() print(f'Predicted class index: {predicted_class_idx}') ``` ## Mobile Usage This model is optimized for mobile deployment.