--- license: mit tags: - object-detection - computer-vision - feet - pytorch - faster-rcnn library_name: torchvision pipeline_tag: object-detection model-index: - name: Faster R-CNN Foot Detector results: [] --- # 🦶 Faster R-CNN Foot Detection Model ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/648a824a8ca6cf9857d1349c/5yqaFK6aNuC_suQE2o_BA.jpeg) by [Tony Assi](https://www.tonyassi.com/) This model detects **feet or shoes** in an image using a fine-tuned [Faster R-CNN](https://pytorch.org/vision/stable/models/generated/torchvision.models.detection.fasterrcnn_resnet50_fpn.html) model from Torchvision. It was trained on a small custom dataset of foot annotations and is intended as a starting point for foot/shoe detection in street, fashion, or movement-based applications. Web demo: [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/tonyassi/foot-detection) --- ## 🧠 Model Details - Base model: `fasterrcnn_resnet50_fpn` (pretrained on COCO) - Fine-tuned on 40 images of feet/shoes - Class labels: - `1`: foot/shoe - Bounding box outputs with confidence scores - Optimized for CPU (but works with MPS and CUDA) --- ## ⚡️ Quick Start To download this repository: ```bash git clone https://github.com/tonyassi/FootDetection.git cd FootDetection ``` Install: ```bash pip install -r requirements.txt ``` Usage: ```python from FootDetection import FootDetection from PIL import Image # Initialize model (first run will auto-download weights) foot_detection = FootDetection("cpu") # "cuda" for GPU or "mps" for Apple Silicon # Load image img = Image.open("image.jpg").convert("RGB") # Run detection results = foot_detection.detect(img, threshold=0.1) print(results) # Draw boxes img_with_boxes = foot_detection.draw_boxes(img) img_with_boxes.show() img_with_boxes.save("annotated_image.jpg") ``` --- ## 📦 Usage ```bash pip install torch torchvision pillow huggingface_hub ``` ```python import os import torch import torchvision from torchvision.models.detection.faster_rcnn import FastRCNNPredictor from PIL import Image, ImageDraw from torchvision.transforms import functional as F from huggingface_hub import hf_hub_download # ===== CONFIG ===== device = torch.device("cpu") # or "mps" if stable checkpoint_dir = "checkpoints" checkpoint_file = "fasterrcnn_foot.pth" local_path = os.path.join(checkpoint_dir, checkpoint_file) # ===== Ensure Checkpoint Exists ===== if not os.path.exists(local_path): os.makedirs(checkpoint_dir, exist_ok=True) print("Downloading model from Hugging Face...") local_path = hf_hub_download( repo_id="tonyassi/foot-detection", filename=checkpoint_file, local_dir=checkpoint_dir ) # ===== Load Model ===== model = torchvision.models.detection.fasterrcnn_resnet50_fpn(weights="DEFAULT") in_features = model.roi_heads.box_predictor.cls_score.in_features model.roi_heads.box_predictor = FastRCNNPredictor(in_features, 2) model.load_state_dict(torch.load(local_path, map_location=device)) model.to(device) model.eval() # ===== Function: Foot Detection ===== def foot_detection(image, threshold=0.1): """Takes a PIL image, returns bounding boxes + scores above threshold""" image_tensor = F.to_tensor(image).unsqueeze(0).to(device) with torch.no_grad(): outputs = model(image_tensor)[0] boxes = [] scores = [] for box, score in zip(outputs["boxes"], outputs["scores"]): if score >= threshold: boxes.append(box.tolist()) scores.append(score.item()) return { "boxes": boxes, "scores": scores } # ===== Function: Draw Bounding Boxes ===== def draw_bounding_box(image, detection): """Draws boxes and scores on a copy of the image""" image_copy = image.copy() draw = ImageDraw.Draw(image_copy) for box, score in zip(detection["boxes"], detection["scores"]): x0, y0, x1, y1 = box draw.rectangle([x0, y0, x1, y1], outline="red", width=3) draw.text((x0, y0), f"{score:.2f}", fill="red") return image_copy from PIL import Image # ==== Load and prepare image ==== image_path = "test.jpg" # replace with your image path image = Image.open(image_path).convert("RGB") # ==== Run detection ==== detections = foot_detection(image, threshold=0.3) # ==== Draw results ==== result_image = draw_bounding_box(image, detections) result_image.show() # or result_image.save("output.jpg") ```