Instructions to use prithivMLmods/rf-detr-mobile-gui-detection with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use prithivMLmods/rf-detr-mobile-gui-detection with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("object-detection", model="prithivMLmods/rf-detr-mobile-gui-detection")# Load model directly from transformers import AutoImageProcessor, AutoModelForObjectDetection processor = AutoImageProcessor.from_pretrained("prithivMLmods/rf-detr-mobile-gui-detection") model = AutoModelForObjectDetection.from_pretrained("prithivMLmods/rf-detr-mobile-gui-detection") - Notebooks
- Google Colab
- Kaggle
Metrics Loss Map
Per Class Metrics
Quick Start with Transformers
pip install torch==2.8.0 --index-url https://download.pytorch.org/whl/cu128
pip install torchvision==0.23.0 transformers==5.9.0 accelerate gradio==6.19.0
import gradio as gr
import torch
from PIL import Image, ImageDraw
from transformers import AutoImageProcessor, RfDetrForObjectDetection
# Load model and processor
model_name = "prithivMLmods/rf-detr-mobile-gui-detection"
processor = AutoImageProcessor.from_pretrained(model_name)
model = RfDetrForObjectDetection.from_pretrained(model_name)
# Detection threshold
THRESHOLD = 0.35
def detect_gui(image):
image = Image.fromarray(image).convert("RGB")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
target_sizes = torch.tensor([image.size[::-1]])
results = processor.post_process_object_detection(
outputs,
target_sizes=target_sizes,
threshold=THRESHOLD,
)[0]
draw = ImageDraw.Draw(image)
detections = []
for score, label, box in zip(
results["scores"],
results["labels"],
results["boxes"],
):
box = [round(x, 2) for x in box.tolist()]
label_name = model.config.id2label[label.item()]
confidence = round(score.item(), 3)
# Draw bounding box
draw.rectangle(box, outline="red", width=3)
# Draw label
draw.text(
(box[0] + 4, max(0, box[1] - 16)),
f"{label_name} {confidence:.2f}",
fill="red",
)
detections.append(
{
"Label": label_name,
"Confidence": confidence,
"Bounding Box": box,
}
)
return image, detections
demo = gr.Interface(
fn=detect_gui,
inputs=gr.Image(type="numpy", label="Upload Mobile UI Screenshot"),
outputs=[
gr.Image(type="pil", label="Detected GUI Elements"),
gr.JSON(label="Detections"),
],
title="RF-DETR Mobile GUI Detection",
description="Upload a mobile UI screenshot to detect GUI elements using RF-DETR.",
)
if __name__ == "__main__":
demo.launch()
e.g., demo screenshot
Acknowledgements
roboflow/rf-detr-medium: rf-detr is an end-to-end object detection model that combines ideas from lw-detr and deformable detr: a dinov2-with-registers-style vit backbone (with an rf-detr windowing pattern for efficient attention), a multi-scale projector between the encoder and decoder, and a multi-scale deformable detr decoder for fast convergence and strong accuracy-latency tradeoffs.
mobile ui design detection[dataset] by mrtoy: this dataset is designed for object detection tasks focused on detecting elements in mobile ui designs. the target objects include text, images, and groups. the dataset contains mobile ui images with object detection bounding boxes, class labels, and localization information.
- Downloads last month
- -
Model tree for prithivMLmods/rf-detr-mobile-gui-detection
Base model
Roboflow/rf-detr-medium

