YOLO11m Widget Detector
YOLO11m Widget Detector is a 20.1 million parameter object detector trained on the dataset from the paper CommonForms: A Large, Diverse Dataset for Form Field Detection. The model detects widgets from among three classes: TextBoxes (text_input), ChoiceButtons (choice_button / checkboxes), and Signature fields (signature).
π Live Demo
Try it directly in your browser β no installation needed:
π Open Live Demo on Hugging Face Spaces
Results
| Model | Text | Choice | Signature | mAP@50 (β) |
|---|---|---|---|---|
| YOLO11m (1024px) | 81.4 | 70.9 | 83.8 | 78.7 |
Installation
The psynx-widget-detector package can be installed with either uv or pip, feel free to choose your package manager flavor. The uv command:
uv pip install psynx-widget-detector
The pip command:
pip install psynx-widget-detector
Once it's installed, you should be able to run inference on ~any PDF.
Python API
The simplest usage will run inference using the default suggested settings. The model weights will automatically download from Hugging Face on your first run.
from widget_detector import WidgetDetector
# Initialize the detector
# (Downloads PSynx/widget-detector-yolo automatically)
detector = WidgetDetector(
conf=0.25, # Confidence threshold
iou=0.45, # NMS IoU threshold
imgsz=1024, # Inference resolution
device="cpu" # "cuda" for GPU, "cpu" for CPU
)
# Process a PDF or Image
result = detector.detect_path("input.pdf")
# Print results
for page in result.pages:
print(f"Page {page.page}: Found {len(page.widgets)} widgets")
for w in page.widgets:
print(f" - {w.class_name} ({w.confidence:.2f})")
# Save output to JSON
result.save("output.json")
Example Output
Here is an example of the model's output on a sample document:
References
CommonForms: A Large, Diverse Dataset for Form Field Detection
- Downloads last month
- -
