Instructions to use PSynx/widget-detector-yolo with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- ultralytics
How to use PSynx/widget-detector-yolo with ultralytics:
# Couldn't find a valid YOLO version tag. # Replace XX with the correct version. from ultralytics import YOLOvXX model = YOLOvXX.from_pretrained("PSynx/widget-detector-yolo") source = 'http://images.cocodataset.org/val2017/000000039769.jpg' model.predict(source=source, save=True) - Notebooks
- Google Colab
- Kaggle
File size: 2,660 Bytes
dc4dc63 0b8822a dc4dc63 b911a36 dc4dc63 0b8822a ea93f6c dc4dc63 ea93f6c 603158f dc4dc63 ea93f6c dc4dc63 ea93f6c dc4dc63 ea93f6c dc4dc63 ea93f6c dc4dc63 545a4c6 dc4dc63 ea93f6c dc4dc63 ea93f6c dc4dc63 ea93f6c dc4dc63 ea93f6c dc4dc63 ea93f6c dc4dc63 c6fe3cd dc4dc63 c6fe3cd dc4dc63 c6fe3cd dc4dc63 ea93f6c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 | ---
language:
- en
license: mit
library_name: ultralytics
tags:
- yolo11
- object-detection
- document-ai
- form-understanding
- vision
pipeline_tag: object-detection
---
## YOLO11m Widget Detector
YOLO11m Widget Detector is a lightweight, high-performance document widget detector designed for scanned forms and PDFs.
The model detects three common form widget types:
- text_input
- choice_button
- signature
It is optimized for:
- scanned forms
- enterprise PDFs
- OCR pipelines
- intelligent document processing (IDP)
- form digitization workflows
The detector supports both CPU and GPU inference and can process PDFs or images directly.
## Features
- Detects form fields from scanned PDFs and images
- Supports text boxes, checkboxes/radio buttons, and signatures
- Works directly on PDFs
- Optimized for document layouts
- JSON export support
- CPU and GPU compatible
- Hugging Face auto-download support
## Results
| Model | Text | Choice | Signature | mAP@50 (↑) |
|---|---|---|---|---|
| YOLO11m v3 (1024px) | 81.4 | 70.9 | 83.8 | 78.7 |
| **YOLO11m v4 (1024px)** | **83.9** | **72.1** | **86.6** | **80.9** |
## Installation
The `psynx-widget-detector` package can be installed with either `uv` or `pip`, feel free to choose your package manager flavor. The `uv` command:
```bash
uv pip install psynx-widget-detector
```
The `pip` command:
```bash
pip install psynx-widget-detector
```
Once it's installed, you should be able to run inference on ~any PDF.
## Python API
The simplest usage will run inference using the default suggested settings. The model weights will automatically download from Hugging Face on your first run.
```python
from widget_detector import WidgetDetector
# Initialize the detector
# (Downloads PSynx/widget-detector-yolo automatically)
detector = WidgetDetector(
conf=0.25, # Confidence threshold
iou=0.45, # NMS IoU threshold
imgsz=1024, # Inference resolution
device="cpu" # "cuda" for GPU, "cpu" for CPU
)
# Process a PDF or Image
result = detector.detect_path("input.pdf")
# Print results
for page in result.pages:
print(f"Page {page.page}: Found {len(page.widgets)} widgets")
for w in page.widgets:
print(f" - {w.class_name} ({w.confidence:.2f})")
# Save output to JSON
result.save("output.json")
```
## Example Input and Output
Here is an example of a document before and after widget detection:
**Input Document:**

**Detection Output:**

## References
*CommonForms: A Large, Diverse Dataset for Form Field Detection*
|