Instructions to use PSynx/widget-detector-yolo with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- ultralytics
How to use PSynx/widget-detector-yolo with ultralytics:
# Couldn't find a valid YOLO version tag. # Replace XX with the correct version. from ultralytics import YOLOvXX model = YOLOvXX.from_pretrained("PSynx/widget-detector-yolo") source = 'http://images.cocodataset.org/val2017/000000039769.jpg' model.predict(source=source, save=True) - Notebooks
- Google Colab
- Kaggle
| language: | |
| - en | |
| license: mit | |
| library_name: ultralytics | |
| tags: | |
| - yolo11 | |
| - object-detection | |
| - document-ai | |
| - form-understanding | |
| - vision | |
| pipeline_tag: object-detection | |
| # YOLO11m Widget Detector | |
| YOLO11m Widget Detector is a 20.1 million parameter object detector trained on the dataset from the paper *CommonForms: A Large, Diverse Dataset for Form Field Detection*. The model detects widgets from among three classes: TextBoxes (`text_input`), ChoiceButtons (`choice_button` / checkboxes), and Signature fields (`signature`). | |
| ## Results | |
| | Model | Text | Choice | Signature | mAP@50 (↑) | | |
| |---|---|---|---|---| | |
| | YOLO11m v3 (1024px) | 81.4 | 70.9 | 83.8 | 78.7 | | |
| | **YOLO11m v4 (1024px)** | **83.9** | **72.1** | **86.6** | **80.9** | | |
| ## Installation | |
| The `psynx-widget-detector` package can be installed with either `uv` or `pip`, feel free to choose your package manager flavor. The `uv` command: | |
| ```bash | |
| uv pip install psynx-widget-detector | |
| ``` | |
| The `pip` command: | |
| ```bash | |
| pip install psynx-widget-detector | |
| ``` | |
| Once it's installed, you should be able to run inference on ~any PDF. | |
| ## Python API | |
| The simplest usage will run inference using the default suggested settings. The model weights will automatically download from Hugging Face on your first run. | |
| ```python | |
| from widget_detector import WidgetDetector | |
| # Initialize the detector | |
| # (Downloads PSynx/widget-detector-yolo automatically) | |
| detector = WidgetDetector( | |
| conf=0.25, # Confidence threshold | |
| iou=0.45, # NMS IoU threshold | |
| imgsz=1024, # Inference resolution | |
| device="cpu" # "cuda" for GPU, "cpu" for CPU | |
| ) | |
| # Process a PDF or Image | |
| result = detector.detect_path("input.pdf") | |
| # Print results | |
| for page in result.pages: | |
| print(f"Page {page.page}: Found {len(page.widgets)} widgets") | |
| for w in page.widgets: | |
| print(f" - {w.class_name} ({w.confidence:.2f})") | |
| # Save output to JSON | |
| result.save("output.json") | |
| ``` | |
| ## Example Output | |
| Here is an example of the model's output on a sample document: | |
|  | |
| ## References | |
| *CommonForms: A Large, Diverse Dataset for Form Field Detection* | |