Instructions to use google/pix2struct-widget-captioning-large with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/pix2struct-widget-captioning-large with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("visual-question-answering", model="google/pix2struct-widget-captioning-large")# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("google/pix2struct-widget-captioning-large") model = AutoModelForImageTextToText.from_pretrained("google/pix2struct-widget-captioning-large") - Notebooks
- Google Colab
- Kaggle
Upload processor
Browse files- preprocessor_config.json +6 -4
preprocessor_config.json
CHANGED
|
@@ -2,9 +2,11 @@
|
|
| 2 |
"do_convert_rgb": true,
|
| 3 |
"do_normalize": true,
|
| 4 |
"image_processor_type": "Pix2StructImageProcessor",
|
| 5 |
-
"
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
|
|
|
|
|
|
| 9 |
"processor_class": "Pix2StructProcessor"
|
| 10 |
}
|
|
|
|
| 2 |
"do_convert_rgb": true,
|
| 3 |
"do_normalize": true,
|
| 4 |
"image_processor_type": "Pix2StructImageProcessor",
|
| 5 |
+
"is_vqa": true,
|
| 6 |
+
"max_patches": 4096,
|
| 7 |
+
"patch_size": {
|
| 8 |
+
"height": 16,
|
| 9 |
+
"width": 16
|
| 10 |
+
},
|
| 11 |
"processor_class": "Pix2StructProcessor"
|
| 12 |
}
|