Instructions to use google/pix2struct-docvqa-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/pix2struct-docvqa-base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("visual-question-answering", model="google/pix2struct-docvqa-base")# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("google/pix2struct-docvqa-base") model = AutoModelForMultimodalLM.from_pretrained("google/pix2struct-docvqa-base") - Notebooks
- Google Colab
- Kaggle
Sagemaker deployment template.
#1
by fprolog - opened
Is the sagemaker snippet ready for deployment , if so, how is the serializer expecting the input ?
I would also like an example of the predictor's input
I would also like an example of the predictor's input
In case you still need it , I made it work using directly the pytorch serving libraries of AWS https://sagemaker.readthedocs.io/en/stable/frameworks/pytorch/using_pytorch.html#deploy-pytorch-models
Thanks. I am familiar with the sagemaker model pipeline. My problem is due to the Image + text nature of this model. When I invoke "response = predictor.predict(data)" I would like an example of the data object containing an image and a prompt for the model.
fprolog changed discussion status to closed