How to use UiPath/pix2struct-vision-base with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("UiPath/pix2struct-vision-base") model = AutoModel.from_pretrained("UiPath/pix2struct-vision-base")