How to use openbmb/MiniCPM-V-2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("visual-question-answering", model="openbmb/MiniCPM-V-2", trust_remote_code=True)
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("openbmb/MiniCPM-V-2", trust_remote_code=True, dtype="auto")
I was wondering if it's possible to get the model to return real coordinates β for example, the position of a button or the location of a window?
I appreciate your help.
Β· Sign up or log in to comment