How to create a generation for a multi page document using a multi page in context example

#7
by shresht8 - opened

In one of the use cases I have tried the model, the input data is a multiple page document which I need to extract a schema from. The in context example that I want to use is also multipage. How can I send a request to my vLLM endpoint using OpenAI client in such a case. The demonstrated examples on the Model card is for a single page in context example and a single page test example. IF we were to modify this example to a 2 page example and 2 page test data how would we do it:

import base64

def encode_image(image_path):
"""
Encode the image file to base64 string
"""
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode('utf-8')

base64_image = encode_image("0.jpg")
base64_image2 = encode_image("1.jpg")

chat_response = client.chat.completions.create(
model="numind/NuExtract-2.0-8B",
temperature=0,
messages=[
{
"role": "user",
"content": [
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}}, # first ICL example image
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image2}"}}, # real input image
],
},
],
extra_body={
"chat_template_kwargs": {
"template": json.dumps(json.loads("""{"store": "verbatim-string"}"""), indent=4),
"examples": [
{
"input": "",
"output": """{"store": "Walmart"}"""
}
]
},
}
)
print("Chat response:", chat_response)

Sign up or log in to comment