Inference API Body Structure
Hello,
I am trying to pass the body in this format for Inference API
{
"inputs":"[INST]{{Prompt}}.\n{{Information}}\nQuestion:{{Question}}\nAssistant:\n[/INST]",
"parameters":
{
"return_full_text":false,
"temperature":0.9,
"max_new_tokens":2048,
"top_p": 0.9,
"do_sample": true
}
}
I am getting this below error
Failed to deserialize the JSON body into the target type: missing field `messages` at line 13 column 1
The format for Inference API in the model is given in this format
{
"model": "Qwen/Qwen2.5-72B-Instruct",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
],
"max_tokens": 500,
"stream": false
}
Is there any way or changes which i can make so that i can provide my input in the input key instead of defining the role and content??
Thanks
You could theoretically do this using the endpoint https://api-inference.huggingface.co/models/Qwen/Qwen2.5-72B-Instruct, but DO NOT do this in 99.9% of cases. The chat template you used in the example you gave is not the chat template used by Qwen2.5. Using an instruction-tuned LLM without the correct chat template, even if the error is just a single token, can massively affect its performance. I see no reason not to use the Chat Completions API, which will apply the correct chat template.
Okay, Thanks @OptimusePrime
Is there any other suggestion which you can give so that it doesn't affect the model performance?
Thanks
For those asking about API access — I've been using Crazyrouter as a unified gateway. One API key, OpenAI SDK compatible. Works well for testing different models without managing multiple accounts.
If you are looking for a simpler way to call Qwen models via API, you can use any OpenAI-compatible gateway. For example, with Crazyrouter:
from openai import OpenAI
client = OpenAI(base_url="https://crazyrouter.com/v1", api_key="your-key")
response = client.chat.completions.create(
model="qwen-plus",
messages=[{"role": "user", "content": "Hello!"}]
)
Standard OpenAI request body, no special formatting needed. Works with all Qwen models.