Inference API Body Structure

#28

by Shivkumar27 - opened Dec 20, 2024

Discussion

Shivkumar27

Dec 20, 2024

•

edited Dec 20, 2024

Hello,
I am trying to pass the body in this format for Inference API

{
   "inputs":"[INST]{{Prompt}}.\n{{Information}}\nQuestion:{{Question}}\nAssistant:\n[/INST]",
   "parameters":
   {
    "return_full_text":false,
    "temperature":0.9,
    "max_new_tokens":2048,
    "top_p": 0.9,
    "do_sample": true
   }
}

I am getting this below error

Failed to deserialize the JSON body into the target type: missing field `messages` at line 13 column 1

The format for Inference API in the model is given in this format

{
    "model": "Qwen/Qwen2.5-72B-Instruct",
    "messages": [
        {
            "role": "user",
            "content": "What is the capital of France?"
        }
    ],
    "max_tokens": 500,
    "stream": false
}

Is there any way or changes which i can make so that i can provide my input in the input key instead of defining the role and content??

Thanks

OptimusePrime

Dec 21, 2024

You could theoretically do this using the endpoint https://api-inference.huggingface.co/models/Qwen/Qwen2.5-72B-Instruct, but DO NOT do this in 99.9% of cases. The chat template you used in the example you gave is not the chat template used by Qwen2.5. Using an instruction-tuned LLM without the correct chat template, even if the error is just a single token, can massively affect its performance. I see no reason not to use the Chat Completions API, which will apply the correct chat template.

Shivkumar27

Dec 23, 2024

Okay, Thanks @OptimusePrime
Is there any other suggestion which you can give so that it doesn't affect the model performance?

Thanks

xujfcn

7 days ago

For those asking about API access — I've been using Crazyrouter as a unified gateway. One API key, OpenAI SDK compatible. Works well for testing different models without managing multiple accounts.

xujfcn

7 days ago

If you are looking for a simpler way to call Qwen models via API, you can use any OpenAI-compatible gateway. For example, with Crazyrouter:

from openai import OpenAI
client = OpenAI(base_url="https://crazyrouter.com/v1", api_key="your-key")
response = client.chat.completions.create(
    model="qwen-plus",
    messages=[{"role": "user", "content": "Hello!"}]
)

Standard OpenAI request body, no special formatting needed. Works with all Qwen models.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment