Commit
Β·
9566d4f
1
Parent(s):
6393558
Add structured outputs comparison: vLLM vs PydanticAI
Browse filesDocuments the incompatibility:
- vLLM uses extra_body.structured_outputs (JSON in content)
- PydanticAI uses tools + tool_choice (JSON in tool_calls)
Explains why PydanticAI works with HF Space but not vLLM.
docs/STRUCTURED_OUTPUTS_COMPARISON.md
ADDED
|
@@ -0,0 +1,132 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Structured Outputs: vLLM vs PydanticAI Comparison
|
| 2 |
+
|
| 3 |
+
## Overview
|
| 4 |
+
|
| 5 |
+
This document compares how vLLM and PydanticAI handle structured outputs, and why they may not be fully compatible.
|
| 6 |
+
|
| 7 |
+
## vLLM Structured Outputs
|
| 8 |
+
|
| 9 |
+
### Method
|
| 10 |
+
vLLM uses **`extra_body`** parameter with `structured_outputs` key (NOT standard OpenAI `response_format`):
|
| 11 |
+
|
| 12 |
+
```python
|
| 13 |
+
completion = client.chat.completions.create(
|
| 14 |
+
model="DragonLLM/Qwen-Open-Finance-R-8B",
|
| 15 |
+
messages=[{"role": "user", "content": "Generate JSON..."}],
|
| 16 |
+
extra_body={
|
| 17 |
+
"structured_outputs": {
|
| 18 |
+
"json": json_schema # Pydantic model.model_json_schema()
|
| 19 |
+
}
|
| 20 |
+
}
|
| 21 |
+
)
|
| 22 |
+
```
|
| 23 |
+
|
| 24 |
+
### Supported Formats
|
| 25 |
+
1. **JSON Schema**: `{"json": json_schema}`
|
| 26 |
+
2. **Regex**: `{"regex": r"pattern"}`
|
| 27 |
+
3. **Choice**: `{"choice": ["option1", "option2"]}`
|
| 28 |
+
4. **Grammar**: `{"grammar": "CFG definition"}`
|
| 29 |
+
|
| 30 |
+
### Response Format
|
| 31 |
+
- Returns JSON string in `message.content`
|
| 32 |
+
- No tool calls involved
|
| 33 |
+
- Direct JSON in content field
|
| 34 |
+
|
| 35 |
+
## PydanticAI Structured Outputs
|
| 36 |
+
|
| 37 |
+
### Method
|
| 38 |
+
PydanticAI uses **tool calling** with `tool_choice="required"`:
|
| 39 |
+
|
| 40 |
+
```python
|
| 41 |
+
agent = Agent(model, system_prompt="...")
|
| 42 |
+
result = await agent.run(prompt, output_type=Portfolio)
|
| 43 |
+
```
|
| 44 |
+
|
| 45 |
+
### How It Works
|
| 46 |
+
1. PydanticAI converts `output_type` (Pydantic model) to a tool definition
|
| 47 |
+
2. Sends request with:
|
| 48 |
+
- `tools`: [function definition matching the schema]
|
| 49 |
+
- `tool_choice`: `"required"` (forces tool call)
|
| 50 |
+
3. Expects response with `tool_calls` array
|
| 51 |
+
4. Extracts JSON from `tool_calls[0].function.arguments`
|
| 52 |
+
|
| 53 |
+
### Expected Response Format
|
| 54 |
+
```json
|
| 55 |
+
{
|
| 56 |
+
"choices": [{
|
| 57 |
+
"message": {
|
| 58 |
+
"tool_calls": [{
|
| 59 |
+
"function": {
|
| 60 |
+
"name": "...",
|
| 61 |
+
"arguments": "{\"field\": \"value\"}" // JSON string
|
| 62 |
+
}
|
| 63 |
+
}]
|
| 64 |
+
}
|
| 65 |
+
}]
|
| 66 |
+
}
|
| 67 |
+
```
|
| 68 |
+
|
| 69 |
+
## Compatibility Issue
|
| 70 |
+
|
| 71 |
+
### Problem
|
| 72 |
+
- **vLLM**: Uses `extra_body.structured_outputs` β Returns JSON in `message.content`
|
| 73 |
+
- **PydanticAI**: Uses `tools` + `tool_choice="required"` β Expects JSON in `tool_calls[].function.arguments`
|
| 74 |
+
|
| 75 |
+
### Current Status
|
| 76 |
+
- β
**HF Space**: Works because it implements tool calling support
|
| 77 |
+
- β **vLLM**: Fails because vLLM's structured outputs return JSON in `content`, not `tool_calls`
|
| 78 |
+
|
| 79 |
+
## Solutions
|
| 80 |
+
|
| 81 |
+
### Option 1: Use vLLM's `extra_body` (Recommended)
|
| 82 |
+
Modify PydanticAI's OpenAI provider to detect vLLM and use `extra_body` instead of tools:
|
| 83 |
+
|
| 84 |
+
```python
|
| 85 |
+
# In PydanticAI OpenAI provider
|
| 86 |
+
if output_type:
|
| 87 |
+
json_schema = output_type.model_json_schema()
|
| 88 |
+
# Use vLLM structured_outputs instead of tools
|
| 89 |
+
extra_body = {
|
| 90 |
+
"structured_outputs": {"json": json_schema}
|
| 91 |
+
}
|
| 92 |
+
```
|
| 93 |
+
|
| 94 |
+
### Option 2: Add Tool Call Support to vLLM Response
|
| 95 |
+
When vLLM receives `tools` + `tool_choice="required"`, wrap the structured output in a tool call format.
|
| 96 |
+
|
| 97 |
+
### Option 3: Use `response_format` (Limited)
|
| 98 |
+
Standard OpenAI `response_format={"type": "json_object"}` works but:
|
| 99 |
+
- Only enforces JSON, not schema validation
|
| 100 |
+
- PydanticAI would need to parse and validate manually
|
| 101 |
+
- Less reliable than schema-based approaches
|
| 102 |
+
|
| 103 |
+
## Current Implementation Status
|
| 104 |
+
|
| 105 |
+
### HF Space (Transformers)
|
| 106 |
+
- β
Supports tool calling (text-based parsing)
|
| 107 |
+
- β
Supports `response_format`
|
| 108 |
+
- β
Works with PydanticAI's tool-based approach
|
| 109 |
+
|
| 110 |
+
### vLLM
|
| 111 |
+
- β
Supports `extra_body.structured_outputs` (JSON schema)
|
| 112 |
+
- β Does NOT support tool calling for structured outputs
|
| 113 |
+
- β
Supports `response_format` (basic JSON mode only)
|
| 114 |
+
|
| 115 |
+
## Recommendation
|
| 116 |
+
|
| 117 |
+
For full compatibility with PydanticAI, we need to:
|
| 118 |
+
|
| 119 |
+
1. **Detect vLLM endpoint** in PydanticAI provider
|
| 120 |
+
2. **Use `extra_body.structured_outputs`** instead of tools when using vLLM
|
| 121 |
+
3. **Parse `message.content`** instead of `tool_calls` for vLLM responses
|
| 122 |
+
|
| 123 |
+
Alternatively, implement a middleware in the HF Space API that:
|
| 124 |
+
- Detects `tools` + `tool_choice="required"` requests
|
| 125 |
+
- Converts to `extra_body.structured_outputs` for vLLM
|
| 126 |
+
- Wraps response in tool call format for PydanticAI compatibility
|
| 127 |
+
|
| 128 |
+
## References
|
| 129 |
+
|
| 130 |
+
- [vLLM Structured Outputs Docs](https://docs.vllm.ai/en/stable/features/structured_outputs/)
|
| 131 |
+
- [PydanticAI Documentation](https://ai.pydantic.dev/)
|
| 132 |
+
|