jeanbaptdzd commited on
Commit
9566d4f
Β·
1 Parent(s): 6393558

Add structured outputs comparison: vLLM vs PydanticAI

Browse files

Documents the incompatibility:
- vLLM uses extra_body.structured_outputs (JSON in content)
- PydanticAI uses tools + tool_choice (JSON in tool_calls)

Explains why PydanticAI works with HF Space but not vLLM.

Files changed (1) hide show
  1. docs/STRUCTURED_OUTPUTS_COMPARISON.md +132 -0
docs/STRUCTURED_OUTPUTS_COMPARISON.md ADDED
@@ -0,0 +1,132 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Structured Outputs: vLLM vs PydanticAI Comparison
2
+
3
+ ## Overview
4
+
5
+ This document compares how vLLM and PydanticAI handle structured outputs, and why they may not be fully compatible.
6
+
7
+ ## vLLM Structured Outputs
8
+
9
+ ### Method
10
+ vLLM uses **`extra_body`** parameter with `structured_outputs` key (NOT standard OpenAI `response_format`):
11
+
12
+ ```python
13
+ completion = client.chat.completions.create(
14
+ model="DragonLLM/Qwen-Open-Finance-R-8B",
15
+ messages=[{"role": "user", "content": "Generate JSON..."}],
16
+ extra_body={
17
+ "structured_outputs": {
18
+ "json": json_schema # Pydantic model.model_json_schema()
19
+ }
20
+ }
21
+ )
22
+ ```
23
+
24
+ ### Supported Formats
25
+ 1. **JSON Schema**: `{"json": json_schema}`
26
+ 2. **Regex**: `{"regex": r"pattern"}`
27
+ 3. **Choice**: `{"choice": ["option1", "option2"]}`
28
+ 4. **Grammar**: `{"grammar": "CFG definition"}`
29
+
30
+ ### Response Format
31
+ - Returns JSON string in `message.content`
32
+ - No tool calls involved
33
+ - Direct JSON in content field
34
+
35
+ ## PydanticAI Structured Outputs
36
+
37
+ ### Method
38
+ PydanticAI uses **tool calling** with `tool_choice="required"`:
39
+
40
+ ```python
41
+ agent = Agent(model, system_prompt="...")
42
+ result = await agent.run(prompt, output_type=Portfolio)
43
+ ```
44
+
45
+ ### How It Works
46
+ 1. PydanticAI converts `output_type` (Pydantic model) to a tool definition
47
+ 2. Sends request with:
48
+ - `tools`: [function definition matching the schema]
49
+ - `tool_choice`: `"required"` (forces tool call)
50
+ 3. Expects response with `tool_calls` array
51
+ 4. Extracts JSON from `tool_calls[0].function.arguments`
52
+
53
+ ### Expected Response Format
54
+ ```json
55
+ {
56
+ "choices": [{
57
+ "message": {
58
+ "tool_calls": [{
59
+ "function": {
60
+ "name": "...",
61
+ "arguments": "{\"field\": \"value\"}" // JSON string
62
+ }
63
+ }]
64
+ }
65
+ }]
66
+ }
67
+ ```
68
+
69
+ ## Compatibility Issue
70
+
71
+ ### Problem
72
+ - **vLLM**: Uses `extra_body.structured_outputs` β†’ Returns JSON in `message.content`
73
+ - **PydanticAI**: Uses `tools` + `tool_choice="required"` β†’ Expects JSON in `tool_calls[].function.arguments`
74
+
75
+ ### Current Status
76
+ - βœ… **HF Space**: Works because it implements tool calling support
77
+ - ❌ **vLLM**: Fails because vLLM's structured outputs return JSON in `content`, not `tool_calls`
78
+
79
+ ## Solutions
80
+
81
+ ### Option 1: Use vLLM's `extra_body` (Recommended)
82
+ Modify PydanticAI's OpenAI provider to detect vLLM and use `extra_body` instead of tools:
83
+
84
+ ```python
85
+ # In PydanticAI OpenAI provider
86
+ if output_type:
87
+ json_schema = output_type.model_json_schema()
88
+ # Use vLLM structured_outputs instead of tools
89
+ extra_body = {
90
+ "structured_outputs": {"json": json_schema}
91
+ }
92
+ ```
93
+
94
+ ### Option 2: Add Tool Call Support to vLLM Response
95
+ When vLLM receives `tools` + `tool_choice="required"`, wrap the structured output in a tool call format.
96
+
97
+ ### Option 3: Use `response_format` (Limited)
98
+ Standard OpenAI `response_format={"type": "json_object"}` works but:
99
+ - Only enforces JSON, not schema validation
100
+ - PydanticAI would need to parse and validate manually
101
+ - Less reliable than schema-based approaches
102
+
103
+ ## Current Implementation Status
104
+
105
+ ### HF Space (Transformers)
106
+ - βœ… Supports tool calling (text-based parsing)
107
+ - βœ… Supports `response_format`
108
+ - βœ… Works with PydanticAI's tool-based approach
109
+
110
+ ### vLLM
111
+ - βœ… Supports `extra_body.structured_outputs` (JSON schema)
112
+ - ❌ Does NOT support tool calling for structured outputs
113
+ - βœ… Supports `response_format` (basic JSON mode only)
114
+
115
+ ## Recommendation
116
+
117
+ For full compatibility with PydanticAI, we need to:
118
+
119
+ 1. **Detect vLLM endpoint** in PydanticAI provider
120
+ 2. **Use `extra_body.structured_outputs`** instead of tools when using vLLM
121
+ 3. **Parse `message.content`** instead of `tool_calls` for vLLM responses
122
+
123
+ Alternatively, implement a middleware in the HF Space API that:
124
+ - Detects `tools` + `tool_choice="required"` requests
125
+ - Converts to `extra_body.structured_outputs` for vLLM
126
+ - Wraps response in tool call format for PydanticAI compatibility
127
+
128
+ ## References
129
+
130
+ - [vLLM Structured Outputs Docs](https://docs.vllm.ai/en/stable/features/structured_outputs/)
131
+ - [PydanticAI Documentation](https://ai.pydantic.dev/)
132
+