open-finance-llm-8b / docs /openai_api_verification.md
jeanbaptdzd's picture
Reorganize tests and clean up documentation
6d3bf74
|
raw
history blame
6.96 kB

OpenAI API Compatibility Verification

Overview

This document verifies that our OpenAI API wrapper implementation correctly follows the OpenAI API specification and properly connects to the Qwen fine-tuned model.

Connection Flow

OpenAI-compatible Client
    ↓ (OpenAI API requests)
Hugging Face Space API (simple-llm-pro-finance)
    ↓ (FastAPI router)
TransformersProvider
    ↓ (Hugging Face Transformers)
Qwen-Open-Finance-R-8B Model

OpenAI API Specification Compliance

1. Chat Completions Endpoint: /v1/chat/completions

βœ… Request Parameters (All Supported)

Parameter Type Status Notes
model string βœ… Required, defaults to configured model
messages array βœ… Required, validated
temperature number βœ… Optional, default 0.7, validated (0-2)
max_tokens integer βœ… Optional, validated (β‰₯1)
stream boolean βœ… Optional, default false
top_p number βœ… Optional, default 1.0
tools array βœ… Optional, tool definitions
tool_choice string/object βœ… Optional, supports "none", "auto", "required"
response_format object βœ… Optional, supports {"type": "json_object"}

βœ… Response Format

Field Type Status Notes
id string βœ… Generated chat completion ID
object string βœ… "chat.completion"
created integer βœ… Unix timestamp
model string βœ… Model name
choices array βœ… Array of Choice objects
usage object βœ… Token usage statistics

βœ… Choice Object

Field Type Status Notes
index integer βœ… Choice index
message object βœ… Message object
finish_reason string βœ… "stop", "length", "tool_calls"

βœ… Message Object

Field Type Status Notes
role string βœ… "assistant"
content string/null βœ… Message content
tool_calls array/null βœ… Array of ToolCall objects

βœ… ToolCall Object

Field Type Status Notes
id string βœ… Tool call ID
type string βœ… "function"
function object βœ… FunctionCall object

βœ… FunctionCall Object

Field Type Status Notes
name string βœ… Function name
arguments string βœ… JSON string of arguments

2. Tool Choice Handling

βœ… Supported Values

  • "none": Model will not call any tools
  • "auto": Model can choose to call tools (default)
  • "required": Model must call a tool (converted to "auto" for text-based models)
  • {"type": "function", "function": {"name": "..."}}: Force specific tool

Implementation Note: Since Qwen is a text-based model (not native function calling), we convert "required" to "auto" and handle tool calls via text parsing.

3. Response Format Handling

βœ… JSON Object Mode

When response_format={"type": "json_object"} is provided:

  • βœ… System prompt is enhanced with JSON output instructions
  • βœ… Response is parsed to extract JSON from markdown code blocks
  • βœ… Clean JSON is returned for validation

Implementation: Since Qwen doesn't have native JSON mode, we enforce it via prompt engineering and post-processing.

Client Integration

βœ… Supported Parameters

The API accepts standard OpenAI API parameters:

{
    "model": "dragon-llm-open-finance",
    "messages": [...],
    "temperature": 0.7,
    "max_tokens": 3000,
    "response_format": {"type": "json_object"},  # βœ… Supported
    "tool_choice": "required",  # βœ… Accepted (converted to "auto")
    "tools": [...]  # βœ… Tool definitions supported
}

βœ… Implementation Details

  1. βœ… tool_choice="required" β†’ Accepted and converted to "auto"
  2. βœ… response_format={"type": "json_object"} β†’ JSON instructions added to prompt
  3. βœ… tools array β†’ Formatted and added to system prompt
  4. βœ… Tool calls in response β†’ Parsed from text and returned in OpenAI format

Qwen Model Integration

βœ… Model Connection

  1. Model Loading: βœ… Uses Hugging Face Transformers

    • Model: DragonLLM/Qwen-Open-Finance-R-8B
    • Tokenizer: Auto-loaded with model
    • Device: Auto (CUDA if available)
  2. Prompt Formatting: βœ… Uses Qwen chat template

    • System prompts properly formatted
    • Tools added to system prompt
    • JSON instructions added when needed
  3. Response Processing: βœ…

    • Text generation via Transformers
    • Tool call parsing from text
    • JSON extraction from markdown

βœ… Qwen-Specific Considerations

  1. Text-Based Tool Calls: Qwen doesn't have native function calling, so we:

    • Format tools in system prompt
    • Parse <tool_call>...</tool_call> blocks from response
    • Convert to OpenAI-compatible format
  2. JSON Output: Qwen doesn't have native JSON mode, so we:

    • Add JSON instructions to system prompt
    • Extract JSON from markdown code blocks
    • Validate and return clean JSON

Verification Checklist

API Compatibility

  • All required OpenAI API parameters supported
  • Response format matches OpenAI specification
  • Error handling follows OpenAI error format
  • Streaming support implemented
  • Tool calls properly formatted

Client Compatibility

  • tool_choice="required" accepted
  • response_format supported
  • Structured output requests handled correctly
  • Tool definitions passed through
  • Structured outputs extracted

Qwen Model Integration

  • Model loads correctly from Hugging Face
  • Chat template applied correctly
  • Tools formatted for Qwen prompt style
  • Tool calls parsed from Qwen text format
  • JSON extracted from Qwen responses

Testing Recommendations

  1. Basic Chat: Verify simple chat completions work
  2. Tool Calls: Test with tools defined, verify parsing
  3. Structured Outputs: Test with response_format, verify JSON extraction
  4. Error Handling: Test invalid requests return proper errors
  5. Streaming: Test streaming responses work correctly

Known Limitations

  1. Native Function Calling: Qwen doesn't support native function calling, so we use text-based parsing
  2. JSON Mode: Qwen doesn't have native JSON mode, so we enforce via prompts
  3. Tool Choice "required": Converted to "auto" since we can't force tool calls in text-based models

Conclusion

βœ… Our OpenAI API wrapper is correctly implemented and properly connected to the Qwen fine-tuned model.

The implementation:

  • Follows OpenAI API specification
  • Handles OpenAI-compatible parameters correctly
  • Properly integrates with Qwen model via Transformers
  • Provides fallbacks for features not natively supported by Qwen