Reframr OpenAI-Compatible Runtime
Reframr v3 runtime work includes an OpenAI-style adapter so apps can plug Reframr into existing chat, support, and tool orchestration systems without writing custom prompt glue.
Chat Completion
from pathlib import Path
from reframr import ReframrModel, build_chat_completion_response
model = ReframrModel.load(Path("model.safetensors"))
response = build_chat_completion_response(
model,
{
"model": "reframr-v3",
"messages": [
{"role": "system", "content": "Be concise and cite sources when tool results are provided."},
{"role": "user", "content": "Summarize this customer support issue."},
],
"max_tokens": 160,
"temperature": 0.58,
},
)
print(response["choices"][0]["message"]["content"])
Streaming
from reframr.openai_compat import iter_sse_chat_completion
for event in iter_sse_chat_completion(model, request):
send_to_browser(event)
The stream emits OpenAI-style chat.completion.chunk SSE events and ends with:
data: [DONE]
Tool Loop
Register real tools in the host application. Reframr can request a tool with <tool_call>, the host executes the function, and the result is fed back as <tool_result> / <source> evidence.
from reframr.openai_compat import run_tool_loop
def web_search(arguments: dict[str, object]) -> dict[str, object]:
query = str(arguments["query"])
result = your_search_client.search(query)
return {
"ok": True,
"source": {
"title": result.title,
"url": result.url,
"snippet": result.snippet,
},
}
response = run_tool_loop(
model,
{
"model": "reframr-v3",
"messages": [
{"role": "user", "content": "What changed in the latest official release notes?"}
],
},
tools={"web.search": web_search},
max_rounds=3,
)
If a tool is missing or fails, the adapter sends the failure back as a tool result instead of crashing. That lets Reframr answer honestly, retry with a different tool if the model requests one, or ask the user for source evidence.
CLI
python -m reframr chat-completion --model model.safetensors < request.json
For SSE output:
{
"model": "reframr-v3",
"stream": true,
"messages": [
{"role": "user", "content": "Write a short support reply."}
]
}
Deployment Notes
- Keep real tools outside the model runtime and pass their outputs back as data.
- Treat source quality as part of the product: validate URLs, timestamps, permissions, and user access.
- Do not let the model fabricate tool results. If no tool result exists for a fresh fact, the app should ask for retrieval or return an uncertainty-aware answer.
- Use
session_idwithpython -m reframr servewhen you want conversation memory in the JSONL server.