Apodex

Apodex-1.0 — A Verification-Centric Agent for Deep Research

Online Service Homepage API
GitHub License

📰Tech Blog | 📄Tech Report

1. Model Introduction

Apodex-1.0 is a verification-centric model for deep research. The trained model alone, Apodex-1.0, runs as a standard tool-using ReAct agent. Deployed in our heavy-duty mode — an asynchronous agent team in which sub-agents specialize in retrieval and verification, route their reports through a shared evidence pool, and feed a global verifier that reasons over the assembled evidence graph to produce the final answer — it becomes Apodex-1.0-H.

Under the hood, a high-quality data pipeline and a three-stage post-training recipe (SFT, agentic DPO, RL on long agentic rollouts) substantially raise the deep-research capability of the Qwen3.5 base while preserving its general knowledge, coding, reasoning, and instruction-following capabilities — the recipe is additive on the deep-research axis rather than a trade across axes.

Apodex-1.0-H sets a new state of the art across both open- and closed-source models on deep-research benchmarks. Every claim in the final report it produces is backed by an explicit evidence chain and independently audited by a verification team before delivery.

Online Service

Try Apodex at apodex.ai.

Key Features

  • Verification-centric agent team. Instead of one agent carrying the full cognitive load, an orchestrator dispatches a heavy-duty agent team whose specialized sub-agents explore in parallel, and a global verifier audits the assembled evidence before any answer is committed. This combination delivers outstanding results: in deployment it coordinates up to 150 sub-agents over 15,000 steps in a single task.
  • Auditable by construction. Every claim in the final answer traces back to a node in the evidence graph and is independently checked before delivery; the report pool records every finding, verdict, and intervention, so conclusions are auditable, retractable, and forkable.
  • Preserving general capabilities. The deep-research focus does not come at the expense of the base model. Our post-training is designed to preserve rather than override: across general knowledge, mathematics, instruction-following, coding and long-context, Apodex-1.0-mini and Apodex-1.0 track their matched-size Qwen3.5 bases within roughly a point.

2. Evaluation Results

To prevent potential information leakage (e.g., retrieving benchmark answers from public repositories), we block access to relevant benchmark-hosting websites during evaluation.

Apodex-1.0-H sets a new state of the art across open- and closed-source frontier systems on the public deep-research suite, achieving 90.3 on BrowseComp, 84.1 on BrowseComp-ZH, 94.4 on DeepSearchQA, 60.8 on text-only HLE, 46.7 on FrontierScience-Research, 87.4 on FrontierScience-Olympiad, and 74.2 on SuperChem.

Apodex Benchmarks

Per-model breakdown across the open-weight checkpoints:

Model BrowseComp BrowseComp-ZH HLE-Text DeepSearchQA
Apodex-1.0-mini 71.5 80.6 46.8 82.2
Apodex-1.0-4B-SFT 48.8 63.5 32.9 69.9
Apodex-1.0-2B-SFT 27.9 35.0 18.2 49.9
Apodex-1.0-0.8B-SFT 13.9 10.7 11.2 25.8

3. Quick Start

Apodex follows the Qwen3.5 chat template — tool calls are emitted as <function=...><parameter=...> and reasoning as .... Launch with the matching parsers so the server returns standard OpenAI-style tool_calls and reasoning_content fields.

3.1 Deployment

We recommend deploying Apodex with the latest SGLang or vLLM for an OpenAI-compatible endpoint.

# SGLang
python3 -m sglang.launch_server --model-path apodex/Apodex-1.0-35B-A3B --tp 8 --host 0.0.0.0 --port 1234 --context-length 262144 --tool-call-parser qwen3_coder --reasoning-parser qwen3

# vLLM
vllm serve apodex/Apodex-1.0-35B-A3B --tensor-parallel-size 8 --max-model-len 262144 --enable-auto-tool-choice --tool-call-parser qwen3_coder --reasoning-parser qwen3

3.2 Best Practices

For optimal performance in agentic tasks, we recommend:

temperature: 1.0
top_p: 0.95
repetition_penalty: 1.05
max_context_length: 262144
max_tokens: 32768

3.3 Agentic Usage

Apodex is trained for native function calling — tool schemas are passed via the tools= parameter of the chat-completions API and rendered into the prompt by the chat template, so the system prompt itself only needs to set the role and the high-level objective. We recommend the prompt below (this is the prompt used in our internal evaluation runs):

You are Apodex, an AI assistant developed by Apodex AI.

Apodex is the flagship agent of Apodex AI. Rather than a conventional conversational LLM, it is a general-purpose solver designed for mission-critical tasks.

Current time: {today_date}. In this environment you have access to a set of tools you can use to answer the user's question.

You only have access to the tools provided. You can use multiple tools per message, and will receive the results of those tools in the user's next response. You use tools step-by-step to accomplish a given task.

# General Objective

You accomplish a given task iteratively, breaking it down into clear steps and working through them methodically.

Substitute {today_date} with the current date (e.g. 2026-06-01). Do not inline tool descriptions in the system prompt — pass them via tools= so the Qwen3.5 chat template can emit the correct <tool_call><function=...> format and the server-side qwen3_coder parser can recover structured tool_calls for you.

The example below runs Apodex as a tool-using agent against an OpenAI-compatible endpoint (the SGLang / vLLM server launched above). The agent loops — executing the requested tools and feeding results back as role="tool" messages — until the model produces a final answer with no tool calls.

Before running, set the endpoint:

export OPENAI_API_KEY="EMPTY"           # any non-empty string for local servers
export BASE_URL="http://localhost:1234/v1"
Click to expand python code example
import json
import os
from datetime import date
from openai import OpenAI


# -------- 1. Tool implementations --------
def get_weather(location: str, unit: str = "celsius") -> str:
    """Get current weather information for a city (simulated)."""
    table = {
        "London":   {"temperature": 15, "condition": "sunny",  "humidity": 45},
        "New York": {"temperature": 20, "condition": "cloudy", "humidity": 60},
        "Tokyo":    {"temperature": 25, "condition": "rainy",  "humidity": 75},
    }
    w = dict(table.get(location, {"temperature": 18, "condition": "unknown", "humidity": 50}))
    if unit == "fahrenheit":
        w["temperature"] = w["temperature"] * 9 / 5 + 32
        w["unit"] = "°F"
    else:
        w["unit"] = "°C"
    return json.dumps(w, ensure_ascii=False)


def calculate(expression: str) -> str:
    """Evaluate a Python-style arithmetic expression."""
    try:
        return json.dumps({"expression": expression, "result": eval(expression)}, ensure_ascii=False)
    except Exception as e:
        return json.dumps({"expression": expression, "error": str(e)}, ensure_ascii=False)


available_tools = {"get_weather": get_weather, "calculate": calculate}


# -------- 2. Tool schemas (OpenAI function-calling format) --------
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather information for a city.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string", "description": "City name, e.g. 'London'."},
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature unit (default: celsius).",
                    },
                },
                "required": ["location"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": "Evaluate a Python-style arithmetic expression.",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string",
                        "description": "Expression to evaluate, e.g. '(25 + 15) * 3 - 10'.",
                    },
                },
                "required": ["expression"],
            },
        },
    },
]


# -------- 3. System prompt --------
SYSTEM_PROMPT = f"""You are Apodex, an AI assistant developed by Apodex AI.

Apodex is the flagship agent of Apodex AI. Rather than a conventional conversational LLM, it is a general-purpose solver designed for mission-critical tasks.

Current time: {date.today()}. In this environment you have access to a set of tools you can use to answer the user's question.

You only have access to the tools provided. You can use multiple tools per message, and will receive the results of those tools in the user's next response. You use tools step-by-step to accomplish a given task.

# General Objective

You accomplish a given task iteratively, breaking it down into clear steps and working through them methodically."""


# -------- 4. Agentic loop --------
def run_agent(user_query: str, model: str = "apodex/Apodex-1.0-35B-A3B", max_turns: int = 20):
    client = OpenAI(
        api_key=os.environ.get("OPENAI_API_KEY", "EMPTY"),
        base_url=os.environ.get("BASE_URL", "<http://localhost:1234/v1>"),
    )

    messages = [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user",   "content": user_query},
    ]
    print(f"\\n{'=' * 60}\\nUser: {user_query}\\n{'=' * 60}\\n")

    for turn in range(max_turns):
        resp = client.chat.completions.create(
            model=model,
            messages=messages,
            tools=tools,
            parallel_tool_calls=True,
            temperature=1.0,
            top_p=0.95,
            max_tokens=16384,
            extra_body={"repetition_penalty": 1.05},
        )
        msg = resp.choices[0].message

        # Optional: print reasoning if the server exposes it (qwen3 reasoning parser)
        reasoning = getattr(msg, "reasoning_content", None)
        if reasoning:
            print(f"[think] {reasoning.strip()}\\n")
        if msg.content:
            print(f"[assistant] {msg.content.strip()}\\n")

        messages.append(msg)

        # No more tool calls -> final answer
        if not msg.tool_calls:
            print(f"💬 Final answer:\\n{msg.content}\\n")
            return msg.content

        # Execute every tool call requested in this turn
        for call in msg.tool_calls:
            name = call.function.name
            args = json.loads(call.function.arguments or "{}")
            print(f"🔧 call {name}({args})")
            try:
                result = available_toolsname
            except Exception as e:
                result = json.dumps({"error": f"{type(e).__name__}: {e}"}, ensure_ascii=False)
            print(f"   ↳ {result}\\n")
            messages.append({
                "role": "tool",
                "tool_call_id": call.id,
                "content": result,
            })

    print("⚠️  Reached max_turns without a final answer.")
    return None


if __name__ == "__main__":
    run_agent("What's the weather in London in Fahrenheit, and what's (25 + 15) * 3 - 10?")

4. License

Apodex-1.0 is released under Apache 2.0.

5. Citation

If you find this project useful in your research, please consider citing:

@article{apodex2026,
  title={Apodex-1.0: A Verification-Centric Agent Team for Discoverative Intelligence},
  author={Apodex Team},
  year={2026}
}

Contact

Reach the Apodex Team via our website.

Downloads last month
1
Safetensors
Model size
36B params
Tensor type
F32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for apodex/Apodex-1.0-mini

Finetuned
(127)
this model

Collection including apodex/Apodex-1.0-mini