Getting Started with OpenAI Agents SDK and Hugging Face

Community Article Published January 12, 2026

👉 Here is the video version of this post: https://youtu.be/kKc0FwiuRg8

In the previous tutorials, we built an agent from scratch and then explored smolagents, Hugging Face's minimalist framework. Now let's look at another popular option: OpenAI's Agents SDK.

The OpenAI Agents SDK is a lightweight framework for building agentic applications. Despite the name, it works with any LLM provider through LiteLLM integration. It offers a clean API for tools, multi-agent orchestration, and built-in features like streaming and sessions.

By the end of this tutorial, you'll be able to:

  • Create agents with custom tools
  • Use non-OpenAI models (including Hugging Face models)
  • Get structured outputs with Pydantic
  • Build multi-agent systems with handoffs
  • Persist conversations with sessions
  • Stream responses in real-time

Let's get started.

Setup

Install the SDK:

pip install openai-agents

For non-OpenAI models, install the LiteLLM extension:

pip install "openai-agents[litellm]"

Set up your API key:

import os
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

Your First Agent

Creating an agent is straightforward:

from agents import Agent, function_tool, Runner

@function_tool
def get_weather(city: str) -> str:
    """Returns weather info for the specified city."""
    return f"The weather in {city} is sunny"

agent = Agent(
    name="Haiku agent",
    instructions="Always respond in haiku form",
    model="gpt-4o-mini",
    tools=[get_weather],
)

result = await Runner.run(agent, "What's the weather in New York?")
print(result.final_output)

Output:

Sun warms glass and stone,
Blue sky folds the city bright—
Sunny streets hum life.

The @function_tool decorator works similarly to smolagents' @tool. It parses the docstring and type hints to generate the tool schema automatically.

Using Non-OpenAI Models

One of the SDK's strengths is model flexibility. You can use Hugging Face models through the LiteLLM integration:

import os
import getpass

os.environ["HF_TOKEN"] = getpass.getpass("Enter your Hugging Face token: ")
from agents import Agent, Runner, ModelSettings
from agents.extensions.models.litellm_model import LitellmModel

model = LitellmModel(
    model="huggingface/novita/MiniMaxAI/MiniMax-M2.1",
    api_key=os.environ["HF_TOKEN"],
)

agent = Agent(
    name="HF agent",
    instructions="Always respond in haiku form",
    tools=[get_weather],
    model=model,
    model_settings=ModelSettings(include_usage=True),
)

result = await Runner.run(agent, "What's the weather in New York?")
print(result.final_output)

The model string follows the pattern: huggingface/<provider>/<org>/<model>. This lets you use any model available through Hugging Face's Inference Providers.

Structured Output

Need typed responses? Use Pydantic models with output_type:

from pydantic import BaseModel
from agents import Agent, Runner

class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

model = LitellmModel(
    model="huggingface/novita/zai-org/GLM-4.7",
    api_key=os.environ["HF_TOKEN"],
)

agent = Agent(
    name="Calendar extractor",
    instructions="Extract calendar events from text",
    output_type=CalendarEvent,
    model=model,
)

result = await Runner.run(
    agent,
    "Extract the event: 'Meeting with Alice and Bob on July 5th.'",
)

print(result.final_output)

Output:

name='Meeting with Alice and Bob' date='July 5th' participants=['Alice', 'Bob']

When using non-OpenAI models, make sure they support both structured output AND tool calling.

Multi-Agent Systems

The SDK supports two main multi-agent architectures:

Handoffs (Decentralized)

Agents hand off control to specialized peers:

from agents import Agent

history_tutor = Agent(
    name="History Tutor",
    handoff_description="Specialist agent for historical questions",
    instructions="You provide assistance with historical queries.",
    model=model,
)

math_tutor = Agent(
    name="Math Tutor",
    handoff_description="Specialist agent for math questions",
    instructions="You help with math problems. Explain your reasoning step by step.",
    model=model,
)

triage_agent = Agent(
    name="Triage Agent",
    instructions="Determine which agent to use based on the user's question",
    handoffs=[history_tutor, math_tutor],
    model=model,
)

result = await Runner.run(
    triage_agent,
    "Can you explain the causes of World War II?",
)

print(result.final_output)

The triage agent analyzes the query and hands off to the history tutor, which then takes over the conversation.

Agents as Tools (Centralized)

A manager orchestrates sub-agents as tools:

manager = Agent(
    name="Manager Agent",
    instructions="Manage a team of agents to answer questions effectively.",
    tools=[
        history_tutor.as_tool(
            tool_name="history_tutor",
            tool_description="Handles historical queries",
        ),
        math_tutor.as_tool(
            tool_name="math_tutor",
            tool_description="Handles math questions",
        ),
    ],
    model=model,
)

result = await Runner.run(
    manager,
    "Can you explain the causes of World War II?",
)

print(result.final_output)

The difference? With handoffs, the specialist agent takes over completely. With tools, the manager stays in control and integrates the sub-agent's response.

Built-in Tools

The SDK includes several pre-built tools when using OpenAI models:

from agents import Agent, Runner, WebSearchTool

agent = Agent(
    name="Assistant",
    tools=[WebSearchTool()],
)

result = await Runner.run(
    agent,
    "Who is the current president of the United States?"
)
print(result.final_output)

Available built-in tools:

  • WebSearchTool - Search the web
  • FileSearchTool - Search OpenAI Vector Stores
  • ComputerTool - Automate computer use tasks
  • CodeInterpreterTool - Execute code in a sandbox
  • ImageGenerationTool - Generate images from prompts
  • LocalShellTool - Run shell commands locally

Custom Tools

Creating custom tools is simple with the @function_tool decorator:

from typing_extensions import TypedDict, Any
from agents import Agent, FunctionTool, RunContextWrapper, function_tool


class Location(TypedDict):
    lat: float
    long: float

@function_tool
async def fetch_weather(location: Location) -> str:
    """Fetch the weather for a given location.

    Args:
        location: The location to fetch the weather for.
    """
    return f"The weather at {location['lat']}, {location['long']} is sunny"


@function_tool(name_override="fetch_data")
def read_file(ctx: RunContextWrapper[Any], path: str, directory: str | None = None) -> str:
    """Read the contents of a file.

    Args:
        path: The path to the file to read.
        directory: The directory to read the file from.
    """
    return "Hello, World!"


agent = Agent(
    name="Assistant",
    tools=[fetch_weather, read_file],
    model=model,
)

Key features:

  • Type hints define the parameter schema
  • Docstrings become the tool description
  • name_override lets you customize the tool name
  • Tools can be sync or async
  • Access runtime context via RunContextWrapper

Sessions

Persist conversations across multiple interactions:

from agents import Agent, Runner, SQLiteSession

agent = Agent(
    name="Assistant",
    instructions="Reply very concisely.",
    model=model,
)

# Create a session
session = SQLiteSession(session_id="conv_123")

# First message
result = await Runner.run(
    agent,
    "What city is the Golden Gate Bridge in?",
    session=session
)
print(result.final_output)  # "San Francisco"

# Continue the conversation
result = await Runner.run(
    agent,
    "What state is it in?",
    session=session
)
print(result.final_output)  # "California"

The session stores the conversation history, so the agent remembers context between calls.

Streaming

For real-time responses, use run_streamed:

import asyncio
from openai.types.responses import ResponseTextDeltaEvent
from agents import Agent, Runner

async def main():
    agent = Agent(
        name="Joker",
        instructions="You are a helpful assistant.",
        model=model,
    )

    result = Runner.run_streamed(agent, input="Tell me 5 jokes.")
    async for event in result.stream_events():
        if event.type == "raw_response_event" and isinstance(event.data, ResponseTextDeltaEvent):
            print(event.data.delta, end="", flush=True)

asyncio.run(main())

This prints each token as it arrives, giving users immediate feedback.

Comparison with smolagents

Both frameworks do similar things, but have different philosophies:

Feature OpenAI Agents SDK smolagents
Tool decorator @function_tool @tool
Multi-agent Handoffs + Tools Manager patterns
Sessions Built-in SQLite Manual
Hub integration None Hugging Face Hub
UI None Built-in Gradio

Choose smolagents if you want tight Hugging Face integration and instant UIs. Choose OpenAI Agents SDK if you want a clean async API with built-in session management.

Recap

The OpenAI Agents SDK provides:

  • Clean API: Simple decorators and async/await patterns
  • Model flexibility: Works with any LLM through LiteLLM
  • Structured output: Native Pydantic support
  • Multi-agent: Both handoff and tool-based orchestration
  • Sessions: Built-in conversation persistence
  • Streaming: Real-time response streaming

Combined with what you learned in the previous tutorials, you now have a solid toolkit for building AI agents with different frameworks.

Full Code

import os
import getpass
from agents import Agent, Runner, function_tool, ModelSettings, SQLiteSession
from agents.extensions.models.litellm_model import LitellmModel

# Setup
os.environ["HF_TOKEN"] = getpass.getpass("Enter your Hugging Face token: ")

# Initialize model
model = LitellmModel(
    model="huggingface/novita/zai-org/GLM-4.7",
    api_key=os.environ["HF_TOKEN"],
)

# Custom tool
@function_tool
def get_weather(city: str) -> str:
    """Returns weather info for the specified city."""
    return f"The weather in {city} is sunny"

# Multi-agent setup
history_tutor = Agent(
    name="History Tutor",
    handoff_description="Specialist for historical questions",
    instructions="Provide assistance with historical queries.",
    model=model,
)

math_tutor = Agent(
    name="Math Tutor",
    handoff_description="Specialist for math questions",
    instructions="Help with math problems. Explain reasoning step by step.",
    model=model,
)

triage_agent = Agent(
    name="Triage Agent",
    instructions="Determine which agent to use based on the question",
    handoffs=[history_tutor, math_tutor],
    model=model,
)

# Run with session
session = SQLiteSession(session_id="demo_session")

result = await Runner.run(
    triage_agent,
    "What were the main causes of World War II?",
    session=session
)

print(result.final_output)

References

Community

Sign up or log in to comment