Getting Started with OpenAI Agents SDK and Hugging Face
👉 Here is the video version of this post: https://youtu.be/kKc0FwiuRg8
In the previous tutorials, we built an agent from scratch and then explored smolagents, Hugging Face's minimalist framework. Now let's look at another popular option: OpenAI's Agents SDK.
The OpenAI Agents SDK is a lightweight framework for building agentic applications. Despite the name, it works with any LLM provider through LiteLLM integration. It offers a clean API for tools, multi-agent orchestration, and built-in features like streaming and sessions.
By the end of this tutorial, you'll be able to:
- Create agents with custom tools
- Use non-OpenAI models (including Hugging Face models)
- Get structured outputs with Pydantic
- Build multi-agent systems with handoffs
- Persist conversations with sessions
- Stream responses in real-time
Let's get started.
Setup
Install the SDK:
pip install openai-agents
For non-OpenAI models, install the LiteLLM extension:
pip install "openai-agents[litellm]"
Set up your API key:
import os
import getpass
os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")
Your First Agent
Creating an agent is straightforward:
from agents import Agent, function_tool, Runner
@function_tool
def get_weather(city: str) -> str:
"""Returns weather info for the specified city."""
return f"The weather in {city} is sunny"
agent = Agent(
name="Haiku agent",
instructions="Always respond in haiku form",
model="gpt-4o-mini",
tools=[get_weather],
)
result = await Runner.run(agent, "What's the weather in New York?")
print(result.final_output)
Output:
Sun warms glass and stone,
Blue sky folds the city bright—
Sunny streets hum life.
The @function_tool decorator works similarly to smolagents' @tool. It parses the docstring and type hints to generate the tool schema automatically.
Using Non-OpenAI Models
One of the SDK's strengths is model flexibility. You can use Hugging Face models through the LiteLLM integration:
import os
import getpass
os.environ["HF_TOKEN"] = getpass.getpass("Enter your Hugging Face token: ")
from agents import Agent, Runner, ModelSettings
from agents.extensions.models.litellm_model import LitellmModel
model = LitellmModel(
model="huggingface/novita/MiniMaxAI/MiniMax-M2.1",
api_key=os.environ["HF_TOKEN"],
)
agent = Agent(
name="HF agent",
instructions="Always respond in haiku form",
tools=[get_weather],
model=model,
model_settings=ModelSettings(include_usage=True),
)
result = await Runner.run(agent, "What's the weather in New York?")
print(result.final_output)
The model string follows the pattern: huggingface/<provider>/<org>/<model>. This lets you use any model available through Hugging Face's Inference Providers.
Structured Output
Need typed responses? Use Pydantic models with output_type:
from pydantic import BaseModel
from agents import Agent, Runner
class CalendarEvent(BaseModel):
name: str
date: str
participants: list[str]
model = LitellmModel(
model="huggingface/novita/zai-org/GLM-4.7",
api_key=os.environ["HF_TOKEN"],
)
agent = Agent(
name="Calendar extractor",
instructions="Extract calendar events from text",
output_type=CalendarEvent,
model=model,
)
result = await Runner.run(
agent,
"Extract the event: 'Meeting with Alice and Bob on July 5th.'",
)
print(result.final_output)
Output:
name='Meeting with Alice and Bob' date='July 5th' participants=['Alice', 'Bob']
When using non-OpenAI models, make sure they support both structured output AND tool calling.
Multi-Agent Systems
The SDK supports two main multi-agent architectures:
Handoffs (Decentralized)
Agents hand off control to specialized peers:
from agents import Agent
history_tutor = Agent(
name="History Tutor",
handoff_description="Specialist agent for historical questions",
instructions="You provide assistance with historical queries.",
model=model,
)
math_tutor = Agent(
name="Math Tutor",
handoff_description="Specialist agent for math questions",
instructions="You help with math problems. Explain your reasoning step by step.",
model=model,
)
triage_agent = Agent(
name="Triage Agent",
instructions="Determine which agent to use based on the user's question",
handoffs=[history_tutor, math_tutor],
model=model,
)
result = await Runner.run(
triage_agent,
"Can you explain the causes of World War II?",
)
print(result.final_output)
The triage agent analyzes the query and hands off to the history tutor, which then takes over the conversation.
Agents as Tools (Centralized)
A manager orchestrates sub-agents as tools:
manager = Agent(
name="Manager Agent",
instructions="Manage a team of agents to answer questions effectively.",
tools=[
history_tutor.as_tool(
tool_name="history_tutor",
tool_description="Handles historical queries",
),
math_tutor.as_tool(
tool_name="math_tutor",
tool_description="Handles math questions",
),
],
model=model,
)
result = await Runner.run(
manager,
"Can you explain the causes of World War II?",
)
print(result.final_output)
The difference? With handoffs, the specialist agent takes over completely. With tools, the manager stays in control and integrates the sub-agent's response.
Built-in Tools
The SDK includes several pre-built tools when using OpenAI models:
from agents import Agent, Runner, WebSearchTool
agent = Agent(
name="Assistant",
tools=[WebSearchTool()],
)
result = await Runner.run(
agent,
"Who is the current president of the United States?"
)
print(result.final_output)
Available built-in tools:
WebSearchTool- Search the webFileSearchTool- Search OpenAI Vector StoresComputerTool- Automate computer use tasksCodeInterpreterTool- Execute code in a sandboxImageGenerationTool- Generate images from promptsLocalShellTool- Run shell commands locally
Custom Tools
Creating custom tools is simple with the @function_tool decorator:
from typing_extensions import TypedDict, Any
from agents import Agent, FunctionTool, RunContextWrapper, function_tool
class Location(TypedDict):
lat: float
long: float
@function_tool
async def fetch_weather(location: Location) -> str:
"""Fetch the weather for a given location.
Args:
location: The location to fetch the weather for.
"""
return f"The weather at {location['lat']}, {location['long']} is sunny"
@function_tool(name_override="fetch_data")
def read_file(ctx: RunContextWrapper[Any], path: str, directory: str | None = None) -> str:
"""Read the contents of a file.
Args:
path: The path to the file to read.
directory: The directory to read the file from.
"""
return "Hello, World!"
agent = Agent(
name="Assistant",
tools=[fetch_weather, read_file],
model=model,
)
Key features:
- Type hints define the parameter schema
- Docstrings become the tool description
name_overridelets you customize the tool name- Tools can be sync or async
- Access runtime context via
RunContextWrapper
Sessions
Persist conversations across multiple interactions:
from agents import Agent, Runner, SQLiteSession
agent = Agent(
name="Assistant",
instructions="Reply very concisely.",
model=model,
)
# Create a session
session = SQLiteSession(session_id="conv_123")
# First message
result = await Runner.run(
agent,
"What city is the Golden Gate Bridge in?",
session=session
)
print(result.final_output) # "San Francisco"
# Continue the conversation
result = await Runner.run(
agent,
"What state is it in?",
session=session
)
print(result.final_output) # "California"
The session stores the conversation history, so the agent remembers context between calls.
Streaming
For real-time responses, use run_streamed:
import asyncio
from openai.types.responses import ResponseTextDeltaEvent
from agents import Agent, Runner
async def main():
agent = Agent(
name="Joker",
instructions="You are a helpful assistant.",
model=model,
)
result = Runner.run_streamed(agent, input="Tell me 5 jokes.")
async for event in result.stream_events():
if event.type == "raw_response_event" and isinstance(event.data, ResponseTextDeltaEvent):
print(event.data.delta, end="", flush=True)
asyncio.run(main())
This prints each token as it arrives, giving users immediate feedback.
Comparison with smolagents
Both frameworks do similar things, but have different philosophies:
| Feature | OpenAI Agents SDK | smolagents |
|---|---|---|
| Tool decorator | @function_tool |
@tool |
| Multi-agent | Handoffs + Tools | Manager patterns |
| Sessions | Built-in SQLite | Manual |
| Hub integration | None | Hugging Face Hub |
| UI | None | Built-in Gradio |
Choose smolagents if you want tight Hugging Face integration and instant UIs. Choose OpenAI Agents SDK if you want a clean async API with built-in session management.
Recap
The OpenAI Agents SDK provides:
- Clean API: Simple decorators and async/await patterns
- Model flexibility: Works with any LLM through LiteLLM
- Structured output: Native Pydantic support
- Multi-agent: Both handoff and tool-based orchestration
- Sessions: Built-in conversation persistence
- Streaming: Real-time response streaming
Combined with what you learned in the previous tutorials, you now have a solid toolkit for building AI agents with different frameworks.
Full Code
import os
import getpass
from agents import Agent, Runner, function_tool, ModelSettings, SQLiteSession
from agents.extensions.models.litellm_model import LitellmModel
# Setup
os.environ["HF_TOKEN"] = getpass.getpass("Enter your Hugging Face token: ")
# Initialize model
model = LitellmModel(
model="huggingface/novita/zai-org/GLM-4.7",
api_key=os.environ["HF_TOKEN"],
)
# Custom tool
@function_tool
def get_weather(city: str) -> str:
"""Returns weather info for the specified city."""
return f"The weather in {city} is sunny"
# Multi-agent setup
history_tutor = Agent(
name="History Tutor",
handoff_description="Specialist for historical questions",
instructions="Provide assistance with historical queries.",
model=model,
)
math_tutor = Agent(
name="Math Tutor",
handoff_description="Specialist for math questions",
instructions="Help with math problems. Explain reasoning step by step.",
model=model,
)
triage_agent = Agent(
name="Triage Agent",
instructions="Determine which agent to use based on the question",
handoffs=[history_tutor, math_tutor],
model=model,
)
# Run with session
session = SQLiteSession(session_id="demo_session")
result = await Runner.run(
triage_agent,
"What were the main causes of World War II?",
session=session
)
print(result.final_output)