Spaces:
Running
title: Codex As API
emoji: π€
colorFrom: indigo
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
Codex-as-API
An OpenAI-compatible HTTP API backed by the OpenAI Codex CLI,
authenticated with your ChatGPT login (no API key). Runs on a Hugging Face
Docker Space; auth and sessions persist in the mounted /data bucket so they
survive restarts and rebuilds.
β οΈ Personal use only.
auth.jsoncontains your ChatGPT access tokens β treat it like a password. The API is protected by a bearer token; keep your Space'sAPI_TOKENsecret.
How it works
client (OpenAI SDK)
β Authorization: Bearer $API_TOKEN (stream=true -> live SSE tokens)
βΌ
FastAPI /v1/chat/completions
β JSON-RPC over stdio (one short-lived process per turn):
βΌ
codex app-server
initialize -> thread/start | thread/resume -> turn/start
<- item/agentMessage/delta {delta} β streamed token-by-token
<- item/completed / turn/completed / thread/tokenUsage/updated
(cwd = /data/sessions/<id>/workspace, sandbox = workspace-write, approvals never)
β
βΌ
/data (bucket)
ββ .codex/auth.json β your ChatGPT login (you upload this once)
ββ .codex/AGENTS.md β global safety rules (no delete, etc.)
ββ sessions/<id>/ β per-session workspace + Codex thread id
Streaming is real, not simulated. The App Server emits
item/agentMessage/deltaevents as the model generates, which the API forwards as OpenAI SSE chunks. (codex execcannot do this β it only returns the whole message at once.)
One-time setup
1. Mount the bucket at /data
Already done in your Space settings (sarveshpatel/cli-storage β /data, Read & Write).
2. Set the Space secret
In Settings β Variables and secrets, add a secret:
| Name | Value |
|---|---|
API_TOKEN |
a long random string (your API key for this service) |
Optional variables:
| Name | Default | Meaning |
|---|---|---|
CODEX_SANDBOX |
workspace-write |
read-only for chat-only, workspace-write to let Codex edit files |
CODEX_MODEL |
(unset) | pin a Codex model, e.g. gpt-5-codex |
CODEX_TIMEOUT |
180 |
max seconds between Codex output events |
CODEX_MAX_CONCURRENCY |
4 |
max Codex turns running at once (resource cap) |
CODEX_QUEUE_TIMEOUT |
90 |
seconds a request waits in queue before 429 |
Concurrency
- Requests for different sessions run in parallel, up to
CODEX_MAX_CONCURRENCY. - Requests for the same session are serialized β two calls never resume the same Codex thread or write the same workspace at once (prevents corruption).
- When all slots are busy and the queue wait exceeds
CODEX_QUEUE_TIMEOUT, the API returns HTTP 429 so clients can back off and retry.
3. Upload your login (auth.json)
On your local machine (with a browser):
npm install -g @openai/codex
codex login # completes the ChatGPT OAuth in a browser
cat ~/.codex/auth.json # confirm it exists
Then upload ~/.codex/auth.json into the bucket at /data/.codex/auth.json
(via the HF bucket UI or the CLI). The Space auto-refreshes the tokens from there
on, so you only do this once (until you explicitly log out).
GET /health reports "logged_in": true once it's in place.
Usage
curl https://<your-space>.hf.space/v1/chat/completions \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-H "X-Session-Id: my-project-1" \
-d '{
"model": "codex",
"messages": [{"role": "user", "content": "Write a Python function to reverse a linked list."}]
}'
With the OpenAI Python SDK:
from openai import OpenAI
client = OpenAI(
base_url="https://<your-space>.hf.space/v1",
api_key="<your API_TOKEN>",
)
resp = client.chat.completions.create(
model="codex",
messages=[{"role": "user", "content": "Refactor app.py for readability."}],
extra_headers={"X-Session-Id": "my-project-1"}, # persistent session
)
print(resp.choices[0].message.content)
- Sessions: pass
X-Session-Id(or the OpenAIuserfield) to keep a persistent workspace and resume the Codex thread across calls. Omit it for a clean one-shot. - Streaming:
stream=truegives real token-by-token SSE (setstream_options={"include_usage": true}to get a final usage chunk).
Endpoints
GET /healthβ liveness + login statusGET /v1/modelsPOST /v1/chat/completions
Custom domain (Nginx reverse proxy)
ai.antaram.org fronts the Space via Nginx (config in
deploy/nginx/ai.antaram.org.conf):
- DNS: point an A record
ai.antaram.orgβ your server's IP. - Install the config, then get TLS:
sudo certbot --nginx -d ai.antaram.org. sudo nginx -t && sudo systemctl reload nginx.
The config sets the upstream Host/SNI to sarveshpatel-codex.hf.space (required
for HF routing) and turns buffering off so SSE streaming stays live. Clients
then use base_url=https://ai.antaram.org/v1.
Safety
A global AGENTS.md (installed into CODEX_HOME on boot) forbids file deletion,
destructive git, escaping the working directory, and printing credentials. Codex
also runs sandboxed (workspace-write) and confined to the session's workspace.