codex / README.md
sarveshpatel's picture
Upload 12 files
19499ad verified
|
Raw
History Blame Contribute Delete
5.42 kB
metadata
title: Codex As API
emoji: πŸ€–
colorFrom: indigo
colorTo: purple
sdk: docker
app_port: 7860
pinned: false

Codex-as-API

An OpenAI-compatible HTTP API backed by the OpenAI Codex CLI, authenticated with your ChatGPT login (no API key). Runs on a Hugging Face Docker Space; auth and sessions persist in the mounted /data bucket so they survive restarts and rebuilds.

⚠️ Personal use only. auth.json contains your ChatGPT access tokens β€” treat it like a password. The API is protected by a bearer token; keep your Space's API_TOKEN secret.

How it works

client (OpenAI SDK)
   β”‚  Authorization: Bearer $API_TOKEN   (stream=true -> live SSE tokens)
   β–Ό
FastAPI  /v1/chat/completions
   β”‚  JSON-RPC over stdio (one short-lived process per turn):
   β–Ό
codex app-server
   initialize -> thread/start | thread/resume -> turn/start
   <- item/agentMessage/delta {delta}   ← streamed token-by-token
   <- item/completed / turn/completed / thread/tokenUsage/updated
   (cwd = /data/sessions/<id>/workspace, sandbox = workspace-write, approvals never)
   β”‚
   β–Ό
/data  (bucket)
  β”œβ”€ .codex/auth.json      ← your ChatGPT login (you upload this once)
  β”œβ”€ .codex/AGENTS.md      ← global safety rules (no delete, etc.)
  └─ sessions/<id>/        ← per-session workspace + Codex thread id

Streaming is real, not simulated. The App Server emits item/agentMessage/delta events as the model generates, which the API forwards as OpenAI SSE chunks. (codex exec cannot do this β€” it only returns the whole message at once.)

One-time setup

1. Mount the bucket at /data

Already done in your Space settings (sarveshpatel/cli-storage β†’ /data, Read & Write).

2. Set the Space secret

In Settings β†’ Variables and secrets, add a secret:

Name Value
API_TOKEN a long random string (your API key for this service)

Optional variables:

Name Default Meaning
CODEX_SANDBOX workspace-write read-only for chat-only, workspace-write to let Codex edit files
CODEX_MODEL (unset) pin a Codex model, e.g. gpt-5-codex
CODEX_TIMEOUT 180 max seconds between Codex output events
CODEX_MAX_CONCURRENCY 4 max Codex turns running at once (resource cap)
CODEX_QUEUE_TIMEOUT 90 seconds a request waits in queue before 429

Concurrency

  • Requests for different sessions run in parallel, up to CODEX_MAX_CONCURRENCY.
  • Requests for the same session are serialized β€” two calls never resume the same Codex thread or write the same workspace at once (prevents corruption).
  • When all slots are busy and the queue wait exceeds CODEX_QUEUE_TIMEOUT, the API returns HTTP 429 so clients can back off and retry.

3. Upload your login (auth.json)

On your local machine (with a browser):

npm install -g @openai/codex
codex login                 # completes the ChatGPT OAuth in a browser
cat ~/.codex/auth.json      # confirm it exists

Then upload ~/.codex/auth.json into the bucket at /data/.codex/auth.json (via the HF bucket UI or the CLI). The Space auto-refreshes the tokens from there on, so you only do this once (until you explicitly log out).

GET /health reports "logged_in": true once it's in place.

Usage

curl https://<your-space>.hf.space/v1/chat/completions \
  -H "Authorization: Bearer $API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "X-Session-Id: my-project-1" \
  -d '{
        "model": "codex",
        "messages": [{"role": "user", "content": "Write a Python function to reverse a linked list."}]
      }'

With the OpenAI Python SDK:

from openai import OpenAI

client = OpenAI(
    base_url="https://<your-space>.hf.space/v1",
    api_key="<your API_TOKEN>",
)
resp = client.chat.completions.create(
    model="codex",
    messages=[{"role": "user", "content": "Refactor app.py for readability."}],
    extra_headers={"X-Session-Id": "my-project-1"},  # persistent session
)
print(resp.choices[0].message.content)
  • Sessions: pass X-Session-Id (or the OpenAI user field) to keep a persistent workspace and resume the Codex thread across calls. Omit it for a clean one-shot.
  • Streaming: stream=true gives real token-by-token SSE (set stream_options={"include_usage": true} to get a final usage chunk).

Endpoints

  • GET /health β€” liveness + login status
  • GET /v1/models
  • POST /v1/chat/completions

Custom domain (Nginx reverse proxy)

ai.antaram.org fronts the Space via Nginx (config in deploy/nginx/ai.antaram.org.conf):

  1. DNS: point an A record ai.antaram.org β†’ your server's IP.
  2. Install the config, then get TLS: sudo certbot --nginx -d ai.antaram.org.
  3. sudo nginx -t && sudo systemctl reload nginx.

The config sets the upstream Host/SNI to sarveshpatel-codex.hf.space (required for HF routing) and turns buffering off so SSE streaming stays live. Clients then use base_url=https://ai.antaram.org/v1.

Safety

A global AGENTS.md (installed into CODEX_HOME on boot) forbids file deletion, destructive git, escaping the working directory, and printing credentials. Codex also runs sandboxed (workspace-write) and confined to the session's workspace.