Architecture¶
Overview¶
AIRelays is a thin compatibility layer between OpenAI-shaped client requests and the ChatGPT Codex subscription backend. The route envelopes are intentionally OpenAI-shaped, but some parameter surfaces are narrower because the subscription backend accepts a slightly different contract.
Request flow:
- FastAPI receives an OpenAI-compatible request.
- Endpoint middleware enforces bearer auth and local abuse controls for protected routes.
- The compatibility layer validates and translates the request into the subscription backend format.
- Upstream auth is loaded from AIRelays-owned storage under the AIRelays data directory or AIRelays keyring namespace.
- Inference requests are sent to
chatgpt.com/backend-api/codex, while subscription-status requests are sent tochatgpt.com/backend-api/wham/usage. - Upstream SSE events are either streamed through directly or aggregated into a final JSON response.
- Every ingress and egress step is logged to hourly JSONL files.
Components¶
airelays.config¶
- resolves config from CLI flags, env, config file, and defaults
- owns local paths and relay-security defaults
- resolves the relay bearer token from explicit override or token file
- supports explicit token generation through
airelays initand optional startup auto-generation when configured
airelays.security¶
- enforces route protection on
/v1/*and/no-tools/v1/* - validates the relay bearer token
- applies per-IP rate limits and temporary blocks after repeated bad tokens
- emits security events to the normal traffic log
airelays.auth¶
- loads upstream ChatGPT subscription auth from AIRelays-owned file, keyring, or auto mode
- refreshes tokens
- supports browser and device login
- keeps login protocol compatibility without sharing Codex-owned storage
airelays.backend¶
- calls the verified subscription backend routes for inference, model listing, and usage introspection
- normalizes streamed event handling
- reconstructs non-stream responses from SSE output items
- logs usage summaries from
response.completed
airelays.transforms¶
- maps OpenAI-compatible requests into the upstream request shape
- maps response payloads back into
chat.completions - expands local uploaded images and text files
- rejects unverified fields explicitly
airelays.store¶
- stores uploaded files locally with explicit per-file and total-byte ceilings enforced at ingress
- stores local conversation metadata and latest upstream response ids
- provides the opt-in stateful session layer
airelays.traffic¶
- writes redacted JSONL logs
- stores text bodies directly
- stores binary payload summaries explicitly with SHA-256 digests
Session Model¶
Stateless requests omit conversation.
Stateful requests create a local conversation and pass that local id back on later responses or chat.completions requests. The server reuses that id as the upstream session_id header and tracks the latest response id locally.
Security Model¶
- upstream provider login is separate from relay-client authorization
- the relay bearer token is local-only and is used by callers of AIRelays
- route protection is middleware-level so local-only routes such as files and conversations are covered too
- current rate limiting is in-memory and single-process by design
- public HTTP is limited to the landing page and a minimal
/healthz; detailed relay diagnostics live behind relay auth at/v1/relay/status
Intentional Boundaries¶
- no silent truncation
- no fake token budgets
- no silent fallback for unsupported endpoints
- no claim of parity beyond routes verified against the subscription backend
- no reuse of upstream subscription auth as relay-client auth