Chat Completions API
AiHummer exposes a single OpenAI-compatible HTTP endpoint for text turns:
POST /v1/chat/completions. Any client or SDK that already speaks the OpenAI
Chat Completions format can talk to AiHummer by changing the base URL and the
API key — no AiHummer-specific code is required.
Authentication
Requests are authenticated with a personal API key as a Bearer token. AiHummer
keys are prefixed with ah- and are issued from the web admin UI.
POST /v1/chat/completions HTTP/1.1
Host: your-aihummer.example
Authorization: Bearer ah-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Content-Type: application/json
[!TIP] The base URL is your gateway address. In a default install the gateway listens on
:8765, so a local call goes tohttp://localhost:8765/v1/chat/completions. In production you terminate TLS at a reverse proxy in front of the gateway.
A basic request
Send a JSON body with messages, exactly as you would to OpenAI. The model
field selects the model (or agent) configured on your instance.
curl https://your-aihummer.example/v1/chat/completions \
-H "Authorization: Bearer ah-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
-H "Content-Type: application/json" \
-d '{
"model": "default",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Summarise our refund policy in two sentences." }
],
"temperature": 0.3
}'
A non-streaming response follows the familiar Chat Completions shape:
{
"id": "chatcmpl-...",
"object": "chat.completion",
"created": 1750000000,
"model": "default",
"choices": [
{
"index": 0,
"message": { "role": "assistant", "content": "Refunds are issued within 14 days..." },
"finish_reason": "stop"
}
],
"usage": { "prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0 }
}
Streaming responses (SSE)
Set "stream": true to receive the answer incrementally as
Server-Sent Events. Each event carries a chat.completion.chunk delta, and
the stream terminates with a final data: [DONE] line.
curl -N https://your-aihummer.example/v1/chat/completions \
-H "Authorization: Bearer ah-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
-H "Content-Type: application/json" \
-d '{
"model": "default",
"stream": true,
"messages": [
{ "role": "user", "content": "Write a one-line greeting." }
]
}'
The response is a text/event-stream:
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"role":"assistant"}}]}
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"}}]}
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":"stop"}]}
data: [DONE]
[!WARNING]
/v1/chat/completionsis the only endpoint on the OpenAI-compatible surface. AiHummer does not expose/v1/modelsand does not expose/v1/embeddings— embeddings are an internal subsystem and are not reachable over HTTP. Do not rely on those routes; they return 404.
Discovery & schema surfaces
While there is no /v1/models listing, AiHummer ships several discovery surfaces
so humans and tools can explore the API:
| Path | What it serves |
|---|---|
GET /docs | Human-readable documentation entry point |
GET /docs/api | Interactive API Explorer |
GET /docs/openapi.json | OpenAPI 3.x specification |
GET /openapi.json | OpenAPI 3.x specification (root alias) |
GET /docs/llm.json | Machine-readable API summary for LLM tooling |
GET /llms.txt | llms.txt index for LLM agents |
# Fetch the OpenAPI spec
curl https://your-aihummer.example/openapi.json
System endpoints
Two lightweight, unauthenticated system endpoints help with liveness and clock checks:
| Method & path | Purpose |
|---|---|
GET /v1/ping | Returns a simple liveness response |
GET /v1/time | Returns the gateway’s current server time |
curl https://your-aihummer.example/v1/ping
curl https://your-aihummer.example/v1/time
Where to next
- Drive AiHummer from your own apps and automations: Inbound & integration triggers.
- Pairing, SSO federation and protocol surfaces: Webhooks, SCIM & pairing.
- Tune what the agent can do: Tools catalog.