Introduction — AiHummer Docs

AiHummer is a self-hosted AI-agent platform for business. It ingests messages from employee and customer channels, routes each one to the right agent, runs a function-calling turn with tools and long-term memory, and delivers the answer back to the originating channel — all managed from a web admin UI and an OpenAI-compatible API.

It is a finished product, not a framework or an SDK. A single self-contained service acts as both the control-plane and the turn engine, so a typical deployment is one service plus PostgreSQL.

What you get

Multi-agent orchestration with sub-agents and optional reasoning strategies (plan-steps, reflect, debate, best-of-N, self-heal).
First-class agents with personas, a structured prompt and a per-agent model.
Long-term memory (Einstein) — facts extracted as reviewable claims, then promoted to memory; recall is wrapped in a data-fence against prompt injection.
Knowledge / RAG with citations, plus deep_research for multi-step reports.
Channels — Telegram, MAX (a Russian messenger, new), Bitrix24 (internal employee IM), SIP telephony, an embeddable web widget and a mobile client (iOS).
A plugin marketplace with one-click, host-native install.
An encrypted credential vault, enterprise SSO (SAML / LDAP / SCIM / OIDC) and Postgres Row-Level Security for multitenant isolation.

How a turn flows

A channel delivers an inbound message to the gateway.
The router resolves the target agent via bindings, @-mentions or a fallback.
The orchestrator assembles a layered, cache-friendly system prompt and drives a function-calling loop over built-in tools, sub-agents and plugins.
Memory and knowledge ground the answer; both arrive as tool results, never as injected instructions.
A guaranteed-delivery outbox returns the reply to the originating channel.

channel ─▶ router ─▶ orchestrator (tools · sub-agents · memory · RAG) ─▶ outbox ─▶ channel

Foundational principles

These are not marketing lines — every page in this guide stays inside them.

Host-native, not Docker. AiHummer deploys as a release tarball running under systemd from /home/.aihummer. There are no containers or orchestrators.
No mandatory paid models. It runs on free/local models and a Codex/ChatGPT-subscription transport. Per-tenant BYOK keys are optional.
Security as a core property. Envelope-encrypted vault, human-in-the-loop approval gates, idempotent side-effects, prompt-injection data-fencing, IP allowlist and audit.
Stability. SemVer with forward-safe, auto-applied database migrations.

[!NOTE] AiHummer exposes an OpenAI-compatible POST /v1/chat/completions endpoint (with SSE streaming). It does not expose /v1/models or /v1/embeddings, and observability is OTLP-push — there is no Prometheus /metrics endpoint.

Who it is for

Companies that want an internal AI employee answering staff over Telegram or Bitrix24, with memory, knowledge and approval gates.
Customer-facing teams putting an agent on a website, a phone line or a mobile app.
Security-, compliance- and sovereignty-conscious deployments that need self-hosting, no Docker, no mandatory external paid model, encrypted secrets, RLS isolation and SSO.
Platform and IT teams that want a finished product — admin UI, multitenancy, marketplace — instead of assembling a framework.

Where to next

New here? Continue to Requirements and Installation.
Want the mental model first? Read Gateway & turn engine.
Building against the API? Jump to chat/completions.