AiHummer is a self-hosted AI-agent platform for business. It ingests messages
from employee and customer channels, routes each one to the right agent, runs a
function-calling turn with tools and long-term memory, and delivers the answer
back to the originating channel — all managed from a web admin UI and an
OpenAI-compatible API.
It is a finished product, not a framework or an SDK. A single self-contained
service acts as both the control-plane and the turn engine, so a typical
deployment is one service plus PostgreSQL.
What you get
Multi-agent orchestration with sub-agents and optional reasoning
strategies (plan-steps, reflect, debate, best-of-N, self-heal).
First-class agents with personas, a structured prompt and a per-agent model.
Long-term memory (Einstein) — facts extracted as reviewable claims, then
promoted to memory; recall is wrapped in a data-fence against prompt injection.
Knowledge / RAG with citations, plus deep_research for multi-step reports.
Channels — Telegram, MAX (a Russian messenger, new), Bitrix24 (internal
employee IM), SIP telephony, an embeddable web widget and a mobile client (iOS).
A plugin marketplace with one-click, host-native install.
An encrypted credential vault, enterprise SSO (SAML / LDAP / SCIM / OIDC)
and Postgres Row-Level Security for multitenant isolation.
How a turn flows
A channel delivers an inbound message to the gateway.
The router resolves the target agent via bindings, @-mentions or a fallback.
The orchestrator assembles a layered, cache-friendly system prompt and
drives a function-calling loop over built-in tools, sub-agents and plugins.
Memory and knowledge ground the answer; both arrive as tool results, never
as injected instructions.
A guaranteed-delivery outbox returns the reply to the originating channel.
These are not marketing lines — every page in this guide stays inside them.
Host-native, not Docker. AiHummer deploys as a release tarball running under
systemd from /home/.aihummer. There are no containers or orchestrators.
No mandatory paid models. It runs on free/local models and a
Codex/ChatGPT-subscription transport. Per-tenant BYOK keys are optional.
Security as a core property. Envelope-encrypted vault, human-in-the-loop
approval gates, idempotent side-effects, prompt-injection data-fencing, IP
allowlist and audit.
Stability. SemVer with forward-safe, auto-applied database migrations.
[!NOTE]
AiHummer exposes an OpenAI-compatible POST /v1/chat/completions endpoint
(with SSE streaming). It does not expose /v1/models or /v1/embeddings,
and observability is OTLP-push — there is no Prometheus /metrics endpoint.
Who it is for
Companies that want an internal AI employee answering staff over
Telegram or Bitrix24, with memory, knowledge and approval gates.
Customer-facing teams putting an agent on a website, a phone line or a
mobile app.
Security-, compliance- and sovereignty-conscious deployments that need
self-hosting, no Docker, no mandatory external paid model, encrypted secrets,
RLS isolation and SSO.
Platform and IT teams that want a finished product — admin UI,
multitenancy, marketplace — instead of assembling a framework.
**AiHummer is a self-hosted AI-agent platform for business.** It ingests messages
from employee and customer channels, routes each one to the right agent, runs a
function-calling turn with tools and long-term memory, and delivers the answer
back to the originating channel — all managed from a web admin UI and an
OpenAI-compatible API.
It is a **finished product**, not a framework or an SDK. A single self-contained
service acts as both the control-plane and the turn engine, so a typical
deployment is one service plus PostgreSQL.
## What you get
- **Multi-agent orchestration** with sub-agents and optional reasoning
strategies (plan-steps, reflect, debate, best-of-N, self-heal).
- **First-class agents** with personas, a structured prompt and a per-agent model.
- **Long-term memory (Einstein)** — facts extracted as reviewable claims, then
promoted to memory; recall is wrapped in a data-fence against prompt injection.
- **Knowledge / RAG** with citations, plus `deep_research` for multi-step reports.
- **Channels** — Telegram, MAX (a Russian messenger, new), Bitrix24 (internal
employee IM), SIP telephony, an embeddable web widget and a mobile client (iOS).
- **A plugin marketplace** with one-click, host-native install.
- **An encrypted credential vault**, enterprise SSO (SAML / LDAP / SCIM / OIDC)
and Postgres Row-Level Security for multitenant isolation.
## How a turn flows
1. A **channel** delivers an inbound message to the gateway.
2. The **router** resolves the target agent via bindings, `@`-mentions or a fallback.
3. The **orchestrator** assembles a layered, cache-friendly system prompt and
drives a function-calling loop over built-in tools, sub-agents and plugins.
4. **Memory and knowledge** ground the answer; both arrive as tool results, never
as injected instructions.
5. A guaranteed-delivery **outbox** returns the reply to the originating channel.
```text
channel ─▶ router ─▶ orchestrator (tools · sub-agents · memory · RAG) ─▶ outbox ─▶ channel
```
## Foundational principles
These are not marketing lines — every page in this guide stays inside them.
- **Host-native, not Docker.** AiHummer deploys as a release tarball running under
systemd from `/home/.aihummer`. There are no containers or orchestrators.
- **No mandatory paid models.** It runs on free/local models and a
Codex/ChatGPT-subscription transport. Per-tenant BYOK keys are optional.
- **Security as a core property.** Envelope-encrypted vault, human-in-the-loop
approval gates, idempotent side-effects, prompt-injection data-fencing, IP
allowlist and audit.
- **Stability.** SemVer with forward-safe, auto-applied database migrations.
> [!NOTE]
> AiHummer exposes an OpenAI-compatible `POST /v1/chat/completions` endpoint
> (with SSE streaming). It does **not** expose `/v1/models` or `/v1/embeddings`,
> and observability is OTLP-push — there is no Prometheus `/metrics` endpoint.
## Who it is for
- **Companies that want an internal AI employee** answering staff over
Telegram or Bitrix24, with memory, knowledge and approval gates.
- **Customer-facing teams** putting an agent on a website, a phone line or a
mobile app.
- **Security-, compliance- and sovereignty-conscious deployments** that need
self-hosting, no Docker, no mandatory external paid model, encrypted secrets,
RLS isolation and SSO.
- **Platform and IT teams** that want a finished product — admin UI,
multitenancy, marketplace — instead of assembling a framework.
## Where to next
- New here? Continue to [Requirements](/en/v1.0/getting-started/requirements)
and [Installation](/en/v1.0/getting-started/installation).
- Want the mental model first? Read
[Gateway & turn engine](/en/v1.0/architecture/gateway-turn-engine).
- Building against the API? Jump to
[chat/completions](/en/v1.0/api/chat-completions).