Orchestration & sub-agents

The orchestrator is what turns an inbound message into an answer. It assembles a layered system prompt and drives a function-calling loop: the model thinks, calls tools, reads the results, and repeats until it produces a final reply. On top of this loop, an agent can spawn sub-agents and apply optional reasoning strategies — all of which are admin-configurable.

The turn loop

A turn is a bounded function-calling loop. The model is given the system prompt, the conversation and the available tools; it may call one or more tools, the orchestrator executes them and feeds the results back, and the loop continues until the model returns a final message. Tools, memory and knowledge always arrive as tool results, never as injected instructions — this is what keeps the loop resistant to prompt injection.

prompt + tools ─▶ model ─▶ tool calls ─▶ results ─▶ model ─▶ ... ─▶ final reply

Sub-agents

An agent can spawn sub-agents to handle a sub-task and return a result to the caller. This lets a coordinating agent delegate focused work — research, drafting, a specialised lookup — without overloading a single context.

Recursion is bounded: the maximum nesting depth is controlled by AIHUMMER_SUBAGENT_MAX_DEPTH. This prevents runaway spawning and keeps a turn’s cost and latency predictable.

[!WARNING] Sub-agents multiply work. Keep AIHUMMER_SUBAGENT_MAX_DEPTH conservative on shared hosts — each extra level of depth can multiply tool calls and model usage.

Reasoning strategies

Beyond the plain loop, the orchestrator offers optional reasoning strategies. They are off unless enabled, and every one is configurable from the admin UI under the “Agent · Reasoning” settings group.

Strategy	What it does
Plan-steps	The agent drafts a plan before acting, then works the steps.
Reflect	The agent critiques and revises its own draft before answering.
Debate	`N` agents argue across a number of rounds, then a judge picks the winner.
Best-of-N	Generate several candidate answers and select the best.
Self-heal	On a failure, the agent diagnoses and retries instead of giving up.

[!TIP] Reasoning strategies trade latency and token cost for quality. Turn them on selectively for the agents and tasks that need them rather than globally.

Agent graphs

For multi-agent workflows that are more structured than ad-hoc sub-agent calls, AiHummer supports agent graphs — explicit wiring of agents into a workflow. Graphs are managed from the admin UI and the admin API under /v1/admin/graphs/*.

Configuration summary

Knob	Where
Sub-agent max depth	`AIHUMMER_SUBAGENT_MAX_DEPTH` (env; hot-reloadable)
Reasoning strategies	Settings group “Agent · Reasoning”
Agent graphs	`/v1/admin/graphs/*`

[!NOTE] Sub-agent depth and tool enablement are among the settings that apply hot (no restart). Other settings may require a restart — see the configuration model in the getting-started guide.

Where to next

See what the loop can call in Tools.
Learn how a message picks its starting agent in Routing.
Configure the agents themselves in Agents & personas.