Orchestration & sub-agents
The orchestrator is what turns an inbound message into an answer. It assembles a layered system prompt and drives a function-calling loop: the model thinks, calls tools, reads the results, and repeats until it produces a final reply. On top of this loop, an agent can spawn sub-agents and apply optional reasoning strategies — all of which are admin-configurable.
The turn loop
A turn is a bounded function-calling loop. The model is given the system prompt, the conversation and the available tools; it may call one or more tools, the orchestrator executes them and feeds the results back, and the loop continues until the model returns a final message. Tools, memory and knowledge always arrive as tool results, never as injected instructions — this is what keeps the loop resistant to prompt injection.
prompt + tools ─▶ model ─▶ tool calls ─▶ results ─▶ model ─▶ ... ─▶ final reply
Sub-agents
An agent can spawn sub-agents to handle a sub-task and return a result to the caller. This lets a coordinating agent delegate focused work — research, drafting, a specialised lookup — without overloading a single context.
Recursion is bounded: the maximum nesting depth is controlled by
AIHUMMER_SUBAGENT_MAX_DEPTH. This prevents runaway spawning and keeps a turn’s
cost and latency predictable.
[!WARNING] Sub-agents multiply work. Keep
AIHUMMER_SUBAGENT_MAX_DEPTHconservative on shared hosts — each extra level of depth can multiply tool calls and model usage.
Reasoning strategies
Beyond the plain loop, the orchestrator offers optional reasoning strategies. They are off unless enabled, and every one is configurable from the admin UI under the “Agent · Reasoning” settings group.
| Strategy | What it does |
|---|---|
| Plan-steps | The agent drafts a plan before acting, then works the steps. |
| Reflect | The agent critiques and revises its own draft before answering. |
| Debate | N agents argue across a number of rounds, then a judge picks the winner. |
| Best-of-N | Generate several candidate answers and select the best. |
| Self-heal | On a failure, the agent diagnoses and retries instead of giving up. |
[!TIP] Reasoning strategies trade latency and token cost for quality. Turn them on selectively for the agents and tasks that need them rather than globally.
Agent graphs
For multi-agent workflows that are more structured than ad-hoc sub-agent calls,
AiHummer supports agent graphs — explicit wiring of agents into a workflow.
Graphs are managed from the admin UI and the admin API under
/v1/admin/graphs/*.
Configuration summary
| Knob | Where |
|---|---|
| Sub-agent max depth | AIHUMMER_SUBAGENT_MAX_DEPTH (env; hot-reloadable) |
| Reasoning strategies | Settings group “Agent · Reasoning” |
| Agent graphs | /v1/admin/graphs/* |
[!NOTE] Sub-agent depth and tool enablement are among the settings that apply hot (no restart). Other settings may require a restart — see the configuration model in the getting-started guide.
Where to next
- See what the loop can call in Tools.
- Learn how a message picks its starting agent in Routing.
- Configure the agents themselves in Agents & personas.