Sidecars
Some capabilities — speech, video, web search, a headless browser — are not part of the gateway itself. They run as sidecars: small, independent HTTP services that the gateway reaches by URL. Every sidecar is free and local: there is no mandatory paid API behind any of them.
What a sidecar is
A sidecar is a separate process exposing an HTTP API. The gateway never links it in or runs it in-process — it simply calls the URL you configure. Each sidecar
- runs under its own systemd unit (host-native, no containers),
- is wired to the gateway through an environment variable, and
- can be installed natively next to the gateway, or pointed at an existing instance you already run elsewhere.
Because the contract is just HTTP, the same configuration works whether the
sidecar is on localhost or on another host.
[!NOTE] Sidecars are optional. A capability is only active when its URL is configured; until then the gateway runs without it. This keeps a minimal install to just the gateway plus PostgreSQL.
The sidecars and their ports
| Port | Sidecar | Backed by | Powers | Wired with |
|---|---|---|---|---|
| 8001 | STT | faster-whisper | speech-to-text | AIHUMMER_STT_URL |
| 8002 | TTS | edge-tts | text-to-speech | AIHUMMER_TTS_URL |
| 8003 | diarize | pyannote | speaker diarization | AIHUMMER_DIARIZE_URL |
| 8004 | voiceclone | OpenVoice V2 | voice cloning | AIHUMMER_VOICECLONE_URL |
| 8005 | video | ffmpeg | video understanding | AIHUMMER_VIDEO_URL |
| 8888 | SearXNG | SearXNG | the web_search tool | SEARXNG_URL |
| 9222 | Chrome / CDP | Chrome (CDP) | the browser tool | CLOAKBROWSER_CDP_URL |
The voice sidecars are the out-of-the-box path: STT (:8001) feeds the agent
turn and TTS (:8002) speaks the answer, both on free/local engines. Diarization
(:8003), voice cloning (:8004) and video understanding (:8005) are
additional, opt-in sidecars. SearXNG (:8888) backs web_search, and a
Chrome/CDP endpoint (:9222) backs the browser tool.
Wiring sidecars
You point the gateway at each sidecar with the matching environment variable. For the host-native install the voice sidecars (STT/TTS) are configured for you; the rest you enable as needed.
# gateway.env — voice in/out (auto-set by the installer for STT/TTS)
AIHUMMER_STT_URL=http://127.0.0.1:8001
AIHUMMER_TTS_URL=http://127.0.0.1:8002
# Optional voice/video sidecars
AIHUMMER_DIARIZE_URL=http://127.0.0.1:8003
AIHUMMER_VOICECLONE_URL=http://127.0.0.1:8004
AIHUMMER_VIDEO_URL=http://127.0.0.1:8005
# Tool sidecars
SEARXNG_URL=http://127.0.0.1:8888
CLOAKBROWSER_CDP_URL=http://127.0.0.1:9222
[!TIP] Because each sidecar is just a URL, you can centralise heavy services. Point several gateways at one shared STT/TTS or SearXNG instance instead of running a copy per host.
Running them natively or shared
Each sidecar lives under its own systemd unit on the host, so you operate it like any other service — start, stop and read logs independently of the gateway. If you already run, say, a SearXNG or a Chrome/CDP endpoint, simply set the URL to that existing instance and skip the local install entirely. Either way the gateway’s view is identical: a configured URL it can call.
[!WARNING] The
browser(CDP) andweb_searchtools reach out to the network. On hardened or air-gapped deployments review egress and the air-gapped setting before enabling them — the gateway only calls a sidecar that you have wired.
Where to next
- How voice and the browser tool are used in a turn: Gateway & turn engine.
- The minimal hard dependency (PostgreSQL) and opt-in everything else: Gateway & turn engine.