# MQTT Broker Setup — PoC → Production The tmux-agent-orchestrate-delegate-job scripts read **all** broker settings from environment variables (or a job record's `broker.*` block) through a single helper, `broker_config_from_env()` in [`./scripts/mqtt_common.py`](./scripts/mqtt_common.py). The design goal: **switch from the public PoC broker to your own broker with config only — no code change.** | Env var | Meaning | PoC default | Production | |---------|---------|-------------|-----------| | `MQTT_BROKER` | host | `broker.hivemq.com` | internal hostname/IP | | `MQTT_PORT` | port | `1883` | `8883` (TLS) | | `MQTT_TLS` | TLS on/off (`1`/`0`) | `0` | `1` | | `MQTT_USERNAME` / `MQTT_PASSWORD` | auth | (none) | broker-issued | | `MQTT_CA_CERTS` | CA bundle path | (none) | private CA path | | `MQTT_CERTFILE` / `MQTT_KEYFILE` | client cert (optional mTLS) | (none) | per-client | | `MQTT_CLIENT_ID_PREFIX` | client id prefix | `hermes` | per-environment | --- ## 1. PoC: public broker (`broker.hivemq.com`) **Pros** — zero setup, reachable from anywhere, perfect for wiring up the publish/subscribe loop and the timeout/state-machine logic. **Cons / accepted assumptions** — no auth, no integrity, shared with the world: - no secrets in payloads; - `started`/`completed`/`error` are advisory signals only; - non-retained messages are **not queued** for absent subscribers, so the subscriber must start before the agent; - a re-subscribing client cannot recover past (non-retained) events. Use it only to validate the protocol, never for real decisions. --- ## 2. Production: self-hosted Mosquitto (or EMQX) Both support MQTT 5 + ACL + TLS. Mosquitto shown below; EMQX is a drop-in for the same env vars. ### 2.1 Install ```bash # macOS brew install mosquitto # Debian/Ubuntu sudo apt-get update && sudo apt-get install -y mosquitto mosquitto-clients # Docker docker run -d --name mosquitto -p 8883:8883 \ -v "$PWD/mosquitto.conf:/mosquitto/config/mosquitto.conf" \ -v "$PWD/certs:/mosquitto/certs" \ -v "$PWD/auth:/mosquitto/auth" \ eclipse-mosquitto:2 ``` ### 2.2 `mosquitto.conf` (key lines) ```conf persistence true persistence_location /mosquitto/data/ password_file /mosquitto/auth/passwd acl_file /mosquitto/auth/acl allow_anonymous false listener 8883 cafile /mosquitto/certs/ca.crt certfile /mosquitto/certs/server.crt keyfile /mosquitto/certs/server.key ``` `persistence true` + QoS 1 + retained terminal events means a subscriber that joins after a job finished still sees the final `completed`/`error`. ### 2.3 Users (username/password) ```bash # create the file with the first user, then add more with -b mosquitto_passwd -c /mosquitto/auth/passwd hermes # subscriber/delegator mosquitto_passwd /mosquitto/auth/passwd claude-worker # publisher/agent # (omit -c after the first; -c truncates the file) ``` ### 2.4 ACL — least privilege The worker only **publishes** events; Hermes only **subscribes**: ```conf # /mosquitto/auth/acl # claude-worker: may publish job events, may not read others' streams user claude-worker topic write python/mqtt/jobs/+/events # hermes: observes every job's events user hermes topic read python/mqtt/jobs/+/events # keep the legacy demo topic usable for both, if desired pattern readwrite python/mqtt/sample ``` ### 2.5 TLS certificates **Quick self-signed (single host, internal only):** ```bash mkdir -p certs && cd certs openssl req -x509 -newkey rsa:2048 -nodes -days 825 \ -keyout server.key -out server.crt \ -subj "/CN=mqtt.internal" cp server.crt ca.crt # clients trust this as the CA bundle ``` **Private CA (recommended — separate CA from server cert):** ```bash # 1) CA openssl genrsa -out ca.key 4096 openssl req -x509 -new -nodes -key ca.key -days 3650 -out ca.crt -subj "/CN=Hermes-CA" # 2) server cert signed by the CA openssl genrsa -out server.key 2048 openssl req -new -key server.key -out server.csr -subj "/CN=mqtt.internal" openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial \ -out server.crt -days 825 ``` Clients trust `ca.crt` via `MQTT_CA_CERTS=/path/to/ca.crt`. --- ## 3. Cut-over verification (config-only, no code change) Goal: prove the **same scripts** talk to your broker by changing only env/registry. ```bash # 1) point the env at the new broker export MQTT_BROKER=mqtt.internal export MQTT_PORT=8883 export MQTT_TLS=1 export MQTT_CA_CERTS=$PWD/certs/ca.crt export MQTT_USERNAME=hermes export MQTT_PASSWORD=… # subscriber side # (publisher side uses claude-worker creds via the job record's broker block) # 2) sanity-check with the mosquitto CLI first mosquitto_sub -h "$MQTT_BROKER" -p 8883 --cafile "$MQTT_CA_CERTS" \ -u hermes -P "$MQTT_PASSWORD" -t 'python/mqtt/jobs/+/events' -v & # 3) run the unchanged tmux-agent-orchestrate-delegate-job loop PY=.venv/bin/python JID=$($PY scripts/registry.py register --prompt "broker cutover smoke") $PY scripts/job_subscriber.py --job "$JID" --timeout 30 & sleep 3 $PY scripts/publish_event.py --job "$JID" --event started $PY scripts/publish_event.py --job "$JID" --event completed # auto-retained ``` Expected: - subscriber prints the `started` and `completed` lines and exits 0; - `mosquitto_sub` shows the same events (ACL allows `hermes` to read); - publishing as a credential **without** write ACL is rejected by the broker; - a subscriber started *after* `completed` still receives it (retained). If all four hold, the migration is config-only. Persist the broker block into each job record so `publish_event.py` connects from the registry alone: ```json "broker": { "host": "mqtt.internal", "port": 8883, "tls": true, "username": "claude-worker", "password": "…" } ```