From 155c6e8d5ccd11b233041a6ff3dd3f6bffb4fd98 Mon Sep 17 00:00:00 2001 From: Godopu Date: Sun, 21 Jun 2026 06:32:29 +0000 Subject: [PATCH] docs: fix delete->stop in REPORT + add session/job state glossary (FW-03, FW-10, FW-16) FW-03: replace 'delete' with 'stop' in skill reference (line 299). 'terminated' retained as valid YAML status value (hard kill mode). FW-10/FW-16: add Glossary section distinguishing session states (running/stopped/terminated/archived in agent-sessions.yaml) from job states (pending/running/completed/error/cancelled in registry). Documents which skill/function sets each state. --- Messaging_System_REPORT.md | 35 ++++++++++++++++++++++++++++++++++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/Messaging_System_REPORT.md b/Messaging_System_REPORT.md index 8d4b707..86c35fb 100644 --- a/Messaging_System_REPORT.md +++ b/Messaging_System_REPORT.md @@ -296,7 +296,7 @@ graph LR 2. **`tmux-agent-orchestrate-monitor` (`reconcile.sh` & `watchdog.sh`)**: * **Watchdog Integration**: Starts a subscriber monitoring loop (`watchdog.sh`) to detect orphaned agent panes or locked workspaces. * **Reconciliation loop**: Subscribes to the global job topic. On terminal events, it invokes `lib.sh::atomic_dump_yaml` to sync status drifts (e.g. setting tmux sessions to `terminated` in `agent-sessions.yaml` once the agent exits). -3. **`tmux-agent-orchestrate-create / delete / resume`**: +3. **`tmux-agent-orchestrate-create / stop / resume`**: * Integrates the job life status into session metadata updates, ensuring standard tmux cleanup triggers state updates in the registry and audit logs. --- @@ -326,3 +326,36 @@ graph LR De-prioritize plaintext support. Enforce connection over port `8883` with verified TLS certificates. Implement client certificates (mTLS) for agent authentication. 4. **Build Auto-Reconnecting Subscriber Loops**: Upgrade `job_subscriber.py` to handle disconnect callbacks. Maintain a persistent queue in memory and allow the client to reconnect with exponential backoff, preventing socket dropout from terminating the orchestration flow. + +--- + +## Glossary: Session States vs Job States + +This project manages **two distinct state domains** that are often confused: + +### Session States (YAML — `.hermes/agent-sessions.yaml`) +Managed by `skills/lib.sh` and the 6 `tmux-agent-orchestrate-*` skills. +Valid values (see `lib.sh` valid-status set): + +| State | Meaning | Set by | +|---|---|---| +| `running` | tmux session active, agent running | `create`, `resume` | +| `stopped` | deliberately stopped via `--capture-id`/`--reason`/`--graceful`; conversation preserved for resume | `stop` (STOP mode) | +| `terminated` | hard-killed via `--mode hard`; tmux session destroyed | `stop` (hard mode), `monitor` reconcile | +| `archived` | soft-stopped via `--mode soft`; tmux left alive, YAML-only update | `stop` (soft mode) | + +### Job States (Registry — `.hermes/jobs/.json`) +Managed by `skills/tmux-agent-orchestrate-delegate-job/scripts/registry.py`. +Valid values: + +| State | Meaning | Set by | +|---|---|---| +| `pending` | job registered, agent not yet started | `registry.py register` | +| `running` | agent picked up the job, publishing events | `publish_event.py --event started` | +| `completed` | terminal event — agent finished successfully | `publish_event.py --event completed` | +| `error` | terminal event — agent failed | `publish_event.py --event error` | +| `cancelled` | job cancelled by orchestrator | `registry.py cancel` | + +**Key distinction**: Session states track the **tmux container lifecycle** (create→stop→resume). +Job states track the **delegated work lifecycle** (submit→run→complete/error). +A single session can host multiple sequential jobs; a job runs within exactly one session.