docs: fix delete->stop in REPORT + add session/job state glossary (FW-03, FW-10, FW-16)

FW-03: replace 'delete' with 'stop' in skill reference (line 299).
  'terminated' retained as valid YAML status value (hard kill mode).

FW-10/FW-16: add Glossary section distinguishing session states
  (running/stopped/terminated/archived in agent-sessions.yaml) from
  job states (pending/running/completed/error/cancelled in registry).
  Documents which skill/function sets each state.
This commit is contained in:
2026-06-21 06:32:29 +00:00
parent 3677e4aace
commit 155c6e8d5c
+34 -1
View File
@@ -296,7 +296,7 @@ graph LR
2. **`tmux-agent-orchestrate-monitor` (`reconcile.sh` & `watchdog.sh`)**: 2. **`tmux-agent-orchestrate-monitor` (`reconcile.sh` & `watchdog.sh`)**:
* **Watchdog Integration**: Starts a subscriber monitoring loop (`watchdog.sh`) to detect orphaned agent panes or locked workspaces. * **Watchdog Integration**: Starts a subscriber monitoring loop (`watchdog.sh`) to detect orphaned agent panes or locked workspaces.
* **Reconciliation loop**: Subscribes to the global job topic. On terminal events, it invokes `lib.sh::atomic_dump_yaml` to sync status drifts (e.g. setting tmux sessions to `terminated` in `agent-sessions.yaml` once the agent exits). * **Reconciliation loop**: Subscribes to the global job topic. On terminal events, it invokes `lib.sh::atomic_dump_yaml` to sync status drifts (e.g. setting tmux sessions to `terminated` in `agent-sessions.yaml` once the agent exits).
3. **`tmux-agent-orchestrate-create / delete / resume`**: 3. **`tmux-agent-orchestrate-create / stop / resume`**:
* Integrates the job life status into session metadata updates, ensuring standard tmux cleanup triggers state updates in the registry and audit logs. * Integrates the job life status into session metadata updates, ensuring standard tmux cleanup triggers state updates in the registry and audit logs.
--- ---
@@ -326,3 +326,36 @@ graph LR
De-prioritize plaintext support. Enforce connection over port `8883` with verified TLS certificates. Implement client certificates (mTLS) for agent authentication. De-prioritize plaintext support. Enforce connection over port `8883` with verified TLS certificates. Implement client certificates (mTLS) for agent authentication.
4. **Build Auto-Reconnecting Subscriber Loops**: 4. **Build Auto-Reconnecting Subscriber Loops**:
Upgrade `job_subscriber.py` to handle disconnect callbacks. Maintain a persistent queue in memory and allow the client to reconnect with exponential backoff, preventing socket dropout from terminating the orchestration flow. Upgrade `job_subscriber.py` to handle disconnect callbacks. Maintain a persistent queue in memory and allow the client to reconnect with exponential backoff, preventing socket dropout from terminating the orchestration flow.
---
## Glossary: Session States vs Job States
This project manages **two distinct state domains** that are often confused:
### Session States (YAML — `.hermes/agent-sessions.yaml`)
Managed by `skills/lib.sh` and the 6 `tmux-agent-orchestrate-*` skills.
Valid values (see `lib.sh` valid-status set):
| State | Meaning | Set by |
|---|---|---|
| `running` | tmux session active, agent running | `create`, `resume` |
| `stopped` | deliberately stopped via `--capture-id`/`--reason`/`--graceful`; conversation preserved for resume | `stop` (STOP mode) |
| `terminated` | hard-killed via `--mode hard`; tmux session destroyed | `stop` (hard mode), `monitor` reconcile |
| `archived` | soft-stopped via `--mode soft`; tmux left alive, YAML-only update | `stop` (soft mode) |
### Job States (Registry — `.hermes/jobs/<id>.json`)
Managed by `skills/tmux-agent-orchestrate-delegate-job/scripts/registry.py`.
Valid values:
| State | Meaning | Set by |
|---|---|---|
| `pending` | job registered, agent not yet started | `registry.py register` |
| `running` | agent picked up the job, publishing events | `publish_event.py --event started` |
| `completed` | terminal event — agent finished successfully | `publish_event.py --event completed` |
| `error` | terminal event — agent failed | `publish_event.py --event error` |
| `cancelled` | job cancelled by orchestrator | `registry.py cancel` |
**Key distinction**: Session states track the **tmux container lifecycle** (create→stop→resume).
Job states track the **delegated work lifecycle** (submit→run→complete/error).
A single session can host multiple sequential jobs; a job runs within exactly one session.