docs: fix delete->stop in REPORT + add session/job state glossary (FW-03, FW-10, FW-16)
FW-03: replace 'delete' with 'stop' in skill reference (line 299). 'terminated' retained as valid YAML status value (hard kill mode). FW-10/FW-16: add Glossary section distinguishing session states (running/stopped/terminated/archived in agent-sessions.yaml) from job states (pending/running/completed/error/cancelled in registry). Documents which skill/function sets each state.
This commit is contained in:
@@ -296,7 +296,7 @@ graph LR
|
|||||||
2. **`tmux-agent-orchestrate-monitor` (`reconcile.sh` & `watchdog.sh`)**:
|
2. **`tmux-agent-orchestrate-monitor` (`reconcile.sh` & `watchdog.sh`)**:
|
||||||
* **Watchdog Integration**: Starts a subscriber monitoring loop (`watchdog.sh`) to detect orphaned agent panes or locked workspaces.
|
* **Watchdog Integration**: Starts a subscriber monitoring loop (`watchdog.sh`) to detect orphaned agent panes or locked workspaces.
|
||||||
* **Reconciliation loop**: Subscribes to the global job topic. On terminal events, it invokes `lib.sh::atomic_dump_yaml` to sync status drifts (e.g. setting tmux sessions to `terminated` in `agent-sessions.yaml` once the agent exits).
|
* **Reconciliation loop**: Subscribes to the global job topic. On terminal events, it invokes `lib.sh::atomic_dump_yaml` to sync status drifts (e.g. setting tmux sessions to `terminated` in `agent-sessions.yaml` once the agent exits).
|
||||||
3. **`tmux-agent-orchestrate-create / delete / resume`**:
|
3. **`tmux-agent-orchestrate-create / stop / resume`**:
|
||||||
* Integrates the job life status into session metadata updates, ensuring standard tmux cleanup triggers state updates in the registry and audit logs.
|
* Integrates the job life status into session metadata updates, ensuring standard tmux cleanup triggers state updates in the registry and audit logs.
|
||||||
|
|
||||||
---
|
---
|
||||||
@@ -326,3 +326,36 @@ graph LR
|
|||||||
De-prioritize plaintext support. Enforce connection over port `8883` with verified TLS certificates. Implement client certificates (mTLS) for agent authentication.
|
De-prioritize plaintext support. Enforce connection over port `8883` with verified TLS certificates. Implement client certificates (mTLS) for agent authentication.
|
||||||
4. **Build Auto-Reconnecting Subscriber Loops**:
|
4. **Build Auto-Reconnecting Subscriber Loops**:
|
||||||
Upgrade `job_subscriber.py` to handle disconnect callbacks. Maintain a persistent queue in memory and allow the client to reconnect with exponential backoff, preventing socket dropout from terminating the orchestration flow.
|
Upgrade `job_subscriber.py` to handle disconnect callbacks. Maintain a persistent queue in memory and allow the client to reconnect with exponential backoff, preventing socket dropout from terminating the orchestration flow.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Glossary: Session States vs Job States
|
||||||
|
|
||||||
|
This project manages **two distinct state domains** that are often confused:
|
||||||
|
|
||||||
|
### Session States (YAML — `.hermes/agent-sessions.yaml`)
|
||||||
|
Managed by `skills/lib.sh` and the 6 `tmux-agent-orchestrate-*` skills.
|
||||||
|
Valid values (see `lib.sh` valid-status set):
|
||||||
|
|
||||||
|
| State | Meaning | Set by |
|
||||||
|
|---|---|---|
|
||||||
|
| `running` | tmux session active, agent running | `create`, `resume` |
|
||||||
|
| `stopped` | deliberately stopped via `--capture-id`/`--reason`/`--graceful`; conversation preserved for resume | `stop` (STOP mode) |
|
||||||
|
| `terminated` | hard-killed via `--mode hard`; tmux session destroyed | `stop` (hard mode), `monitor` reconcile |
|
||||||
|
| `archived` | soft-stopped via `--mode soft`; tmux left alive, YAML-only update | `stop` (soft mode) |
|
||||||
|
|
||||||
|
### Job States (Registry — `.hermes/jobs/<id>.json`)
|
||||||
|
Managed by `skills/tmux-agent-orchestrate-delegate-job/scripts/registry.py`.
|
||||||
|
Valid values:
|
||||||
|
|
||||||
|
| State | Meaning | Set by |
|
||||||
|
|---|---|---|
|
||||||
|
| `pending` | job registered, agent not yet started | `registry.py register` |
|
||||||
|
| `running` | agent picked up the job, publishing events | `publish_event.py --event started` |
|
||||||
|
| `completed` | terminal event — agent finished successfully | `publish_event.py --event completed` |
|
||||||
|
| `error` | terminal event — agent failed | `publish_event.py --event error` |
|
||||||
|
| `cancelled` | job cancelled by orchestrator | `registry.py cancel` |
|
||||||
|
|
||||||
|
**Key distinction**: Session states track the **tmux container lifecycle** (create→stop→resume).
|
||||||
|
Job states track the **delegated work lifecycle** (submit→run→complete/error).
|
||||||
|
A single session can host multiple sequential jobs; a job runs within exactly one session.
|
||||||
|
|||||||
Reference in New Issue
Block a user