feat(lib): SQLite DB normalization (FW-L3) & stop semantics simplification (FW-L2)
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
---
|
||||
name: tmux-agent-orchestrate-stop
|
||||
description: "Stop an agent tmux session (claude, antigravity/agy) and update .hermes/agent-sessions.yaml. Hard mode marks status=terminated; stop options (--capture-id/--reason/--graceful) mark status=stopped with conversation preserved for resume. Does NOT delete on-disk conversation artifacts (jsonl/db) — those are preserved unless --purge-conversation is passed. Use when ending a work session, switching to a different one, or cleaning up before a fresh start."
|
||||
description: "Stop an agent tmux session (claude, antigravity/agy) and update .hermes/agent-sessions.yaml. Default stops gracefully and marks status=stopped with conversation preserved for resume. Does NOT delete on-disk conversation artifacts (jsonl/db) — those are preserved unless --purge-conversation is passed. Use when ending a work session, switching to a different one, or cleaning up before a fresh start."
|
||||
version: 1.0.0
|
||||
author: godopu
|
||||
license: MIT
|
||||
@@ -21,16 +21,17 @@ metadata:
|
||||
|
||||
## What this skill does
|
||||
|
||||
Stop an agent's tmux session and **mark the YAML entry (terminated or stopped)**. Preserves:
|
||||
Stop an agent's tmux session gracefully, resolve and store the conversation ID, and **mark the YAML entry (status=stopped)**. Preserves:
|
||||
|
||||
- The tmux session's recorded `pane.pid / cmd / cwd / mcp_attachments` for audit
|
||||
- The agent's on-disk conversation (claude `*.jsonl`, agy `conversations/*.db`) — so the user can `tmux-agent-orchestrate-resume` later
|
||||
- The `start_command` so a future `tmux-agent-orchestrate-create --session <name>` reproduces the same tmux spec
|
||||
|
||||
The user explicitly chooses:
|
||||
|
||||
- **soft stop** (default): update YAML only; leave tmux running. Useful when "stop" really means "I'm done with this card".
|
||||
- **hard stop**: `tmux kill-session` + update YAML. The default when the user says "kill it" or "end the session".
|
||||
The stop command is always **graceful by default**:
|
||||
1. Sends exit keys to the agent TUI (`/exit` for Claude, `Exit` for Agy) and waits 3 seconds.
|
||||
2. If still alive, issues `tmux kill-session` (SIGTERM) and waits 5 seconds.
|
||||
3. If still alive, kills the pane PID via SIGKILL (`kill -9`) as a last resort.
|
||||
4. Auto-captures the conversation ID into the row (`claude_session_id_own`/`agy_conversation_id_own`) before killing, ensuring the next resume uses a race-free tier-1 lookup.
|
||||
|
||||
## Pre-flight
|
||||
|
||||
@@ -48,99 +49,64 @@ if '$SESSION_NAME' not in names:
|
||||
raise SystemExit(1)
|
||||
"
|
||||
|
||||
# 2) Already terminated?
|
||||
# 2) Already stopped?
|
||||
ALREADY=$(python3 -c "
|
||||
import yaml
|
||||
d = yaml.safe_load(open('$AGENT_SESSIONS_YAML'))
|
||||
s = [x for x in d['tmux_sessions'] if x['name']=='$SESSION_NAME'][0]
|
||||
print(s.get('status', 'unknown'))
|
||||
")
|
||||
if [ "$ALREADY" = "terminated" ]; then
|
||||
echo "Already terminated at $(python3 -c "import yaml; d=yaml.safe_load(open('$AGENT_SESSIONS_YAML')); print([x for x in d['tmux_sessions'] if x['name']=='$SESSION_NAME'][0].get('terminated_at',''))")"
|
||||
echo "Re-running will just refresh the timestamp. Continue? (--yes to skip)"
|
||||
if [ "$ALREADY" = "stopped" ]; then
|
||||
echo "Already stopped."
|
||||
fi
|
||||
```
|
||||
|
||||
## Workflow
|
||||
|
||||
```bash
|
||||
# 1. soft stop (YAML only — tmux left running)
|
||||
# 1. Stop gracefully (default — captures ID, shuts down safely, status=stopped)
|
||||
bash skills/tmux-agent-orchestrate-stop/scripts/stop_session.sh \
|
||||
--session "$SESSION_NAME" --mode soft
|
||||
--session "$SESSION_NAME"
|
||||
|
||||
# 2. hard stop (default — kill tmux + update YAML)
|
||||
# 2. Stop gracefully + record a custom stop reason
|
||||
bash skills/tmux-agent-orchestrate-stop/scripts/stop_session.sh \
|
||||
--session "$SESSION_NAME" --mode hard
|
||||
--session "$SESSION_NAME" --reason api_error
|
||||
|
||||
# 3. hard stop + clean up on-disk conversation (DANGEROUS)
|
||||
# — this prevents any future resume. Use only when user is certain.
|
||||
# 3. Stop gracefully + clean up on-disk conversation (DANGEROUS)
|
||||
# — this prevents any future resume (status=terminated, resumable=false).
|
||||
bash skills/tmux-agent-orchestrate-stop/scripts/stop_session.sh \
|
||||
--session "$SESSION_NAME" --mode hard --purge-conversation
|
||||
--session "$SESSION_NAME" --purge-conversation
|
||||
```
|
||||
|
||||
## Stop extension (Option A — `stop` semantics without a 6th skill)
|
||||
|
||||
Rather than a separate `tmux-agent-orchestrate-stop` route, the base stop command absorbs the
|
||||
"stop" intent via three opt-in options. Passing **any** of them switches the YAML
|
||||
transition from `terminated` to **`stopped`** (`running → stopped`), signalling
|
||||
"deliberately stopped, conversation preserved, ready to resume":
|
||||
|
||||
```bash
|
||||
# Stop: capture the conversation id into the row, record a reason, exit gracefully.
|
||||
bash skills/tmux-agent-orchestrate-stop/scripts/stop_session.sh \
|
||||
--session "$SESSION_NAME" --capture-id --reason api_error --graceful
|
||||
```
|
||||
|
||||
| Option | Effect |
|
||||
|---|---|
|
||||
| `--capture-id` | Before kill, resolve THIS workspace's conversation id via `find_workspace_uuid` (per-row → workspace-scoped disk scan → cache) and record it to `claude_session_id_own` / `agy_conversation_id_own`, plus `resumable: true`. Guarantees the next resume hits **tier-1** (race-free) instead of the mtime-based disk-scan fallback. |
|
||||
| `--reason <reason>` | Records `stop_reason` (default `manual_stop`). Convention: `user_request` / `api_error` / `timeout` / `crash` / `manual_stop`. |
|
||||
| `--graceful` | `tmux send-keys` exit (`/exit` for claude, `Exit` for agy) → 3 s wait → if alive `tmux kill-session` (SIGTERM) → 5 s → `kill -9` pane pid as last resort. Avoids hard-killing a TUI mid-write. |
|
||||
|
||||
**Idempotency**: in STOP mode, if the row is already `status: stopped`, the script
|
||||
prints `already stopped (...)` and exits 0 — re-running is a safe no-op.
|
||||
|
||||
**Backward compatibility**: with none of these options, the base stop command behaves exactly as
|
||||
before (`hard`→`terminated`, `soft`→`archived`).
|
||||
**Idempotency**: if the row is already `status: stopped`, the script prints `already stopped (...)` and exits 0 — re-running is a safe no-op.
|
||||
|
||||
### State machine
|
||||
|
||||
```
|
||||
running ──(stop --mode hard)────────────────► terminated
|
||||
running ──(stop --capture-id/--reason/--graceful)► stopped (resumable, conv preserved)
|
||||
running ──(stop --mode soft)────────────────► archived (tmux left alive)
|
||||
stopped ──(stop --capture-id … again)───────► stopped (idempotent no-op)
|
||||
any ──(stop --purge-conversation --yes)─► (conv deleted, resumable:false)
|
||||
running ──(stop default / --reason)────────► stopped (resumable:true, conv preserved)
|
||||
running ──(stop --purge-conversation --yes)► terminated (resumable:false, conv deleted)
|
||||
stopped ──(stop default … again)───────────► stopped (idempotent no-op)
|
||||
```
|
||||
|
||||
Fields written in STOP mode: `status: stopped`, `stopped_at`, `stopped_at_epoch`,
|
||||
`stop_reason`, `termination_mode: stop|graceful`, and (with `--capture-id`)
|
||||
`claude_session_id_own`/`agy_conversation_id_own` + `resumable: true`.
|
||||
Fields written in STOP mode: `status: stopped`, `stopped_at`, `stopped_at_epoch`, `stop_reason`, `termination_mode: graceful`, `claude_session_id_own`/`agy_conversation_id_own` and `resumable: true`.
|
||||
|
||||
If `--purge-conversation` is used: `status: terminated`, `terminated_at`, `terminated_at_epoch`, `termination_mode: purge` and `resumable: false`.
|
||||
|
||||
The script:
|
||||
1. Verifies the session is in agent-sessions.yaml
|
||||
2. If `delegate_job_id` is set, automatically publishes a `progress --detail "terminating"` event to the tmux-agent-orchestrate-delegate-job registry
|
||||
3. Captures the `last_visible_status` from `tmux capture-pane` (so we have a final TUI snapshot for audit)
|
||||
4. For `hard` mode: `tmux kill-session -t <name>` (which auto-SIGTERMs children including the agent)
|
||||
4. Attempts graceful exit keys → SIGTERM kill-session → SIGKILL fallback
|
||||
5. For `purge-conversation`: deletes `~/.claude/projects/.../jsonl` (claude) or `~/.gemini/antigravity-cli/conversations/...db` + `brain/...` (agy)
|
||||
6. Updates the YAML entry
|
||||
6. Updates the YAML entry and SQLite database atomically
|
||||
7. If `delegate_job_id` is set, publishes a `completed` event to the tmux-agent-orchestrate-delegate-job registry
|
||||
8. Updates the YAML entry:
|
||||
```yaml
|
||||
- name: <SESSION_NAME>
|
||||
status: terminated
|
||||
terminated_at: 2026-06-17T...Z
|
||||
terminated_at_epoch: ...
|
||||
# all original fields preserved
|
||||
```
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- **`tmux kill-session` doesn't just kill the session — it sends SIGHUP to the pane's child processes too.** This is usually what you want (the agent process dies, no zombie reparenting to init). But if you wanted to keep the agent running outside tmux for some reason, use `soft` mode.
|
||||
- **Don't delete on-disk artifacts by default** — the agent's `*.jsonl` / `conversations/*.db` is the data that `tmux-agent-orchestrate-resume` needs. `--purge-conversation` is for when the user is genuinely done with the conversation and wants zero recovery chance.
|
||||
- **YAML is append-only until you write a stop** — if a previous run left the entry as `running` but tmux is actually dead (crash, host reboot), the YAML is stale. Running `tmux-agent-orchestrate-stop --mode hard` will detect "tmux already dead, just update YAML" and proceed.
|
||||
- **Don't delete the `claude_session_id_own: null` placeholder** — when the user creates a fresh session with `tmux-agent-orchestrate-create` and never sent a message, the entry has `claude_session_id_own: null`. Stopping must preserve that field (it's the audit trail showing "this tmux session never produced a session id of its own").
|
||||
- **Monitor skill may still be tracking** — if `tmux-agent-orchestrate-monitor` is running a heartbeat loop, stopping a session while it watches will trigger its `tmux ls != yaml` reconciliation. That's expected — let the monitor run, it will mark the entry as `terminated` on its own. Don't fight it.
|
||||
- **YAML is append-only until you write a stop** — if a previous run left the entry as `running` but tmux is actually dead (crash, host reboot), the YAML is stale. Running `tmux-agent-orchestrate-stop` will detect "tmux already dead, just update YAML" and proceed.
|
||||
- **Don't delete the `claude_session_id_own: null` placeholder** — when the user creates a fresh session with `tmux-agent-orchestrate-create` and never sent a message, the entry has `claude_session_id_own: null`. Stopping must preserve that field.
|
||||
- **Monitor skill may still be tracking** — if `tmux-agent-orchestrate-monitor` is running a heartbeat loop, stopping a session while it watches will trigger its `tmux ls != yaml` reconciliation. That's expected — let the monitor run, it will mark the entry as `terminated` on its own.
|
||||
|
||||
## Verification
|
||||
|
||||
@@ -148,23 +114,23 @@ The script:
|
||||
# 1. tmux gone
|
||||
tmux has-session -t "$SESSION_NAME" 2>/dev/null && echo "STILL ALIVE" || echo "OK: tmux gone"
|
||||
|
||||
# 2. YAML has terminated entry
|
||||
# 2. YAML has stopped entry
|
||||
python3 -c "
|
||||
import yaml
|
||||
d = yaml.safe_load(open('$AGENT_SESSIONS_YAML'))
|
||||
s = [x for x in d['tmux_sessions'] if x['name']=='$SESSION_NAME'][0]
|
||||
assert s['status'] == 'terminated', f'expected terminated, got {s[\"status\"]}'
|
||||
assert s.get('terminated_at'), 'missing terminated_at'
|
||||
print(f'OK: terminated at {s[\"terminated_at\"]}')
|
||||
assert s['status'] == 'stopped', f'expected stopped, got {s[\"status\"]}'
|
||||
assert s.get('stopped_at'), 'missing stopped_at'
|
||||
print(f'OK: stopped at {s[\"stopped_at\"]}')
|
||||
print(f' preserved: pane.pid={s[\"pane\"][\"pid\"]}, cmd={s[\"pane\"][\"cmd\"]}, cwd={s[\"pane\"][\"cwd\"]}')
|
||||
"
|
||||
|
||||
# 3. (if --purge-conversation) disk artifacts gone (CLAUDE_PROJECT_DIR env var overrides default $HOME/.claude/projects)
|
||||
# 3. (if --purge-conversation) disk artifacts gone
|
||||
[ -f "${CLAUDE_PROJECT_DIR:-$HOME/.claude/projects}/<projkey>/<uuid>.jsonl" ] && echo "WARN: jsonl still exists" || echo "OK: jsonl purged"
|
||||
```
|
||||
|
||||
## When NOT to use this skill
|
||||
|
||||
- **Just detaching** → `tmux detach` (Ctrl-B d) or just close the terminal. The tmux session keeps running.
|
||||
- **Stopping the agent inside but keeping tmux** → send `Ctrl-C` or `/exit` (claude) / `Ctrl-D` (agy) via `tmux send-keys`. The tmux session stays but the agent process is gone; you can then `tmux-agent-orchestrate-create` again to spawn a fresh agent in the same tmux session.
|
||||
- **Replacing an existing session with a new one** → `tmux-agent-orchestrate-stop --mode hard` first, then `tmux-agent-orchestrate-create`.
|
||||
- **Stopping the agent inside but keeping tmux** → send `Ctrl-C` or `/exit` (claude) / `Ctrl-D` (agy) via `tmux send-keys`. The tmux session stays but the agent process is gone.
|
||||
- **Replacing an existing session with a new one** → `tmux-agent-orchestrate-stop` first, then `tmux-agent-orchestrate-create`.
|
||||
|
||||
Reference in New Issue
Block a user