feat(lib): SQLite DB normalization (FW-L3) & stop semantics simplification (FW-L2)

This commit is contained in:
2026-06-21 09:05:15 +00:00
parent 478be56679
commit 8097df0cbe
11 changed files with 324 additions and 200 deletions
+36 -70
View File
@@ -1,6 +1,6 @@
---
name: tmux-agent-orchestrate-stop
description: "Stop an agent tmux session (claude, antigravity/agy) and update .hermes/agent-sessions.yaml. Hard mode marks status=terminated; stop options (--capture-id/--reason/--graceful) mark status=stopped with conversation preserved for resume. Does NOT delete on-disk conversation artifacts (jsonl/db) — those are preserved unless --purge-conversation is passed. Use when ending a work session, switching to a different one, or cleaning up before a fresh start."
description: "Stop an agent tmux session (claude, antigravity/agy) and update .hermes/agent-sessions.yaml. Default stops gracefully and marks status=stopped with conversation preserved for resume. Does NOT delete on-disk conversation artifacts (jsonl/db) — those are preserved unless --purge-conversation is passed. Use when ending a work session, switching to a different one, or cleaning up before a fresh start."
version: 1.0.0
author: godopu
license: MIT
@@ -21,16 +21,17 @@ metadata:
## What this skill does
Stop an agent's tmux session and **mark the YAML entry (terminated or stopped)**. Preserves:
Stop an agent's tmux session gracefully, resolve and store the conversation ID, and **mark the YAML entry (status=stopped)**. Preserves:
- The tmux session's recorded `pane.pid / cmd / cwd / mcp_attachments` for audit
- The agent's on-disk conversation (claude `*.jsonl`, agy `conversations/*.db`) — so the user can `tmux-agent-orchestrate-resume` later
- The `start_command` so a future `tmux-agent-orchestrate-create --session <name>` reproduces the same tmux spec
The user explicitly chooses:
- **soft stop** (default): update YAML only; leave tmux running. Useful when "stop" really means "I'm done with this card".
- **hard stop**: `tmux kill-session` + update YAML. The default when the user says "kill it" or "end the session".
The stop command is always **graceful by default**:
1. Sends exit keys to the agent TUI (`/exit` for Claude, `Exit` for Agy) and waits 3 seconds.
2. If still alive, issues `tmux kill-session` (SIGTERM) and waits 5 seconds.
3. If still alive, kills the pane PID via SIGKILL (`kill -9`) as a last resort.
4. Auto-captures the conversation ID into the row (`claude_session_id_own`/`agy_conversation_id_own`) before killing, ensuring the next resume uses a race-free tier-1 lookup.
## Pre-flight
@@ -48,99 +49,64 @@ if '$SESSION_NAME' not in names:
raise SystemExit(1)
"
# 2) Already terminated?
# 2) Already stopped?
ALREADY=$(python3 -c "
import yaml
d = yaml.safe_load(open('$AGENT_SESSIONS_YAML'))
s = [x for x in d['tmux_sessions'] if x['name']=='$SESSION_NAME'][0]
print(s.get('status', 'unknown'))
")
if [ "$ALREADY" = "terminated" ]; then
echo "Already terminated at $(python3 -c "import yaml; d=yaml.safe_load(open('$AGENT_SESSIONS_YAML')); print([x for x in d['tmux_sessions'] if x['name']=='$SESSION_NAME'][0].get('terminated_at',''))")"
echo "Re-running will just refresh the timestamp. Continue? (--yes to skip)"
if [ "$ALREADY" = "stopped" ]; then
echo "Already stopped."
fi
```
## Workflow
```bash
# 1. soft stop (YAML only — tmux left running)
# 1. Stop gracefully (default — captures ID, shuts down safely, status=stopped)
bash skills/tmux-agent-orchestrate-stop/scripts/stop_session.sh \
--session "$SESSION_NAME" --mode soft
--session "$SESSION_NAME"
# 2. hard stop (default — kill tmux + update YAML)
# 2. Stop gracefully + record a custom stop reason
bash skills/tmux-agent-orchestrate-stop/scripts/stop_session.sh \
--session "$SESSION_NAME" --mode hard
--session "$SESSION_NAME" --reason api_error
# 3. hard stop + clean up on-disk conversation (DANGEROUS)
# — this prevents any future resume. Use only when user is certain.
# 3. Stop gracefully + clean up on-disk conversation (DANGEROUS)
# — this prevents any future resume (status=terminated, resumable=false).
bash skills/tmux-agent-orchestrate-stop/scripts/stop_session.sh \
--session "$SESSION_NAME" --mode hard --purge-conversation
--session "$SESSION_NAME" --purge-conversation
```
## Stop extension (Option A — `stop` semantics without a 6th skill)
Rather than a separate `tmux-agent-orchestrate-stop` route, the base stop command absorbs the
"stop" intent via three opt-in options. Passing **any** of them switches the YAML
transition from `terminated` to **`stopped`** (`running → stopped`), signalling
"deliberately stopped, conversation preserved, ready to resume":
```bash
# Stop: capture the conversation id into the row, record a reason, exit gracefully.
bash skills/tmux-agent-orchestrate-stop/scripts/stop_session.sh \
--session "$SESSION_NAME" --capture-id --reason api_error --graceful
```
| Option | Effect |
|---|---|
| `--capture-id` | Before kill, resolve THIS workspace's conversation id via `find_workspace_uuid` (per-row → workspace-scoped disk scan → cache) and record it to `claude_session_id_own` / `agy_conversation_id_own`, plus `resumable: true`. Guarantees the next resume hits **tier-1** (race-free) instead of the mtime-based disk-scan fallback. |
| `--reason <reason>` | Records `stop_reason` (default `manual_stop`). Convention: `user_request` / `api_error` / `timeout` / `crash` / `manual_stop`. |
| `--graceful` | `tmux send-keys` exit (`/exit` for claude, `Exit` for agy) → 3 s wait → if alive `tmux kill-session` (SIGTERM) → 5 s → `kill -9` pane pid as last resort. Avoids hard-killing a TUI mid-write. |
**Idempotency**: in STOP mode, if the row is already `status: stopped`, the script
prints `already stopped (...)` and exits 0 — re-running is a safe no-op.
**Backward compatibility**: with none of these options, the base stop command behaves exactly as
before (`hard``terminated`, `soft``archived`).
**Idempotency**: if the row is already `status: stopped`, the script prints `already stopped (...)` and exits 0 — re-running is a safe no-op.
### State machine
```
running ──(stop --mode hard)────────────────► terminated
running ──(stop --capture-id/--reason/--graceful)► stopped (resumable, conv preserved)
running ──(stop --mode soft)───────────────archived (tmux left alive)
stopped ──(stop --capture-id … again)───────► stopped (idempotent no-op)
any ──(stop --purge-conversation --yes)─► (conv deleted, resumable:false)
running ──(stop default / --reason)────────► stopped (resumable:true, conv preserved)
running ──(stop --purge-conversation --yes)► terminated (resumable:false, conv deleted)
stopped ──(stop default … again)───────────► stopped (idempotent no-op)
```
Fields written in STOP mode: `status: stopped`, `stopped_at`, `stopped_at_epoch`,
`stop_reason`, `termination_mode: stop|graceful`, and (with `--capture-id`)
`claude_session_id_own`/`agy_conversation_id_own` + `resumable: true`.
Fields written in STOP mode: `status: stopped`, `stopped_at`, `stopped_at_epoch`, `stop_reason`, `termination_mode: graceful`, `claude_session_id_own`/`agy_conversation_id_own` and `resumable: true`.
If `--purge-conversation` is used: `status: terminated`, `terminated_at`, `terminated_at_epoch`, `termination_mode: purge` and `resumable: false`.
The script:
1. Verifies the session is in agent-sessions.yaml
2. If `delegate_job_id` is set, automatically publishes a `progress --detail "terminating"` event to the tmux-agent-orchestrate-delegate-job registry
3. Captures the `last_visible_status` from `tmux capture-pane` (so we have a final TUI snapshot for audit)
4. For `hard` mode: `tmux kill-session -t <name>` (which auto-SIGTERMs children including the agent)
4. Attempts graceful exit keys → SIGTERM kill-session → SIGKILL fallback
5. For `purge-conversation`: deletes `~/.claude/projects/.../jsonl` (claude) or `~/.gemini/antigravity-cli/conversations/...db` + `brain/...` (agy)
6. Updates the YAML entry
6. Updates the YAML entry and SQLite database atomically
7. If `delegate_job_id` is set, publishes a `completed` event to the tmux-agent-orchestrate-delegate-job registry
8. Updates the YAML entry:
```yaml
- name: <SESSION_NAME>
status: terminated
terminated_at: 2026-06-17T...Z
terminated_at_epoch: ...
# all original fields preserved
```
## Pitfalls
- **`tmux kill-session` doesn't just kill the session — it sends SIGHUP to the pane's child processes too.** This is usually what you want (the agent process dies, no zombie reparenting to init). But if you wanted to keep the agent running outside tmux for some reason, use `soft` mode.
- **Don't delete on-disk artifacts by default** — the agent's `*.jsonl` / `conversations/*.db` is the data that `tmux-agent-orchestrate-resume` needs. `--purge-conversation` is for when the user is genuinely done with the conversation and wants zero recovery chance.
- **YAML is append-only until you write a stop** — if a previous run left the entry as `running` but tmux is actually dead (crash, host reboot), the YAML is stale. Running `tmux-agent-orchestrate-stop --mode hard` will detect "tmux already dead, just update YAML" and proceed.
- **Don't delete the `claude_session_id_own: null` placeholder** — when the user creates a fresh session with `tmux-agent-orchestrate-create` and never sent a message, the entry has `claude_session_id_own: null`. Stopping must preserve that field (it's the audit trail showing "this tmux session never produced a session id of its own").
- **Monitor skill may still be tracking** — if `tmux-agent-orchestrate-monitor` is running a heartbeat loop, stopping a session while it watches will trigger its `tmux ls != yaml` reconciliation. That's expected — let the monitor run, it will mark the entry as `terminated` on its own. Don't fight it.
- **YAML is append-only until you write a stop** — if a previous run left the entry as `running` but tmux is actually dead (crash, host reboot), the YAML is stale. Running `tmux-agent-orchestrate-stop` will detect "tmux already dead, just update YAML" and proceed.
- **Don't delete the `claude_session_id_own: null` placeholder** — when the user creates a fresh session with `tmux-agent-orchestrate-create` and never sent a message, the entry has `claude_session_id_own: null`. Stopping must preserve that field.
- **Monitor skill may still be tracking** — if `tmux-agent-orchestrate-monitor` is running a heartbeat loop, stopping a session while it watches will trigger its `tmux ls != yaml` reconciliation. That's expected — let the monitor run, it will mark the entry as `terminated` on its own.
## Verification
@@ -148,23 +114,23 @@ The script:
# 1. tmux gone
tmux has-session -t "$SESSION_NAME" 2>/dev/null && echo "STILL ALIVE" || echo "OK: tmux gone"
# 2. YAML has terminated entry
# 2. YAML has stopped entry
python3 -c "
import yaml
d = yaml.safe_load(open('$AGENT_SESSIONS_YAML'))
s = [x for x in d['tmux_sessions'] if x['name']=='$SESSION_NAME'][0]
assert s['status'] == 'terminated', f'expected terminated, got {s[\"status\"]}'
assert s.get('terminated_at'), 'missing terminated_at'
print(f'OK: terminated at {s[\"terminated_at\"]}')
assert s['status'] == 'stopped', f'expected stopped, got {s[\"status\"]}'
assert s.get('stopped_at'), 'missing stopped_at'
print(f'OK: stopped at {s[\"stopped_at\"]}')
print(f' preserved: pane.pid={s[\"pane\"][\"pid\"]}, cmd={s[\"pane\"][\"cmd\"]}, cwd={s[\"pane\"][\"cwd\"]}')
"
# 3. (if --purge-conversation) disk artifacts gone (CLAUDE_PROJECT_DIR env var overrides default $HOME/.claude/projects)
# 3. (if --purge-conversation) disk artifacts gone
[ -f "${CLAUDE_PROJECT_DIR:-$HOME/.claude/projects}/<projkey>/<uuid>.jsonl" ] && echo "WARN: jsonl still exists" || echo "OK: jsonl purged"
```
## When NOT to use this skill
- **Just detaching** → `tmux detach` (Ctrl-B d) or just close the terminal. The tmux session keeps running.
- **Stopping the agent inside but keeping tmux** → send `Ctrl-C` or `/exit` (claude) / `Ctrl-D` (agy) via `tmux send-keys`. The tmux session stays but the agent process is gone; you can then `tmux-agent-orchestrate-create` again to spawn a fresh agent in the same tmux session.
- **Replacing an existing session with a new one** → `tmux-agent-orchestrate-stop --mode hard` first, then `tmux-agent-orchestrate-create`.
- **Stopping the agent inside but keeping tmux** → send `Ctrl-C` or `/exit` (claude) / `Ctrl-D` (agy) via `tmux send-keys`. The tmux session stays but the agent process is gone.
- **Replacing an existing session with a new one** → `tmux-agent-orchestrate-stop` first, then `tmux-agent-orchestrate-create`.
@@ -33,54 +33,41 @@ source "$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)/lib.sh"
usage() {
cat <<EOF
Usage: $0 --session <name> [--agent claude|agy] [--mode soft|hard] [--purge-conversation] [--yes]
[--capture-id] [--reason <reason>] [--graceful]
Usage: $0 --session <name> [--agent claude|agy] [--purge-conversation] [--yes] [--reason <reason>]
Modes:
soft — update YAML to status=archived, leave tmux running
hard (default) — tmux kill-session + update YAML to status=terminated
Stop extension (any of these → STOP mode, status=stopped instead of terminated):
--capture-id — record this workspace's conversation id to the row before kill
Stop arguments:
--reason <reason> — stop_reason field (default: manual_stop)
--graceful — send-keys exit → 3s → kill-session → 5s → SIGKILL fallback
(idempotent: stopping an already-stopped session is a no-op with exit 0)
EOF
}
SESSION_NAME=""
AGENT=""
MODE="hard" # "stop" 의 자연스러운 의미 = tmux 까지 종료
PURGE=0
YES=0
CAPTURE_ID=0
GRACEFUL=0
REASON=""
STOP_MODE=0
CAPTURE_ID=1
GRACEFUL=1
REASON="manual_stop"
STOP_MODE=1
while [ $# -gt 0 ]; do
case "$1" in
--session) SESSION_NAME="$2"; shift 2 ;;
--agent) AGENT="$2"; shift 2 ;;
--mode) MODE="$2"; shift 2 ;;
--purge-conversation) PURGE=1; shift ;;
--yes) YES=1; shift ;;
--capture-id) CAPTURE_ID=1; STOP_MODE=1; shift ;;
--reason) REASON="$2"; STOP_MODE=1; shift 2 ;;
--graceful) GRACEFUL=1; STOP_MODE=1; shift ;;
--reason) REASON="$2"; shift 2 ;;
--mode|--capture-id|--graceful)
echo "ERROR: $1 option is deprecated. Stop now always stops gracefully and captures IDs." >&2
exit 2
;;
-h|--help) usage; exit 0 ;;
*) echo "ERROR: unknown arg: $1" >&2; usage; exit 2 ;;
esac
done
[ -n "$SESSION_NAME" ] || { echo "ERROR: --session required" >&2; usage; exit 2; }
[ "$MODE" = "soft" ] || [ "$MODE" = "hard" ] || { echo "ERROR: --mode must be soft or hard" >&2; exit 2; }
[ -f "$AGENT_SESSIONS_YAML" ] || { echo "ERROR: $AGENT_SESSIONS_YAML not found" >&2; exit 1; }
# STOP 모드 기본 사유
if [ "$STOP_MODE" = "1" ] && [ -z "$REASON" ]; then
REASON="manual_stop"
fi
export TMUX_SERVER_NAME="$(resolve_tmux_server "$SESSION_NAME")"
# --agent 미지정 시 이름 suffix 로 fallback (P1-F)
@@ -95,10 +82,34 @@ fi
# 세션이 YAML 에 있는지 + 해당 row 의 워크스페이스 cwd 및 delegate_job_id 추출.
# JSON 으로 emit — cwd 에 '|' 가 들어가도 안전 (review item 7; 기존 cwd|jid 파서 대체).
MAPPED_DATA=$(env_python "$AGENT_SESSIONS_YAML" SESSION_NAME="$SESSION_NAME" <<'PYEOF'
import os, json, yaml
import os, sys, json, yaml, sqlite3
name = os.environ['SESSION_NAME']
with open(os.environ['YAML_PATH']) as f:
d = yaml.safe_load(f) or {}
yaml_path = os.environ['YAML_PATH']
db_path = os.path.splitext(yaml_path)[0] + '.db'
d = {}
try:
if os.path.exists(db_path):
conn = sqlite3.connect(db_path, timeout=10.0)
try:
row = conn.execute('SELECT data FROM sessions WHERE name=?', (name,)).fetchone()
if row:
s = json.loads(row[0])
cwd = (s.get('pane') or {}).get('cwd', '')
jid = s.get('delegate_job_id', '') or ''
print(json.dumps({"cwd": cwd, "job_id": jid}))
raise SystemExit(0)
except sqlite3.OperationalError:
pass
row = conn.execute('SELECT data FROM state WHERE id=1').fetchone()
if row:
d = json.loads(row[0])
conn.close()
elif os.path.exists(yaml_path):
with open(yaml_path) as f:
d = yaml.safe_load(f) or {}
except Exception:
pass
for s in d.get('tmux_sessions', []):
if s.get('name') == name:
cwd = (s.get('pane') or {}).get('cwd', '')
@@ -194,31 +205,27 @@ graceful_stop() {
# tmux 종료: graceful 이면 폴백 체인, 아니면 기존 hard kill.
if [ "$GRACEFUL" = "1" ] && [ "$TMUX_ALIVE" = "1" ]; then
graceful_stop
elif [ "$MODE" = "hard" ] && [ "$TMUX_ALIVE" = "1" ]; then
elif [ "$TMUX_ALIVE" = "1" ]; then
tmux kill-session -t "$SESSION_NAME"
echo "killed tmux: $SESSION_NAME"
elif [ "$MODE" = "hard" ]; then
else
echo "tmux already dead, just updating YAML"
fi
atomic_dump_yaml "$AGENT_SESSIONS_YAML" \
SESSION_NAME="$SESSION_NAME" AGENT="$AGENT" MODE="$MODE" PURGE="$PURGE" \
SESSION_NAME="$SESSION_NAME" AGENT="$AGENT" PURGE="$PURGE" \
NOW_ISO="$NOW_ISO" NOW_EPOCH="$NOW_EPOCH" LAST_STATUS="$LAST_STATUS" \
PURGE_UUID="$PURGE_UUID" TARGET_CWD="$TARGET_CWD" \
STOP_MODE="$STOP_MODE" REASON="$REASON" GRACEFUL="$GRACEFUL" \
CAPTURED_UUID="$CAPTURED_UUID" <<'PYEOF'
REASON="$REASON" CAPTURED_UUID="$CAPTURED_UUID" <<'PYEOF'
import shutil
name = os.environ['SESSION_NAME']
agent = os.environ['AGENT']
mode = os.environ['MODE']
purge = os.environ['PURGE'] == '1'
now = os.environ['NOW_ISO']
home = os.environ['HOME_DIR']
last_status = os.environ.get('LAST_STATUS', '')
purge_uuid = os.environ.get('PURGE_UUID', '').strip()
ws = os.environ.get('TARGET_CWD', '')
stop_mode = os.environ.get('STOP_MODE') == '1'
graceful = os.environ.get('GRACEFUL') == '1'
reason = os.environ.get('REASON', '') or 'manual_stop'
captured = os.environ.get('CAPTURED_UUID', '').strip()
@@ -231,29 +238,22 @@ if target is None:
print(f"ERROR: disappeared during script: {name}", flush=True)
raise SystemExit(1)
if mode == 'soft':
# P1-A: soft 는 tmux 가 살아있으니 archived. terminated 아님.
target['status'] = 'archived'
target['archived_at'] = now
target['termination_mode'] = 'soft'
elif stop_mode:
# STOP 모드: running -> stopped (terminated 와 의도 구분). conversation 보존.
if purge:
target['status'] = 'terminated'
target['terminated_at'] = now
target['terminated_at_epoch'] = int(os.environ['NOW_EPOCH'])
target['termination_mode'] = 'purge'
else:
target['status'] = 'stopped'
target['stopped_at'] = now
target['stopped_at_epoch'] = int(os.environ['NOW_EPOCH'])
target['stop_reason'] = reason
target['termination_mode'] = 'graceful' if graceful else 'stop'
else:
target['status'] = 'terminated'
target['terminated_at'] = now
target['terminated_at_epoch'] = int(os.environ['NOW_EPOCH'])
target['termination_mode'] = 'hard'
target['termination_mode'] = 'graceful'
if last_status:
target['last_visible_status_at_termination'] = last_status
# --capture-id: 해결된 conversation id 를 per-row own id 에 확정 기록 (tier-1 보장).
# purge 와 함께면 어차피 아래에서 지워지므로 기록하지 않는다.
# --capture-id: 항상 captured UUID 기록 (purge가 아닐 때만)
if captured and not purge:
if agent == 'claude':
target['claude_session_id_own'] = captured
@@ -305,16 +305,11 @@ PYEOF
delegate_publish_event "$DELEGATE_JOB_ID" completed "session terminated"
echo
if [ "$STOP_MODE" = "1" ]; then
echo "=== stop complete ==="
else
echo "=== stop complete ==="
fi
echo "=== stop complete ==="
echo " session: $SESSION_NAME"
echo " agent: $AGENT"
echo " mode: $MODE${STOP_MODE:+ (stop)}${GRACEFUL:+ +graceful}"
[ "$STOP_MODE" = "1" ] && echo " reason: $REASON"
[ "$CAPTURE_ID" = "1" ] && echo " captured: ${CAPTURED_UUID:-<none>}"
echo " reason: $REASON"
echo " captured: ${CAPTURED_UUID:-<none>}"
echo " purge: $PURGE${PURGE_UUID:+ (uuid $PURGE_UUID)}"
echo " time: $NOW_ISO"
echo