From a2d4f8060858328e30bb8552b93fd50cc480e26b Mon Sep 17 00:00:00 2001 From: Godopu Date: Sat, 20 Jun 2026 15:32:02 +0000 Subject: [PATCH] fix(monitor,resume): honor stopped status + clear stop metadata on resume Implements user choice Option B: the two follow-ups to 0de0f23, in one patch. Changes: - skills/tmux-agent-orchestrate-monitor/scripts/reconcile.sh: - drift-A skip-set extended: ('terminated', 'archived', 'stopped') - prevents the monitor from overwriting a tmux-dead 'stopped' row with 'terminated (auto-detected)', which would lose resumable + captured id - skills/tmux-agent-orchestrate-resume/scripts/update_yaml_resumed.sh: - pop stopped_at, stopped_at_epoch, stop_reason, resumable on resume (alongside the existing terminated_at*/termination_mode/archived_at) so a resumed row has no stale end-of-session metadata - skills/tmux-agent-orchestrate-monitor/SKILL.md: documented 'stopped' in the drift class list + a skip-set note on drift class A - skills/tmux-agent-orchestrate-resume/SKILL.md: documented stopped -> running transition + tier-1 race-free resume path 5-route surface preserved (no new directory). delete_session.sh untouched. Verified on isolated server -L claude-followup-test (kill-server after): - syntax PASS - E2E A: stop -> tmux dead -> reconcile --once -> status stays 'stopped' - E2E B: resume -> stopped_at/stopped_at_epoch/stop_reason/resumable all gone - E2E C: plain delete -> terminated, reconcile leaves it (no regression) - Real YAML + main canary untouched Co-Authored-By: Claude Opus 4.8 --- skills/tmux-agent-orchestrate-monitor/SKILL.md | 9 ++++++++- .../scripts/reconcile.sh | 4 +++- skills/tmux-agent-orchestrate-resume/SKILL.md | 16 ++++++++++++++++ .../scripts/update_yaml_resumed.sh | 5 +++++ 4 files changed, 32 insertions(+), 2 deletions(-) diff --git a/skills/tmux-agent-orchestrate-monitor/SKILL.md b/skills/tmux-agent-orchestrate-monitor/SKILL.md index 215778e..d076e29 100644 --- a/skills/tmux-agent-orchestrate-monitor/SKILL.md +++ b/skills/tmux-agent-orchestrate-monitor/SKILL.md @@ -30,7 +30,7 @@ Dispatch a **Kanban worker** (in `goal_mode`) that: - `~/.gemini/antigravity-cli/conversations/.db` mtime (agy) 2. Compares the live state to `agent-sessions.yaml` 3. Detects 4 classes of drift: - - **yaml-only terminated**: tmux dead, YAML says `terminated` → OK + - **yaml-only terminated/archived/stopped**: tmux dead, YAML says `terminated`, `archived`, or `stopped` → OK, left untouched (deliberate end states) - **yaml-only running, tmux dead**: YAML says `running`, tmux is gone → mark `terminated` with timestamp - **tmux-only running, not in YAML**: tmux session exists with `-creator-*` naming but YAML doesn't know about it → register as a new entry - **stale UUID**: YAML has a UUID, but the on-disk artifact is gone → flag in comment @@ -120,6 +120,13 @@ tmux: no session → comment: "lab-landing-page-creator-claude: tmux gone (was pane 201132, cmd claude). Marked terminated." ``` +**Skip-set**: the auto-terminate only fires for sessions whose status is `running`. +Rows already in a deliberate end state — `terminated`, `archived`, or **`stopped`** +(set by `tmux-agent-orchestrate-delete --capture-id/--reason/--graceful`) — are +left untouched. This is critical: a `stopped` row keeps its `resumable: true` and +captured `*_session_id_own`, so the monitor must **not** overwrite it with +`terminated ("auto-detected")` when its tmux is (expectedly) gone. + ### B. tmux alive, not in YAML → auto-register ``` diff --git a/skills/tmux-agent-orchestrate-monitor/scripts/reconcile.sh b/skills/tmux-agent-orchestrate-monitor/scripts/reconcile.sh index b161bd9..d2b829a 100755 --- a/skills/tmux-agent-orchestrate-monitor/scripts/reconcile.sh +++ b/skills/tmux-agent-orchestrate-monitor/scripts/reconcile.sh @@ -302,7 +302,9 @@ if tmux_confirmed: name = s.get('name') if not name: continue - if s.get('status') in ('terminated', 'archived'): + # 'stopped' 도 deliberate한 종료 상태 — drift 로 보지 않고 그대로 둔다. + # (없으면 tmux-dead stopped 세션을 'terminated' 로 덮어써 resumable 플래그가 소실됨) + if s.get('status') in ('terminated', 'archived', 'stopped'): continue srv = s.get('tmux_server') or 'default' if (name, srv) not in alive_set: diff --git a/skills/tmux-agent-orchestrate-resume/SKILL.md b/skills/tmux-agent-orchestrate-resume/SKILL.md index f5539d1..063ac29 100644 --- a/skills/tmux-agent-orchestrate-resume/SKILL.md +++ b/skills/tmux-agent-orchestrate-resume/SKILL.md @@ -29,6 +29,22 @@ Three cases this skill handles: 2. **tmux is alive but empty** — You started a session with `tmux-agent-orchestrate-create` but haven't sent a message yet (so no session id was assigned). The user can either send their first message (and the id is auto-assigned), or you can read the *workspace's* most recent conversation from `$HOME_DIR/.gemini/antigravity-cli/cache/last_conversations.json` (defaults to `~/.gemini/...`) for agy, or the latest `*.jsonl` in `$CLAUDE_PROJECT_DIR//` (defaults to `~/.claude/projects/`) for claude. 3. **tmux is alive AND the agent inside is already running** — Just attach. No re-spawn needed. +### Resuming a `stopped` session (`stopped → running`) + +When a session was ended via `tmux-agent-orchestrate-delete --capture-id` (STOP +mode), its row is `status: stopped` with `resumable: true` and the conversation id +already recorded in `claude_session_id_own` / `agy_conversation_id_own`. This is the +ideal resume path: + +- **tier-1, race-free**: because `--capture-id` wrote the id into the row at stop + time, `resolve_session_id.sh` resolves it via `find_workspace_uuid` tier-1 (the + per-row own id) — no reliance on the mtime-based disk scan, so a concurrent + session in another workspace can never shadow it. +- On resume, `update_yaml_resumed.sh` transitions `stopped → running` and **clears + the stop metadata** (`stopped_at`, `stopped_at_epoch`, `stop_reason`, `resumable`) + along with the usual `terminated_at*` / `termination_mode` / `archived_at`, so the + row reflects a clean running state with no stale end-of-session fields. + ## UUID resolution order `agent-sessions.yaml` is the *primary* source. The skill reads in this order: diff --git a/skills/tmux-agent-orchestrate-resume/scripts/update_yaml_resumed.sh b/skills/tmux-agent-orchestrate-resume/scripts/update_yaml_resumed.sh index efa76df..c7621a9 100755 --- a/skills/tmux-agent-orchestrate-resume/scripts/update_yaml_resumed.sh +++ b/skills/tmux-agent-orchestrate-resume/scripts/update_yaml_resumed.sh @@ -92,6 +92,11 @@ target.pop('terminated_at', None) target.pop('terminated_at_epoch', None) target.pop('termination_mode', None) target.pop('archived_at', None) +# stop 메타도 정리 — resume 하면 더 이상 stopped 상태가 아니므로 잔존 필드를 제거. +target.pop('stopped_at', None) +target.pop('stopped_at_epoch', None) +target.pop('stop_reason', None) +target.pop('resumable', None) target['last_visible_status'] = f'resumed conversation {uuid} at {now}' target.setdefault('pane', {})