refactor: rename skills from tmux-agent-orchestrate-* to multi-agent-mux-* in backplane scripts and documents

This commit is contained in:
2026-06-22 15:58:48 +09:00
parent ee48d77d0a
commit c721d1cd86
32 changed files with 215 additions and 215 deletions
@@ -0,0 +1,136 @@
---
name: multi-agent-mux-stop
description: "Stop an agent tmux session (claude, antigravity/agy) and update .mam/agent-sessions.yaml. Default stops gracefully and marks status=stopped with conversation preserved for resume. Does NOT delete on-disk conversation artifacts (jsonl/db) — those are preserved unless --purge-conversation is passed. Use when ending a work session, switching to a different one, or cleaning up before a fresh start."
version: 1.0.0
author: godopu
license: MIT
platforms: [linux, macos]
environments: [terminal, tmux]
metadata:
hermes:
tags: [agent, tmux, claude, antigravity, agy, multi-agent, stop, terminate, cleanup]
related_skills: [multi-agent-mux-create, multi-agent-mux-resume, multi-agent-mux-monitor]
prereq_skills: [multi-agent-mux-create, multi-agent-mux-resume]
---
# Multi-Agent Stop — Stop an Agent tmux Session
> **Companion skills**: `multi-agent-mux-create` (start), `multi-agent-mux-resume` (re-attach), `multi-agent-mux-monitor` (live status).
> **Tmux Isolation**: `stop` 명령은 YAML의 `tmux_server` 필드를 자동으로 파싱하여 해당 격리 서버의 세션을 안전하게 종료(kill)하므로, `TMUX_SERVER_NAME` 환경변수를 수동으로 지정할 필요가 없습니다.
> **Single source of truth**: `./.mam/agent-sessions.yaml`.
## What this skill does
Stop an agent's tmux session gracefully, resolve and store the conversation ID, and **mark the YAML entry (status=stopped)**. Preserves:
- The tmux session's recorded `pane.pid / cmd / cwd / mcp_attachments` for audit
- The agent's on-disk conversation (claude `*.jsonl`, agy `conversations/*.db`) — so the user can `multi-agent-mux-resume` later
- The `start_command` so a future `multi-agent-mux-create --session <name>` reproduces the same tmux spec
The stop command is always **graceful by default**:
1. Sends exit keys to the agent TUI (`/exit` for Claude, `Exit` for Agy) and waits 3 seconds.
2. If still alive, issues `tmux kill-session` (SIGTERM) and waits 5 seconds.
3. If still alive, kills the pane PID via SIGKILL (`kill -9`) as a last resort.
4. Auto-captures the conversation ID into the row (`claude_session_id_own`/`agy_conversation_id_own`) before killing, ensuring the next resume uses a race-free tier-1 lookup.
## Pre-flight
```bash
SESSION_NAME=<workspace>-creator-<agent> # convention
AGENT_SESSIONS_YAML=.mam/agent-sessions.yaml
# 1) Session is registered?
python3 -c "
import yaml
d = yaml.safe_load(open('$AGENT_SESSIONS_YAML'))
names = [s['name'] for s in d.get('tmux_sessions', [])]
if '$SESSION_NAME' not in names:
print('NOT in YAML — refusing to stop (no audit trail). Use multi-agent-mux-create first, or pass --force-no-yaml.')
raise SystemExit(1)
"
# 2) Already stopped?
ALREADY=$(python3 -c "
import yaml
d = yaml.safe_load(open('$AGENT_SESSIONS_YAML'))
s = [x for x in d['tmux_sessions'] if x['name']=='$SESSION_NAME'][0]
print(s.get('status', 'unknown'))
")
if [ "$ALREADY" = "stopped" ]; then
echo "Already stopped."
fi
```
## Workflow
```bash
# 1. Stop gracefully (default — captures ID, shuts down safely, status=stopped)
bash .agents/skills/multi-agent-mux-stop/scripts/stop_session.sh \
--session "$SESSION_NAME"
# 2. Stop gracefully + record a custom stop reason
bash .agents/skills/multi-agent-mux-stop/scripts/stop_session.sh \
--session "$SESSION_NAME" --reason api_error
# 3. Stop gracefully + clean up on-disk conversation (DANGEROUS)
# — this prevents any future resume (status=terminated, resumable=false).
bash .agents/skills/multi-agent-mux-stop/scripts/stop_session.sh \
--session "$SESSION_NAME" --purge-conversation
```
**Idempotency**: if the row is already `status: stopped`, the script prints `already stopped (...)` and exits 0 — re-running is a safe no-op.
### State machine
```
running ──(stop default / --reason)────────► stopped (resumable:true, conv preserved)
running ──(stop --purge-conversation --yes)► terminated (resumable:false, conv deleted)
stopped ──(stop default … again)───────────► stopped (idempotent no-op)
```
Fields written in STOP mode: `status: stopped`, `stopped_at`, `stopped_at_epoch`, `stop_reason`, `termination_mode: graceful`, `claude_session_id_own`/`agy_conversation_id_own` and `resumable: true`.
If `--purge-conversation` is used: `status: terminated`, `terminated_at`, `terminated_at_epoch`, `termination_mode: purge` and `resumable: false`.
The script:
1. Verifies the session is in agent-sessions.yaml
2. If `delegate_job_id` is set, automatically publishes a `progress --detail "terminating"` event to the multi-agent-mux-delegate-job registry
3. Captures the `last_visible_status` from `tmux capture-pane` (so we have a final TUI snapshot for audit)
4. Attempts graceful exit keys → SIGTERM kill-session → SIGKILL fallback
5. For `purge-conversation`: deletes `~/.claude/projects/.../jsonl` (claude) or `~/.gemini/antigravity-cli/conversations/...db` + `brain/...` (agy)
6. Updates the YAML entry and SQLite database atomically
7. If `delegate_job_id` is set, publishes a `completed` event to the multi-agent-mux-delegate-job registry
## Pitfalls
- **Don't delete on-disk artifacts by default** — the agent's `*.jsonl` / `conversations/*.db` is the data that `multi-agent-mux-resume` needs. `--purge-conversation` is for when the user is genuinely done with the conversation and wants zero recovery chance.
- **YAML is append-only until you write a stop** — if a previous run left the entry as `running` but tmux is actually dead (crash, host reboot), the YAML is stale. Running `multi-agent-mux-stop` will detect "tmux already dead, just update YAML" and proceed.
- **Don't delete the `claude_session_id_own: null` placeholder** — when the user creates a fresh session with `multi-agent-mux-create` and never sent a message, the entry has `claude_session_id_own: null`. Stopping must preserve that field.
- **Monitor skill may still be tracking** — if `multi-agent-mux-monitor` is running a heartbeat loop, stopping a session while it watches will trigger its `tmux ls != yaml` reconciliation. That's expected — let the monitor run, it will mark the entry as `terminated` on its own.
## Verification
```bash
# 1. tmux gone
tmux has-session -t "$SESSION_NAME" 2>/dev/null && echo "STILL ALIVE" || echo "OK: tmux gone"
# 2. YAML has stopped entry
python3 -c "
import yaml
d = yaml.safe_load(open('$AGENT_SESSIONS_YAML'))
s = [x for x in d['tmux_sessions'] if x['name']=='$SESSION_NAME'][0]
assert s['status'] == 'stopped', f'expected stopped, got {s[\"status\"]}'
assert s.get('stopped_at'), 'missing stopped_at'
print(f'OK: stopped at {s[\"stopped_at\"]}')
print(f' preserved: pane.pid={s[\"pane\"][\"pid\"]}, cmd={s[\"pane\"][\"cmd\"]}, cwd={s[\"pane\"][\"cwd\"]}')
"
# 3. (if --purge-conversation) disk artifacts gone
[ -f "${CLAUDE_PROJECT_DIR:-$HOME/.claude/projects}/<projkey>/<uuid>.jsonl" ] && echo "WARN: jsonl still exists" || echo "OK: jsonl purged"
```
## When NOT to use this skill
- **Just detaching** → `tmux detach` (Ctrl-B d) or just close the terminal. The tmux session keeps running.
- **Stopping the agent inside but keeping tmux** → send `Ctrl-C` or `/exit` (claude) / `Ctrl-D` (agy) via `tmux send-keys`. The tmux session stays but the agent process is gone.
- **Replacing an existing session with a new one** → `multi-agent-mux-stop` first, then `multi-agent-mux-create`.
+341
View File
@@ -0,0 +1,341 @@
#!/usr/bin/env bash
# stop_session.sh — multi-agent-mux-stop 의 부속 스크립트
# Usage:
# bash stop_session.sh --session <name> [--agent claude|agy] \
# [--mode soft|hard] [--purge-conversation] [--yes]
#
# mode:
# soft — YAML 을 status=archived 로 마크, tmux 세션은 그대로 둠 (P1-A:
# terminated 는 tmux 가 실제로 죽은 상태에만 사용)
# hard — tmux kill-session + YAML status=terminated
# --purge-conversation: --mode hard 일 때만. 삭제 대상 세션의 *워크스페이스에
# 격리된* conversation artifact 만 삭제 (P0-C). 전역
# agent_identities 를 참조하지 않음. resume 불가.
#
# Stop extension (Option A — stop 확장, 새 6번째 스킬 없이 stop 의미론 흡수):
# --capture-id — kill 직전에 이 워크스페이스의 conversation id 를 row 에 확정
# 기록 (claude_session_id_own / agy_conversation_id_own) →
# 다음 resume 이 tier-1(race-free) 로 복원. find_workspace_uuid
# 재사용 (per-row -> workspace-scoped disk scan -> cache).
# --reason R — 상태 전이 사유 (stop_reason). 기본값 manual_stop.
# --graceful — kill-session 즉시 종료 대신 send-keys 로 정상 종료 유도 →
# 3초 대기 → 미종료 시 kill-session(SIGTERM) → 5초 → SIGKILL.
# 위 세 옵션 중 하나라도 주면 STOP 모드: status 가 terminated 가 아니라 stopped
# 로 전이 (running -> stopped). 멱등: 이미 stopped 면 no-op + exit 0.
# 옵션 미지정 시 기존 hard/soft 동작 그대로 (backward compatible).
#
# Exit codes:
# 0 = success (or already-stopped no-op) | 1 = YAML not found / not registered
# 2 = invalid args | 3 = interactive confirmation required (--yes 누락)
set -euo pipefail
source "$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)/lib.sh"
usage() {
cat <<EOF
Usage: $0 --session <name> [--agent claude|agy] [--purge-conversation] [--yes] [--reason <reason>]
Stop arguments:
--reason <reason> — stop_reason field (default: manual_stop)
(idempotent: stopping an already-stopped session is a no-op with exit 0)
EOF
}
SESSION_NAME=""
AGENT=""
PURGE=0
YES=0
CAPTURE_ID=1
GRACEFUL=1
REASON="manual_stop"
STOP_MODE=1
while [ $# -gt 0 ]; do
case "$1" in
--session) SESSION_NAME="$2"; shift 2 ;;
--agent) AGENT="$2"; shift 2 ;;
--purge-conversation) PURGE=1; shift ;;
--yes) YES=1; shift ;;
--reason) REASON="$2"; shift 2 ;;
--mode|--capture-id|--graceful)
echo "ERROR: $1 option is deprecated. Stop now always stops gracefully and captures IDs." >&2
exit 2
;;
-h|--help) usage; exit 0 ;;
*) echo "ERROR: unknown arg: $1" >&2; usage; exit 2 ;;
esac
done
[ -n "$SESSION_NAME" ] || { echo "ERROR: --session required" >&2; usage; exit 2; }
[ -f "$AGENT_SESSIONS_YAML" ] || { echo "ERROR: $AGENT_SESSIONS_YAML not found" >&2; exit 1; }
export TMUX_SERVER_NAME="$(resolve_tmux_server "$SESSION_NAME")"
# --agent 미지정 시 이름 suffix 로 fallback (P1-F)
if [ -z "$AGENT" ]; then
case "$SESSION_NAME" in
*-creator-claude) AGENT=claude ;;
*-creator-agy) AGENT=agy ;;
*-creator-hermes) AGENT=hermes ;;
*) echo "ERROR: cannot infer agent from '$SESSION_NAME'; pass --agent" >&2; exit 2 ;;
esac
fi
# 세션이 YAML 에 있는지 + 해당 row 의 워크스페이스 cwd 및 delegate_job_id 추출.
# JSON 으로 emit — cwd 에 '|' 가 들어가도 안전 (review item 7; 기존 cwd|jid 파서 대체).
MAPPED_DATA=$(env_python "$AGENT_SESSIONS_YAML" SESSION_NAME="$SESSION_NAME" <<'PYEOF'
import os, sys, json, yaml, sqlite3
name = os.environ['SESSION_NAME']
yaml_path = os.environ['YAML_PATH']
db_path = os.path.splitext(yaml_path)[0] + '.db'
d = {}
try:
if os.path.exists(db_path):
conn = sqlite3.connect(db_path, timeout=10.0)
try:
row = conn.execute('SELECT data FROM sessions WHERE name=?', (name,)).fetchone()
if row:
s = json.loads(row[0])
cwd = (s.get('pane') or {}).get('cwd', '')
jid = s.get('delegate_job_id', '') or ''
print(json.dumps({"cwd": cwd, "job_id": jid}))
raise SystemExit(0)
except sqlite3.OperationalError:
pass
row = conn.execute('SELECT data FROM state WHERE id=1').fetchone()
if row:
d = json.loads(row[0])
conn.close()
elif os.path.exists(yaml_path):
with open(yaml_path) as f:
d = yaml.safe_load(f) or {}
except Exception:
pass
for s in d.get('tmux_sessions', []):
if s.get('name') == name:
cwd = (s.get('pane') or {}).get('cwd', '')
jid = s.get('delegate_job_id', '') or ''
print(json.dumps({"cwd": cwd, "job_id": jid}))
raise SystemExit(0)
raise SystemExit(7)
PYEOF
) || {
echo "ERROR: session '$SESSION_NAME' not in $AGENT_SESSIONS_YAML" >&2
exit 1
}
TARGET_CWD=$(printf '%s' "$MAPPED_DATA" | python3 -c 'import sys,json; print(json.load(sys.stdin).get("cwd",""))')
DELEGATE_JOB_ID=$(printf '%s' "$MAPPED_DATA" | python3 -c 'import sys,json; print(json.load(sys.stdin).get("job_id",""))')
# 멱등성: STOP 모드에서 이미 stopped 인 세션이면 no-op + exit 0
if [ "$STOP_MODE" = "1" ]; then
if STOPPED_INFO=$(is_already_stopped "$SESSION_NAME"); then
echo "already stopped (status=stopped, $STOPPED_INFO) — no-op"
exit 0
fi
fi
# purge 확인
if [ "$PURGE" = "1" ] && [ "$YES" != "1" ]; then
echo "DANGER: --purge-conversation will DELETE this workspace's on-disk conversation."
echo " workspace: ${TARGET_CWD:-<unknown>}"
echo " This means: no future multi-agent-mux-resume for this session."
echo " Re-run with --yes to confirm."
exit 3
fi
# purge 대상 UUID 를 워크스페이스 격리해서 해결 (P0-C — 전역 참조 금지)
PURGE_UUID=""
if [ "$PURGE" = "1" ] && [ -n "$TARGET_CWD" ]; then
PURGE_UUID=$(find_workspace_uuid "$TARGET_CWD" "$AGENT" || true)
fi
NOW_ISO=$(date -u +'%Y-%m-%dT%H:%M:%SZ')
NOW_EPOCH=$(date +%s)
# tmux 상태 + 마지막 TUI 스냅샷 (살아있을 때만; capture-pane 내용은 env 로만 전달)
TMUX_ALIVE=0
LAST_STATUS=""
if tmux has-session -t "$SESSION_NAME" 2>/dev/null; then
TMUX_ALIVE=1
LAST_STATUS=$(tmux capture-pane -t "$SESSION_NAME" -p -S -10 2>/dev/null | tr '\n' ' ' | head -c 500 || true)
fi
# --capture-id: kill 직전에 conversation id 를 해결 (process/jsonl 이 아직 살아있을 때).
# find_workspace_uuid 가 tier-1(row) -> tier-2(workspace-scoped disk scan) -> tier-3(cache)
# 를 알아서 시도하므로 tmux 생사와 무관하게 동작.
CAPTURED_UUID=""
if [ "$CAPTURE_ID" = "1" ] && [ -n "$TARGET_CWD" ]; then
CAPTURED_UUID=$(capture_conversation_id "$AGENT" "$TARGET_CWD" || true)
if [ -n "$CAPTURED_UUID" ]; then
echo "captured conversation id: $CAPTURED_UUID"
else
echo "WARN: --capture-id requested but no conversation id resolved (nothing on disk yet)"
fi
fi
delegate_publish_event "$DELEGATE_JOB_ID" progress "terminating"
# --graceful: send-keys 로 정상 종료 유도 → 폴백 체인 (SIGTERM → SIGKILL).
graceful_stop() {
local pane_pid exitkey
pane_pid=$(tmux list-panes -t "$SESSION_NAME" -F '#{pane_pid}' 2>/dev/null | head -1 || true)
case "$AGENT" in
claude) exitkey="/exit" ;;
agy) exitkey="Exit" ;;
hermes) exitkey="/exit" ;;
*) exitkey="/exit" ;;
esac
echo "graceful: send-keys '$exitkey' to $SESSION_NAME"
tmux send-keys -t "$SESSION_NAME" "$exitkey" Enter 2>/dev/null || true
sleep 3
if ! tmux has-session -t "$SESSION_NAME" 2>/dev/null; then
echo "graceful: exited cleanly"
return 0
fi
echo "graceful: still alive → kill-session (SIGTERM)"
tmux kill-session -t "$SESSION_NAME" 2>/dev/null || true
sleep 5
if ! tmux has-session -t "$SESSION_NAME" 2>/dev/null; then
echo "graceful: terminated after kill-session"
return 0
fi
echo "graceful: STILL alive → SIGKILL fallback (pane pid $pane_pid)"
[ -n "$pane_pid" ] && kill -9 "$pane_pid" 2>/dev/null || true
}
# tmux 종료: graceful 이면 폴백 체인, 아니면 기존 hard kill.
if [ "$GRACEFUL" = "1" ] && [ "$TMUX_ALIVE" = "1" ]; then
graceful_stop
elif [ "$TMUX_ALIVE" = "1" ]; then
tmux kill-session -t "$SESSION_NAME"
echo "killed tmux: $SESSION_NAME"
else
echo "tmux already dead, just updating YAML"
fi
atomic_dump_yaml "$AGENT_SESSIONS_YAML" \
SESSION_NAME="$SESSION_NAME" AGENT="$AGENT" PURGE="$PURGE" \
NOW_ISO="$NOW_ISO" NOW_EPOCH="$NOW_EPOCH" LAST_STATUS="$LAST_STATUS" \
PURGE_UUID="$PURGE_UUID" TARGET_CWD="$TARGET_CWD" \
REASON="$REASON" CAPTURED_UUID="$CAPTURED_UUID" <<'PYEOF'
import shutil
name = os.environ['SESSION_NAME']
agent = os.environ['AGENT']
purge = os.environ['PURGE'] == '1'
now = os.environ['NOW_ISO']
home = os.environ['HOME_DIR']
last_status = os.environ.get('LAST_STATUS', '')
purge_uuid = os.environ.get('PURGE_UUID', '').strip()
ws = os.environ.get('TARGET_CWD', '')
reason = os.environ.get('REASON', '') or 'manual_stop'
captured = os.environ.get('CAPTURED_UUID', '').strip()
target = None
for s in d.get('tmux_sessions', []):
if s.get('name') == name:
target = s
break
if target is None:
print(f"ERROR: disappeared during script: {name}", flush=True)
raise SystemExit(1)
if purge:
target['status'] = 'terminated'
target['terminated_at'] = now
target['terminated_at_epoch'] = int(os.environ['NOW_EPOCH'])
target['termination_mode'] = 'purge'
else:
target['status'] = 'stopped'
target['stopped_at'] = now
target['stopped_at_epoch'] = int(os.environ['NOW_EPOCH'])
target['stop_reason'] = reason
target['termination_mode'] = 'graceful'
if last_status:
target['last_visible_status_at_termination'] = last_status
# --capture-id: 항상 captured UUID 기록 (purge가 아닐 때만)
if captured and not purge:
if agent == 'claude':
target['claude_session_id_own'] = captured
elif agent == 'agy':
target['agy_conversation_id_own'] = captured
elif agent == 'hermes':
target['hermes_conversation_id_own'] = captured
target['resumable'] = True
# --purge-conversation: 워크스페이스 격리된 UUID 의 디스크 artifact 만 삭제 (P0-C)
if purge and purge_uuid:
if agent == 'claude':
key = ws.replace('/', '-').replace('_', '-')
claude_project_dir = os.environ.get('CLAUDE_PROJECT_DIR', f"{home}/.claude/projects")
jsonl = f"{claude_project_dir}/{key}/{purge_uuid}.jsonl"
if os.path.exists(jsonl):
os.remove(jsonl)
print(f"purged: {jsonl}", flush=True)
target['claude_session_id_own'] = None
elif agent == 'agy':
db = f"{home}/.gemini/antigravity-cli/conversations/{purge_uuid}.db"
if os.path.exists(db):
os.remove(db)
print(f"purged: {db}", flush=True)
brain = f"{home}/.gemini/antigravity-cli/brain/{purge_uuid}"
if os.path.isdir(brain):
shutil.rmtree(brain)
print(f"purged: {brain}", flush=True)
target['agy_conversation_id_own'] = None
elif agent == 'hermes':
json_file = f"{home}/.mam/sessions/session_{purge_uuid}.json"
if os.path.exists(json_file):
os.remove(json_file)
print(f"purged: {json_file}", flush=True)
hdb = f"{home}/.mam/state.db"
if os.path.exists(hdb):
try:
import sqlite3
conn = sqlite3.connect(hdb)
conn.execute("DELETE FROM sessions WHERE id=?", (purge_uuid,))
conn.execute("DELETE FROM messages WHERE session_id=?", (purge_uuid,))
conn.commit()
conn.close()
print(f"purged db records for session: {purge_uuid}", flush=True)
except Exception as e:
print(f"WARN: purge hermes db records failed: {e}", flush=True)
target['hermes_conversation_id_own'] = None
# agent_identities 는 cache — 이 워크스페이스 것일 때만 비운다
ai = (d.get('agent_identities') or {}).get(agent) or {}
if ai.get('project_cwd') == ws:
if agent == 'claude' and ai.get('session_id') == purge_uuid:
ai['session_id'] = None
ai['session_jsonl'] = None
ai.pop('session_size_bytes', None)
ai.pop('session_lines', None)
elif agent == 'agy' and ai.get('conversation_id') == purge_uuid:
ai['conversation_id'] = None
ai['conversation_db'] = None
ai['conversation_brain_dir'] = None
elif agent == 'hermes' and ai.get('session_id') == purge_uuid:
ai['session_id'] = None
elif purge and not purge_uuid:
print("WARN: --purge-conversation requested but no workspace-scoped UUID resolved; nothing purged", flush=True)
if purge:
target['resumable'] = False
print(f"updated: {name} status={target['status']}", flush=True)
PYEOF
delegate_publish_event "$DELEGATE_JOB_ID" completed "session terminated"
echo
echo "=== stop complete ==="
echo " session: $SESSION_NAME"
echo " agent: $AGENT"
echo " reason: $REASON"
echo " captured: ${CAPTURED_UUID:-<none>}"
echo " purge: $PURGE${PURGE_UUID:+ (uuid $PURGE_UUID)}"
echo " time: $NOW_ISO"
echo
echo "Recovery: multi-agent-mux-create + multi-agent-mux-resume 로 동일 컨텍스트 복원 가능"
echo " (단 --purge-conversation 사용 시 복원 불가)"