Files
multi-agent-mux/review-brief-FW-L2-L3-v1.md
T

3.9 KiB

Review Brief: FW-L3 & FW-L2 Improvements (v2)

We have implemented two long-term tasks from FUTURE_WORKS.md: FW-L3 (SQLite Database Normalization) and FW-L2 (Stop Semantics Simplification), including the migration safety improvements identified in the first review round.

1. FW-L3: SQLite Database Normalization

  • Goal: Transition from storing the entire JSON state as a single blob in state (id=1) table to a normalized table structure (sessions table) to support O(1) status queries, while maintaining compatibility with the existing YAML synchronization workflow.
  • Implementation:
    • In skills/lib.sh:
      • Updated atomic_dump_yaml to create and maintain:
        • state (id=1, data TEXT) table (holds global metadata such as agent_identities, with the tmux_sessions key removed).
        • sessions (name TEXT PRIMARY KEY, status TEXT, pane_cwd TEXT, data JSON) table (each row holds a single session entry).
        • Added index idx_sessions_pane_cwd on sessions(pane_cwd) for faster lookups.
      • Inside atomic_dump_yaml, before executing caller mutations, the complete dictionary d is seamlessly reconstructed from both state and sessions tables to guarantee that existing mutations still run perfectly without any modification.
      • Updated resolve_tmux_server, find_workspace_uuid, and is_already_stopped to run optimized O(1) SELECT queries directly on the normalized database table when it exists.
      • Migration Fallback: Added comprehensive safety fallbacks: if sessions table does not exist yet (OperationalError) or returns no results, the reader functions fall back to querying the old state table's JSON blob. This guarantees zero degradation during the migration window when readers execute before the first write.
    • In status.sh and reconcile.sh:
      • Adjusted the read-only DB loading logic to pull and reconstruct the d['tmux_sessions'] list from the sessions table.

2. FW-L2: Stop Semantics Simplification

  • Goal: Deprecate confusing --mode soft|hard, --capture-id, and --graceful flags. Make graceful shutdown and metadata capture the standard default behavior. Clarify the destructive --purge-conversation option.
  • Implementation:
    • In skills/tmux-agent-orchestrate-stop/scripts/stop_session.sh:
      • Deprecated --mode, --capture-id, and --graceful arguments. Passing these flags now raises an error informing the user that they are deprecated.
      • Default behavior is now equivalent to the previous stop mode: it gracefully exits the agent TUI, shuts down tmux, captures conversation IDs, and updates status to stopped (instead of terminated).
      • Added custom reasons via --reason (still defaults to manual_stop).
      • --purge-conversation is retained as a destructive option to purge conversation databases and JSONLs from disk. When purged, status transitions to terminated and resumable is set to False.
    • In skills/tmux-agent-orchestrate-stop/SKILL.md:
      • Re-wrote the stop documentation, removed deprecated options, and aligned with the new semantics.
    • Stale Documentation Cleanup:
      • Cleaned up outdated references to --capture-id/--graceful in resume/SKILL.md and monitor/SKILL.md.

Verification Checklist for Reviewers

  1. Does the SQLite schema creation/modification in lib.sh preserve concurrency safety (e.g. WAL mode, BEGIN IMMEDIATE, commit/rollback)?
  2. Do the O(1) optimizations in lib.sh (resolve_tmux_server, find_workspace_uuid, is_already_stopped) fallback safely to YAML/state-blob if the SQLite DB is missing or in old schema format?
  3. Are the stop options properly simplified in stop_session.sh, and does the default behavior work cleanly with the database/YAML update flow?
  4. Are there any edge cases where reconcile.sh or status.sh might fail when DB is newly initialized?

Please perform a code review on these changes and reply with either a detailed feedback/corrections or a PASS.