docs: clean up stale create_session usage instructions in comments and markdown examples

feat: enforce required agent roles at creation and role immutability in registry
refactor: adapt multi-agent-mux skills and agent guidelines for the Team Leader scenario
2026-06-28 10:31:58 +09:00 · 2026-06-28 10:27:36 +09:00 · 2026-06-28 10:21:24 +09:00 · 2026-06-28 09:34:52 +09:00 · 2026-06-28 09:17:11 +09:00 · 2026-06-27 08:28:47 +09:00
24 changed files with 1093 additions and 577 deletions
@@ -10,21 +10,28 @@

 역할군 간의 책임 및 권한을 명확히 분리하여 병목을 줄이고 작업의 완성도를 높입니다.

-### 👤 Project Manager (PM / Orchestrator)
- **주요 책무**: 사용자 요구사항 접수, 상세 작업 계획 수립, 작업자 할당/지시, 전체 워크플로우 통제 및 최종 결과 보고.
+### 👑 General Manager (총괄 매니저)
+- **주요 책무**: 사용자와 직접 소통하여 요구사항 접수, 상세 작업 계획 수립, 팀장 에이전트 할당 및 작업 위임, 전체 워크플로우 통제 및 최종 완료 보고.
 - **모호성 제거**: 사용자의 요구사항에 모호한 부분이 있다면 작업을 추측하여 진행하지 말고, 즉시 사용자에게 질문하여 명확히 해야 합니다 (`/grill-me` 슬래시 명령어 권장).
- **피드백 루프 조정**: Reviewer들의 검증 의견을 분석하여 개선 방향을 의사결정합니다. 결정하기 까다로운 기술적 난제는 Worker 및 Reviewer들의 조사를 거쳐 PM 본인의 판단을 더한 최종 보고서를 작성해 사용자에게 제시하고 프로젝트의 방향을 결정합니다.
- **자가 치유 (Hermes Fallback Fix)**: Reviewer가 지적한 결함이 아주 경미하거나 단순 오탈자/설정 누락인 경우, Worker에게 재할당하지 않고 PM이 직접 소스코드를 수정하여 전체 왕복(Round-trip) 비용을 최소화합니다.

-### 🛠️ Worker (Implementation Agent)
- **주요 책무**: PM으로부터 위임받은 구체적인 비즈니스 로직 설계 및 소스코드 구현.
- **협업 및 소통**: 할당받은 업무 범위에서 구현 방향이 모호하거나 인터페이스 설계 변경이 필요한 경우 PM에게 질문하여 합의를 이룬 후 수술적(Surgical) 변경을 적용합니다.
- **계약 준수**: PM이 전달한 단일 작업 지침(Brief) 및 고유 Job ID 규약을 준수하며, 작업 시작 시 `started`, 종료 시 `completed`/`error` 이벤트를 백플레인에 발행해야 합니다.
+### 👥 Team Leaders (팀장)
+새롭게 생성되는 에이전트(`antigravity`, `claude`, `cline`, `hermes` 등)는 각 팀의 **팀장** 역할을 수행합니다. 총괄 매니저로부터 작업을 위임받아 개발 또는 리뷰 워크플로우를 주도합니다.
+- **Developer Team Leader (개발 팀장)**:
+  - 총괄 매니저로부터 작업을 위임받습니다.
+  - **작업 분석 및 계획**: 주어진 작업을 철저히 분석하고, 작은 단위로 문제를 나누어 세부 계획을 수립합니다.
+  - **내부 병렬 처리**: 내부적으로 subagent를 활용해 위임받은 작업을 병렬적으로 처리할 수 있습니다.
+  - **리뷰 타당성 검증 및 거부**: 리뷰어가 지적한 피드백을 면밀히 검토합니다. 타당한 제안은 수렴하여 코드를 수정하지만, 타당하지 않다고 판단되는 안건은 반영하지 않고 **그 명확한 이유를 작성하여 리뷰어에게 되돌려 보냅니다**.
+  - **완료 신호 송신**: 모든 리뷰어들로부터 `PASS`를 획득하고 변경 사항이 검증되면, 최초 작업을 위임받았던 개발 팀장이 총괄 매니저에게 최종 작업 완료 신호를 송신합니다.
+- **Reviewer Team Leader (리뷰어 팀장)**:
+  - 개발 팀장으로부터 리뷰 요청을 접수합니다.
+  - **문제 제시에 대한 이유와 개선 방향 포함**: 단순한 반려(`NOT PASS`) 통보는 금지됩니다. 이슈를 제기할 때는 **반드시 해당 문제가 발생하는 구체적인 이유와 확실한 개선 방향(코드 대안 포함)을 함께 작성**해야 합니다.
+  - **합의 루프**: 모든 지적 사항이 해결되고 최종 `PASS`를 발행할 때까지 리뷰 루프에 동참합니다.

-### 🔍 Reviewer (Verification Agent)
- **주요 책무**: Worker가 제출한 소스코드 변경 사항(Diff)과 구현 명세를 검증하고, 보안 결함 탐지, 성능 개선안 도출 및 설계 일관성을 심사하는 조력자.
- **구체적 대안 제시**: 단순한 반려(`NOT PASS`) 통보를 금지하며, 문제를 제기할 때는 **안정적이고 검증된 구체적인 코드 대안(Alternative Code)이나 해결 방안을 반드시 함께 제시**해야 합니다.
- **교차 검증의 상호보완성**: 에이전트의 모델 특성(예: Flash 계열은 의미론적 셸 결함 포착에 강하고, Opus/Sonnet 계열은 API 서명 및 논리 회귀 분석에 강함)을 살려 병렬로 상호보완적 심사를 수행합니다.
+### 🛡️ 역할 범위 준수 원칙 (Role Suitability Check)
+- 모든 에이전트는 자신에게 부여된 역할에 부합하는 작업만을 수행해야 합니다. (예: 개발 팀장은 최종 PASS 여부를 결정하지 않으며, 리뷰어 팀장은 직접 프로젝트 소스코드를 작성하지 않습니다.)
+- **자신의 역할에 맞지 않는 작업이 지시된 경우**, 에이전트는 반드시:
+  1. 해당 작업을 수행하기에 가장 적합한 에이전트 세션을 추천하여 위임을 유도하거나,
+  2. 프로젝트 연속성을 위해 극히 필요한 경우 직접 작업을 수행합니다.

 ---

@@ -59,34 +66,42 @@
 sequenceDiagram
    autonumber
    actor User as 사용자
-    participant PM as Project Manager
-    participant W as Worker
-    participant R as Reviewers
+    participant GM as General Manager
+    participant DTL as Developer Team Leader
+    participant RTL as Reviewer Team Leaders
    participant M as MQTT Backplane

-    User->>PM: 요구사항 전달
-    Note over PM: grill-me 및 계획 수립
-    PM->>M: Job 등록 및 Subscriber 구동
-    PM->>W: 작업 위임 (Job ID & Brief 전달)
-    W->>M: 'started' 이벤트 발행
-    Note over W: 코드 변경 및 구현
-    W->>M: 'completed' (혹은 'error') 발행
-    PM->>R: 병렬 리뷰 요청 (Diff 전달)
-    Note over R: 교차 분석 & 검증
-    alt 결함 발견
-        R->>PM: NOT PASS (대안 포함 피드백)
-        Note over PM: 경미한 결함은 PM이 직접 수정
-        PM->>W: 피드백 반영 및 재할당
+    User->>GM: 요구사항 전달
+    GM->>DTL: 작업 위임 (예: 랜딩 페이지 제작)
+    Note over DTL: 작업 분석, 세분화 및 subagent 병렬 구동
+    DTL->>M: 'started' 이벤트 발행
+    Note over DTL: 코드 변경 및 구현
+    DTL->>M: 'completed' 발행
+    DTL->>RTL: 리뷰 요청 (랜딩 페이지를 제작했습니다. 리뷰를 진행해주세요)
+    Note over RTL: 교차 분석 & 검증
+    alt 결함 발견 (리뷰어 피드백)
+        RTL->>DTL: NOT PASS / 피드백 (반드시 이유와 확실한 개선 방향 포함)
+        Note over DTL: DTL이 피드백의 타당성 검증
+        alt 타당한 피드백
+            Note over DTL: DTL이 수용하여 코드 수정
+        else 타당하지 않은 피드백
+            DTL->>RTL: 반론 및 거부 이유 전달 (부적절한 항목 미반영)
+        end
+        DTL->>RTL: 재리뷰 요청 (리뷰 안건 수정 완료)
    else 검증 통과
-        R->>PM: PASS
+        RTL->>DTL: PASS
    end
-    PM->>User: 최종 검증 통과 보고 & 커밋
+    DTL->>GM: 최종 완료 신호 송신
+    GM->>User: 사용자에게 작업 완료 통보
 ```

-1. **계획 수립 및 할당**: PM은 사용자 요청을 구체화하고 의존성이 겹치지 않는 범위에서 잡을 정의합니다.
-2. **작업 개시 및 통보**: PM은 구독자를 띄운 뒤 Worker 세션에 잡을 인가하며, Worker는 로직을 수행하고 단말 이벤트를 전송해 세션을 자동 종료합니다.
-3. **교차 검수 반복 (Review Loop)**: PM은 작업 완료 후 변경분을 Reviewer 에이전트들에게 병렬 회람시킵니다. 리뷰어 전원이 `PASS` 의견을 낼 때까지 수정-반려 주기를 무한 반복(Loop)하여 코드 완성도를 보증합니다.
-4. **릴리즈 및 정리**: 검증이 완료된 코드는 Git에 커밋하고, 임시 세션 리소스를 회수합니다.
+1. **계획 수립 및 할당**: 총괄 매니저는 개발 팀장에게 작업을 인가합니다.
+2. **분석 및 내부 실행**: 개발 팀장은 작업을 분석하고 세분화하여 계획을 세운 뒤 내부 subagent를 가동하여 구현을 완료합니다. 이후 `started`를 거쳐 `completed` 이벤트를 발행하고 리뷰어에게 검수를 요청합니다.
+3. **이의 제기 및 정제 루프**:
+   - 리뷰어 팀장은 상세 피드백 시 반드시 이유와 보완 방향을 제시해야 합니다.
+   - 개발 팀장은 의견을 검토해 타당하면 수정하고, 타당하지 않으면 반론과 근거를 회신합니다.
+   - 리뷰어 전원이 `PASS`를 인가할 때까지 이 과정이 반복됩니다.
+4. **최종 보고**: 개발 팀장이 총괄 매니저에게 완료 신호를 보내면 총괄 매니저가 사용자에게 완료를 알립니다.

 ---

@@ -119,8 +134,8 @@ TMUX 환경에서 실행되는 에이전트가 화면 스크롤 한계로 인해
 - [ ] **디렉토리 규약**: 레지스트리 경로(`.mam/jobs/`) 및 로깅 경로(`.mam/delegate_job_logs/`)가 `.gitignore`에 등록되었는가?
 - [ ] **스크립트 구비**: `mqtt_common.py`, `publish_event.py`, `job_subscriber.py`, `registry.py` 등의 핵심 모듈이 배치되었는가?
 - [ ] **HMAC 활성화**: 새로운 레지스트리 잡 발급 시 난수 기반의 `auth_token`이 정상적으로 주입되고, 서명 기반의 상호 인증이 활성화되는가?
- [ ] **운영 헌장 배치**: 본 규약 파일(`AGENT.md`)이 새 프로젝트의 **최상위 루트(Root) 디렉터리**에 배치되었는가? (협업을 수행하는 에이전트들이 온보딩 시 규칙을 가장 먼저 인지할 수 있도록 루트 경로 배치가 필수적입니다.)
+- [ ] **운영 헌장 배치**: 본 규약 파일(`AGENT.md`)이 새 프로젝트의 **.agents/ 디렉터리**에 배치되었는가? (프로젝트 루트를 깔끔하게 유지하면서도 온보딩하는 에이전트들이 규칙을 이해할 수 있도록 `.agents/` 경로 배치가 권장됩니다.)

 ---

-*본 가이드는 협업 효율성과 코드 보안의 엄격한 균형을 유지하기 위한 규범입니다. 변경 사항이 필요한 경우 PM 및 Reviewer의 전원 합의를 거쳐 본 문서를 업데이트해야 합니다.*
+*본 가이드는 협업 효율성과 코드 보안의 엄격한 균형을 유지하기 위한 규범입니다. 변경 사항이 필요한 경우 총괄 매니저 및 전체 팀장의 합의를 거쳐 본 문서를 업데이트해야 합니다.*
@@ -10,21 +10,28 @@ All agents working on a new project must read this document thoroughly and compl

 We clearly separate responsibilities and permissions between roles to reduce bottlenecks and enhance the quality of execution.

-### 👤 Project Manager (PM / Orchestrator)
- **Core Responsibility**: Receive user requirements, establish detailed task plans, assign and instruct workers, control the overall workflow, and report final results.
+### 👑 General Manager (Orchestrator)
+- **Core Responsibility**: Interact directly with the user, receive high-level requirements, establish task plans, delegate tasks to Team Leaders, control the overall workflow, and report completion back to the user.
 - **Ambiguity Resolution**: If a user's requirements contain ambiguous details, do not guess. Immediately ask the user for clarification (we recommend using the `/grill-me` slash command).
- **Feedback Loop Adjustment**: Analyze verification feedback from Reviewers to decide on improvement paths. For complex technical challenges, direct Workers and Reviewers to research options, add the PM's own assessment, and present a final report to the user to decide the project's direction.
- **Self-Healing (Hermes Fallback Fix)**: If a defect pointed out by a Reviewer is extremely minor or is a simple typo/configuration omission, the PM should directly fix the source code instead of reassigning it to the Worker, thereby minimizing the round-trip cost.

-### 🛠️ Worker (Implementation Agent)
- **Core Responsibility**: Design business logic and implement source code as delegated by the PM.
- **Collaboration & Communication**: If the implementation path is ambiguous or interface design changes are required within the assigned scope, ask the PM for consensus before applying surgical changes.
- **Contract Adherence**: Comply with the single task instructions (Brief) and the unique Job ID convention provided by the PM. Workers must publish a `started` event when starting work, and a `completed` or `error` event to the backplane upon termination.
+### 👥 Team Leaders (팀장)
+Newly spawned agents (e.g., `antigravity`, `claude`, `cline`, `hermes`) act as **Team Leaders** of their respective groups. They receive delegated tasks from the General Manager and manage implementation or review workflows.
+- **Developer Team Leader (개발 팀장)**:
+  - Receives tasks from the General Manager.
+  - **Task Breakdown & Planning**: Thoroughly analyzes the task, breaks it down into small units, and creates a plan.
+  - **Internal Parallelism**: Can run subagents in parallel internally to handle the delegated work.
+  - **Review Integrity & Refusal**: Thoroughly reviews feedback from Reviewers. Adopts/implements recommendations if valid. If any recommendation is judged invalid, the Developer Team Leader must **not** implement it, but instead return the refutation along with detailed reasons to the Reviewer.
+  - **Completion Signal**: Once all reviewers yield a `PASS` and changes are verified, the Developer Team Leader who first received the task sends a completion signal back to the General Manager.
+- **Reviewer Team Leader (리뷰어 팀장)**:
+  - Receives review requests from the Developer Team Leader.
+  - **Detailed Feedback with Directions**: Simply rejecting changes (`NOT PASS`) is forbidden. Reviewers **must** specify the exact reason for the issue and provide a concrete, stable, and verified alternative direction for improvement.
+  - **Consensus Loop**: Engages in the review cycle until all objections are resolved and a final `PASS` is issued.

-### 🔍 Reviewer (Verification Agent)
- **Core Responsibility**: Verify source code changes (Diff) and implementation specifications submitted by Workers. Reviewers act as facilitators by detecting security vulnerabilities, proposing performance improvements, and examining design consistency.
- **Provide Concrete Alternatives**: Simply rejecting changes (`NOT PASS`) is forbidden. When raising an issue, Reviewers must propose a **concrete, stable, and verified alternative code block or solution**.
- **Complementary Cross-Verification**: Leverage the unique characteristics of different agent models (e.g., Flash-class models are skilled at capturing semantic shell bugs, while Opus/Sonnet-class models excel at API signatures and logical regression analysis) to perform parallel and mutually-supportive reviews.
+### 🛡️ Role Suitability Check Principle (자신의 역할 범위 수행 원칙)
+- Every agent must only perform tasks suitable for its designated role (e.g., Developer Team Leaders do not issue final reviews, and Reviewer Team Leaders do not write project code).
+- **If an agent receives a task that does not fit its role**, it must either:
+  1. Recommend the optimal agent session to delegate the task to, or
+  2. Perform the task directly if strictly necessary for project continuity.

 ---

@@ -59,34 +66,42 @@ Asynchronous communication and state management between agents are controlled vi
 sequenceDiagram
    autonumber
    actor User as User
-    participant PM as Project Manager
-    participant W as Worker
-    participant R as Reviewers
+    participant GM as General Manager
+    participant DTL as Developer Team Leader
+    participant RTL as Reviewer Team Leaders
    participant M as MQTT Backplane

-    User->>PM: Hand over requirements
-    Note over PM: Run grill-me & plan tasks
-    PM->>M: Register Job & start Subscriber
-    PM->>W: Delegate task (Provide Job ID & Brief)
-    W->>M: Publish 'started' event
-    Note over W: Modify code & implement
-    W->>M: Publish 'completed' (or 'error')
-    PM->>R: Request parallel review (Provide Diff)
-    Note over R: Cross-analysis & verification
-    alt Defect Found
-        R->>PM: NOT PASS (Feedback with alternatives)
-        Note over PM: PM directly fixes minor defects
-        PM->>W: Apply feedback & re-delegate
+    User->>GM: Hand over requirements
+    GM->>DTL: Delegate task (e.g., create landing page)
+    Note over DTL: Analyze, breakdown & spawn parallel subagents
+    DTL->>M: Publish 'started' event
+    Note over DTL: Modify code & implement
+    DTL->>M: Publish 'completed'
+    DTL->>RTL: Request review (I created landing page. Please review it)
+    Note over RTL: Cross-analysis & verification
+    alt Defect Found (Reviewer feedback)
+        RTL->>DTL: NOT PASS / Feedback (Must include reason & improvement direction)
+        Note over DTL: DTL checks validity of suggestions
+        alt Valid feedback
+            Note over DTL: DTL adopts and modifies code
+        else Invalid feedback
+            DTL->>RTL: Send refutation & reasons (Did not reflect inappropriate parts)
+        end
+        DTL->>RTL: Request review again (Modified review items)
    else Verification Pass
-        R->>PM: PASS
+        RTL->>DTL: PASS
    end
-    PM->>User: Report final pass & commit changes
+    DTL->>GM: Send completion signal
+    GM->>User: Notify task completion
 ```

-1. **Planning and Allocation**: The PM defines requirements and outlines independent jobs to avoid conflicting dependencies.
-2. **Execution and Notification**: The PM launches a subscriber, then assigns the job to a Worker session. The Worker performs the logic and sends a terminal event, automatically closing the session.
-3. **Cross-Verification Iteration (Review Loop)**: Once the task is complete, the PM circulates the changes to the Reviewer agents in parallel. The modify-reject cycle repeats until all reviewers yield a `PASS`, ensuring high-quality code.
-4. **Release and Cleanup**: Code that passes verification is committed to Git, and temporary session resources are reclaimed.
+1. **Planning and Allocation**: The General Manager delegates the task to the Developer Team Leader.
+2. **Analysis and Internal Execution**: The Developer Team Leader analyzes the task, breaks it down, plans execution, and optionally spawns parallel subagents. It publishes `started`, completes the task, and requests review from the Reviewer Team Leader.
+3. **Objection & Refinement Loop**:
+   - The Reviewer Team Leader must provide clear reasons and improvement directions for any issues.
+   - The Developer Team Leader validates the feedback. Valid suggestions are implemented; invalid ones are refuted with reasons and returned to the reviewer.
+   - This cycle repeats until all reviewers issue a `PASS`.
+4. **Completion and Report**: The Developer Team Leader sends the final completion signal to the General Manager, who notifies the user.

 ---

@@ -119,8 +134,8 @@ Use this checklist when deploying this agent orchestration model to a new projec
 - [ ] **Directory Convention**: Are the registry path (`.mam/jobs/`) and logging path (`.mam/delegate_job_logs/`) added to `.gitignore`?
 - [ ] **Core Scripts**: Are the core scripts (`mqtt_common.py`, `publish_event.py`, `job_subscriber.py`, and `registry.py`) in place?
 - [ ] **HMAC Enablement**: When a new registry job is created, is a random `auth_token` correctly injected, and is signature-based mutual authentication active?
- [ ] **Charter Placement**: Is this protocol file (`AGENT.md`) placed in the **top-level root directory** of the new project? (Placing it at the root is essential so that onboarding agents can recognize the rules immediately.)
+- [ ] **Charter Placement**: Is this protocol file (`AGENT.md`) placed in the **.agents/ directory** of the new project? (Placing it in `.agents/` is essential to keep the project root clean while allowing onboarding agents to align on the rules.)

 ---

-*This guide balances collaboration efficiency with strict code security. Any required changes must be discussed and agreed upon by the PM and all Reviewers before updating this document.*
+*This guide balances collaboration efficiency with strict code security. Any required changes must be discussed and agreed upon by the General Manager and all Team Leaders before updating this document.*
@@ -19,7 +19,7 @@ WORKSPACE_ROOT="$(cd "$SKILL_DIR/../.." && pwd)"
 AGENT_SESSIONS_YAML="${AGENT_SESSIONS_YAML:-$WORKSPACE_ROOT/.mam/agent-sessions.yaml}"

 # Workspace-relative defaults with environment overrides (Phase Z)
-HOME_DIR="${HOME_DIR:-$WORKSPACE_ROOT}"
+HOME_DIR="${HOME_DIR:-$HOME}"
 CLAUDE_PROJECT_DIR="${CLAUDE_PROJECT_DIR:-$HOME/.claude/projects}"
 LOCAL_BIN="${LOCAL_BIN:-$HOME/.local/bin}"

@@ -305,6 +305,8 @@ def _validate(d):
            raise SystemExit(f"VALIDATE: tmux_sessions[{i}] not a mapping")
        if not s.get('name') or not s.get('status'):
            raise SystemExit(f"VALIDATE: tmux_sessions[{i}] missing name/status")
+        if s.get('role') is not None and (not isinstance(s['role'], str) or not s['role'].strip()):
+            raise SystemExit(f"VALIDATE: tmux_sessions[{i}] {s.get('name')!r} role must be a non-empty string")
        if s['status'] not in valid:
            raise SystemExit(f"VALIDATE: tmux_sessions[{i}] {s.get('name')!r} bad status {s['status']!r}")
        if not isinstance(s.get('pane'), dict):
@@ -366,10 +368,17 @@ try:
        d['tmux_sessions'] = []

    old_terminals = get_terminal_set(d)
+    old_roles = {s.get('name'): s.get('role') for s in db_sessions if s.get('role')}

    # --- caller mutation (module scope: sees d, yaml, os, glob, subprocess) ---
    exec(compile(os.environ['AGENT_SESSIONS_MUTATION'], '<mutation>', 'exec'), globals())

+    # Role immutability check
+    for s in d.get('tmux_sessions', []):
+        name = s.get('name')
+        if name in old_roles and s.get('role') != old_roles[name]:
+            raise SystemExit(f"VALIDATE: role of session {name!r} cannot be modified from {old_roles[name]!r} to {s.get('role')!r}")
+
    _validate(d)

    # Separate globals and sessions for normalization
@@ -475,7 +484,7 @@ def db_exists(uuid):


 def hermes_exists(uuid):
-    hdb = f"{home}/.mam/state.db"
+    hdb = f"{home}/.hermes/state.db"
    if not os.path.exists(hdb):
        return False
    try:
@@ -487,6 +496,10 @@ def hermes_exists(uuid):
        return False


+def cline_exists(uuid):
+    return os.path.exists(f"{home}/.cline/data/sessions/{uuid}/{uuid}.json")
+
+
 def emit(u):
    print(u)
    raise SystemExit(0)
@@ -536,6 +549,10 @@ for s in sessions:
        cand = s.get('hermes_conversation_id_own')
        if cand and hermes_exists(cand):
            emit(cand)
+    if agent == 'cline' and name.endswith('-creator-cline'):
+        cand = s.get('cline_conversation_id_own')
+        if cand and cline_exists(cand):
+            emit(cand)

 # 2) disk scan scoped to THIS workspace
 if agent == 'claude':
@@ -565,7 +582,7 @@ elif agent == 'agy':
        if cand and db_exists(cand):
            emit(cand)
 elif agent == 'hermes':
-    hdb = f"{home}/.mam/state.db"
+    hdb = f"{home}/.hermes/state.db"
    if os.path.exists(hdb):
        cand = None
        try:
@@ -578,6 +595,27 @@ elif agent == 'hermes':
            cand = None
        if cand:
            emit(cand)
+elif agent == 'cline':
+    sessions_dir = f"{home}/.cline/data/sessions"
+    if os.path.isdir(sessions_dir):
+        candidates = []
+        for session_folder in glob.glob(f"{sessions_dir}/*"):
+            if os.path.isdir(session_folder):
+                folder_name = os.path.basename(session_folder)
+                json_file = f"{session_folder}/{folder_name}.json"
+                if os.path.exists(json_file):
+                    candidates.append(json_file)
+        candidates.sort(key=os.path.getmtime, reverse=True)
+        for j in candidates:
+            try:
+                with open(j) as f:
+                    sdata = json.load(f)
+                if sdata.get('cwd') == ws or sdata.get('workspace_root') == ws:
+                    sid = sdata.get('session_id')
+                    if sid:
+                        emit(sid)
+            except Exception:
+                pass

 # 3) agent_identities cache, ONLY when its project_cwd == this workspace
 ai = {}
@@ -609,6 +647,10 @@ if ai_agent.get('project_cwd') == ws:
        cand = ai_agent.get('session_id') or ai.get('conversation_id')
        if cand and hermes_exists(cand):
            emit(cand)
+    elif agent == 'cline':
+        cand = ai_agent.get('session_id') or ai.get('conversation_id')
+        if cand and cline_exists(cand):
+            emit(cand)

 print('')
 PYEOF
@@ -74,12 +74,12 @@ To prevent this, you can run this skill inside an **isolated tmux server** using
   ```
 2. **Via Option Flag**:
   ```bash
-   bash scripts/create_session.sh --workspace /path/to/project --agent claude --tmux-server multi-agent-canary
+   bash scripts/create_session.sh --workspace /path/to/project --agent claude --role developer --tmux-server multi-agent-canary
   ```
 3. **Submit Job Integration**:
   You can automatically register a delegated job with a prompt when creating a session:
   ```bash
-   bash scripts/create_session.sh --workspace /path/to/project --agent claude --submit-job "Task prompt here"
+   bash scripts/create_session.sh --workspace /path/to/project --agent claude --role developer --submit-job "Task prompt here"
   ```

 ### Recommended Alias
@@ -173,7 +173,7 @@ Use the `agent-sessions-yaml-edit` script in `scripts/` to safely append (preser

 ```bash
 bash .agents/skills/multi-agent-mux-create/scripts/create_session.sh \
-  --workspace "$WORKSPACE" --agent "$AGENT" --session "$SESSION_NAME"
+  --workspace "$WORKSPACE" --agent "$AGENT" --role "$ROLE" --session "$SESSION_NAME"
 ```

 The script handles the YAML append, pane capture, and the `last_visible_status` placeholder.
@@ -1,7 +1,7 @@
 #!/usr/bin/env bash
 # create_session.sh — multi-agent-mux-create 의 부속 스크립트
 # Usage:
-#   bash create_session.sh --workspace <path> --agent <claude|agy> [--session <name>] [--wrapper]
+#   bash create_session.sh --workspace <path> --agent <claude|agy> --role <role> [--session <name>] [--wrapper]
 #
 # 동작:
 #   1) preflight: tmux/claude/agy 가용성, workspace 존재
@@ -23,11 +23,12 @@ source "$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)/lib.sh"

 usage() {
  cat <<EOF
-Usage: $0 --workspace <path> --agent <claude|agy|hermes> [options]
+Usage: $0 --workspace <path> --agent <claude|agy|hermes|cline> --role <role> [options]

 Options:
  --workspace PATH    project directory (required)
-  --agent AGENT       claude | agy | hermes (required)
+  --agent AGENT       claude | agy | hermes | cline (required)
+  --role ROLE         assigned role (required)
  --session NAME      tmux session name (default: derived from workspace)
  --wrapper           force use of ~/.local/bin/<session> wrapper even if not present
  --dry-run           print commands without executing
@@ -39,6 +40,7 @@ EOF

 WORKSPACE=""
 AGENT=""
+ROLE=""
 SESSION_NAME=""
 USE_WRAPPER=0
 DRY_RUN=0
@@ -49,6 +51,7 @@ while [ $# -gt 0 ]; do
  case "$1" in
    --workspace) WORKSPACE="$2"; shift 2 ;;
    --agent)     AGENT="$2";     shift 2 ;;
+    --role)      ROLE="$2";      shift 2 ;;
    --session)   SESSION_NAME="$2"; shift 2 ;;
    --wrapper)   USE_WRAPPER=1; shift ;;
    --dry-run)   DRY_RUN=1; shift ;;
@@ -66,6 +69,7 @@ fi
 # Preflight
 [ -n "$WORKSPACE" ] || { echo "ERROR: --workspace required" >&2; usage; exit 2; }
 [ -n "$AGENT" ]    || { echo "ERROR: --agent required" >&2; usage; exit 2; }
+[ -n "$ROLE" ]     || { echo "ERROR: --role required" >&2; usage; exit 2; }
 [ -d "$WORKSPACE" ] || { echo "ERROR: workspace $WORKSPACE not a directory" >&2; exit 1; }
 command -v tmux >/dev/null || { echo "ERROR: tmux not installed" >&2; exit 1; }
 command -v "$AGENT" >/dev/null || { echo "ERROR: $AGENT CLI not in PATH" >&2; exit 1; }
@@ -86,6 +90,11 @@ elif [ "$AGENT" = "hermes" ]; then
    echo "ERROR: hermes is not functional. Run 'hermes setup' first." >&2
    exit 1
  fi
+elif [ "$AGENT" = "cline" ]; then
+  if ! cline history --json >/dev/null 2>&1; then
+    echo "ERROR: cline is not functional or configured." >&2
+    exit 1
+  fi
 fi

 # 세션 이름 — lib.sh::derive_session_name 이 단일 소스 (P0-A)
@@ -119,7 +128,10 @@ spawn() {
    hermes)
      _tmux new-session -d -s "$SESSION_NAME" -x 140 -y 40 -c "$WORKSPACE" "hermes"
      ;;
-    *) echo "ERROR: --agent must be claude, agy or hermes, got: $AGENT" >&2; exit 2 ;;
+    cline)
+      _tmux new-session -d -s "$SESSION_NAME" -x 140 -y 40 -c "$WORKSPACE" "cline -i"
+      ;;
+    *) echo "ERROR: --agent must be claude, agy, hermes or cline, got: $AGENT" >&2; exit 2 ;;
  esac
 }

@@ -145,6 +157,7 @@ case "$AGENT" in
  claude) CMD_FULL='claude --dangerously-skip-permissions' ;;
  agy)    CMD_FULL='agy --dangerously-skip-permissions' ;;
  hermes) CMD_FULL='hermes' ;;
+  cline)  CMD_FULL='cline -i' ;;
 esac

 # 시작 명령
@@ -161,7 +174,7 @@ case "$AGENT" in
      START_CMD="$local_tmux new-session -d -s \"$SESSION_NAME\" -x 140 -y 40 -c \"$WORKSPACE\" \"claude --dangerously-skip-permissions\""
    fi
    ;;
-  agy|hermes)
+  agy|hermes|cline)
    START_CMD="$local_tmux new-session -d -s \"$SESSION_NAME\" -x 140 -y 40 -c \"$WORKSPACE\" \"$CMD_FULL\""
    ;;
 esac
@@ -174,6 +187,8 @@ if [ -n "$SUBMIT_JOB_PROMPT" ]; then
    delegate_agent="claude-code"
  elif [ "$AGENT" = "hermes" ]; then
    delegate_agent="hermes-agent"
+  elif [ "$AGENT" = "cline" ]; then
+    delegate_agent="cline-agent"
  else
    delegate_agent="antigravity-cli"
  fi
@@ -191,7 +206,7 @@ fi
 # 모든 값은 환경변수로 전달 — heredoc interpolation 없음 (P1-B).
 # 자식 pid 는 bash 에서 pgrep 으로 미리 구함 (P2: 도구명 필터).
 CHILD_PID=0
-if { [ "$AGENT" = "agy" ] || [ "$AGENT" = "hermes" ]; } && [ -n "$PANE_PID" ]; then
+if { [ "$AGENT" = "agy" ] || [ "$AGENT" = "hermes" ] || [ "$AGENT" = "cline" ]; } && [ -n "$PANE_PID" ]; then
  CHILD_PID=$(pgrep -P "$PANE_PID" -x "$AGENT" 2>/dev/null | head -1 || true)
  CHILD_PID="${CHILD_PID:-0}"
 fi
@@ -201,9 +216,10 @@ atomic_dump_yaml "$AGENT_SESSIONS_YAML" \
  TMUX_EPOCH="$TMUX_EPOCH" PANE_PID="$PANE_PID" PANE_CWD="$PANE_CWD" \
  CMD_FULL="$CMD_FULL" START_CMD="$START_CMD" CHILD_PID="$CHILD_PID" \
  TMUX_SERVER_NAME="${TMUX_SERVER_NAME:-default}" \
-  DELEGATE_JOB_ID="$DELEGATE_JOB_ID" <<'PYEOF'
+  DELEGATE_JOB_ID="$DELEGATE_JOB_ID" ROLE="$ROLE" <<'PYEOF'
 name = os.environ['SESSION_NAME']
 agent = os.environ['AGENT']
+role = os.environ['ROLE']
 pid = os.environ.get('PANE_PID', '')
 epoch = os.environ.get('TMUX_EPOCH', '')
 server_name = os.environ.get('TMUX_SERVER_NAME', 'default')
@@ -222,6 +238,7 @@ sessions[:] = [s for s in sessions if s.get('name') != name]
 entry = {
    'name': name,
    'status': 'running',
+    'role': role,
    'tmux_session_created_at': os.environ['NOW_ISO'],
    'tmux_session_epoch': int(epoch) if epoch.isdigit() else 0,
    'tmux_server': server_name,
@@ -265,6 +282,11 @@ elif agent == 'hermes':
    entry['child_pid'] = int(cp) if cp.isdigit() else 0
    entry['hermes_conversation_id_own'] = None
    entry['last_visible_status'] = "TUI started; awaiting first user message"
+elif agent == 'cline':
+    cp = os.environ.get('CHILD_PID', '0')
+    entry['child_pid'] = int(cp) if cp.isdigit() else 0
+    entry['cline_conversation_id_own'] = None
+    entry['last_visible_status'] = "TUI started; awaiting first user message"

 sessions.append(entry)

@@ -0,0 +1,116 @@
+# Task Delegation Types (작업 위임 타입) Design Specification
+
+이 문서는 `multi-agent-mux-delegate-job` 스킬에 **작업 위임 타입 (Task Delegation Types)**을 정의하고, 단일 에이전트 실행을 넘어 에이전트 협업 구조(루프, 토론 등)를 체계적으로 오케스트레이션하기 위한 설계 명세입니다.
+
+---
+
+## 1. 개요 및 필요성
+
+기존의 잡 위임 시스템은 단일 에이전트에 지시사항(Prompt)을 전달하고 완료(`completed`) 또는 에러(`error`) 이벤트를 수신하면 작업을 종료하는 **단방향 직접 위임(Direct)** 구조였습니다.
+
+하지만 실제 협업 환경에서는 다음과 같은 유형의 고도화된 협업 흐름이 필요합니다:
+1. **자료조사 및 토론 (Research & Discussion)**: 계획 수립 또는 개념 검토를 위해 여러 에이전트가 협의에 이를 때까지 논의를 주고받음.
+2. **작업자-리뷰어 루프 (Worker-Reviewer Loop)**: 작업자(Worker)가 코드를 수정하면, 리뷰어(Reviewer)가 검토하여 `PASS`를 줄 때까지 피드백 반영 및 수정을 반복함.
+
+이러한 협업 워크플로우를 개별 에이전트의 내부 코드 수정 없이 **오케스트레이터(위임 스크립트) 레이어에서 제어**할 수 있도록 작업 타입을 도입합니다.
+
+---
+
+## 2. 작업 위임 타입 정의
+
+| 타입명 (`--type`) | 설명 | 워크플로우 흐름 |
+|------------------|------|----------------|
+| `direct` (기본값) | 단일 에이전트에 대한 직접 위임 | 지시 → 에이전트 수행 → 완료/에러 수신 후 종료 |
+| `loop` | 작업자-리뷰어 피드백 루프 | 작업자 실행 → 완료 시 리뷰어 자동 호출 → 리뷰 통과 시 종료, 실패 시 피드백과 함께 작업자 재호출 (반복) |
+| `discuss` | 자료조사 및 상호 토론 | 에이전트 A(초안 작성) → 에이전트 B(검토 및 의견 제시) → 에이전트 A(반영 및 수정) → 합의 도달 시 종료 |
+
+---
+
+## 3. CLI 명세 확장
+
+`multi-agent-mux-delegate-job submit` 명령어에 다음 옵션들이 추가됩니다.
+
+```bash
+multi-agent-mux-delegate-job submit \
+  --prompt <text> \
+  --agent <worker_agent> \
+  --agent-session <worker_session> \
+  [--type <direct|loop|discuss>] \
+  [--reviewer <reviewer_agent>] \
+  [--reviewer-session <reviewer_session>] \
+  [--max-iterations <count>] \
+  [--validate <script>]
+```
+
+### 신규 옵션 상세:
+*   `--type`: 작업 위임 타입을 지정합니다. (`direct`, `loop`, `discuss`)
+*   `--reviewer`: 리뷰를 담당할 에이전트 이름입니다 (기본값: `hermes`).
+*   `--reviewer-session`: 리뷰어 에이전트가 돌고 있는 tmux 세션 이름입니다 (기본값: `tmux:hermes`).
+*   `--max-iterations`: 루프 또는 토론의 최대 반복 횟수입니다 (기본값: `5`).
+
+---
+
+## 4. 작업자-리뷰어 루프 (`loop`) 상태 머신
+
+오케스트레이터는 다음 상태 다이어그램에 따라 작업을 순차적으로 위임하고 이벤트를 구독합니다.
+
+```mermaid
+stateDiagram-v2
+    [*] --> Worker_Pending : Submit Job
+    Worker_Pending --> Worker_Running : Worker picks job
+    Worker_Running --> Reviewer_Pending : Worker emits "completed"
+    Worker_Running --> Error_Terminal : Worker emits "error"
+    
+    Reviewer_Pending --> Reviewer_Running : Reviewer picks job
+    Reviewer_Running --> Success_Terminal : Reviewer emits "completed" (PASS)
+    Reviewer_Running --> Worker_Pending : Reviewer emits "error" / "completed" (Feedback) / Increment Iteration
+    Reviewer_Running --> Error_Terminal : Reviewer emits "error" & Iteration > Max
+    
+    Success_Terminal --> [*]
+    Error_Terminal --> [*]
+```
+
+### 단계별 상세 동작 프로토콜:
+
+1.  **작업자(Worker) 실행**:
+    *   오케스트레이터는 작업을 `pending`으로 등록하고, `agent_session`을 작업자 세션(예: `tmux:claude`)으로 설정하여 전달합니다.
+    *   작업자가 수행을 완료하고 `completed` 이벤트를 발행하면 오케스트레이터가 이를 가로챕니다.
+2.  **리뷰어(Reviewer)로 스위칭**:
+    *   오케스트레이터는 전체 작업을 종료하지 않고, 작업 레코드의 `agent_session`을 리뷰어 세션(예: `tmux:hermes`)으로 변경합니다.
+    *   리뷰어에게 전달할 프롬프트를 자동으로 조립합니다:
+        > *"Review the changes/artifacts generated for job $JOB_ID. Check if they meet the requirements. If correct, publish completed event with 'PASS'. If there are issues, publish error event with detailed feedback/nits."*
+    *   상태를 다시 `pending`으로 리셋하여 리뷰어 세션이 잡을 집어갈 수 있도록 합니다.
+3.  **리뷰 결과 판정**:
+    *   **PASS인 경우**: 리뷰어가 `completed` 이벤트와 함께 `"PASS"` 메시지를 주면 잡이 최종 `completed` 처리되며 오케스트레이터가 종료됩니다.
+    *   **피드백 발생 시**: 리뷰어가 `error` 또는 일반 `completed`와 함께 피드백 내용을 발행하면, 오케스트레이터는 반복 횟수(`iteration`)를 1 증가시킵니다.
+        *   최대 반복 횟수(`max-iterations`)를 초과한 경우 최종 `error` 종료됩니다.
+        *   그렇지 않다면, 다시 `agent_session`을 작업자 세션으로 돌리고 프롬프트를 조립하여 `pending` 상태로 돌려보냅니다:
+            > *"The reviewer provided the following feedback for job $JOB_ID: <리뷰어 피드백>. Please modify the code/artifacts to address these comments."*
+
+---
+
+## 5. 토론 및 협의 (`discuss`) 상태 머신
+
+토론 타입은 작업자와 리뷰어가 동등한 관계(예: PM/기획 에이전트와 설계 에이전트)에서 상호 계획안을 다듬어 나가는 구조입니다.
+
+1.  **초안 작성 (PM/Researcher)**:
+    *   PM 세션에 최초 프롬프트를 보냅니다. PM은 요구사항 분석 및 초안을 파일(예: `draft_plan.md`)로 작성하고 `completed`를 발행합니다.
+2.  **의견 검토 (Designer/Developer)**:
+    *   작업을 설계자 세션으로 전환하고 다음 프롬프트를 줍니다:
+        > *"Read draft_plan.md. Review its technical feasibility. Write your feedback/objections to draft_plan.md or comments.md. If you agree with the plan, reply with 'AGREE'."*
+3.  **합의 도달 여부 검토**:
+    *   상대방이 `"AGREE"`를 보내면 토론 합의가 성립되어 최종 완료됩니다.
+    *   반대 의견이 있으면 PM 세션으로 다시 넘겨 계획을 개정하도록 유도합니다.
+
+---
+
+## 6. 구현 계획
+
+1.  **`registry.py` 확장**:
+    *   `job_type` (기본값 `"direct"`), `reviewer`, `reviewer_session`, `max_iterations`, `iteration` 필드를 잡 레코드 모델에 추가합니다.
+    *   `register_job` 함수와 CLI 파서에 신규 매개변수를 등록합니다.
+2.  **`multi-agent-mux-delegate-job` 래퍼 스크립트 수정**:
+    *   `cmd_submit`에서 위임 타입(`--type`)을 받아 루프를 도는 셸 스크립트 상태 기계를 작성합니다.
+    *   각 에피소드가 끝날 때마다 상태를 변경하여 작업자/리뷰어 간에 소유권을 주고받도록 구현합니다.
+3.  **검증**:
+    *   가상의 worker/reviewer 시나리오를 만들거나 claude/hermes 세션에서 직접 상호 검증 루프를 돌려 정상 수렴하는지 테스트합니다.
@@ -1,7 +1,7 @@
 # multi-agent-mux-delegate-job 스킬

-작업(Job)을 자율 에이전트(claude-code/codex/opencode/human)에게 위임하고 MQTT
-이벤트 채널로 비동기 관찰하는 Hermes 스킬. **시작점은 [`SKILL.md`](./SKILL.md).**
+작업(Job)을 자율 에이전트(claude-code/hermes/agy/cline/codex/opencode/human)에게 위임하고 MQTT
+이벤트 채널로 비동기 관찰하는 범용 에이전트 협업 스킬. **시작점은 [`SKILL.md`](./SKILL.md).**

 - 프로토콜/스키마: [`job-protocol.md`](./job-protocol.md)
 - 브로커 PoC→운영 전환: [`mqtt-broker-setup.md`](./mqtt-broker-setup.md)
@@ -1,385 +1,94 @@
 ---
 name: multi-agent-mux-delegate-job
-description: "Delegate a unit of work to any autonomous agent (claude-code, codex, opencode, or a human) and observe it asynchronously over an MQTT event channel. Each job gets a unique id, a registry record (prompt, broker, status, timeouts), and a single per-job topic that carries started/permission_required/progress/completed/error events as schema-versioned JSON. The delegator starts a subscriber first, runs the agent, and treats a completed/error event or a timeout as the job's terminal state. Ships a working reference implementation (publish_event.py, job_subscriber.py, registry.py, mqtt_common.py, multi-agent-mux-delegate-job wrapper) plus a PoC-to-production path: validate on a public broker, then move to an authenticated TLS broker by changing config only — no code change. Use when you need fire-and-observe delegation, multi-job fan-out across tmux sessions, or a uniform completion-signal protocol shared by several agent types."
-version: 1.0.0
-author: Hermes Agent
+description: "Delegate a unit of work to any autonomous agent (claude-code, hermes, agy, cline, codex, or a human) and observe it asynchronously over an MQTT event channel. Supported roles include orchestrator, worker, and reviewer."
+version: 1.1.0
+author: Multi-Agent System
 license: MIT
 platforms: [linux, macos, windows]
-metadata:
-  hermes:
-    tags: [agent-delegation, mqtt, jobs, orchestration, async-completion]
-    related_skills: [claude-code, codex, opencode, hermes-agent-skill-authoring]
 ---

 # multi-agent-mux-delegate-job — Async Job Delegation over MQTT

-Delegate a unit of work to an autonomous agent, then **observe** it instead of
-blocking on it. Every job gets a unique id and a registry record; the agent
-publishes lifecycle events (`started`, `permission_required`, `progress`,
-`completed`, `error`) to a per-job MQTT topic; the delegator subscribes and
-treats `completed`/`error` — or a timeout — as the terminal state.
+Delegate a unit of work to any autonomous agent, then **observe** it asynchronously instead of blocking. Every job gets a unique ID and a registry record. The worker agent publishes lifecycle events (`started`, `permission_required`, `progress`, `completed`, `error`) to a per-job MQTT topic, and the delegator/orchestrator subscribes to verify the final state.

-This skill is a **reference implementation**: copy the files in this directory
-into your project and customise. The `communication_over_mqtt` project is the
-canonical concrete instance.
+This skill allows any agent (`claude-code`, `hermes`, `agy`, `cline`, etc.) to play any role: **Orchestrator/Delegator**, **Worker/Implementer**, or **Reviewer**.

-## Overview
+---

-The model is deliberately small. A **job** is one delegated task. An **agent**
-is a worker (a claude-code tmux session, a codex run, a human). The **registry**
-(`.mam/jobs/<id>.json`) holds everything about a job so nothing important
-lives in environment variables — which means one tmux session can process many
-jobs sequentially, and many sessions can fan out in parallel, with no env
-collisions. The **event channel** is one MQTT topic per job carrying JSON
-payloads; `event` discriminates the type.
+## Roles in Multi-Agent Mux

-Responsibility is split into exactly one entry point each:
-[`publish_event.py`](./scripts/publish_event.py) emits events (registry lookup,
-monotonic `seq`, retry+backoff) and [`job_subscriber.py`](./scripts/job_subscriber.py)
-observes them (timeouts, terminal state machine, defensive parsing). Shared
-logic lives in [`mqtt_common.py`](./scripts/mqtt_common.py); registry I/O in
-[`registry.py`](./scripts/registry.py). The demo `publisher.py`/`subscriber.py`
-in the host project stay frozen.
+- **Orchestrator (Delegator)**: Initiates the job, coordinates other agents, handles loops and reviews, and commits final changes.
+- **Worker (Implementer)**: Receives the brief file or task prompt, performs the implementation, and emits started/completed/error events.
+- **Reviewer**: Evaluates git diffs or artifacts produced by the worker, and responds with a `completed` event containing `"PASS"` or feedback.

-Two stages, same code. **PoC** runs on the public `broker.hivemq.com` to wire up
-the protocol. **Production** moves to your own authenticated TLS broker — the
-switch is **config only** (env vars + the registry `broker.*` block), never a
-code change. See [`mqtt-broker-setup.md`](./mqtt-broker-setup.md).
+---

-## When to Use / When NOT to Use
+## Core Commands (CLI)

-**Use when:**
- you want **fire-and-observe** delegation — kick off work and get a completion
-  signal rather than blocking a terminal;
- several agent types (claude-code, codex, opencode, human) must follow **one**
-  completion protocol;
- you need **multi-job fan-out** across tmux sessions with safe job claiming;
- you want a clean PoC → authenticated-broker upgrade path.
-
-**Do NOT use when:**
- a one-shot `claude -p '…'` that returns inline is enough (no async signal
-  needed) — just use the [claude-code](../claude-code/SKILL.md) skill directly;
- you need request/response RPC or large artifact transfer (this is a
-  one-direction event stream, not a data bus);
- the payload would carry secrets and you're still on the public broker — move
-  to the own-broker stage first.
-
-## Quick Start
-
-The one-line wrapper handles register + subscriber-first + agent launch. If
-you're new, **start here** and only fall back to the manual 5-step flow when
-you need finer control.
+The `multi-agent-mux-delegate-job` bash wrapper handles job registration, subscriber management, agent session targeting, and validation hooks:

 ```bash
-# 1) one line: register → start subscriber → launch agent in tmux
-#    (uses public broker by default; last stdout line is the audit-log dir)
+# 1) Submit a new job to a targeted agent session (e.g. tmux session name 'demo')
 multi-agent-mux-delegate-job submit \
-  --agent claude-code \
-  --prompt "정렬 문제 10개를 만들어 sort_problems.md로 저장" \
-  --workdir /path/to/project \
-  --agent-session tmux:demo \
+  --agent <claude-code|hermes-agent|agy-agent|cline-agent|human> \
+  --agent-session tmux:<session_name> \
+  --prompt "Task description or instructions here" \
  --timeout 3600 --idle-timeout 120
-# → stdout: registered job: <JID>
-#          subscriber pid: …
-#          agent launched in tmux session: demo
-#          subscriber output: <one line per event>
-#          /path/to/project/.mam/delegate_job_logs/<JID>     ← audit log dir

-# 2) at any time, query the job or its audit log
-multi-agent-mux-delegate-job status --job <JID>
-multi-agent-mux-delegate-job logs   <JID>            # pretty timeline
-multi-agent-mux-delegate-job logs   --list           # every job, live status
+# 2) Submit a job with a feedback loop (Worker-Reviewer Loop)
+multi-agent-mux-delegate-job submit \
+  --agent <worker_agent> --agent-session tmux:<worker_session> \
+  --type loop --reviewer <reviewer_agent> --reviewer-session tmux:<reviewer_session> \
+  --prompt "Task description"

-# 3) run a user-supplied validator against the job's artifacts
-multi-agent-mux-delegate-job verify --job <JID> --validate ./validate.sh
+# 3) Check job status and audit logs
+multi-agent-mux-delegate-job status --job <JOB_ID>
+multi-agent-mux-delegate-job logs   <JOB_ID>            # Chronological log of events
+multi-agent-mux-delegate-job list                       # Summary of all registered jobs
+
+# 4) Verify job artifacts with a validation script
+multi-agent-mux-delegate-job verify --job <JOB_ID> --validate ./validate.sh
 ```

-The wrapper enforces the **subscribe-before-publish** ordering and **forwards
-the freshly-minted `JOB_ID` into the agent's prompt** (so the agent calls
-`publish_event.py --job <JID>` with the right id — see Pitfall §"Wrong job_id
-propagated to the agent"). When you need finer control, the manual flow is:
+---

-```bash
-# Manual 5-step (same outcome, more knobs)
-PY=.venv/bin/python
-SKILL=./.agents/skills/multi-agent-mux-delegate-job/scripts
+## Task Delegation Types

-# 1) register
-JID=$($PY "$SKILL/registry.py" register \
-        --prompt "…" --agent claude-code --agent-session tmux:demo \
-        --timeout 3600 --idle-timeout 120)
+Supported job types include:
+- `direct` (default): Single agent execution (direct tasking).
+- `loop` (Worker-Reviewer Loop): Alternates worker execution and reviewer evaluation until reviewer approves (`PASS`) or iterations run out.
+- `discuss` (Research & Discussion): Collaboration between two agents to reach a consensus (e.g., agreeing on a design or plan).

-# 2) START THE SUBSCRIBER FIRST (MQTT does not queue non-retained msgs)
-$PY "$SKILL/job_subscriber.py" --job "$JID" --timeout 3600 --idle-timeout 120 &
+For detailed state machine diagrams and configurations, see [DELEGATION_TYPES.md](./DELEGATION_TYPES.md).

-# 3) pass JID to the agent and instruct it to publish events with --job "$JID"
-#    (don't hard-code a job id you saw earlier — see Pitfall §"Wrong job_id")
+---

-# 4) on completion the subscriber prints events and exits 0/1/2
+## The Event Protocol Contract

-# 5) inspect any time
-$PY "$SKILL/registry.py" get       --job "$JID"
-$PY "$SKILL/registry.py" logs      "$JID"        # positional job id
-$PY "$SKILL/registry.py" logs --list
-```
+Every agent participating in the delegation contract must follow the same lifecycle publishing protocol using `publish_event.py`:

-## Job Protocol
+1. **On Start**: Publish `started` event.
+   `python3 .agents/skills/multi-agent-mux-delegate-job/scripts/publish_event.py --job "$JOB_ID" --event started`
+2. **On Tool/Permission Prompt**: Publish `permission_required` event.
+   `python3 ... --job "$JOB_ID" --event permission_required --detail "<tool>:<reason>"`
+3. **On Progress Update (Optional)**: Publish `progress` event.
+   `python3 ... --job "$JOB_ID" --event progress --detail "<status_update>"`
+4. **On Success**: Publish `completed` event.
+   `python3 ... --job "$JOB_ID" --event completed --detail "<summary>"` (Reviewer should include `"PASS"` in the detail to approve).
+5. **On Failure/Feedback**: Publish `error` event.
+   `python3 ... --job "$JOB_ID" --event error --detail "<reason_or_feedback>"`

-One topic per job: `python/mqtt/jobs/<job_id>/events`. Payload (JSON, UTF-8,
-`schema_version=1`):
-
-```json
-{ "schema_version": 1, "seq": 7, "job_id": "abc12345",
-  "event": "started|permission_required|progress|completed|error",
-  "timestamp": "2026-06-19T09:32:00Z", "detail": "generalised text",
-  "data": { "optional": "metadata" } }
-```
-
- `seq` is monotonic per job (first = 1); the subscriber uses it to spot
-  reorder/duplication.
- `timestamp` is advisory — timeouts are measured from **receive** time.
- `detail`/`data` carry **no** secrets or absolute paths.
- A `schema_version` or `job_id` mismatch is **dropped** (defensive parsing).
-
-`started` and `completed`/`error` are the mandatory bookends; `completed`→exit 0,
-`error`→exit 1. Full catalogue + production `auth_token` handling:
-[`job-protocol.md`](./job-protocol.md).
-
-## Registry Format
-
-```
-.mam/jobs/<id>.json        # metadata record (single source of truth)
-.mam/jobs/<id>.events.log  # append-only JSON-lines log (debug, optional)
-.mam/jobs/.lock            # fcntl advisory lock for the registry
-```
-
-The record holds `status`, `prompt`, `agent`, `agent_session`, a `broker` block,
-`topic_prefix`, `timeout_sec`/`idle_timeout_sec`, `expected_artifacts`,
-`last_seq`, and (production) `auth_token`. Because the `broker` block lives in
-the record, `publish_event.py` connects from the registry alone. Concurrency,
-the atomic rename trick, and multi-session job claiming are in
-[`registry.md`](./registry.md).
+---

 ## Audit Logs

-Every job's lifecycle is mirrored to a **persistent, append-only audit log**
-under `.mam/delegate_job_logs/` (override with `DELEGATE_JOB_LOGS_DIR`;
-default `<cwd>/.mam/delegate_job_logs`). Unlike the registry — live state
-mutated in place and liable to be cleaned up — the audit log is durable
-history you can replay after the fact. It is git-ignored.
+Job lifecycle execution events are persistently mirrored to an append-only log under `.mam/delegate_job_logs/<job_id>/` (containing `meta.json`, `events.ndjson`, and `status.json`). Use `multi-agent-mux-delegate-job logs <job_id>` to view the timeline.

-```
-.mam/delegate_job_logs/<job_id>/
-  meta.json      # registration snapshot: prompt, agent, broker, timeouts, …
-  events.ndjson  # append-only, one JSON event per line, in time order
-  status.json    # current status only (fast point-query)
-```
+---

-**What is logged, automatically:**
+## Best Practices and Pitfalls

-| When | `events.ndjson` line | Written by |
-|------|----------------------|------------|
-| job registered | `registered` (also seeds meta.json + status.json) | `registry.register_job` |
-| any status change | `status_changed` (`from`/`to`; also rewrites status.json) | `update_job_status`, `pick_pending` |
-| event published | `published` (carries the exact payload — reproducible) | `publish_event.py` |
-| event received | `received` (subscriber's external view) | `job_subscriber.py` |
-
-Both the emitter side (`published`) and the observer side (`received`) are
-recorded, so a dropped publish or a missed receive is still visible from the
-other. Every write is **best-effort and isolated** — an fcntl-locked append
-guarded by `try/except` that only ever emits a `logger.warning`, so a logging
-failure can never break a publish, a subscribe, or a registry write. stdout is
-never touched.
-
-**Reading them:**
-
-```bash
-multi-agent-mux-delegate-job logs <job_id>     # pretty-print one job's timeline
-multi-agent-mux-delegate-job logs --list       # summarise every logged job (with live status)
-# or directly via the registry CLI:
-$PY scripts/registry.py logs <job_id> [--tail N] [--json]
-$PY scripts/registry.py logs --list [--json]
-```
-
-`submit` prints the job's audit-log directory as its last stdout line, so a
-caller can `tail -n1` to locate it.
-
-## Broker Setup
-
-| Stage | Broker | Auth | Transport |
-|-------|--------|------|-----------|
-| PoC | `broker.hivemq.com` | none | 1883 plaintext |
-| Production | self-hosted Mosquitto/EMQX | user/pass + ACL | 8883 TLS |
-
-All connection settings come from env (`MQTT_BROKER`, `MQTT_PORT`, `MQTT_TLS`,
-`MQTT_USERNAME`/`MQTT_PASSWORD`, `MQTT_CA_CERTS`, …) resolved by
-`broker_config_from_env()`, with the registry `broker.*` block overriding per
-job. Moving to your own broker is **config only**: install Mosquitto, set
-`persistence true` + `acl_file` + `password_file` + a TLS `listener 8883`, grant
-the worker `write python/mqtt/jobs/+/events` and Hermes `read`, then flip
-`MQTT_TLS=1` and fill the registry `broker.*`. Step-by-step (conf, ACL,
-`mosquitto_passwd`, self-signed/private-CA certs, cut-over verification):
-[`mqtt-broker-setup.md`](./mqtt-broker-setup.md).
-
-## Agent Adapters
-
-Each agent voluntarily follows the contract: receive a `JOB_ID` (or registry
-path), call `publish_event.py` at lifecycle points, exit 0/1/2. **The contract
-in one line**: every event call uses `--job "$JOB_ID"` where `$JOB_ID` is the
-**freshly-issued id from the registry record for *this* delegation** — never a
-job_id you saw in an earlier session (Pitfall §"Wrong job_id propagated to the
-agent").
-
- **claude-code** — Claude Code calls `publish_event.py` via its Bash tool at
-  lifecycle points. `submit --mode tmux` injects a prompt that already names
-  `$JOB_ID`; if you drive claude manually, hand it the id explicitly. Reference
-  instruction block (the wrapper injects something equivalent):
-
-  ```text
-  Your job_id is "$JOB_ID" (read it from the registry record for this delegation —
-  do not reuse any job_id you saw before).
-
-  On start:        $PY multi-agent-mux-delegate-job/scripts/publish_event.py --job "$JOB_ID" --event started
-  On permission:   $PY … --job "$JOB_ID" --event permission_required --detail "<tool>:<what>"
-  On progress:     $PY … --job "$JOB_ID" --event progress --detail "<short status>"
-  On success:      $PY … --job "$JOB_ID" --event completed --detail "<one-line summary>"
-  On failure:      $PY … --job "$JOB_ID" --event error     --detail "<one-line reason>"
-
-  Task: <the user's prompt>
-
-  The subscriber for "$JOB_ID" is already running; your completed/error event
-  ends the job. Exit codes: 0 completed, 1 error, 2 publish failure.
-  ```
-
-  See [claude-code](../claude-code/SKILL.md) for tmux orchestration patterns.
- **codex** — same contract. Invoke `codex exec "<instruction-block-above>"` or
-  wire `publish_event.py` as an MCP tool so the agent can call it directly.
- **opencode** — wire `publish_event.py` as a tool/command the agent can call;
-  identical event points.
- **human** — a person does the work, reads the registry record, then runs
-  `publish_event.py --job <id> --event completed` (or `error`) by hand.
-
-## User Interface
-
-The [`multi-agent-mux-delegate-job`](./multi-agent-mux-delegate-job) bash wrapper bundles register +
-subscribe-first + run-agent + validate:
-
-```bash
-multi-agent-mux-delegate-job submit  --agent claude-code \
-   --prompt "정렬 문제 10개를 만들어 sort_problems.md로 저장" \
-    --workdir /path/to/project --timeout 3600 [--validate ./validate.sh]
-multi-agent-mux-delegate-job status  --job <id>          # one record, pretty-printed
-multi-agent-mux-delegate-job list                        # all jobs, one line each
-multi-agent-mux-delegate-job verify  --job <id> --validate ./validate.sh   # runs it, reports exit code
-multi-agent-mux-delegate-job wait    [--job <id>]        # block until terminal (else --wait-any)
-```
-
-`submit` **always starts the subscriber before the agent** (the ordering
-dependency), runs the agent in `--mode print` (one-shot) or `--mode tmux`, and
-calls `--validate` afterward if given. The skill automates job-id generation,
-registry creation, broker resolution, subscriber-first ordering, agent launch,
-and completion detection; it does **not** automate the agent's internals or your
-business-logic validation — those are hooks you fill (`validate.sh` reads
-`$JOB_ID`/`$REGISTRY_DIR`).
-
-## Common Pitfalls
-
- **Publishing before subscribing** — MQTT does not queue non-retained messages
-  for absent subscribers. Start `job_subscriber.py` *before* the agent, or rely
-  on retained terminal events (production). `submit` enforces this.
- **Wrong job_id propagated to the agent** — the wrapper prints a fresh `JOB_ID`
-  on every `submit`. If your agent instruction (or the wrapper's prompt template)
-  hard-codes an old job_id, the agent calls `publish_event.py --job <wrong>`,
-  the subscriber's defensive parser drops it as a `job_id` mismatch, and the
-  delegator waits until idle timeout (exit 2). Fix: instruct the agent to
-  **read the job_id from the registry record for *this* delegation** (or pass it
-  in via env / `--prompt` interpolation), never from prior runs. `submit`'s
-  default prompt template interpolates `$JOB_ID` for you — if you build a custom
-  prompt, do the same.
- **tmux session name collision** — `submit --mode tmux` derives the session
-  name from `--agent-session tmux:<name>` (default `tmux:claude`). If a session
-  with that name is already attached (e.g. you ran the demo and the previous
-  session is still open), `tmux new-session -d -s <name>` fails and the agent
-  never launches. Pick a unique `--agent-session` per concurrent delegation
-  (e.g. `tmux:demo`, `tmux:claude-a`, `tmux:claude-b`) or kill the stale one
-  (`tmux kill-session -t claude`) before re-running.
- **Timeout before `started`** — a cold-starting agent may not emit `started`
-  for a while; the wall-clock timeout starts at subscribe time so a stuck agent
-  still terminates. Don't set `--timeout` so low you false-positive a slow start.
- **No retry on publish** — a dropped `completed` would hang the delegator
-  forever; `publish_event.py` retries with exponential backoff and exits 2 if it
-  still fails, so the delegator is never left waiting silently.
- **QoS-1 duplicates / reorders** — a terminal event can arrive twice, or
-  `error` can trail `completed`; the subscriber's terminal state machine
-  finalises each job once and ignores the rest.
- **Trusting the public broker** — anyone can publish there; never make a real
-  decision on a PoC signal. Add `auth_token` + an authenticated broker first.
- **Secrets in `detail`/`data`** — keep payloads generalised; no paths, keys, or
-  tokens (except the production `auth_token` in `data`).
-
-## Subagent Orchestration Pattern
-
-When using this skill from a Hermes `delegate_task` subagent to dispatch work to
-a coding-agent CLI (agy/claude) running in a tmux session, the following pattern
-has been verified (2026-06-21, 6-batch refactoring sprint):
-
-### Roles
- **Main worker** (implementation): one agent session (e.g. `agy-new`) receives
-  brief files and executes code changes.
- **Reviewers** (spec compliance + code quality): two other agent sessions
-  (e.g. `agy-existing`, `claude-existing`) review the diff in parallel.
- **Hermes** (orchestrator): dispatches subagents, verifies diffs, commits,
-  and falls back to direct fixes when reviewers find issues.
-
-### Key lessons learned
-1. **Brief delivery via file path** — don't paste long briefs inline via
-   `tmux send-keys`; the TUI may swallow them. Instead, send a short instruction
-   like "follow /tmp/batch1-brief.md" and let the agent read the file.
-2. **Polling vs MQTT subscriber** — for short tasks (<5min), pane polling
-   (`capture-pane` + grep for completion markers) is simpler and more reliable
-   than registering a job via `registry.py` + `job_subscriber.py`. Use MQTT
-   subscriber only for long-running jobs (>5min) where push notification matters.
-3. **Reviewers catch different bugs** — in practice, agy (Flash) caught
-   semantic issues (slash matching, export scope), while claude (Opus) caught
-   API signature mismatches (paho v2 5-arg vs 4-arg `on_disconnect`). Two
-   reviewers with different models provide complementary coverage.
-4. **Hermes fallback fix** — when reviewers find a small, well-defined issue
-   (wrong argument count, missing slash), Hermes should fix it directly rather
-   than re-dispatching the implementer. This saves a full round-trip.
-5. **Batch grouping** — group 2-3 FW items per batch when they touch different
-   files (no file overlap). This amortises the dispatch overhead. Items touching
-   the same file must be in separate batches to avoid conflicts.
-6. **Pane Snapshots & Truncation Prevention** — to prevent long agent responses from being scrolled out and truncated due to TUI viewport limitations, enforce the following snapshotting pattern:
-   - Immediately after dispatching a brief, capture the pre-brief pane buffer via `capture-pane -S -200`.
-   - During long execution, run a background loop taking incremental snapshots (e.g. every 30 seconds `>> /tmp/pane-snap.txt`).
-   - Immediately after job termination, capture the entire final pane state to ensure no terminal logs are lost.
-
-## Verification Checklist
-
- [ ] `started` → `completed` over the public broker: subscriber prints the
-      lines and exits **0**.
- [ ] `error` path: subscriber exits **1**.
- [ ] timeout path: no terminal event within `--timeout`/`--idle-timeout` →
-      exit **2**.
- [ ] polluted payload (bad JSON, wrong `schema_version`, wrong `job_id`) is
-      dropped with a warning, not crashed on.
- [ ] one tmux session processes two registry jobs in sequence; a second
-      session with a different `agent_session` claims only its own.
- [ ] broker cut-over: same scripts reach an authenticated TLS broker with env
-      changes only; a credential without write ACL is rejected; a late
-      subscriber still receives the retained terminal event.
- [ ] `publisher.py`/`subscriber.py`/`README.md` demo on `python/mqtt/sample`
-      still works unchanged (regression).
- [ ] **audit log integrity** — for a completed job,
-      `.mam/delegate_job_logs/<JID>/events.ndjson` contains `registered` →
-      `received started` → `published completed` (in that order), and
-      `status.json.status == "completed"` matches the registry record. A
-      logging failure (e.g. read-only log dir) does not break the publish or
-      subscribe path — only a `logger.warning` is emitted.
- [ ] **end-to-end demo smoke** — run
-      `multi-agent-mux-delegate-job submit --agent claude-code --agent-session tmux:demo-smoke
-       --prompt "echo hello and call publish_event.py --job <JID>
-       --event completed" --timeout 120` and confirm
-      (a) registered job id echoed, (b) subscriber pid echoed, (c) tmux session
-      name printed, (d) `events.ndjson` grows as the agent runs, (e) final
-      stdout line is the audit-log dir.
+- **Subscribe-Before-Publish**: The subscriber must be running before the agent starts publishing. The `submit` command handles this automatically by launching the subscriber in the background first.
+- **Fresh job_id Propagation**: Make sure the worker agent receives the correct `JOB_ID` generated for the current run, rather than reusing stale IDs from previous sessions.
+- **Brief delivery via file path**: For long or complex prompts, write the instructions to a file (e.g. `/tmp/task-brief.md`) and pass a short prompt pointing to the file path to prevent terminal buffer overflows.
+- **Batch Grouping**: Group non-overlapping tasks into batches to parallelize execution across multiple agent sessions, reducing overhead.
@@ -16,6 +16,13 @@ set -euo pipefail

 SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

+# Load local .env if it exists in current dir or workspace root
+if [[ -f .env ]]; then
+  set -a; source .env; set +a
+elif [[ -f "$SCRIPT_DIR/../../.env" ]]; then
+  set -a; source "$SCRIPT_DIR/../../.env"; set +a
+fi
+
 # Pick an interpreter: prefer a project .venv, else python3.
 pick_python() {
  local py_bin
@@ -46,6 +53,8 @@ multi-agent-mux-delegate-job <command> [options]
  submit  --agent <name> --prompt <text> [--workdir <dir>] [--agent-session <label>]
          [--timeout <sec>] [--idle-timeout <sec>] [--validate <script>]
          [--registry-dir <dir>] [--dry-run]
+          [--type <direct|loop|discuss>] [--reviewer <reviewer_agent>]
+          [--reviewer-session <reviewer_session>] [--max-iterations <count>]
          # The skill is tmux-interactive only; --mode print was removed.
  status  --job <id> [--registry-dir <dir>]
  list    [--registry-dir <dir>]
@@ -59,6 +68,7 @@ EOF
 AGENT="claude-code"; PROMPT=""; WORKDIR="$(pwd)"; AGENT_SESSION="tmux:claude"
 TIMEOUT=3600; IDLE_TIMEOUT=120; VALIDATE=""; DRY_RUN=0
 JOB_ID=""; REGISTRY_DIR="$REGISTRY_DIR_DEFAULT"
+TYPE="direct"; REVIEWER="hermes"; REVIEWER_SESSION="tmux:hermes"; MAX_ITERATIONS=5

 parse_opts() {
  while [[ $# -gt 0 ]]; do
@@ -73,6 +83,10 @@ parse_opts() {
      --job) JOB_ID="$2"; shift 2;;
      --registry-dir) REGISTRY_DIR="$2"; shift 2;;
      --dry-run) DRY_RUN=1; shift;;
+      --type) TYPE="$2"; shift 2;;
+      --reviewer) REVIEWER="$2"; shift 2;;
+      --reviewer-session) REVIEWER_SESSION="$2"; shift 2;;
+      --max-iterations) MAX_ITERATIONS="$2"; shift 2;;
      *) echo "unknown option: $1" >&2; usage; exit 1;;
    esac
  done
@@ -88,26 +102,29 @@ cmd_submit() {
  # 1) register job (prints the new job id)
  JOB_ID="$("$PY" "$SCRIPT_DIR/scripts/registry.py" --registry-dir "$REGISTRY_DIR" register \
      --prompt "$PROMPT" --agent "$AGENT" --agent-session "$AGENT_SESSION" \
-      --timeout "$TIMEOUT" --idle-timeout "$IDLE_TIMEOUT")"
+      --timeout "$TIMEOUT" --idle-timeout "$IDLE_TIMEOUT" \
+      --job-type "$TYPE" --reviewer "$REVIEWER" --reviewer-session "$REVIEWER_SESSION" \
+      --max-iterations "$MAX_ITERATIONS")"
  echo "registered job: $JOB_ID"

-  # 2) START THE SUBSCRIBER FIRST (ordering dependency — MQTT does not queue
-  #    non-retained messages for absent subscribers).
-  local logf="$REGISTRY_DIR/$JOB_ID.subscriber.out"
-  "$PY" "$SCRIPT_DIR/scripts/job_subscriber.py" --registry-dir "$REGISTRY_DIR" \
-      --job "$JOB_ID" --timeout "$TIMEOUT" --idle-timeout "$IDLE_TIMEOUT" \
-      >"$logf" 2>&1 &
-  local sub_pid=$!
-  echo "subscriber pid: $sub_pid (log: $logf)"
-  sleep 1  # give the subscriber time to CONNACK + SUBSCRIBE before the agent runs
+  if [[ "$TYPE" == "direct" ]]; then
+    # 2) START THE SUBSCRIBER FIRST (ordering dependency — MQTT does not queue
+    #    non-retained messages for absent subscribers).
+    local logf="$REGISTRY_DIR/$JOB_ID.subscriber.out"
+    "$PY" "$SCRIPT_DIR/scripts/job_subscriber.py" --registry-dir "$REGISTRY_DIR" \
+        --job "$JOB_ID" --timeout "$TIMEOUT" --idle-timeout "$IDLE_TIMEOUT" \
+        >"$logf" 2>&1 &
+    local sub_pid=$!
+    echo "subscriber pid: $sub_pid (log: $logf)"
+    sleep 1  # give the subscriber time to CONNACK + SUBSCRIBE before the agent runs

-  # 3) run the agent (or print the command for dry-run / missing binary)
-  local pub="$PY $SCRIPT_DIR/scripts/publish_event.py --registry-dir $REGISTRY_DIR --job $JOB_ID"
-  # NOTE: the agent MUST use --job "$JOB_ID" (the one we just minted). Hard-coding
-  # an id from an earlier session is the #1 reason a delegated job sits idle and
-  # times out (see SKILL.md "Wrong job_id propagated to the agent"). We make the
-  # freshness explicit in the instruction header.
-  local instructions="Your job_id is \"$JOB_ID\" (the one just registered for THIS delegation — read it from the registry record, do NOT reuse any job_id you saw in earlier runs).
+    # 3) run the agent (or print the command for dry-run / missing binary)
+    local pub="$PY $SCRIPT_DIR/scripts/publish_event.py --registry-dir $REGISTRY_DIR --job $JOB_ID"
+    # NOTE: the agent MUST use --job "$JOB_ID" (the one we just minted). Hard-coding
+    # an id from an earlier session is the #1 reason a delegated job sits idle and
+    # times out (see SKILL.md "Wrong job_id propagated to the agent"). We make the
+    # freshness explicit in the instruction header.
+    local instructions="Your job_id is \"$JOB_ID\" (the one just registered for THIS delegation — read it from the registry record, do NOT reuse any job_id you saw in earlier runs).

 On start run:        $pub --event started.
 On permission/tool prompt run: $pub --event permission_required --detail '<tool>:<what>'.
@@ -119,40 +136,185 @@ The subscriber for this job_id is already running; your completed/error event en

 Task: $PROMPT"

-  run_agent "$JOB_ID" "$instructions"
+    run_agent "$JOB_ID" "$instructions"

-  # 4) optional validation hook
-  if [[ -n "$VALIDATE" ]]; then
-    echo "running validation: $VALIDATE"
-    if JOB_ID="$JOB_ID" REGISTRY_DIR="$REGISTRY_DIR" bash "$VALIDATE"; then
-      echo "validation: PASS"
-    else
-      local rc=$?
-      echo "validation: FAIL (exit $rc)"
+    # 4) optional validation hook
+    if [[ -n "$VALIDATE" ]]; then
+      echo "running validation: $VALIDATE"
+      if JOB_ID="$JOB_ID" REGISTRY_DIR="$REGISTRY_DIR" bash "$VALIDATE"; then
+        echo "validation: PASS"
+      else
+        local rc=$?
+        echo "validation: FAIL (exit $rc)"
+      fi
    fi
+
+    if [[ "$DRY_RUN" == "1" ]]; then
+      # In dry-run we never started a real subscriber (the wrapper short-circuits
+      # before launching one), but the wait below would still try to join the
+      # background sub_pid from cmd_submit. Skip both the wait and the subscriber
+      # log dump; the user just wants to see the instruction that would have run.
+      local logs_root_dry="${DELEGATE_JOB_LOGS_DIR:-$WORKDIR/delegate_job_logs}"
+      echo "$logs_root_dry/$JOB_ID"
+      return 0
+    fi
+
+    wait "$sub_pid" || true
+    echo "subscriber output:"; cat "$logf" || true
+
+    # Last stdout line: the persistent audit-log dir for this job (see SKILL.md
+    # "Audit Logs"). Callers can scrape `tail -n1` to find it.
+    local logs_root="${DELEGATE_JOB_LOGS_DIR:-$WORKDIR/delegate_job_logs}"
+    echo "$logs_root/$JOB_ID"
+  else
+    # Implement loop/discuss orchestrator
+    local iteration=1
+    local current_prompt="$PROMPT"
+    local current_session="$AGENT_SESSION"
+    local current_role="worker"
+
+    if [[ "$DRY_RUN" == "1" ]]; then
+      echo "[dry-run] orchestrator loop would start for job: $JOB_ID type: $TYPE"
+      echo "worker session: $AGENT_SESSION, reviewer session: $REVIEWER_SESSION"
+      local logs_root_dry="${DELEGATE_JOB_LOGS_DIR:-$WORKDIR/delegate_job_logs}"
+      echo "$logs_root_dry/$JOB_ID"
+      return 0
+    fi
+
+    while true; do
+      echo "=================================================="
+      echo "Iteration $iteration - Role: $current_role"
+      echo "Session: $current_session"
+      echo "=================================================="
+
+      # Update job details in registry
+      "$PY" "$SCRIPT_DIR/scripts/registry.py" --registry-dir "$REGISTRY_DIR" update \
+          --job "$JOB_ID" \
+          --agent-session "$current_session" \
+          --prompt "$current_prompt" \
+          --iteration "$iteration" \
+          --status "pending"
+
+      # Start subscriber
+      local logf="$REGISTRY_DIR/${JOB_ID}.iter_${iteration}_${current_role}.subscriber.out"
+      "$PY" "$SCRIPT_DIR/scripts/job_subscriber.py" --registry-dir "$REGISTRY_DIR" \
+          --job "$JOB_ID" --timeout "$TIMEOUT" --idle-timeout "$IDLE_TIMEOUT" \
+          >"$logf" 2>&1 &
+      local sub_pid=$!
+      echo "subscriber pid: $sub_pid (log: $logf)"
+      sleep 1
+
+      # Format instruction block
+      local pub="$PY $SCRIPT_DIR/scripts/publish_event.py --registry-dir $REGISTRY_DIR --job $JOB_ID"
+      local instructions="Your job_id is \"$JOB_ID\" (the one just registered for THIS delegation — read it from the registry record, do NOT reuse any job_id you saw in earlier runs).
+
+On start run:        $pub --event started.
+On permission/tool prompt run: $pub --event permission_required --detail '<tool>:<what>'.
+On progress (optional): $pub --event progress --detail '<short status>'.
+On success run:      $pub --event completed --detail '<one-line summary>'.
+On failure run:      $pub --event error     --detail '<one-line reason>'.
+
+The subscriber for this job_id is already running; your completed/error event ends the job. Exit codes: 0 completed, 1 error, 2 publish failure.
+
+Task: $current_prompt"
+
+      # Trigger agent
+      run_agent "$JOB_ID" "$instructions" "$current_session"
+
+      # Wait for subscriber
+      local sub_rc=0
+      wait "$sub_pid" || sub_rc=$?
+      echo "subscriber output:"; cat "$logf" || true
+
+      # Check job status based on subscriber exit code
+      local job_status="running"
+      if [[ $sub_rc -eq 0 ]]; then
+        job_status="completed"
+      elif [[ $sub_rc -eq 1 ]]; then
+        job_status="error"
+      else
+        job_status="timeout"
+      fi
+      
+      echo "Job role $current_role finished with status: $job_status"
+
+      # Retrieve feedback from the last event
+      local feedback
+      feedback="$("$PY" "$SCRIPT_DIR/scripts/registry.py" --registry-dir "$REGISTRY_DIR" get-feedback --job "$JOB_ID")"
+      echo "Feedback/Detail: $feedback"
+
+      if [[ "$current_role" == "worker" ]]; then
+        if [[ "$job_status" != "completed" ]]; then
+          echo "Worker did not complete successfully (status: $job_status). Terminating workflow."
+          break
+        fi
+
+        # Worker completed successfully, now switch to reviewer
+        current_role="reviewer"
+        current_session="$REVIEWER_SESSION"
+        # Build reviewer prompt based on type
+        if [[ "$TYPE" == "loop" ]]; then
+          current_prompt="Review the changes/artifacts generated for job $JOB_ID. Check if they meet the requirements. If correct, publish completed event with 'PASS'. If there are issues, publish error event with detailed feedback/nits. CRITICAL: When raising issues or giving a review, you MUST include the exact reason for the issue and a clear direction for improvement (문제 제시에 대한 이유와 확실한 개선 방향을 반드시 포함해야 합니다)."
+        elif [[ "$TYPE" == "discuss" ]]; then
+          current_prompt="Read draft/documents generated for job $JOB_ID. Review the feasibility and content. Write your feedback/objections. If you agree with the plan, reply with 'AGREE'."
+        fi
+      else
+        if [[ "$job_status" != "completed" ]]; then
+          echo "Reviewer did not complete successfully (status: $job_status). Terminating workflow."
+          break
+        fi
+ 
+        # Reviewer finished. Check if pass/agree
+        local success=0
+        if [[ "$TYPE" == "loop" ]]; then
+          if [[ "${feedback,,}" == *"pass"* ]]; then
+            success=1
+          fi
+        elif [[ "$TYPE" == "discuss" ]]; then
+          if [[ "${feedback,,}" == *"agree"* ]]; then
+            success=1
+          fi
+        fi
+ 
+        if [[ "$success" == "1" ]]; then
+          echo "Reviewer approved the work. Finalizing job as completed."
+          "$PY" "$SCRIPT_DIR/scripts/registry.py" --registry-dir "$REGISTRY_DIR" status --job "$JOB_ID" --set "completed"
+          break
+        else
+          # Reviewer rejected/provided feedback. Increment & check max iterations
+          if [[ $iteration -ge $MAX_ITERATIONS ]]; then
+            echo "Max iterations ($MAX_ITERATIONS) reached without approval. Terminating workflow."
+            "$PY" "$SCRIPT_DIR/scripts/registry.py" --registry-dir "$REGISTRY_DIR" status --job "$JOB_ID" --set "error"
+            break
+          fi
+          
+          iteration=$((iteration + 1))
+          current_role="worker"
+          current_session="$AGENT_SESSION"
+          current_prompt="The reviewer provided the following feedback for job $JOB_ID: $feedback. Please modify the code/artifacts to address these comments. CRITICAL: As the Developer Team Leader, you must thoroughly review the suggested modifications, verify their validity, adopt/implement them if valid, and if you judge any recommendation to be invalid, do NOT implement it but instead explain your reasons clearly in your response and send it back to the reviewer (수정안을 최대한 꼼꼼히 검토하여 타당성을 검증하고, 타당하다면 수렴하여 수정을 진행하되, 타당하지 않다고 판단되는 부분이 있다면 그 이유를 명확히 밝혀 리뷰어에게 전달하십시오)."
+        fi
+      fi
+    done
+
+    # 4) optional validation hook
+    if [[ -n "$VALIDATE" ]]; then
+      echo "running validation: $VALIDATE"
+      if JOB_ID="$JOB_ID" REGISTRY_DIR="$REGISTRY_DIR" bash "$VALIDATE"; then
+        echo "validation: PASS"
+      else
+        local rc=$?
+        echo "validation: FAIL (exit $rc)"
+      fi
+    fi
+
+    # Last stdout line: the persistent audit-log dir
+    local logs_root="${DELEGATE_JOB_LOGS_DIR:-$WORKDIR/delegate_job_logs}"
+    echo "$logs_root/$JOB_ID"
  fi
-
-  if [[ "$DRY_RUN" == "1" ]]; then
-    # In dry-run we never started a real subscriber (the wrapper short-circuits
-    # before launching one), but the wait below would still try to join the
-    # background sub_pid from cmd_submit. Skip both the wait and the subscriber
-    # log dump; the user just wants to see the instruction that would have run.
-    local logs_root_dry="${DELEGATE_JOB_LOGS_DIR:-$WORKDIR/delegate_job_logs}"
-    echo "$logs_root_dry/$JOB_ID"
-    return 0
-  fi
-
-  wait "$sub_pid" || true
-  echo "subscriber output:"; cat "$logf" || true
-
-  # Last stdout line: the persistent audit-log dir for this job (see SKILL.md
-  # "Audit Logs"). Callers can scrape `tail -n1` to find it.
-  local logs_root="${DELEGATE_JOB_LOGS_DIR:-$WORKDIR/delegate_job_logs}"
-  echo "$logs_root/$JOB_ID"
 }

 run_agent() {
-  local job_id="$1"; local instructions="$2"
+  local job_id="$1"; local instructions="$2"; local target_session="${3:-$AGENT_SESSION}"
  # The skill is INTERACTIVE-ONLY. We never invoke `claude -p` or any other
  # one-shot print mode, because:
  #   - claude -p exits the moment stdin is drained, so there's nothing to
@@ -168,7 +330,7 @@ run_agent() {
    echo "[human agent] complete the task, then run publish_event.py --event completed"
    return
  fi
-  local sess="${AGENT_SESSION#tmux:}"
+  local sess="${target_session#tmux:}"

  if [[ "$DRY_RUN" == "1" ]]; then
    echo "[dry-run] would delegate task to running agent '$AGENT' in tmux session '$sess' with instructions:"
@@ -202,6 +364,7 @@ run_agent() {
  echo "살아있는 에이전트 세션 '$sess'에 작업을 위임합니다..."
  $_tmux set-buffer -b "job_buf_$job_id" "$instructions"
  $_tmux paste-buffer -b "job_buf_$job_id" -t "$sess"
+  sleep 0.5
  $_tmux send-keys -t "$sess" C-m
  $_tmux delete-buffer -b "job_buf_$job_id"
  
@@ -59,11 +59,11 @@ def _format_line(topic: str, payload: Dict[str, Any]) -> str:
 class _Watcher:
    """Holds the shared queue + the set of job_ids we accept events for."""

-    def __init__(self, expected_job_ids: Set[str], expected_tokens: Dict[str, Optional[str]]):
+    def __init__(self, expected_job_ids: Set[str], expected_tokens: Dict[str, Optional[str]], expected_seqs: Dict[str, int]):
        self.events: "queue.Queue[Tuple[str, Dict[str, Any]]]" = queue.Queue()
        self.expected = set(expected_job_ids)
        self.tokens = expected_tokens  # job_id -> expected auth_token (or None)
-        self.last_seq: Dict[str, int] = {jid: 0 for jid in expected_job_ids}
+        self.last_seq = dict(expected_seqs)

    def on_message(self, _client, _userdata, msg) -> None:
        # --- defensive parsing -------------------------------------------
@@ -153,7 +153,8 @@ def main(argv=None) -> int:

    expected_ids: Set[str] = {j["job_id"] for j in jobs}
    tokens = {j["job_id"]: j.get("auth_token") for j in jobs}
-    watcher = _Watcher(expected_ids, tokens)
+    seqs = {j["job_id"]: int(j.get("last_seq", 0)) for j in jobs}
+    watcher = _Watcher(expected_ids, tokens, seqs)

    # Resolve timeouts from CLI, falling back to the (first) job's settings.
    base_job = jobs[0]
@@ -59,6 +59,10 @@ def register_job(
    expected_artifacts: Optional[List[str]] = None,
    bits: int = 32,
    auth_token: Optional[str] = None,
+    job_type: str = "direct",
+    reviewer: Optional[str] = None,
+    reviewer_session: Optional[str] = None,
+    max_iterations: int = 5,
 ) -> str:
    """Create a new ``pending`` job record and return its id.

@@ -90,6 +94,11 @@ def register_job(
        "expected_artifacts": expected_artifacts or [],
        "last_seq": 0,
        "auth_token": auth_token,
+        "job_type": job_type,
+        "reviewer": reviewer,
+        "reviewer_session": reviewer_session,
+        "max_iterations": int(max_iterations),
+        "iteration": 1,
    }
    with registry_lock(registry_dir):
        if mqtt_common._job_path(job_id, registry_dir).exists():
@@ -164,7 +173,7 @@ def append_event(job_id: str, registry_dir: str, payload: Dict[str, Any]) -> Non
 # convenience re-export so callers can `from registry import load_job`
 __all__ = [
    "register_job", "pick_pending", "update_status", "load_job",
-    "list_jobs", "append_event", "generate_job_id",
+    "list_jobs", "append_event", "generate_job_id", "get_feedback",
 ]


@@ -180,6 +189,44 @@ def _iter_records(registry_dir: str):
            logger.warning("skipping unreadable record %s: %s", path, exc)


+def get_feedback(job_id: str, registry_dir: str = DEFAULT_REGISTRY_DIR) -> str:
+    """Read the job's audit log or events log and return the detail of the last completed/error event."""
+    # 1) Try the unified audit log first (ndjson) since it's written synchronously by the subscriber
+    try:
+        import mqtt_common
+        logs_dir = mqtt_common.LOGS_DIR
+        events = list(mqtt_common.iter_logged_events(job_id, logs_dir))
+        for e in reversed(events):
+            if e.get("source_event") in ("completed", "error"):
+                return e.get("detail", "")
+            if e.get("event") in ("completed", "error"):
+                return e.get("detail", "")
+    except Exception:
+        pass
+
+    # 2) Fallback to local .events.log
+    log_path = Path(registry_dir) / f"{job_id}.events.log"
+    if log_path.exists():
+        feedback = ""
+        try:
+            with open(log_path, "r", encoding="utf-8") as fh:
+                for line in fh:
+                    if not line.strip():
+                        continue
+                    try:
+                        payload = json.loads(line)
+                        if payload.get("event") in ("completed", "error"):
+                            feedback = payload.get("detail", "")
+                    except json.JSONDecodeError:
+                        continue
+        except OSError:
+            pass
+        if feedback:
+            return feedback
+
+    return ""
+
+
 # --------------------------------------------------------------------------
 # CLI (so the bash wrapper can shell out without inline python)
 # --------------------------------------------------------------------------
@@ -197,6 +244,10 @@ def _build_parser() -> argparse.ArgumentParser:
    p_reg.add_argument("--bits", type=int, default=32, help="32 (PoC) or 128 (prod)")
    p_reg.add_argument("--artifact", action="append", default=[], dest="artifacts")
    p_reg.add_argument("--auth-token", default=None, help="HMAC auth token for the job (auto-generated if secure broker is detected)")
+    p_reg.add_argument("--job-type", default="direct", choices=["direct", "loop", "discuss"])
+    p_reg.add_argument("--reviewer", default=None)
+    p_reg.add_argument("--reviewer-session", default=None)
+    p_reg.add_argument("--max-iterations", type=int, default=5)

    p_list = sub.add_parser("list", help="list jobs (optionally by status)")
    p_list.add_argument("--status", default=None)
@@ -209,6 +260,16 @@ def _build_parser() -> argparse.ArgumentParser:
    p_status.add_argument("--job", required=True)
    p_status.add_argument("--set", required=True, dest="status")

+    p_update = sub.add_parser("update", help="update a job record")
+    p_update.add_argument("--job", required=True)
+    p_update.add_argument("--status", default=None)
+    p_update.add_argument("--agent-session", default=None)
+    p_update.add_argument("--prompt", default=None)
+    p_update.add_argument("--iteration", type=int, default=None)
+
+    p_feedback = sub.add_parser("get-feedback", help="get the last feedback detail (completed/error) for a job")
+    p_feedback.add_argument("--job", required=True)
+
    p_pick = sub.add_parser("pick", help="claim a pending job for a session; prints id")
    p_pick.add_argument("--agent-session", default="tmux:claude")

@@ -247,6 +308,10 @@ def main(argv: Optional[List[str]] = None) -> int:
            expected_artifacts=args.artifacts,
            bits=args.bits,
            auth_token=args.auth_token,
+            job_type=args.job_type,
+            reviewer=args.reviewer,
+            reviewer_session=args.reviewer_session,
+            max_iterations=args.max_iterations,
        )
        print(job_id)
        return 0
@@ -279,6 +344,27 @@ def main(argv: Optional[List[str]] = None) -> int:
            return 1
        return 0

+    if args.command == "update":
+        fields = {}
+        if args.status is not None:
+            fields["status"] = args.status
+        if args.agent_session is not None:
+            fields["agent_session"] = args.agent_session
+        if args.prompt is not None:
+            fields["prompt"] = args.prompt
+        if args.iteration is not None:
+            fields["iteration"] = args.iteration
+        try:
+            mqtt_common.update_job_status(args.job, rd, **fields)
+        except FileNotFoundError as exc:
+            print(str(exc), file=sys.stderr)
+            return 1
+        return 0
+
+    if args.command == "get-feedback":
+        print(get_feedback(args.job, rd))
+        return 0
+
    if args.command == "pick":
        job_id = pick_pending(args.agent_session, rd)
        if job_id is None:
@@ -282,7 +282,7 @@ mkdir -p "$STATE_DIR"
 # atomic_dump_yaml(flock + temp+rename) 로 같은 소스를 돌린다. atomic 래퍼에서는
 # 'actions' 가 없으면 SystemExit(0) 으로 쓰기를 건너뛴다 (불필요한 재포맷 방지).
 read -r -d '' RECON_SRC <<'PYEOF' || true
-import os, json, glob, subprocess, time
+import os, json, glob, subprocess, time, sqlite3
 from datetime import datetime, timezone
 import yaml

@@ -403,14 +403,28 @@ if tmux_confirmed:
        name = t['name']
        if name in yaml_session_names:
            continue
-        if not (name.endswith('-creator-claude') or name.endswith('-creator-agy')):
+        if name.endswith('-creator-claude'):
+            agent = 'claude'
+        elif name.endswith('-creator-agy'):
+            agent = 'agy'
+        elif name.endswith('-creator-hermes'):
+            agent = 'hermes'
+        elif name.endswith('-creator-cline'):
+            agent = 'cline'
+        else:
            continue
        srv = t.get('server', 'default')
        pm = pane_meta(name, srv)
        if not pm:
            continue
-        agent = 'claude' if name.endswith('-creator-claude') else 'agy'
-        cmd_full = 'claude --dangerously-skip-permissions' if agent == 'claude' else 'agy --dangerously-skip-permissions'
+        if agent == 'claude':
+            cmd_full = 'claude --dangerously-skip-permissions'
+        elif agent == 'agy':
+            cmd_full = 'agy --dangerously-skip-permissions'
+        elif agent == 'hermes':
+            cmd_full = 'hermes'
+        elif agent == 'cline':
+            cmd_full = 'cline -i'
        server_opt = f"-L {srv} " if srv != 'default' else ""
        entry = {
            'name': name,
@@ -430,7 +444,7 @@ if tmux_confirmed:
            entry['tui'] = {'model': '(unknown — capture after first message)', 'provider': 'anthropic',
                            'plan': '(unknown)', 'account': '(unknown)', 'version': '(unknown)'}
            entry['claude_session_id_own'] = None
-        else:
+        elif agent == 'agy':
            entry['child_pid'] = 0
            entry['agy_conversation_id_own'] = None
            entry['mcp_attachments'] = [
@@ -440,6 +454,12 @@ if tmux_confirmed:
                    'endpoint': 'https://stitch.googleapis.com/mcp'
                }
            ]
+        elif agent == 'hermes':
+            entry['child_pid'] = 0
+            entry['hermes_conversation_id_own'] = None
+        elif agent == 'cline':
+            entry['child_pid'] = 0
+            entry['cline_conversation_id_own'] = None
        d.setdefault('tmux_sessions', []).append(entry)
        yaml_session_names.add(name)
        drifts.append({'class': 'B', 'name': name,
@@ -505,6 +525,66 @@ for s in d.get('tmux_sessions', []):
        except Exception:
            pass

+# === drift C (hermes): hermes 새 session id materialize (per-row own id) ===
+for s in d.get('tmux_sessions', []):
+    if not s.get('name', '').endswith('-creator-hermes'):
+        continue
+    if s.get('status') != 'running':
+        continue
+    if s.get('hermes_conversation_id_own'):
+        continue
+    cwd = (s.get('pane') or {}).get('cwd', '')
+    if not cwd:
+        continue
+    hdb = f"{home}/.hermes/state.db"
+    if os.path.exists(hdb):
+        try:
+            conn = sqlite3.connect(hdb)
+            r = conn.execute("SELECT id FROM sessions WHERE cwd=? ORDER BY started_at DESC LIMIT 1", (cwd,)).fetchone()
+            conn.close()
+            if r:
+                cid = r[0]
+                s['hermes_conversation_id_own'] = cid
+                drifts.append({'class': 'C', 'name': s['name'], 'msg': f"{s['name']}: conversation id materialized: {cid}"})
+                actions.append(f"updated conversation id: {cid}")
+        except Exception:
+            pass
+
+# === drift C (cline): cline 새 session id materialize (per-row own id) ===
+for s in d.get('tmux_sessions', []):
+    if not s.get('name', '').endswith('-creator-cline'):
+        continue
+    if s.get('status') != 'running':
+        continue
+    if s.get('cline_conversation_id_own'):
+        continue
+    cwd = (s.get('pane') or {}).get('cwd', '')
+    if not cwd:
+        continue
+    sessions_dir = f"{home}/.cline/data/sessions"
+    if os.path.isdir(sessions_dir):
+        candidates = []
+        for session_folder in glob.glob(f"{sessions_dir}/*"):
+            if os.path.isdir(session_folder):
+                folder_name = os.path.basename(session_folder)
+                json_file = f"{session_folder}/{folder_name}.json"
+                if os.path.exists(json_file):
+                    candidates.append(json_file)
+        candidates.sort(key=os.path.getmtime, reverse=True)
+        for j in candidates:
+            try:
+                with open(j) as f:
+                    sdata = json.load(f)
+                if sdata.get('cwd') == cwd or sdata.get('workspace_root') == cwd:
+                    cid = sdata.get('session_id')
+                    if cid:
+                        s['cline_conversation_id_own'] = cid
+                        drifts.append({'class': 'C', 'name': s['name'], 'msg': f"{s['name']}: session id materialized: {cid}"})
+                        actions.append(f"updated session id: {cid}")
+                        break
+            except Exception:
+                pass
+
 # === drift D: stale UUID (cache 의 artifact 가 사라짐) — 보고만, 변경 없음 ===
 ai = d.get('agent_identities', {}) or {}
 cl = (ai.get('claude') or {})
@@ -519,6 +599,28 @@ if ag.get('conversation_id'):
    if not os.path.exists(f"{home}/.gemini/antigravity-cli/conversations/{cid}.db"):
        drifts.append({'class': 'D', 'name': '(agy identity cache)',
                       'msg': f"stale UUID in agent_identities.agy.conversation_id: {cid} (.db missing)"})
+hr = (ai.get('hermes') or {})
+if hr.get('session_id'):
+    sid = hr['session_id']
+    hdb = f"{home}/.hermes/state.db"
+    has_session = False
+    if os.path.exists(hdb):
+        try:
+            conn = sqlite3.connect(hdb)
+            r = conn.execute("SELECT 1 FROM sessions WHERE id=?", (sid,)).fetchone()
+            conn.close()
+            has_session = r is not None
+        except Exception:
+            pass
+    if not has_session:
+        drifts.append({'class': 'D', 'name': '(hermes identity cache)',
+                       'msg': f"stale UUID in agent_identities.hermes.session_id: {sid} (session missing from db)"})
+cn = (ai.get('cline') or {})
+if cn.get('session_id'):
+    sid = cn['session_id']
+    if not os.path.exists(f"{home}/.cline/data/sessions/{sid}/{sid}.json"):
+        drifts.append({'class': 'D', 'name': '(cline identity cache)',
+                       'msg': f"stale UUID in agent_identities.cline.session_id: {sid} (session file missing)"})

 result = {
    'timestamp': now_iso,
@@ -41,6 +41,7 @@ if [ -z "$AGENT" ]; then
    *-creator-claude) AGENT=claude ;;
    *-creator-agy)    AGENT=agy ;;
    *-creator-hermes) AGENT=hermes ;;
+    *-creator-cline)  AGENT=cline ;;
    *) echo "ERROR: cannot infer agent from '$SESSION_NAME'; pass --agent" >&2; exit 2 ;;
  esac
 fi
@@ -51,7 +52,7 @@ NOW_ISO=$(date -u +'%Y-%m-%dT%H:%M:%SZ')
 PANE_PID=$(tmux list-panes -t "$SESSION_NAME" -F '#{pane_pid}' 2>/dev/null | head -1 || true)
 PANE_PID="${PANE_PID:-}"
 CHILD_PID=0
-if { [ "$AGENT" = "agy" ] || [ "$AGENT" = "hermes" ]; } && [ -n "$PANE_PID" ]; then
+if { [ "$AGENT" = "agy" ] || [ "$AGENT" = "hermes" ] || [ "$AGENT" = "cline" ]; } && [ -n "$PANE_PID" ]; then
  CHILD_PID=$(pgrep -P "$PANE_PID" -x "$AGENT" 2>/dev/null | head -1 || true)
  CHILD_PID="${CHILD_PID:-0}"
 fi
@@ -144,6 +145,13 @@ elif agent == 'hermes':
    cp = os.environ.get('CHILD_PID', '0')
    if cp.isdigit() and int(cp) > 0:
        target['child_pid'] = int(cp)
+elif agent == 'cline':
+    target['pane']['cmd'] = 'cline'
+    target['pane']['cmd_full'] = f'cline -i --id {uuid}'
+    target['cline_conversation_id_own'] = uuid
+    cp = os.environ.get('CHILD_PID', '0')
+    if cp.isdigit() and int(cp) > 0:
+        target['child_pid'] = int(cp)

 snap = d.setdefault('snapshot', {})
 snap['taken_at'] = now
@@ -76,6 +76,7 @@ if [ -z "$AGENT" ]; then
    *-creator-claude) AGENT=claude ;;
    *-creator-agy)    AGENT=agy ;;
    *-creator-hermes) AGENT=hermes ;;
+    *-creator-cline)  AGENT=cline ;;
    *) echo "ERROR: cannot infer agent from '$SESSION_NAME'; pass --agent" >&2; exit 2 ;;
  esac
 fi
@@ -184,6 +185,7 @@ graceful_stop() {
    claude) exitkey="/exit" ;;
    agy)    exitkey="Exit" ;;
    hermes) exitkey="/exit" ;;
+    cline)  exitkey="/exit" ;;
    *)      exitkey="/exit" ;;
  esac
  echo "graceful: send-keys '$exitkey' to $SESSION_NAME"
@@ -263,6 +265,8 @@ if captured and not purge:
        target['agy_conversation_id_own'] = captured
    elif agent == 'hermes':
        target['hermes_conversation_id_own'] = captured
+    elif agent == 'cline':
+        target['cline_conversation_id_own'] = captured
    target['resumable'] = True

 # --purge-conversation: 워크스페이스 격리된 UUID 의 디스크 artifact 만 삭제 (P0-C)
@@ -286,23 +290,29 @@ if purge and purge_uuid:
            print(f"purged: {brain}", flush=True)
        target['agy_conversation_id_own'] = None
    elif agent == 'hermes':
-        json_file = f"{home}/.mam/sessions/session_{purge_uuid}.json"
+        json_file = f"{home}/.hermes/sessions/session_{purge_uuid}.json"
        if os.path.exists(json_file):
            os.remove(json_file)
            print(f"purged: {json_file}", flush=True)
-        hdb = f"{home}/.mam/state.db"
+        hdb = f"{home}/.hermes/state.db"
        if os.path.exists(hdb):
            try:
                import sqlite3
-                conn = sqlite3.connect(hdb)
-                conn.execute("DELETE FROM sessions WHERE id=?", (purge_uuid,))
-                conn.execute("DELETE FROM messages WHERE session_id=?", (purge_uuid,))
-                conn.commit()
-                conn.close()
+                hconn = sqlite3.connect(hdb)
+                hconn.execute("DELETE FROM sessions WHERE id=?", (purge_uuid,))
+                hconn.execute("DELETE FROM messages WHERE session_id=?", (purge_uuid,))
+                hconn.commit()
+                hconn.close()
                print(f"purged db records for session: {purge_uuid}", flush=True)
            except Exception as e:
                print(f"WARN: purge hermes db records failed: {e}", flush=True)
        target['hermes_conversation_id_own'] = None
+    elif agent == 'cline':
+        sessions_dir = f"{home}/.cline/data/sessions/{purge_uuid}"
+        if os.path.isdir(sessions_dir):
+            shutil.rmtree(sessions_dir)
+            print(f"purged: {sessions_dir}", flush=True)
+        target['cline_conversation_id_own'] = None
    # agent_identities 는 cache — 이 워크스페이스 것일 때만 비운다
    ai = (d.get('agent_identities') or {}).get(agent) or {}
    if ai.get('project_cwd') == ws:
@@ -317,6 +327,8 @@ if purge and purge_uuid:
            ai['conversation_brain_dir'] = None
        elif agent == 'hermes' and ai.get('session_id') == purge_uuid:
            ai['session_id'] = None
+        elif agent == 'cline' and ai.get('session_id') == purge_uuid:
+            ai['session_id'] = None
 elif purge and not purge_uuid:
    print("WARN: --purge-conversation requested but no workspace-scoped UUID resolved; nothing purged", flush=True)

@@ -10,30 +10,50 @@

 본 프로젝트를 새로운 환경에 복제(Clone)한 후, 핵심 구성 요소들의 위치와 역할을 먼저 파악해야 합니다.

-*   `.agents/skills/`: 멀티 에이전트 구동 및 비동기 잡 처리를 수행하는 셸 스크립트 모음
-    *   `lib.sh`: 오케스트레이션의 핵심 셸 함수 및 가상환경(venv) 자동 연동 라이브러리
-    *   `multi-agent-mux-create/`: 격리된 tmux 에이전트 세션을 시작하는 스크립트
-    *   `multi-agent-mux-stop/`: 세션을 정상적으로 중지하고 상태를 업데이트하는 스크립트
-    *   `multi-agent-mux-resume/`: 중지된 에이전트 세션을 이전 대화 상태 그대로 복원하는 스크립트
-    *   `multi-agent-mux-status/`: 전체 에이전트 세션의 현재 구동 상태를 조회하는 스크립트
-    *   `multi-agent-mux-monitor/`: tmux 상태와 레지스트리 상태를 동기화하는 모니터 스크립트
-    *   `multi-agent-mux-delegate-job/`: 비동기 잡 분할 실행 모듈
-        *   `requirements.txt`: Python 의존성 목록 (paho-mqtt, pyyaml)
-        *   `scripts/`: 핵심 비즈니스 로직을 구동하는 Python 스크립트 디렉터리
-            *   `registry.py`: 잡의 등록, 클레임 및 원자적 파일 락 제어 (CLI 지원)
-            *   `job_subscriber.py`: 백그라운드 이벤트 구독기 및 오디팅 로그 생성기
-            *   `publish_event.py`: 실행 상태 및 에러 트랩 시 이벤트 퍼블리셔
-            *   `mqtt_common.py`: 공통 MQTT 브로커 연결 유틸리티
-*   `AGENT.md`: 에이전트 간의 역할 분담(PM, Worker, Reviewer) 및 이벤트 발행 규약 정의
+*   `.agents/`: 오케스트레이션 및 에이전트 커스텀 스킬 디렉터리
+    *   `AGENT.md`: 에이전트 간의 역할 분담(PM, Worker, Reviewer) 및 이벤트 발행 규약 정의
+    *   `AGENT.ko.md`: 에이전트 간의 역할 분담(PM, Worker, Reviewer) 및 이벤트 발행 규약 정의 (한국어)
+    *   `skills/`: 멀티 에이전트 구동 및 비동기 잡 처리를 수행하는 셸 스크립트 모음
+        *   `lib.sh`: 오케스트레이션의 핵심 셸 함수 및 가상환경(venv) 자동 연동 라이브러리
+        *   `multi-agent-mux-create/`: 격리된 tmux 에이전트 세션을 시작하는 스크립트
+        *   `multi-agent-mux-stop/`: 세션을 정상적으로 중지하고 상태를 업데이트하는 스크립트
+        *   `multi-agent-mux-resume/`: 중지된 에이전트 세션을 이전 대화 상태 그대로 복원하는 스크립트
+        *   `multi-agent-mux-status/`: 전체 에이전트 세션의 현재 구동 상태를 조회하는 스크립트
+        *   `multi-agent-mux-monitor/`: tmux 상태와 레지스트리 상태를 동기화하는 모니터 스크립트
+        *   `multi-agent-mux-delegate-job/`: 비동기 잡 분할 실행 모듈
+            *   `requirements.txt`: Python 의존성 목록 (paho-mqtt, pyyaml)
+            *   `scripts/`: 핵심 비즈니스 로직을 구동하는 Python 스크립트 디렉터리
+                *   `registry.py`: 잡의 등록, 클레임 및 원자적 파일 락 제어 (CLI 지원)
+                *   `job_subscriber.py`: 백그라운드 이벤트 구독기 및 오디팅 로그 생성기
+                *   `publish_event.py`: 실행 상태 및 에러 트랩 시 이벤트 퍼블리셔
+                *   `mqtt_common.py`: 공통 MQTT 브로커 연결 유틸리티
 *   `MESSAGING.md`: 에이전트 간 주고받는 MQTT 메시지 와이어 스킴 가이드라인

 ---

-## 2. 환경 설정 파일 생성 (.env)
+## 2. 워크스페이스 프로비저닝 (단계 0 — 참고용)
+
+에이전트가 이 안내서를 읽기 전에 워크스페이스 설치가 선행되어야 합니다. 이 단계는 이미 개발자(사람)가 설치기를 통해 수행 완료한 상태입니다. 에이전트가 환경 검증 작업을 수행할 때 파일들이 어떤 경로로 복사되고 설치되었는지에 대한 기초 사실(Reference)을 제공하기 위해 여기에 기록해 둡니다.
+
+### 단계 0.1: 설치 스크립트 실행 (이미 기수행됨)
+인스톨러를 통해 워크스페이스를 기동하기 위해 사람이 다음 명령을 구동하였습니다:
+
+```bash
+curl -fsSL https://git.godopu.com/tmpl/multi-agent-mux/raw/branch/main/deploy/install.sh | bash
+```
+
+인스톨러는 시스템 의존성 유효성을 검증하고 리포지토리를 불러와 오케스트레이션을 위한 `.agents/` 스켈레톤 및 관련 문서 파일(본 `BOOTSTRAP.ko.md` 포함)을 복사하고 `.env` 및 `.venv` 가상환경 설정을 자동으로 수행했습니다.
+
+> [!NOTE]
+> **검증 전용 경고**: 본 가이드를 읽고 있는 AI 에이전트는 인스톨러 스크립트를 스스로 **다시 실행해서는 안 됩니다.** 즉시 환경 변수 설정 및 검증 테스트 단계로 넘어가십시오.
+
+---
+
+## 3. 환경 설정 파일 생성 (.env)

 메시징 브로커 설정 및 실행 경로를 설정하기 위해 로컬 환경 설정 파일(`.env`)을 생성하고 수정해야 합니다.

-### 단계 2.1: 자동 생성 스크립트 실행
+### 단계 3.1: 자동 생성 스크립트 실행
 프로젝트 루트에서 제공되는 환경 설정 템플릿 복사 스크립트를 실행합니다.

 ```bash
@@ -44,7 +64,7 @@
 ./scripts/generate-env.sh --force
 ```

-### 단계 2.2: 환경 변수 수정 및 설정
+### 단계 3.2: 환경 변수 수정 및 설정
 생성된 `.env` 파일을 열어 설정을 필요에 따라 구성합니다.

 > [!NOTE]
@@ -60,15 +80,15 @@
 > [!WARNING]
 > **보안 모드 기본값 안내**:
 > 시스템의 기본 설정은 **무인증 PoC 모드**입니다. 잡 등록 시 `auth_token`이 명시적으로 주입되지 않으면(또는 `null`인 경우) HMAC 서명 검증이 생략됩니다.
-> 공개 브로커 사용 환경이나 실제 프로덕션 단계에서는 잡 등록 시 `auth_token`을 고유 난수값으로 생성 및 주입하여 HMAC 보안 서명을 활성화해야 합니다. (자세한 보안 규약은 [MESSAGING.md](./MESSAGING.md) 및 [AGENT.ko.md](./AGENT.ko.md)의 `2.3 보안 프로토콜` 섹션을 참조하십시오. 현재 CLI를 통한 자동 토큰 생성/주입 기능 지원은 향후 로드맵의 `FW-N6` 과제로 처리 예정입니다.)
+> 공개 브로커 사용 환경이나 실제 프로덕션 단계에서는 잡 등록 시 `auth_token`을 고유 난수값으로 생성 및 주입하여 HMAC 보안 서명을 활성화해야 합니다. (자세한 보안 규약은 [MESSAGING.md](./MESSAGING.md) 및 [AGENT.ko.md](.agents/AGENT.ko.md)의 `2.3 보안 프로토콜` 섹션을 참조하십시오. 현재 CLI를 통한 자동 토큰 생성/주입 기능 지원은 향후 로드맵의 `FW-N6` 과제로 처리 예정입니다.)

 ---

-## 3. 의존성 및 가상환경 설정 (Venv Setup)
+## 4. 의존성 및 가상환경 설정 (Venv Setup)

 오케스트레이션 및 MQTT 메시징을 구동하기 위한 Python 3 의존성을 설정합니다.

-### 단계 3.1: Python 가상환경 구축
+### 단계 4.1: Python 가상환경 구축
 프로젝트 루트에서 `.venv` 가상환경을 생성하고 활성화합니다.

 ```bash
@@ -79,7 +99,7 @@ python3 -m venv .venv
 source .venv/bin/activate
 ```

-### 단계 3.2: 의존성 패키지 설치
+### 단계 4.2: 의존성 패키지 설치
 `multi-agent-mux-delegate-job` 디렉터리에 기재된 `requirements.txt` 의존성 목록을 가상환경에 설치합니다.

 ```bash
@@ -89,7 +109,7 @@ pip install -r .agents/skills/multi-agent-mux-delegate-job/requirements.txt

 ---

-## 4. 디렉터리 준비 및 보안 감시 가이드
+## 5. 디렉터리 준비 및 보안 감시 가이드

 에이전트 제어 상태 및 잡 기록을 위해 로컬 레지스트리 디렉터리가 정상적으로 생성되었는지 확인합니다.

@@ -110,7 +130,7 @@ pip install -r .agents/skills/multi-agent-mux-delegate-job/requirements.txt

 ---

-## 5. 실행 환경 검증 및 부트스트랩 테스트
+## 6. 실행 환경 검증 및 부트스트랩 테스트

 환경 구축이 오작동 없이 안전하게 완료되었는지 아래의 체크리스트를 실행해 검증합니다.

@@ -158,8 +178,8 @@ rm -f ".mam/jobs/$JID.json" ".mam/jobs/$JID.lock"

 ---

-## 6. 에이전트 온보딩 가이드 (New Agent Onboarding)
+## 7. 에이전트 온보딩 가이드 (New Agent Onboarding)

-본 환경 구축을 무사히 마쳤다면, 협업하는 에이전트는 즉시 프로젝트 루트에 있는 **[AGENT.ko.md](./AGENT.ko.md)** 문서를 읽어야 합니다.
+본 환경 구축을 무사히 마쳤다면, 협업하는 에이전트는 즉시 .agents/ 디렉터리에 있는 **[AGENT.ko.md](.agents/AGENT.ko.md)** 문서를 읽어야 합니다.

 해당 문서에는 에이전트가 각 역할(PM, Worker, Reviewer)로 구동될 때 지켜야 할 **수술적 변경 규칙, 교차 검증 통과 규약, Tmux 뷰포트 유실 방지를 위한 스냅샷 패턴** 등이 서술되어 있어 안정적인 멀티 에이전트 워크플로우에 즉시 기여할 수 있도록 돕습니다.
@@ -10,30 +10,50 @@ A new agent can follow the steps in this guide sequentially to establish a stabl

 Before cloning this project into a new environment, you must first understand the locations and roles of its core components:

-*   `.agents/skills/`: A collection of shell scripts that execute multi-agent coordination and asynchronous job processing.
-    *   `lib.sh`: The core orchestration shell functions and virtual environment (venv) auto-loading library.
-    *   `multi-agent-mux-create/`: Script to launch isolated tmux agent sessions.
-    *   `multi-agent-mux-stop/`: Script to gracefully stop agent sessions and update states.
-    *   `multi-agent-mux-resume/`: Script to restore stopped agent sessions back to their previous conversation state.
-    *   `multi-agent-mux-status/`: Script to query the current running state of all agent sessions.
-    *   `multi-agent-mux-monitor/`: Monitor script to sync tmux states with the registry.
-    *   `multi-agent-mux-delegate-job/`: Asynchronous job splitting and delegation module.
-        *   `requirements.txt`: Python dependency list (`paho-mqtt`, `pyyaml`).
-        *   `scripts/`: Python scripts running the core business logic.
-            *   `registry.py`: Job registration, claiming, and atomic file lock control (CLI supported).
-            *   `job_subscriber.py`: Background event subscriber and audit log generator.
-            *   `publish_event.py`: Event publisher for runtime states and error traps.
-            *   `mqtt_common.py`: Common utility for connecting to the MQTT broker.
-*   `AGENT.md`: Definition of agent roles (PM, Worker, Reviewer) and event publication rules.
+*   `.agents/`: Orchestration and custom agent skills root.
+    *   `AGENT.md`: Definition of agent roles (PM, Worker, Reviewer) and event publication rules.
+    *   `AGENT.ko.md`: Definition of agent roles (PM, Worker, Reviewer) and event publication rules (Korean).
+    *   `skills/`: A collection of shell scripts that execute multi-agent coordination and asynchronous job processing.
+        *   `lib.sh`: The core orchestration shell functions and virtual environment (venv) auto-loading library.
+        *   `multi-agent-mux-create/`: Script to launch isolated tmux agent sessions.
+        *   `multi-agent-mux-stop/`: Script to gracefully stop agent sessions and update states.
+        *   `multi-agent-mux-resume/`: Script to restore stopped agent sessions back to their previous conversation state.
+        *   `multi-agent-mux-status/`: Script to query the current running state of all agent sessions.
+        *   `multi-agent-mux-monitor/`: Monitor script to sync tmux states with the registry.
+        *   `multi-agent-mux-delegate-job/`: Asynchronous job splitting and delegation module.
+            *   `requirements.txt`: Python dependency list (`paho-mqtt`, `pyyaml`).
+            *   `scripts/`: Python scripts running the core business logic.
+                *   `registry.py`: Job registration, claiming, and atomic file lock control (CLI supported).
+                *   `job_subscriber.py`: Background event subscriber and audit log generator.
+                *   `publish_event.py`: Event publisher for runtime states and error traps.
+                *   `mqtt_common.py`: Common utility for connecting to the MQTT broker.
 *   `MESSAGING.md`: Messaging scheme and wire protocol guidelines for MQTT communication between agents.

 ---

-## 2. Environment Configuration (.env)
+## 2. Workspace Provisioning (Step 0 — Reference Only)
+
+Before any agent can read this guide, the workspace must be provisioned. This step has already been performed by the human developer using the installer. It is documented here strictly as a reference for verifying agents to understand how the assets arrived.
+
+### Step 0.1: Run the Installer One-Liner (Pre-Cloned/Provisioned)
+The human developer bootstrapped the workspace by running:
+
+```bash
+curl -fsSL https://git.godopu.com/tmpl/multi-agent-mux/raw/branch/main/deploy/install.sh | bash
+```
+
+The installer verified system dependencies, staged the repository, copied the runtime `.agents/` scaffolding and documentation files (including this `BOOTSTRAP.md`), and set up the default `.env` and `.venv` environments.
+
+> [!NOTE]
+> **Verify Only**: AI agents reading this guide must **not** attempt to run the installation script again. Proceed directly to configuration and verification steps.
+
+---
+
+## 3. Environment Configuration (.env)

 To set up the messaging broker and execution paths, you must create and modify a local environment configuration file (`.env`).

-### Step 2.1: Run the Generation Script
+### Step 3.1: Run the Generation Script
 Run the environment template copy script provided in the project root:

 ```bash
@@ -44,7 +64,7 @@ Run the environment template copy script provided in the project root:
 ./scripts/generate-env.sh --force
 ```

-### Step 2.2: Modify Environment Variables
+### Step 3.2: Modify Environment Variables
 Open the generated `.env` file to configure settings as needed.

 > [!NOTE]
@@ -60,15 +80,15 @@ Open the generated `.env` file to configure settings as needed.
 > [!WARNING]
 > **Security Mode Default Warning**:
 > The system's default setting is the **unauthenticated PoC mode**. If an `auth_token` is not explicitly provided (or is `null`) during job registration, HMAC signature verification is skipped.
-> In a public broker environment or production phase, you must generate and inject a unique random `auth_token` during job registration to enable HMAC signature security. (For detailed security protocols, refer to section `2.3 Security Protocol` in [MESSAGING.md](./MESSAGING.md) and [AGENT.md](./AGENT.md). Automated token generation and injection via CLI is on the roadmap under task `FW-N6`.)
+> In a public broker environment or production phase, you must generate and inject a unique random `auth_token` during job registration to enable HMAC signature security. (For detailed security protocols, refer to section `2.3 Security Protocol` in [MESSAGING.md](./MESSAGING.md) and [AGENT.md](.agents/AGENT.md). Automated token generation and injection via CLI is on the roadmap under task `FW-N6`.)

 ---

-## 3. Dependency and Virtualenv Setup
+## 4. Dependency and Virtualenv Setup

 Set up the Python 3 dependencies required to run the orchestration and MQTT messaging backplane.

-### Step 3.1: Build Python Virtual Environment
+### Step 4.1: Build Python Virtual Environment
 Create and activate a `.venv` virtual environment in the project root:

 ```bash
@@ -79,7 +99,7 @@ python3 -m venv .venv
 source .venv/bin/activate
 ```

-### Step 3.2: Install Dependency Packages
+### Step 4.2: Install Dependency Packages
 Install the required packages listed in `requirements.txt` under `multi-agent-mux-delegate-job`:

 ```bash
@@ -89,7 +109,7 @@ pip install -r .agents/skills/multi-agent-mux-delegate-job/requirements.txt

 ---

-## 4. Directory Structure and Security Audit Guide
+## 5. Directory Structure and Security Audit Guide

 Ensure that the local registry directories required to track agent states and jobs are successfully created:

@@ -110,7 +130,7 @@ Ensure that the local registry directories required to track agent states and jo

 ---

-## 5. Execution Verification and Bootstrap Tests
+## 6. Execution Verification and Bootstrap Tests

 To verify that the environment has been successfully built without runtime errors, run the following verification checklist.

@@ -159,8 +179,8 @@ rm -f ".mam/jobs/$JID.json" ".mam/jobs/$JID.lock"

 ---

-## 6. Onboarding Collaborating Agents (New Agent Onboarding)
+## 7. Onboarding Collaborating Agents (New Agent Onboarding)

-Once the setup is verified, onboarding agents should immediately read the **[AGENT.md](./AGENT.md)** guidelines in the project root.
+Once the setup is verified, onboarding agents should immediately read the **[AGENT.md](.agents/AGENT.md)** guidelines in the .agents/ directory.

 The guidelines describe essential workflows—such as **surgical change constraints, cross-verification review loops, and pane snapshotting to prevent viewport truncation**—allowing new agents to quickly and safely integrate with the multi-agent workflow.
@@ -26,7 +26,7 @@
 | **FW-W5** | 리뷰어 판정을 위한 구조적 메시지 스키마 정의 | P2 (Medium) | 중 | **워크플로우**: PM 에이전트가 터미널 스크롤백 문자열을 무가공 grep 파싱하는 대신, 전용 리뷰 피드백 토픽(예: `reviews/<job_id>/verdicts`) 및 정형화된 JSON 포맷(`PASS`/`NOT_PASS` + 차단 요인) 도입 | 없음 |
 | **FW-W6** | 모니터링 복구 루프의 Hermes 에이전트 지원 확장 | P2 (Medium) | 중 | **워크플로우 / 일관성**: `reconcile.sh` 내 자동 등록(drift-B) 및 ID 동기화(drift-C) 로직에 `hermes` 세션을 완전 편입시켜 Claude/Agy 세션과 동일한 모니터링 및 복구 수준 지원 | 없음 |
 | **FW-W7** | derive_session_name 내 디렉터리 경로 슬러그 이름 충돌 해결 | P2 (Medium) | 소 | **워크플로우 / 충돌 방지**: 마지막 2개 디렉터리만 슬러그화할 때 발생하는 동일 이름의 중첩 디렉터리 세션 이름 충돌(예: `/projectA/src` 및 `/projectB/src` 가 동일한 세션명으로 슬러그화됨)을 해결하기 위해 워크스페이스 범위 해시 값을 포함하는 세션명 명명 규칙 적용 | 없음 |
-| ~~**FW-D1**~~ | ✅ **해결됨 (2026-06-24)** — 설치 스크립트가 더 이상 in-place 추출하지 않음 | — | — | **배포 / 안전성**: `deploy/install.sh`는 이제 다운로드를 `mktemp -d` 임시 디렉터리에 스테이징하고 `.agents/skills/lib.sh` 존재를 검증한 뒤, 런타임 자산(`.agents/`, `AGENT.md`, `.env.example`)만 per-file no-clobber 가드(`[ ! -e ]`)로 타겟에 복사한다. 따라서 기존 타겟 파일이 항상 우선하며 레포 개발 문서가 워크스페이스에 들어가지 않는다. fetch 후 sanity 체크도 디렉터리가 아닌 파일을 검사하도록 변경 | 완료 |
+| ~~**FW-D1**~~ | ✅ **해결됨 (2026-06-24)** — 설치 스크립트가 더 이상 in-place 추출하지 않음 | — | — | **배포 / 안전성**: `deploy/install.sh`는 이제 다운로드를 `mktemp -d` 임시 디렉터리에 스테이징하고 `.agents/skills/lib.sh` 존재를 검증한 뒤, 런타임 자산(`.agents/`, `.env.example`)만 per-file no-clobber 가드(`[ ! -e ]`)로 타겟에 복사한다. 따라서 기존 타겟 파일이 항상 우선하며 레포 개발 문서가 워크스페이스에 들어가지 않는다. fetch 후 sanity 체크도 디렉터리가 아닌 파일을 검사하도록 변경 | 완료 |
 | **FW-D2** | 설치 스크립트가 다운로드하는 소스를 sourcing 전에 고정 및 검증 | P2 (Medium) | 소 | **배포 / 공급망**: 설치 스크립트는 네트워크로 이동형 `main` 브랜치를 clone/추출하고, 워크스페이스는 이후 해당 셸 스크립트(`lib.sh` 등)를 `source`한다. *부분 해결 (2026-06-24): 복사 전에 스테이징된 트리에 `.agents/skills/lib.sh`가 존재하는지 검증함.* **남은 작업:** 릴리스 태그나 커밋 SHA로 고정하고 공개 체크섬을 검증하여 구조적 존재 여부뿐 아니라 콘텐츠 무결성까지 보장 | 없음 |
 | **FW-D3** | `install.sh`와 `lib.sh` 간 NFS 감지 로직 중복 제거 | P2 (Medium) | 소 | **배포 / 이식성**: `deploy/install.sh`가 `lib.sh::_check_is_nfs`에 이미 존재하는 GNU 전용 `df --output=target` + `mount` NFS 검사를 재구현한다. FW-P1 이식성 수정이 이 두 번째 사본까지 포함하도록, 단일 공유 헬퍼로 추출하여 macOS/BSD에서 두 호출 지점 모두 올바르게 동작하게 한다 | FW-P1 |
 | **FW-D4** | CI shellcheck 커버리지 공백 해소 | P3 (Low) | 소 | **배포 / 품질**: `deploy/gitea-ci.yml`은 5개 스크립트만 shellcheck하며, `status.sh`, `resolve_session_id.sh`, `update_yaml_resumed.sh`, `scripts/generate-env.sh`는 검사되지 않는다. 추적되는 모든 `*.sh`를 glob 처리하여 신규 스크립트가 자동 포함되도록 한다 | 없음 |
@@ -27,7 +27,7 @@ Below is the list of pending future work items. These items were proposed based
 | **FW-W5** | Define structured message schema for reviewer verdicts | P2 (Medium) | Medium | **Workflow**: Create a dedicated reviewer topic (e.g., `reviews/<job_id>/verdicts`) emitting structured JSON verdicts (`PASS` / `NOT_PASS` + details) to eliminate raw text grepping by the PM. | None |
 | **FW-W6** | Expand monitor reconciliation support to Hermes agent | P2 (Medium) | Medium | **Workflow / Consistency**: Fully integrate `hermes` sessions into auto-registration (drift-B) and ID materialization (drift-C) under `reconcile.sh` to match Claude/Agy monitoring coverage. | None |
 | **FW-W7** | Resolve path slug collisions in derive_session_name | P2 (Medium) | Small | **Workflow / Collision Avoidance**: Update `derive_session_name` to handle same-name nested directories (e.g. `/projectA/src` and `/projectB/src` both slugify to identical session names) by incorporating workspace-scoped identifiers or hash digests. | None |
-| ~~**FW-D1**~~ | ✅ **RESOLVED (2026-06-24)** — installer no longer extracts in-place | — | — | **Deploy / Safety**: `deploy/install.sh` now stages the download into a `mktemp -d` dir, verifies `.agents/skills/lib.sh` is present, then copies only the runtime assets (`.agents/`, `AGENT.md`, `.env.example`) into the target with per-file no-clobber guards (`[ ! -e ]`), so existing target files always win and repo dev docs never land in the workspace. The post-fetch sanity check now tests a file, not just the directory. | Done |
+| ~~**FW-D1**~~ | ✅ **RESOLVED (2026-06-24)** — installer no longer extracts in-place | — | — | **Deploy / Safety**: `deploy/install.sh` now stages the download into a `mktemp -d` dir, verifies `.agents/skills/lib.sh` is present, then copies only the runtime assets (`.agents/`, `.env.example`) into the target with per-file no-clobber guards (`[ ! -e ]`), so existing target files always win and repo dev docs never land in the workspace. The post-fetch sanity check now tests a file, not just the directory. | Done |
 | **FW-D2** | Pin and verify the source the installer downloads before sourcing it | P2 (Medium) | Small | **Deploy / Supply-chain**: The installer clones/extracts the moving `main` branch over the network, and the workspace later `source`s those shell scripts (`lib.sh` et al.). *Partially addressed (2026-06-24): the staged tree is now verified to contain `.agents/skills/lib.sh` before any file is copied.* **Remaining:** pin to a release tag or commit SHA and/or verify a published checksum so the fetched content is integrity-checked, not merely structurally present. | None |
 | **FW-D3** | De-duplicate NFS detection between `install.sh` and `lib.sh` | P2 (Medium) | Small | **Deploy / Portability**: `deploy/install.sh` re-implements the GNU-specific `df --output=target` + `mount` NFS check already present in `lib.sh::_check_is_nfs`. The FW-P1 portability fix must cover this second copy — extract a single shared helper so both call sites stay correct on macOS/BSD. | FW-P1 |
 | **FW-D4** | Close CI shellcheck coverage gaps | P3 (Low) | Small | **Deploy / Quality**: `deploy/gitea-ci.yml` shellchecks only 5 scripts; `status.sh`, `resolve_session_id.sh`, `update_yaml_resumed.sh`, and `scripts/generate-env.sh` are never linted. Glob all tracked `*.sh` so new scripts are covered automatically. | None |
@@ -64,4 +64,4 @@ Strong success criteria let you loop independently. Weak criteria ("make it work

 **These guidelines are working if:** fewer unnecessary changes in diffs, fewer rewrites due to overcomplication, and clarifying questions come before implementation rather than after mistakes.

-Read AGENT.md first before working and follow the instructions for orchestration. 
+Read .agents/AGENT.md first before working and follow the instructions for orchestration. 
@@ -138,6 +138,8 @@ sequenceDiagram
 ```text
 .
 ├── .agents/
+│   ├── AGENT.md                 # 에이전트 역할 행동 강령 및 뷰포트 스냅샷 규칙
+│   ├── AGENT.ko.md              # 에이전트 역할 행동 강령 (한국어 백업)
 │   └── skills/                  # 코어 오케스트레이션 셸 스크립트 및 라이브러리
 │       ├── lib.sh               # 공통 오케스트레이션 셸 함수 라이브러리
 │       ├── multi-agent-mux-create/
@@ -154,8 +156,6 @@ sequenceDiagram
 │   └── jobs/                    # 비동기 잡 메타데이터 JSON 파일들
 ├── scripts/
 │   └── generate-env.sh          # 환경 파일(.env) 템플릿 복사 스크립트
-├── AGENT.ko.md                  # 에이전트 역할 행동 강령 (한국어 백업)
-├── AGENT.md                     # 에이전트 역할 행동 강령 및 뷰포트 스냅샷 규칙
 ├── BOOTSTRAP.ko.md              # 프로젝트 초기 설치 가이드 (한국어 백업)
 ├── BOOTSTRAP.md                 # 프로젝트 초기 설치 및 검증 상세 가이드
 ├── MESSAGING.md                 # MQTT 메시징 프로토콜 와이어 규격서
@@ -188,6 +188,6 @@ sequenceDiagram
 ## 📝 협업 에이전트 준수 사항

 이 프로젝트에 새로 합류한 에이전트는 다음 규칙을 준수해야 합니다:
-1.  **[AGENT.md](./AGENT.md)** 문서를 정독하여 프로젝트 매니저(PM), 작업자(Worker), 리뷰어(Reviewer) 간의 역할 및 개발 제약조건을 인지하십시오.
+1.  **[AGENT.md](.agents/AGENT.md)** 문서를 정독하여 프로젝트 매니저(PM), 작업자(Worker), 리뷰어(Reviewer) 간의 역할 및 개발 제약조건을 인지하십시오.
 2.  장시간 명령을 실행하는 경우 터미널 스크롤백 로그 유실을 방지하기 위해 `AGENT.md` (제4장)에 기재된 **뷰포트 스냅샷 규칙(Pane Snapshotting Rules)**을 반드시 적용하십시오.
 3.  리뷰어 세션에 diff 검증을 요청하기 전에는 어떠한 코어 파일의 임의 수정도 프로덕션 브랜치에 승인 없이 머지할 수 없습니다.
@@ -14,6 +14,24 @@ Modern agentic workflows often suffer from session timeout, lack of process isol
 3. **Multi-Agent Mux (MAM):** Combining local file-based locks (fcntl) and an ACID-compliant SQLite WAL database (`.mam/agent-sessions.db`) to manage concurrent job claims and track running agent sessions without drift.
 4. **Automated Review & Quality Loop:** Implementing parallel reviewer loops where worker agents must receive a `PASS` rating from various specialized verification agents (e.g., Claude for high-level logic, Hermes for shell syntax/safety) before merging code.

+
+---
+
+## 📦 Installation & Setup
+
+You can bootstrap the Multi-Agent Mux (MAM) framework in any workspace directory with a single command:
+
+```bash
+curl -fsSL https://git.godopu.com/tmpl/multi-agent-mux/raw/branch/main/deploy/install.sh | bash
+```
+
+Alternatively, if you have already cloned the repository locally, run the installer directly:
+```bash
+bash deploy/install.sh
+```
+
+The idempotent installer automatically validates system dependencies (tmux, python3, and PyYAML), creates the python virtual environment (`.venv`), installs dependencies, copies `.env.example` as `.env`, and initializes the `.agents/` scaffolding.
+
 ---

 ## 🛠️ Core Skills & Scaffolding
@@ -138,6 +156,8 @@ To ensure communication integrity across public MQTT brokers, the backplane inte
 ```text
 .
 ├── .agents/
+│   ├── AGENT.md                 # Agent roles, snapshottings, and execution charter
+│   ├── AGENT.ko.md              # Agent roles, snapshottings, and execution charter (Korean)
 │   └── skills/                  # Core orchestration shell wrappers & libraries
 │       ├── lib.sh               # Shared orchestration library
 │       ├── multi-agent-mux-create/
@@ -154,7 +174,6 @@ To ensure communication integrity across public MQTT brokers, the backplane inte
 │   └── jobs/                    # Asynchronous job metadata files
 ├── scripts/
 │   └── generate-env.sh          # Environment bootstrap helper
-├── AGENT.md                     # Agent roles, snapshottings, and execution charter
 ├── BOOTSTRAP.md                 # Detailed installation and verification guide
 ├── MESSAGING.md                 # MQTT wire protocol specification
 └── README.md                    # Project introduction and overview (this file)
@@ -186,6 +205,6 @@ For detailed setup instructions, please consult the **[BOOTSTRAP.md](./BOOTSTRAP
 ## 📝 Guidelines for Collaborating Agents

 If you are an AI agent newly onboarded to this project:
-1.  Read **[AGENT.md](./AGENT.md)** to align on development constraints and roles (PM, Worker, Reviewer).
+1.  Read **[AGENT.md](.agents/AGENT.md)** to align on development constraints and roles (PM, Worker, Reviewer).
 2.  Adhere to the **Pane Snapshotting Rules** in `AGENT.md` (Section 4) to prevent scrollback data loss during long execution steps.
 3.  Never modify core logic without submitting a diff to the reviewer sessions for evaluation.
@@ -0,0 +1,122 @@
+# Multi-Agent Mux: Skill Features and Architecture
+
+이 문서는 `multi-agent-mux` 워크스페이스 내에 구현된 6개의 개별 스킬 및 공통 라이브러리의 핵심 기능, 상태 머신, CLI 사양, 그리고 상호 연동 방식을 종합 정리한 명세입니다. 스킬 최적화 및 팩토링 작업의 기준서로 사용됩니다.
+
+---
+
+## 1. 아키텍처 개요 (Architecture Overview)
+
+`multi-agent-mux`는 다중 자율 에이전트(Claude, Agy, Cline, Hermes 등)를 격리된 Tmux 세션 환경에서 관리하고 상호 통신할 수 있게 돕는 시스템입니다.
+*   **중앙 상태 레지스트리**: `.mam/agent-sessions.yaml` 및 동기화된 `.mam/agent-sessions.db` (SQLite3)
+*   **격리 소켓**: 독립된 tmux 서버 소켓 지정 구동 가능 (예: `multi-agent-mux` 서버)
+*   **이벤트 버스**: MQTT 프로토콜 기반의 실시간 작업 상태 비동기 관찰 (`multi-agent-mux-delegate-job`)
+
+---
+
+## 2. 공통 라이브러리: `lib.sh` (Common Library)
+
+모든 스킬 스크립트가 로드하여 사용하는 핵심 공유 헬퍼 라이브러리입니다.
+
+*   **상태 파일 원자적 덤프 (`atomic_dump_yaml`)**:
+    *   NFS(네트워크 파일 시스템) 감지 시 SQLite `PRAGMA journal_mode=DELETE` 폴백, 로컬 환경에서는 `PRAGMA journal_mode=WAL` 설정.
+    *   독점 잠금(`BEGIN IMMEDIATE`)을 활성화해 멀티프로세스 환경에서 Read-Modify-Write 데이터 유실(lost update race condition) 방지.
+    *   트랜잭션 커밋 완료 후 `.bak` 백업 파일 생성 및 임시파일 생성 후 `os.replace` 원자적 대체 기법 적용.
+*   **에이전트 세션 실재성 판단 (`*_exists` 함수군)**:
+    *   `claude`: 프로젝트 디렉터리 하위 `<uuid>.jsonl` 존재성
+    *   `agy`: `.gemini/antigravity-cli/conversations/<uuid>.db` 존재성
+    *   `hermes`: `~/.hermes/state.db`의 `sessions` 테이블 내 존재성 (SQLite 쿼리 검증)
+    *   `cline`: `.cline/data/sessions/<uuid>/<uuid>.json` 존재성
+*   **세션 ID 해석 엔진 (`find_workspace_uuid` 분기 구조)**:
+    *   **Tier 1 (YAML 직접 조회)**: YAML 내 기록된 에이전트별 전용 필드(`claude_session_id_own` 등) 조회.
+    *   **Tier 2 (디스크 잔해 스캔)**: 워크스페이스 디렉터리(`cwd` / `workspace_root`)와 매칭되는 디스크 상의 세션 로그 중 가장 최근 수정일(`mtime`) 기준 정렬 후 최신 UUID 반환.
+    *   **Tier 3 (아이덴티티 캐시)**: 레지스트리 상단 `agent_identities` 캐시 데이터 연동.
+
+---
+
+## 3. 스킬별 상세 핵심 기능 (Skill Specifications)
+
+### 3.1. `multi-agent-mux-create` (생성 스킬)
+*   **용도**: 신규 에이전트 동작용 격리된 Tmux 컨테이너 생성 및 레지스트리 신규 등록.
+*   **핵심 기능**:
+    *   **사전 기능 검증 (Preflight Check)**:
+        *   `claude`: `claude auth status`를 통한 로그인 상태(`"loggedIn": true`) 검증
+        *   `agy`: `agy models`를 통한 API 연동 정상 상태 검증
+        *   `hermes`: `hermes status`를 통한 연동 상태 검증
+        *   `cline`: `cline history --json` 동작 및 설정 상태 사전 검증
+    *   **Tmux 세션 생성 및 초기화**: 에이전트별 최적화된 화면 크기(`-x 140 -y 40`) 및 작업 디렉터리(`-c`)를 적용해 세션 백그라운드 생성.
+    *   **초기 상태 YAML 등록**: 사용자 필수 지정 역할(`--role`), `status: running`, `pane` 세부정보(인덱스, PID, CWD, CMD_FULL), 시작 명령 및 `mcp_attachments` 기록.
+    *   **역할 불변성 보장**: 에이전트 생성 시 부여된 역할(`role`)은 사후 수정이 불가하며, 임의 변경 시도 시 데이터 검증(`atomic_dump_yaml`) 단계에서 예외 처리되어 방어됨.
+
+### 3.2. `multi-agent-mux-resume` (재개 스킬)
+*   **용도**: 중지되었거나 유실된 에이전트의 이전 컨텍스트 그대로 Tmux 세션 및 TUI 연결 복원.
+*   **핵심 기능**:
+    *   **세션 ID 해석 위임**: `lib.sh::find_workspace_uuid`을 구동하여 대상 워크스페이스의 UUID 확인.
+    *   **세션 복원 기동**:
+        *   `claude`: `claude --dangerously-skip-permissions -r <UUID>`
+        *   `agy`: `agy --dangerously-skip-permissions --conversation <UUID>`
+        *   `hermes`: `hermes --resume <UUID>`
+        *   `cline`: `cline -i --id <UUID>`
+    *   **TUI 바이패스 자동화 (Claude)**: 기동 직후 백그라운드에서 `Enter` ➔ `Down` ➔ `Enter` 키스트로크를 주입하여 권한 우회 및 복구 확인 대화상자 자동 수락.
+    *   **동기화**: `update_yaml_resumed.sh`를 구동해 상태를 `running`으로 전이하고 기동 시점에 맞춘 하위 자식 PID 갱신 및 기존 종료 메타데이터 제거.
+
+### 3.3. `multi-agent-mux-stop` (종료 스킬)
+*   **용도**: 세션을 안전하게 정리하고, 상태 및 UUID를 안전하게 저장 및 동기화.
+*   **핵심 기능**:
+    *   **종료 전 TUI 스냅숏 저장**: `tmux capture-pane`을 수행해 최종 화면 상태를 `last_visible_status_at_termination` 필드에 보존.
+    *   **다단계 Graceful 종료 프로토콜**:
+        1. TUI 안전 종료 키스트로크 주입 (`/exit` 또는 `Exit`) 후 3초 대기.
+        2. 생존 시 `tmux kill-session` 전송 및 5초 대기.
+        3. 최후 수단으로 감지된 자식 PID에 `kill -9` 전송.
+    *   **디스크 소거 (--purge-conversation)**:
+        *   `resumable`을 `false`로 설정하고 상태를 `terminated`로 기록.
+        *   에이전트별 데이터 경로에 접근해 해당 세션 파일 파쇄.
+            *   `claude`: `<proj-key>/<uuid>.jsonl` 삭제
+            *   `agy`: `conversations/<uuid>.db` 및 `brain/<uuid>` 폴더 삭제
+            *   `hermes`: `sessions/session_<uuid>.json` 삭제 및 `state.db` 내 이력 삭제 (내부 독자 커넥션 `hconn` 사용으로 상위 YAML DB 충돌 차단)
+            *   `cline`: `~/.cline/data/sessions/<uuid>` 폴더 소거
+
+### 3.4. `multi-agent-mux-delegate-job` (위임 스킬)
+*   **용도**: 타 에이전트에게 비동기적으로 작업을 위임하고, MQTT 이벤트로 실행 상태 관찰.
+*   **핵심 기능**:
+    *   **작업 지시 유형 (Delegation Types)**:
+        *   `direct` (기본값): 단일 타겟 세션 기동 후 작업 전달 및 대기.
+        *   `loop` (협업 루프): 구현자(Worker)의 작업 완료 후 검토자(Reviewer)가 코드 검수를 수행하여 `"PASS"` 의견이 나올 때까지 작업 수정을 자동 반복 지시.
+        *   `discuss` (토론/합의): 두 에이전트 간 공동 토론을 추진하여 최종 기획 및 계획 합의 도출.
+    *   **MQTT 이벤트 규격**: `publish_event.py`와 `job_subscriber.py`를 매핑하여 `started` ➔ `permission_required` ➔ `progress` ➔ `completed`/`error` 상태 전이 추적 및 자동 이중 타임아웃 검사 (전체 실행 예산 3600초 + 120초 유휴 타임아웃).
+    *   **감사 로그 기록**: `.mam/delegate_job_logs/<job_id>/`에 `meta.json`, `status.json` 및 원시 NDJSON 형식의 `events.ndjson`을 영속 기록.
+
+### 3.5. `multi-agent-mux-status` (현황 스킬)
+*   **용도**: 레지스트리를 읽어와 실행 중인 모든 에이전트의 구동 세션 현황을 즉시 표기.
+*   **핵심 기능**:
+    *   **읽기 전용 안정성**: DB 수정이나 상태 전이 유발 없이 순수 조회만 수행.
+    *   실시간 tmux 프로세스 상태 정보와 YAML 간의 이름 매핑 정합성을 검증하여 콘솔에 요약 출력.
+
+### 3.6. `multi-agent-mux-monitor` (화해 스킬)
+*   **용도**: 운영체제 Tmux 런타임과 YAML 레지스트리 데이터 불일치를 백그라운드 루프로 감지해 자동 화해(Reconciliation) 처리.
+*   **핵심 기능**:
+    *   **Drift 감지 및 복구 매뉴얼**:
+        *   **Drift A (Crash/죽은 세션)**: YAML 상 `running`이나 실제 tmux 프로세스가 죽은 경우 감지 ➔ 상태를 `terminated`로 격하 조정.
+        *   **Drift B (새 세션 감지)**: YAML에 없으나 tmux 상에 임의로 떠 있는 `*-creator-*` 세션을 레지스트리에 자동 등록 및 자식 PID 정보 갱신.
+        *   **Drift C (실시간 UUID 갱신)**: 새로 시작된 에이전트가 첫 명령을 받아 세션 ID를 생성했을 때, 디스크 상의 세션 로그 중 가장 수정시간이 일치하는 최신 UUID를 찾아 `*_conversation_id_own` 필드에 주입.
+        *   **Drift D (캐시 정합성 점검)**: 레지스트리 및 캐시 상의 세션 UUID가 실제 디스크에 존재하는지 검사하여 소거된 세션을 리포트.
+
+---
+
+## 4. 에이전트 상태 머신 (Agent State Machine)
+
+시스템 전반에 걸쳐 에이전트 세션은 아래 흐름을 따라 전이됩니다.
+
+```mermaid
+stateDiagram-v2
+    [*] --> running : multi-agent-mux-create / Drift B
+    running --> stopped : multi-agent-mux-stop (default)
+    running --> terminated : multi-agent-mux-stop (--purge-conversation) / Drift A
+    stopped --> running : multi-agent-mux-resume
+    terminated --> [*]
+```
+
+## 5. 최적화 및 팩토링 작업 시 주의 사항
+
+1.  **원자적 쓰기 무력화 금지**: `lib.sh`에 설정된 `atomic_dump_yaml`은 다중 에이전트 병렬 기동 시 데이터 꼬임을 막는 중추 역할을 합니다. DB 잠금 및 트랜잭션 흐름을 훼손하지 않아야 합니다.
+2.  **Cline 및 Claude의 TUI 입력 바인딩 유지**: 세션 재개나 중지 시, 각 에이전트가 내부적으로 사용하는 프롬프트 제어 명령어(예: `/exit`, `--id <session>`)의 세세한 차이를 유지해야 예외 없이 동작합니다.
+3.  **데이터베이스 변수 충돌 주의**: 서브셸 또는 인라인 Python 스크립트 실행 시 전역 SQLite 커넥션(`conn`)의 이름 공간을 절대 오염시키지 마십시오. (예: `stop_session.sh` 버그 재발 방지).
@@ -128,7 +128,7 @@ if ! check_assets_present "."; then

  # Copy non-dev documents if they don't already exist.
  # We skip dev-specific docs like README.md, DONE.md, and FUTURE_WORKS.md.
-  for doc in AGENT.md AGENT.ko.md MESSAGING.md BOOTSTRAP.md BOOTSTRAP.ko.md INSTRUCTION.md; do
+  for doc in MESSAGING.md BOOTSTRAP.md BOOTSTRAP.ko.md INSTRUCTION.md; do
    if [ -f "$STAGE_DIR/$doc" ] && [ ! -e "$doc" ]; then
      cp "$STAGE_DIR/$doc" . || { echo "❌ Error: Failed to copy $doc" >&2; exit 1; }
      echo "$doc" >> "$MANIFEST_FILE"
@@ -26,6 +26,12 @@ done

 if [ -z "$TARGET_DIR" ]; then
  TARGET_DIR="$(pwd)"
+else
+  if [ ! -d "$TARGET_DIR" ]; then
+    echo "❌ Error: Target directory '$TARGET_DIR' does not exist." >&2
+    exit 1
+  fi
+  TARGET_DIR="$(cd "$TARGET_DIR" && pwd)"
 fi

 echo "===================================================================="
@@ -33,11 +39,6 @@ echo "⚡ Starting Multi-Agent Mux (MAM) Update"
 echo "📂 Target Workspace: $TARGET_DIR"
 echo "===================================================================="

-if [ ! -d "$TARGET_DIR" ]; then
-  echo "❌ Error: Target directory '$TARGET_DIR' does not exist." >&2
-  exit 1
-fi
-
 cd "$TARGET_DIR"

 # 1. Verification of existing install
@@ -76,12 +77,60 @@ fi
 HAS_MAM=0
 if [ -d ".mam" ]; then
  HAS_MAM=1
-  # Move .mam out of the way of remove.sh
-  mv ".mam" ".mam.update-tmp"
+  # Copy database and jobs to temporary backup outside of .mam.
+  # We do NOT move the .mam folder away so that remove.sh can still read .mam/install_manifest.txt!
+  mkdir -p .mam.update-tmp
+  # Copy SQLite databases and session files
+  for db in .mam/agent-sessions.*; do
+    if [ -f "$db" ]; then
+      cp -f "$db" .mam.update-tmp/
+    fi
+  done
+  # Copy jobs history
+  if [ -d ".mam/jobs" ] && [ "$(ls -A .mam/jobs 2>/dev/null)" ]; then
+    mkdir -p .mam.update-tmp/jobs
+    cp -rf .mam/jobs/* .mam.update-tmp/jobs/
+  fi
+  # Copy delegate logs
+  if [ -d ".mam/delegate_job_logs" ] && [ "$(ls -A .mam/delegate_job_logs 2>/dev/null)" ]; then
+    mkdir -p .mam.update-tmp/delegate_job_logs
+    cp -rf .mam/delegate_job_logs/* .mam.update-tmp/delegate_job_logs/
+  fi
+  # Copy manifest so we have a backup
+  if [ -f ".mam/install_manifest.txt" ]; then
+    cp -f .mam/install_manifest.txt .mam.update-tmp/
+  fi
 fi

+# Define trap to restore backup files on failure
+restore_on_failure() {
+  echo "❌ Update failed. Reverting configuration and database to previous state..."
+  if [ $HAS_ENV -eq 1 ] && [ -f ".env.update-tmp" ]; then
+    mv -f ".env.update-tmp" ".env" 2>/dev/null || true
+  fi
+  if [ $HAS_MAM -eq 1 ] && [ -d ".mam.update-tmp" ]; then
+    # Revert to old database/jobs backup by restoring .mam directory
+    rm -rf .mam 2>/dev/null || true
+    mkdir -p .mam
+    cp -f .mam.update-tmp/agent-sessions.* .mam/ 2>/dev/null || true
+    if [ -d ".mam.update-tmp/jobs" ]; then
+      cp -rf .mam.update-tmp/jobs .mam/ 2>/dev/null || true
+    fi
+    if [ -d ".mam.update-tmp/delegate_job_logs" ]; then
+      cp -rf .mam.update-tmp/delegate_job_logs .mam/ 2>/dev/null || true
+    fi
+    if [ -f ".mam.update-tmp/install_manifest.txt" ]; then
+      cp -f .mam.update-tmp/install_manifest.txt .mam/ 2>/dev/null || true
+    fi
+    rm -rf .mam.update-tmp 2>/dev/null || true
+  fi
+}
+trap restore_on_failure EXIT
+
 # 3. Perform uninstallation of existing files
 echo "🗑️  Removing existing installation..."
+# remove.sh will run in manifest mode because .mam/install_manifest.txt is still present.
+# It will delete .agents/, documents, scripts, .venv, and .mam folder.
 bash remove.sh --force

 # 4. Fetch and run the latest installer from Gitea
@@ -93,13 +142,12 @@ elif command -v wget &>/dev/null; then
  wget -qO- "$INSTALLER_URL" | bash -s -- "$TARGET_DIR"
 else
  echo "❌ Error: Neither 'curl' nor 'wget' is available to fetch the installer." >&2
-  
-  # Restore backups before failing
-  if [ $HAS_ENV -eq 1 ]; then mv ".env.update-tmp" ".env"; fi
-  if [ $HAS_MAM -eq 1 ]; then mv ".mam.update-tmp" ".mam"; fi
  exit 1
 fi

+# Disable failure trap since installation succeeded
+trap - EXIT
+
 # 5. Restore backups of configuration and database
 echo "🔄 Restoring configuration and database..."
 if [ $HAS_ENV -eq 1 ]; then
@@ -108,26 +156,22 @@ if [ $HAS_ENV -eq 1 ]; then
 fi

 if [ $HAS_MAM -eq 1 ]; then
-  # The installer created a new .mam directory with a new manifest.
-  # We want to merge the old .mam database/jobs back while keeping the new manifest.
  if [ -d ".mam.update-tmp" ]; then
-    # Copy SQLite databases
-    for db in .mam.update-tmp/db.sqlite*; do
+    # The installer created a new .mam directory with a new manifest.
+    # We want to merge the old .mam database/jobs back while keeping the new manifest.
+    for db in .mam.update-tmp/agent-sessions.*; do
      if [ -f "$db" ]; then
        cp -f "$db" .mam/
      fi
    done
-    # Copy jobs history
    if [ -d ".mam.update-tmp/jobs" ] && [ "$(ls -A .mam.update-tmp/jobs 2>/dev/null)" ]; then
      mkdir -p .mam/jobs
      cp -rf .mam.update-tmp/jobs/* .mam/jobs/
    fi
-    # Copy delegate logs
    if [ -d ".mam.update-tmp/delegate_job_logs" ] && [ "$(ls -A .mam.update-tmp/delegate_job_logs 2>/dev/null)" ]; then
      mkdir -p .mam/delegate_job_logs
      cp -rf .mam.update-tmp/delegate_job_logs/* .mam/delegate_job_logs/
    fi
-    # Clean up the backup directory
    rm -rf ".mam.update-tmp"
  fi
 fi
Author	SHA1	Message	Date
Godopu	6e3c866461	docs: clean up stale create_session usage instructions in comments and markdown examples	2026-06-28 10:31:58 +09:00
Godopu	7c8267240d	feat: enforce required agent roles at creation and role immutability in registry	2026-06-28 10:27:36 +09:00
Godopu	f457180777	refactor: adapt multi-agent-mux skills and agent guidelines for the Team Leader scenario	2026-06-28 10:21:24 +09:00
Godopu	81474ac3f7	docs: add Step 0 provisioning to BOOTSTRAP.md and update README.md with curl installer	2026-06-28 09:34:52 +09:00
Godopu	dd9500a271	feat(multi-agent-mux): integrate cline agent support, fix sqlite3 naming collision, simplify delegation docs, and add SKILL_FEATURES.md	2026-06-28 09:17:11 +09:00
Godopu	dfd0a9483d	feat: implement loop and discuss task delegation types in multi-agent-mux-delegate-job	2026-06-27 08:28:47 +09:00
Godopu	3b8db1eca2	fix(skills): add 0.5s sleep delay after paste-buffer to prevent key collisions	2026-06-26 23:00:30 +09:00
Godopu	698ea09b27	docs: update AGENT.md references to .agents/AGENT.md	2026-06-26 21:33:26 +09:00
Godopu	57d8f6c2ff	refactor: move AGENT.md and AGENT.ko.md to .agents/ directory	2026-06-26 21:28:41 +09:00
Godopu	e14ee90243	fix(skills): point HOME_DIR to real home directory and fix Hermes database path	2026-06-26 21:17:43 +09:00
Godopu	b47fcbda9b	fix(deploy): resolve TARGET_DIR to absolute path in update.sh	2026-06-24 12:29:00 +09:00
Godopu	5da6e59d2f	fix(deploy): fix update.sh fallback mode, trap rollback, and database names	2026-06-24 12:28:17 +09:00