141 lines
11 KiB
Markdown
141 lines
11 KiB
Markdown
# AGENT.md
|
|
|
|
This document serves as the common guidelines and protocol for introducing the **MQTT messaging backplane and Tmux-based multi-agent orchestration workflow** to a new project. It defines the rules and architecture to ensure collaborating agents perform tasks safely, robustly, and consistently.
|
|
|
|
All agents working on a new project must read this document thoroughly and comply with the defined protocols before starting any tasks.
|
|
|
|
---
|
|
|
|
## 1. Agent Roles Definition (Agent Roles)
|
|
|
|
We clearly separate responsibilities and permissions between roles to reduce bottlenecks and enhance the quality of execution.
|
|
|
|
### 👑 General Manager (Orchestrator)
|
|
- **Core Responsibility**: Interact directly with the user, receive high-level requirements, establish task plans, delegate tasks to Team Leaders, control the overall workflow, and report completion back to the user.
|
|
- **Ambiguity Resolution**: If a user's requirements contain ambiguous details, do not guess. Immediately ask the user for clarification (we recommend using the `/grill-me` slash command).
|
|
|
|
### 👥 Team Leaders (팀장)
|
|
Newly spawned agents (e.g., `antigravity`, `claude`, `cline`, `hermes`) act as **Team Leaders** of their respective groups. They receive delegated tasks from the General Manager and manage implementation or review workflows.
|
|
- **Developer Team Leader (개발 팀장)**:
|
|
- Receives tasks from the General Manager.
|
|
- **Task Breakdown & Planning**: Thoroughly analyzes the task, breaks it down into small units, and creates a plan.
|
|
- **Internal Parallelism**: Can run subagents in parallel internally to handle the delegated work.
|
|
- **Review Integrity & Refusal**: Thoroughly reviews feedback from Reviewers. Adopts/implements recommendations if valid. If any recommendation is judged invalid, the Developer Team Leader must **not** implement it, but instead return the refutation along with detailed reasons to the Reviewer.
|
|
- **Completion Signal**: Once all reviewers yield a `PASS` and changes are verified, the Developer Team Leader who first received the task sends a completion signal back to the General Manager.
|
|
- **Reviewer Team Leader (리뷰어 팀장)**:
|
|
- Receives review requests from the Developer Team Leader.
|
|
- **Detailed Feedback with Directions**: Simply rejecting changes (`NOT PASS`) is forbidden. Reviewers **must** specify the exact reason for the issue and provide a concrete, stable, and verified alternative direction for improvement.
|
|
- **Consensus Loop**: Engages in the review cycle until all objections are resolved and a final `PASS` is issued.
|
|
|
|
### 🛡️ Role Suitability Check Principle (자신의 역할 범위 수행 원칙)
|
|
- Every agent must only perform tasks suitable for its designated role (e.g., Developer Team Leaders do not issue final reviews, and Reviewer Team Leaders do not write project code).
|
|
- **If an agent receives a task that does not fit its role**, it must either:
|
|
1. Recommend the optimal agent session to delegate the task to, or
|
|
2. Perform the task directly if strictly necessary for project continuity.
|
|
|
|
---
|
|
|
|
## 2. Messaging Backplane & Registry Protocol
|
|
|
|
Asynchronous communication and state management between agents are controlled via distributed event channels and file/DB registries.
|
|
|
|
### 📡 MQTT Backplane
|
|
- **Event Lifecycle**:
|
|
- `started` (Job execution starts) ➡️ `progress`/`permission_required` (Share intermediate progress) ➡️ `completed` (Successful termination) or `error` (Failed termination)
|
|
- `completed` and `error` are terminal events that are published exactly once.
|
|
- **Publish/Subscribe Rules**:
|
|
- Since MQTT does not guarantee persistent queues, the subscriber (`job_subscriber.py`) **must be running in the background before the agent starts** (the Subscribe-before-Publish principle).
|
|
- When publishing terminal events, publish with `retain=True` on the broker so that subscribers joining late can still read the final state.
|
|
- Generalize all transmitted data to ensure that sensitive secrets like passwords, private keys, or absolute system paths are not included.
|
|
|
|
### 🗃️ Registry & State Management
|
|
- This architecture maintains two distinct registries based on their purpose:
|
|
- **Job Registry**: The metadata and lifecycle of each asynchronous job are recorded in individual JSON files (`.mam/jobs/<id>.json`). Concurrency conflicts (claiming races) across multiple sessions are prevented via file-based `fcntl` advisory locks (`registry_lock` via `registry.py`).
|
|
- **Session Registry**: TMUX monitoring states and running agent metadata are consistently controlled using a SQLite WAL database (`.mam/agent-sessions.db`) to support reliable concurrent transactions on a single host. However, since SQLite WAL mode does not guarantee complete file locking in Network File System (NFS) environments, we recommend using a local file system.
|
|
|
|
### 🛡️ Security Protocol (HMAC-SHA256)
|
|
- **Unauthenticated PoC Mode**: If the `auth_token` in the job registry is set to `null` (the default PoC mode), signature verification is skipped and all events are accepted (`verify_hmac` always returns `True`).
|
|
- **Authenticated Production Mode**: In production environments or integrations requiring authentication, a unique cryptographic token (`auth_token`) is issued for each job. The publisher must include an `hmac_sig` signature in the payload keyed by this token, and the receiving end (`verify_hmac`) will immediately drop messages that lack a signature or have mismatching signatures to prevent downgrade attacks.
|
|
- **Rollout Strategy**: To avoid event drops caused by inconsistencies between publishing and receiving nodes when updating security schemes, hybrid transition formats (which risk leaking plaintext tokens) must not be used. Instead, adopt a **"Simultaneous Rollout"** where all nodes are updated at once.
|
|
|
|
---
|
|
|
|
## 3. Collaborative Workflow Execution Loop (Workflow Loop)
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
autonumber
|
|
actor User as User
|
|
participant GM as General Manager
|
|
participant DTL as Developer Team Leader
|
|
participant RTL as Reviewer Team Leaders
|
|
participant M as MQTT Backplane
|
|
|
|
User->>GM: Hand over requirements
|
|
GM->>DTL: Delegate task (e.g., create landing page)
|
|
Note over DTL: Analyze, breakdown & spawn parallel subagents
|
|
DTL->>M: Publish 'started' event
|
|
Note over DTL: Modify code & implement
|
|
DTL->>M: Publish 'completed'
|
|
DTL->>RTL: Request review (I created landing page. Please review it)
|
|
Note over RTL: Cross-analysis & verification
|
|
alt Defect Found (Reviewer feedback)
|
|
RTL->>DTL: NOT PASS / Feedback (Must include reason & improvement direction)
|
|
Note over DTL: DTL checks validity of suggestions
|
|
alt Valid feedback
|
|
Note over DTL: DTL adopts and modifies code
|
|
else Invalid feedback
|
|
DTL->>RTL: Send refutation & reasons (Did not reflect inappropriate parts)
|
|
end
|
|
DTL->>RTL: Request review again (Modified review items)
|
|
else Verification Pass
|
|
RTL->>DTL: PASS
|
|
end
|
|
DTL->>GM: Send completion signal
|
|
GM->>User: Notify task completion
|
|
```
|
|
|
|
1. **Planning and Allocation**: The General Manager delegates the task to the Developer Team Leader.
|
|
2. **Analysis and Internal Execution**: The Developer Team Leader analyzes the task, breaks it down, plans execution, and optionally spawns parallel subagents. It publishes `started`, completes the task, and requests review from the Reviewer Team Leader.
|
|
3. **Objection & Refinement Loop**:
|
|
- The Reviewer Team Leader must provide clear reasons and improvement directions for any issues.
|
|
- The Developer Team Leader validates the feedback. Valid suggestions are implemented; invalid ones are refuted with reasons and returned to the reviewer.
|
|
- This cycle repeats until all reviewers issue a `PASS`.
|
|
4. **Completion and Report**: The Developer Team Leader sends the final completion signal to the General Manager, who notifies the user.
|
|
|
|
---
|
|
|
|
## 4. Analysis Infrastructure Patterns & Practical Guide (Infra Patterns)
|
|
|
|
These are critical instructions for preventing data loss and infrastructure-level failures during long-running agent analyses.
|
|
|
|
### 📸 Preventing TUI Viewport Truncation (The 3 Pane Snapshotting Rules)
|
|
To ensure that agents running in TMUX environments do not lose debug logs or previous outputs due to screen scrollback limits, the following **snapshotting pattern must be enforced**:
|
|
1. **Pre-brief Capture**: Capture the pane (`capture-pane -S -200`) immediately after sending the task instruction (Brief) to back up the starting point of the input history.
|
|
2. **Loop Snapshot**: For long-running agent sessions (5 minutes or more), periodically (e.g., every 30 seconds) scan the viewport and append the incremental data to `/tmp/pane-snap.txt`.
|
|
3. **Post-job Capture**: Capture the complete pane state one final time immediately after a job completes or returns an error to preserve the entire execution trajectory.
|
|
|
|
### 📄 Handling Long Briefing Instructions
|
|
- Sending long instructions or prompts (hundreds of lines) sequentially via TMUX `send-keys` or input buffers can overwhelm the agent's TUI, leading to lost characters or truncated paragraphs.
|
|
- **Resolution**: If instructions are long, write them separately to a file path (e.g., `/tmp/brief-<job_id>.md`) and send a simplified execution command to the agent: `"Read /tmp/brief-... and execute"`.
|
|
|
|
### ⏱️ Timeout Configuration & Alignment Rules
|
|
- **Job Execution Limits (`timeout_sec` & `idle_timeout_sec`)**: Each job independently manages its overall execution timeout (`timeout_sec`, default 3600s) and idle timeout without receiving messages (`idle_timeout_sec`, default 120s).
|
|
- **Monitor Idle Waiting (`SUB_IDLE_TIMEOUT`)**: The idle timeout for the monitor script (`reconcile.sh`), `SUB_IDLE_TIMEOUT`, must always be set generously to `3600s` (1 hour) or more to align with the maximum job budget. This prevents the monitor from terminating early due to idle detection, which would lose control over background tasks before they finish.
|
|
|
|
---
|
|
|
|
## 5. Setup Checklist for New Projects (Setup Checklist)
|
|
|
|
Use this checklist when deploying this agent orchestration model to a new project:
|
|
|
|
- [ ] **Virtualenv Dependencies**: Are required Python packages like `pyyaml` and `paho-mqtt` included in `.venv` or `requirements.txt`?
|
|
- [ ] **Configuration File**: Are the MQTT broker address and security credentials safely loaded and shared via the `.env` file?
|
|
- [ ] **Directory Convention**: Are the registry path (`.mam/jobs/`) and logging path (`.mam/delegate_job_logs/`) added to `.gitignore`?
|
|
- [ ] **Core Scripts**: Are the core scripts (`mqtt_common.py`, `publish_event.py`, `job_subscriber.py`, and `registry.py`) in place?
|
|
- [ ] **HMAC Enablement**: When a new registry job is created, is a random `auth_token` correctly injected, and is signature-based mutual authentication active?
|
|
- [ ] **Charter Placement**: Is this protocol file (`AGENT.md`) placed in the **.agents/ directory** of the new project? (Placing it in `.agents/` is essential to keep the project root clean while allowing onboarding agents to align on the rules.)
|
|
|
|
---
|
|
|
|
*This guide balances collaboration efficiency with strict code security. Any required changes must be discussed and agreed upon by the General Manager and all Team Leaders before updating this document.* |