multi-agent-mux/BOOTSTRAP.md

# BOOTSTRAP.md

This document guides you through the setup and initialization procedures required to adopt the `tmux_agent_orchestration` orchestration and messaging backplane workflow in a new project, enabling a new developer or agent to get up and running quickly.

A new agent can follow the steps in this guide sequentially to establish a stable and reliable initial environment.

---

## 1. Scaffolding Overview (Project Structure)

Before cloning this project into a new environment, you must first understand the locations and roles of its core components:

*   `.agents/skills/`: A collection of shell scripts that execute multi-agent coordination and asynchronous job processing.
    *   `lib.sh`: The core orchestration shell functions and virtual environment (venv) auto-loading library.
    *   `multi-agent-mux-create/`: Script to launch isolated tmux agent sessions.
    *   `multi-agent-mux-stop/`: Script to gracefully stop agent sessions and update states.
    *   `multi-agent-mux-resume/`: Script to restore stopped agent sessions back to their previous conversation state.
    *   `multi-agent-mux-status/`: Script to query the current running state of all agent sessions.
    *   `multi-agent-mux-monitor/`: Monitor script to sync tmux states with the registry.
    *   `multi-agent-mux-delegate-job/`: Asynchronous job splitting and delegation module.
        *   `requirements.txt`: Python dependency list (`paho-mqtt`, `pyyaml`).
        *   `scripts/`: Python scripts running the core business logic.
            *   `registry.py`: Job registration, claiming, and atomic file lock control (CLI supported).
            *   `job_subscriber.py`: Background event subscriber and audit log generator.
            *   `publish_event.py`: Event publisher for runtime states and error traps.
            *   `mqtt_common.py`: Common utility for connecting to the MQTT broker.
*   `AGENT.md`: Definition of agent roles (PM, Worker, Reviewer) and event publication rules.
*   `MESSAGING.md`: Messaging scheme and wire protocol guidelines for MQTT communication between agents.

---

## 2. Environment Configuration (.env)

To set up the messaging broker and execution paths, you must create and modify a local environment configuration file (`.env`).

### Step 2.1: Run the Generation Script
Run the environment template copy script provided in the project root:

```bash
# Automatically copy .env.example to .env (does not overwrite if it already exists)
./scripts/generate-env.sh

# To force overwrite and create a backup of the existing .env:
./scripts/generate-env.sh --force
```

### Step 2.2: Modify Environment Variables
Open the generated `.env` file to configure settings as needed.

> [!NOTE]
> The default `.env` file generated by `generate-env.sh` has all environment variables commented out. If left commented out, the system defaults to using relative paths (`.mam/`, etc.) relative to the local project root, and the public MQTT broker. You can use it as-is without uncommenting anything.

1.  **MQTT Broker Setup (`MQTT_BROKER`)**:
    *   The default broker is HiveMQ's public sandbox broker (`broker.hivemq.com`). However, for production work where security and privacy are critical, we strongly recommend changing this to a private broker address.
2.  **Authentication Credentials (`MQTT_USERNAME`, `MQTT_PASSWORD`)**:
    *   If using a secured broker, change the placeholders marked `replace_me` to your actual broker credentials.
3.  **Path Variables (Optional)**:
    *   Uncomment and specify absolute paths for variables like `AGENT_SESSIONS_YAML` and `DELEGATE_JOB_LOGS_DIR` only if you need to override the default relative paths to align with specific build systems or host directories.

> [!WARNING]
> **Security Mode Default Warning**:
> The system's default setting is the **unauthenticated PoC mode**. If an `auth_token` is not explicitly provided (or is `null`) during job registration, HMAC signature verification is skipped.
> In a public broker environment or production phase, you must generate and inject a unique random `auth_token` during job registration to enable HMAC signature security. (For detailed security protocols, refer to section `2.3 Security Protocol` in [MESSAGING.md](./MESSAGING.md) and [AGENT.md](./AGENT.md). Automated token generation and injection via CLI is on the roadmap under task `FW-N6`.)

---

## 3. Dependency and Virtualenv Setup

Set up the Python 3 dependencies required to run the orchestration and MQTT messaging backplane.

### Step 3.1: Build Python Virtual Environment
Create and activate a `.venv` virtual environment in the project root:

```bash
# Create virtual environment
python3 -m venv .venv

# Activate virtual environment
source .venv/bin/activate
```

### Step 3.2: Install Dependency Packages
Install the required packages listed in `requirements.txt` under `multi-agent-mux-delegate-job`:

```bash
# Install dependencies (pyyaml, paho-mqtt, etc.)
pip install -r .agents/skills/multi-agent-mux-delegate-job/requirements.txt
```

---

## 4. Directory Structure and Security Audit Guide

Ensure that the local registry directories required to track agent states and jobs are successfully created:

1.  **Required Directory Structure**:
    *   `.mam/jobs/`: Holds detailed metadata files for registered asynchronous jobs.
    *   `.mam/delegate_job_logs/`: Holds the audit logs (`events.ndjson`) for all backplane events published by agents.
2.  **Git Ignore Configuration (`.gitignore`)**:
    *   When initializing a new project, verify that the following entries are configured in `.gitignore` to prevent committing local runtimes to the repository. The exception `!.env.example` must be kept to preserve the template:
        ```text
        .env
        .env.*
        !.env.example
        .mam/
        .venv/
        __pycache__/
        *.pyc
        ```

---

## 5. Execution Verification and Bootstrap Tests

To verify that the environment has been successfully built without runtime errors, run the following verification checklist.

> [!IMPORTANT]
> All verification commands below must be executed from the **project root directory** (where the `.mam/` directory is directly visible). This is because the default job registry path resolved by scripts is relative to the current working directory under `./.mam/jobs`.

### Verification Test 1: Registry Script Load Check
Verify that the Python scripts and virtual environment libraries load correctly by listing jobs:

```bash
# Run using the python interpreter in the virtual environment
.venv/bin/python3 .agents/skills/multi-agent-mux-delegate-job/scripts/registry.py list
```
*   **Expected Output**: The command should exit successfully and print an empty JSON array `[]` (or a list of pending/running jobs if any exist) without any python traceback errors.

### Verification Test 2: MQTT Broker Connection Handshake Test
Test the end-to-end communication through the broker to verify that events are published and received correctly:

```bash
# 1. Register a temporary test job and capture its 8-character Hex Job ID
JID=$(.venv/bin/python3 .agents/skills/multi-agent-mux-delegate-job/scripts/registry.py register \
  --agent "test-agent" \
  --prompt "Bootstrap check command" \
  --timeout 120)
echo "Generated Job ID: $JID"

# 2. Run the background event subscriber (Subscriber) for this Job ID
.venv/bin/python3 .agents/skills/multi-agent-mux-delegate-job/scripts/job_subscriber.py --job "$JID" &

# 3. Wait 2 seconds to allow the Subscriber to establish its MQTT socket connection
sleep 2

# 4. Publish a start event (adhering to the Subscribe-before-Publish rule)
.venv/bin/python3 .agents/skills/multi-agent-mux-delegate-job/scripts/publish_event.py \
  --job "$JID" \
  --event started \
  --detail "Bootstrap MQTT verification connection check"

# 5. Verify that the event is printed to stdout and written to the audit log:
#    .mam/delegate_job_logs/events.ndjson

# 6. Stop the background subscriber and clean up the test job records
kill %1
rm -f ".mam/jobs/$JID.json" ".mam/jobs/$JID.lock"
```

---

## 6. Onboarding Collaborating Agents (New Agent Onboarding)

Once the setup is verified, onboarding agents should immediately read the **[AGENT.md](./AGENT.md)** guidelines in the project root.

The guidelines describe essential workflows—such as **surgical change constraints, cross-verification review loops, and pane snapshotting to prevent viewport truncation**—allowing new agents to quickly and safely integrate with the multi-agent workflow.