feat(tmux-agent-orchestrate-monitor): integrate watchdog pattern as skill
Moved /tmp/subscriber-watchdog.sh → skills/tmux-agent-orchestrate-monitor/scripts/watchdog.sh (skill-managed lifecycle, no longer lives outside workspace). Added lib.sh::start_watchdog() helper: - Spawns watchdog as background nohup process - Writes watchdog log to .hermes/jobs/<JID>.watchdog.log - Returns watchdog PID via stdout Wired create_session.sh --submit-job to auto-start watchdog after JOB registration. Fixes: - Bug: registry.py get first-line parse was fragile (empty status → infinite loop) → Now uses python3 json.load for robust parsing - Bug: old path skills/delegate-job/scripts/job_subscriber.py hardcoded → Now uses skills/tmux-agent-orchestrate-delegate-job/scripts/job_subscriber.py Verified on isolated server -L agy-watchdog-skill-test (kill-server after): - Syntax check PASS - E2E: register job → start watchdog → publish completed → watchdog exits - Global skill non-interference verified - Main isolated server -L multi-agent-canary untouched
This commit is contained in:
@@ -425,3 +425,25 @@ delegate_publish_event() {
|
||||
"$py_bin" "$pub" --job "$job_id" --event "$event" --detail "$detail" || true
|
||||
}
|
||||
|
||||
# start_watchdog <job_id> [workdir]
|
||||
# Spawns a watchdog process to monitor a delegate-job JOB in the background.
|
||||
# The watchdog re-spawns the subscriber every 2 minutes (or whatever hard
|
||||
# limit we set) and exits automatically when the JOB reaches terminal state.
|
||||
# Returns the watchdog PID via stdout.
|
||||
start_watchdog() {
|
||||
local job_id="$1"
|
||||
local workdir="${2:-$PWD}"
|
||||
local watchdog_script="$workdir/skills/tmux-agent-orchestrate-monitor/scripts/watchdog.sh"
|
||||
local log_file="$workdir/.hermes/jobs/${job_id}.watchdog.log"
|
||||
|
||||
if [ ! -x "$watchdog_script" ]; then
|
||||
echo "ERROR: watchdog not found or not executable: $watchdog_script" >&2
|
||||
return 1
|
||||
fi
|
||||
|
||||
nohup "$watchdog_script" "$job_id" "$workdir" > "$log_file" 2>&1 &
|
||||
local pid=$!
|
||||
echo "$pid"
|
||||
}
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user