Compare commits
164 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| cadaa5dc25 | |||
| 910b601a5e | |||
| fa0a5fbb9c | |||
| cd93fc71f7 | |||
| 3e94472e78 | |||
| 0609813ba0 | |||
| 09e04359e3 | |||
| fe32f65fa4 | |||
| 1a5b6e25f8 | |||
| 54760964cf | |||
| e463670649 | |||
| 6e6890ebd9 | |||
| 609b3ed090 | |||
| 65faa40b9a | |||
| 9f97de115b | |||
| 8f21f4df19 | |||
| ff7a52c1d2 | |||
| 4ed6b84863 | |||
| 7a124d7d25 | |||
| f00c567469 | |||
| 6f0e5b4589 | |||
| 5da4d05bf2 | |||
| 1a8718ca9d | |||
| c1c225aa05 | |||
| dc7c10d6fe | |||
| a827b0841e | |||
| a9c93ea9df | |||
| bb69af31f8 | |||
| 7644da4280 | |||
| 13e4af421d | |||
| f2d5307573 | |||
| bc9a22b46a | |||
| 932e71c0bf | |||
| d3b0b330aa | |||
| 5e927bcd13 | |||
| 890a146413 | |||
| afdf0779a1 | |||
| eb7cae1fea | |||
| fe82dc7f2b | |||
| b00b0ba4aa | |||
| 3f04567290 | |||
| acb9cd67c6 | |||
| d90ab7e646 | |||
| 8ea90adcaf | |||
| de803e1e76 | |||
| 019efab804 | |||
| 957d37f51f | |||
| 8e084262a0 | |||
| 504144eb9c | |||
| 86374ab293 | |||
| 199edb228c | |||
| 598a20a3f0 | |||
| c8b5ba3812 | |||
| 5ea9fda69b | |||
| 4f7cfc0418 | |||
| 1f38a96561 | |||
| 660b9b3810 | |||
| 328069809b | |||
| b1551045dc | |||
| d02226aab9 | |||
| 39811c9b32 | |||
| f7f161e60f | |||
| e6040fc824 | |||
| 17fc44d0d8 | |||
| 1bebb7467f | |||
| cc1d986a74 | |||
| fabcd026af | |||
| aff042855a | |||
| 39b0c4f720 | |||
| 43a5700ae6 | |||
| 7acdabaf96 | |||
| dfd2d5f620 | |||
| f24e2857ab | |||
| d38432f640 | |||
| 4e570e3e2b | |||
| a64e3170cd | |||
| 4da4babcf4 | |||
| 384e496a1b | |||
| b38c6110f2 | |||
| 74efb1c143 | |||
| f23b2b9683 | |||
| 423003aa05 | |||
| af82f2ba20 | |||
| fe8e15d211 | |||
| b098556757 | |||
| 5c5f277d6d | |||
| 2fa5229695 | |||
| c3caa3ea94 | |||
| ee0607f022 | |||
| afe5d43a9a | |||
| dd332a5759 | |||
| 103f9adcfd | |||
| 652c8cb5a7 | |||
| 11a8f3ba99 | |||
| 451e6fc2fc | |||
| 1ecef55fea | |||
| 76e38b24e6 | |||
| b1283a0e7b | |||
| 2c51bc47e8 | |||
| ff495c1521 | |||
| a04aed098d | |||
| 916b70c595 | |||
| 55cb3429d4 | |||
| 545ff3582f | |||
| 8743299226 | |||
| 205e94f960 | |||
| 86b0a4d285 | |||
| 79212481c9 | |||
| 76dd153760 | |||
| b8d10abec9 | |||
| 7ebddf7792 | |||
| 04d7ca2e6a | |||
| f6f47c2f23 | |||
| 39e0976ace | |||
| 299579ab7b | |||
| 3a10c38511 | |||
| db54f3d0b4 | |||
| 8105e93031 | |||
| 0d5c2f1a2e | |||
| bba24d87f7 | |||
| efb3af4a93 | |||
| 65746af720 | |||
| d9e9d27e01 | |||
| 83351606c6 | |||
| d528f578aa | |||
| cf3310e818 | |||
| 74d6b25183 | |||
| dc837a5400 | |||
| 4eff49c9c5 | |||
| 965d5073c3 | |||
| e82bbb587f | |||
| c89a0d334a | |||
| ac9b6d593f | |||
| 8c0a9c5bc6 | |||
| 63a3b9b50a | |||
| 7e6e0b1f5a | |||
| ab528d9163 | |||
| 7967d32f12 | |||
| a7de3dbb9f | |||
| 0fbf2ab513 | |||
| 436f42c00c | |||
| 881869352d | |||
| 3f982009e2 | |||
| 52820278fd | |||
| abcb336e7c | |||
| 1c7812fa9f | |||
| 4c60779fac | |||
| 726713d081 | |||
| 5265e25f9b | |||
| 035ed430ba | |||
| f145203eee | |||
| eafd1c1fb2 | |||
| e6ad7ae10e | |||
| 05b12b41b6 | |||
| a59da9921e | |||
| bbd6ec85ac | |||
| ce8cb5f0f1 | |||
| 9eb5eef676 | |||
| c94a2542bd | |||
| e6b3cd1824 | |||
| 49f77f2d1e | |||
| d3c2d9e8f6 | |||
| f114c861b4 | |||
| 544a024e22 |
@@ -1,6 +1,6 @@
|
||||
# Weekly canary suite. Catches upstream regressions (broken pipelock
|
||||
# image packaging at the pinned digest, etc.) without coupling every
|
||||
# dev push to upstream registry availability.
|
||||
# Weekly canary suite. Catches upstream regressions (broken pinned
|
||||
# digest, etc.) without coupling every dev push to upstream registry
|
||||
# availability.
|
||||
#
|
||||
# Opt-in via CLAUDE_BOTTLE_RUN_CANARIES=1 so the same files can be run
|
||||
# locally with the same gating.
|
||||
|
||||
@@ -26,7 +26,7 @@ jobs:
|
||||
- name: Run pylint
|
||||
run: |
|
||||
# Run pylint on all Python files in the repo
|
||||
find . -name '*.py' -not -path './.venv/*' -not -path './.git/*' | xargs pylint --fail-under=8.0 || true
|
||||
find . -name '*.py' -not -path './.venv/*' -not -path './.git/*' | xargs pylint --fail-under=8.0
|
||||
|
||||
- name: Run pyright
|
||||
run: |
|
||||
|
||||
@@ -0,0 +1,125 @@
|
||||
# Assign sequential numbers to prd-new-*.md files on merge to main.
|
||||
#
|
||||
# When a PR merges to main and includes prd-new-*.md files this workflow:
|
||||
# 1. Finds the next available NNNN number by scanning existing PRDs.
|
||||
# 2. Renames each prd-new-*.md to NNNN-<slug>.md.
|
||||
# 3. Updates the title header (# PRD prd-new: → # PRD NNNN:).
|
||||
# 4. Flips Status: Draft → Active when the push touched files outside
|
||||
# docs/prds/ anywhere in its commit range (i.e. the implementation
|
||||
# shipped together with the PRD).
|
||||
# 5. Commits the renaming back to main.
|
||||
#
|
||||
# No-op if the working tree contains no prd-new-*.md files.
|
||||
#
|
||||
# NOTE: The workflow scans the working tree (not just HEAD~1..HEAD) because
|
||||
# PRs land as multi-commit pushes and the prd-new file is often added in an
|
||||
# earlier commit on the branch, not in the final squash/merge commit.
|
||||
|
||||
name: prd-number
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
paths:
|
||||
- 'docs/prds/prd-new-*.md'
|
||||
|
||||
jobs:
|
||||
assign-numbers:
|
||||
runs-on: ubuntu-latest
|
||||
permissions:
|
||||
contents: write
|
||||
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
with:
|
||||
fetch-depth: 0
|
||||
token: ${{ secrets.GITHUB_TOKEN }}
|
||||
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: "3.12"
|
||||
|
||||
- name: Configure git
|
||||
run: |
|
||||
git config user.name "github-actions[bot]"
|
||||
git config user.email "github-actions[bot]@users.noreply.github.com"
|
||||
|
||||
- name: Assign PRD numbers
|
||||
run: |
|
||||
python3 - <<'EOF'
|
||||
import os
|
||||
import re
|
||||
import subprocess
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
prds_dir = Path("docs/prds")
|
||||
|
||||
# Scan the working tree — prd-new files may have landed in any
|
||||
# commit of a multi-commit push, not just HEAD.
|
||||
new_prds = sorted(prds_dir.glob("prd-new-*.md"))
|
||||
|
||||
if not new_prds:
|
||||
print("No prd-new-*.md files found — nothing to do.")
|
||||
sys.exit(0)
|
||||
|
||||
# Determine whether non-PRD files were also changed anywhere in
|
||||
# the push range (BEFORE_SHA → HEAD). Falls back to HEAD~1 when
|
||||
# the env var isn't set (e.g. local act runs).
|
||||
before_sha = os.environ.get("GITHUB_EVENT_BEFORE", "HEAD~1")
|
||||
all_changed = subprocess.run(
|
||||
["git", "diff", "--name-only", before_sha, "HEAD"],
|
||||
capture_output=True, text=True, check=True,
|
||||
).stdout.splitlines()
|
||||
non_prd_changed = any(
|
||||
not f.startswith("docs/prds/") for f in all_changed
|
||||
)
|
||||
|
||||
# Find next available number.
|
||||
existing = sorted(
|
||||
int(m.group(1))
|
||||
for p in prds_dir.glob("*.md")
|
||||
if (m := re.match(r"^(\d{4})-", p.name))
|
||||
)
|
||||
next_num = (max(existing) + 1) if existing else 1
|
||||
|
||||
for prd_path in sorted(new_prds):
|
||||
slug = re.sub(r"^prd-new-", "", prd_path.stem)
|
||||
new_name = f"{next_num:04d}-{slug}.md"
|
||||
new_path = prds_dir / new_name
|
||||
print(f" {prd_path.name} → {new_name}")
|
||||
|
||||
content = prd_path.read_text()
|
||||
|
||||
# Update title header.
|
||||
content = re.sub(
|
||||
r"^(#\s+PRD\s+)prd-new(:)",
|
||||
rf"\g<1>{next_num:04d}\2",
|
||||
content,
|
||||
count=1,
|
||||
flags=re.MULTILINE,
|
||||
)
|
||||
|
||||
# Conditionally flip Status.
|
||||
if non_prd_changed:
|
||||
content = re.sub(
|
||||
r"(\*\*Status:\*\*\s*)Draft",
|
||||
r"\g<1>Active",
|
||||
content,
|
||||
count=1,
|
||||
)
|
||||
|
||||
new_path.write_text(content)
|
||||
subprocess.run(["git", "rm", str(prd_path)], check=True)
|
||||
subprocess.run(["git", "add", str(new_path)], check=True)
|
||||
next_num += 1
|
||||
|
||||
subprocess.run(
|
||||
["git", "commit", "-m", "ci(prd): assign sequential numbers to new PRDs"],
|
||||
check=True,
|
||||
)
|
||||
subprocess.run(["git", "push"], check=True)
|
||||
EOF
|
||||
@@ -21,7 +21,11 @@ on:
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
paths:
|
||||
- '**.py'
|
||||
pull_request:
|
||||
paths:
|
||||
- '**.py'
|
||||
|
||||
jobs:
|
||||
unit:
|
||||
|
||||
@@ -8,6 +8,7 @@ on:
|
||||
- '**.py'
|
||||
- '.pylintrc'
|
||||
- 'pyrightconfig.json'
|
||||
workflow_dispatch:
|
||||
|
||||
jobs:
|
||||
update-badges:
|
||||
@@ -31,28 +32,16 @@ jobs:
|
||||
- name: Run pylint and extract score
|
||||
id: pylint
|
||||
run: |
|
||||
# Run pylint and capture the score
|
||||
PYLINT_OUTPUT=$(python -m pylint bot_bottle/ 2>&1 | tail -1)
|
||||
echo "Output: $PYLINT_OUTPUT"
|
||||
# Extract score (e.g., "9.92/10")
|
||||
SCORE=$(echo "$PYLINT_OUTPUT" | grep -oP '\d+\.\d+/10' | head -1)
|
||||
if [ -z "$SCORE" ]; then
|
||||
SCORE="9.92/10"
|
||||
fi
|
||||
PYLINT_OUTPUT=$(python -m pylint bot_bottle/ 2>&1) || true
|
||||
SCORE=$(echo "$PYLINT_OUTPUT" | grep -oP '(?<=rated at )\d+\.\d+/10' | head -1)
|
||||
echo "score=$SCORE" >> $GITHUB_OUTPUT
|
||||
echo "Pylint score: $SCORE"
|
||||
|
||||
- name: Run pyright and check errors
|
||||
id: pyright
|
||||
run: |
|
||||
# Run pyright and check for errors
|
||||
PYRIGHT_OUTPUT=$(python -m pyright 2>&1 | tail -1)
|
||||
echo "Output: $PYRIGHT_OUTPUT"
|
||||
# Extract error count
|
||||
ERRORS=$(echo "$PYRIGHT_OUTPUT" | grep -oP '^\d+' | head -1)
|
||||
if [ -z "$ERRORS" ]; then
|
||||
ERRORS="0"
|
||||
fi
|
||||
PYRIGHT_OUTPUT=$(python -m pyright 2>&1) || true
|
||||
ERRORS=$(echo "$PYRIGHT_OUTPUT" | grep -oP '\d+(?= error)' | head -1)
|
||||
echo "errors=$ERRORS" >> $GITHUB_OUTPUT
|
||||
echo "Pyright errors: $ERRORS"
|
||||
|
||||
@@ -61,16 +50,14 @@ jobs:
|
||||
PYLINT_SCORE="${{ steps.pylint.outputs.score }}"
|
||||
PYRIGHT_ERRORS="${{ steps.pyright.outputs.errors }}"
|
||||
|
||||
# Escape / for sed
|
||||
PYLINT_SCORE_ESCAPED=$(echo "$PYLINT_SCORE" | sed 's/\//\\\//g')
|
||||
PYLINT_SCORE_ENCODED=$(echo "$PYLINT_SCORE" | sed 's|/|%2F|g')
|
||||
|
||||
# Create badge URLs with proper encoding
|
||||
PYLINT_BADGE="[](https://github.com/PyCQA/pylint)"
|
||||
PYRIGHT_BADGE="[](https://github.com/microsoft/pyright)"
|
||||
|
||||
# Update README with new badges
|
||||
sed -i "s|\[\!\[pylint\].*pylint)\]|${PYLINT_BADGE}|g" README.md
|
||||
sed -i "s|\[\!\[pyright\].*pyright)\]|${PYRIGHT_BADGE}|g" README.md
|
||||
if [ -n "$PYLINT_SCORE_ENCODED" ]; then
|
||||
sed -i "s|/badge/pylint-[^)]*|/badge/pylint-${PYLINT_SCORE_ENCODED}-brightgreen|" README.md
|
||||
fi
|
||||
if [ -n "$PYRIGHT_ERRORS" ]; then
|
||||
sed -i "s|/badge/pyright-[^)]*|/badge/pyright-${PYRIGHT_ERRORS}%20errors-brightgreen|" README.md
|
||||
fi
|
||||
|
||||
echo "Updated badges:"
|
||||
grep -E "pylint|pyright" README.md | head -2
|
||||
@@ -86,11 +73,7 @@ jobs:
|
||||
else
|
||||
echo "Badge changes detected, committing..."
|
||||
git add README.md
|
||||
git commit -m "chore: update quality badges
|
||||
|
||||
- Pylint: ${{ steps.pylint.outputs.score }}
|
||||
- Pyright: ${{ steps.pyright.outputs.errors }} errors
|
||||
|
||||
[skip ci]"
|
||||
MSG="chore: update quality badges"$'\n\n'"- Pylint: ${{ steps.pylint.outputs.score }}"$'\n'"- Pyright: ${{ steps.pyright.outputs.errors }} errors"$'\n\n'"[skip ci]"
|
||||
git commit -m "$MSG"
|
||||
git push
|
||||
fi
|
||||
|
||||
@@ -419,7 +419,8 @@ disable=raw-checker-failed,
|
||||
too-many-instance-attributes,
|
||||
duplicate-code,
|
||||
import-outside-toplevel,
|
||||
too-few-public-methods
|
||||
too-few-public-methods,
|
||||
unnecessary-ellipsis
|
||||
|
||||
# Enable the message, report, category or checker with the given id(s). You can
|
||||
# either give multiple identifier separated by comma (,) or put this option
|
||||
|
||||
@@ -2,11 +2,18 @@
|
||||
|
||||
## What this is
|
||||
|
||||
bot-bottle spins up an isolated container for running AI coding agents with a
|
||||
curated set of skills and env vars. The point is to run agents with broad
|
||||
permissions inside a sandbox, so a misbehaving agent cannot reach the host.
|
||||
A Python CLI (entry point `cli.py`, package `bot_bottle/`) orchestrates
|
||||
the container lifecycle and the copying of skills and env vars into it.
|
||||
bot-bottle spins up an isolated backend runtime for running AI coding agents
|
||||
with a curated set of skills and env vars. The point is to run agents with
|
||||
broad permissions inside a sandbox, so a misbehaving agent cannot reach the
|
||||
host. A Python CLI (entry point `cli.py`, package `bot_bottle/`) orchestrates
|
||||
the runtime lifecycle and the copying of skills and env vars into it.
|
||||
The default backend on compatible macOS hosts is macos-container:
|
||||
agents and sidecar bundles run through Apple's `container` CLI without
|
||||
requiring Docker. The smolmachines backend remains available with
|
||||
`BOT_BOTTLE_BACKEND=smolmachines` or `--backend=smolmachines`; agents
|
||||
run in a libkrun micro-VM, while the sidecar bundle still uses Docker.
|
||||
The legacy Docker backend remains available with `BOT_BOTTLE_BACKEND=docker`
|
||||
or `--backend=docker`.
|
||||
|
||||
## Goals
|
||||
|
||||
@@ -17,7 +24,7 @@ the container lifecycle and the copying of skills and env vars into it.
|
||||
## Non-goals
|
||||
|
||||
- Communicating between agents directly
|
||||
- Self hosted VMs (v1 uses local Docker containers, not VMs)
|
||||
- Removing the Docker backend
|
||||
- Advanced agent auditing (lean on git history for auditing)
|
||||
|
||||
## Repository layout
|
||||
@@ -25,9 +32,8 @@ the container lifecycle and the copying of skills and env vars into it.
|
||||
- `README.md` — short public-facing description.
|
||||
- `AGENTS.md` — this file, orientation for future agent sessions.
|
||||
- `.gitignore` — OS junk.
|
||||
- `bot-bottle.json` — legacy manifest of named agents (env / skills / prompt
|
||||
per agent), consumed by `cli.py`. See "Manifest" under
|
||||
"Intended design".
|
||||
- `.bot-bottle/` — per-repo agent and bottle manifests (YAML markdown format).
|
||||
- `examples/` — example bottles and agents showing the manifest format.
|
||||
- `docs/README.md` — docs overview; when to write which document.
|
||||
- `docs/prds/` — product requirement docs (see `docs/prds/README.md` for format).
|
||||
- `docs/research/` — research notes (see `docs/research/README.md`).
|
||||
@@ -37,10 +43,11 @@ the container lifecycle and the copying of skills and env vars into it.
|
||||
|
||||
- Three kinds of doc, each with its own conventions in-folder; see
|
||||
`docs/README.md` for when to write which:
|
||||
- **PRDs** (`docs/prds/`) — one feature per file, numbered
|
||||
`NNNN-kebab.md`. A `Status:` line tracks lifecycle: Draft → Active
|
||||
(shipped to `main`) → Superseded/Retargeted. Format in
|
||||
`docs/prds/README.md`.
|
||||
- **PRDs** (`docs/prds/`) — one feature per file. While a PR is open
|
||||
the file is named `prd-new-<kebab>.md`; CI assigns a sequential
|
||||
number on merge to `main` and renames it. A `Status:` line tracks
|
||||
lifecycle: Draft → Active (shipped to `main`) →
|
||||
Superseded/Retargeted. Format in `docs/prds/README.md`.
|
||||
- **Research notes** (`docs/research/`) — opinionated investigations;
|
||||
unnumbered kebab-case, freeform and verdict-first. See
|
||||
`docs/research/README.md`.
|
||||
|
||||
+12
-26
@@ -1,23 +1,18 @@
|
||||
# Per-bottle sidecar bundle image (PRD 0024).
|
||||
#
|
||||
# Collapses the four prior per-sidecar images (pipelock, egress,
|
||||
# git-gate, supervise) into one. A small stdlib-Python init
|
||||
# supervisor at /app/sidecar_init.py spawns all four daemons,
|
||||
# forwards SIGTERM, and propagates per-daemon stdout/stderr to the
|
||||
# container log with a `[name]` prefix. See PRD 0024 for the
|
||||
# rationale.
|
||||
# Collapses the prior per-sidecar images (egress, git-gate,
|
||||
# supervise) into one. A small stdlib-Python init supervisor at
|
||||
# /app/sidecar_init.py spawns all daemons, forwards SIGTERM, and
|
||||
# propagates per-daemon stdout/stderr to the container log with a
|
||||
# `[name]` prefix. See PRD 0024 for the rationale.
|
||||
#
|
||||
# Layout (preserved verbatim from the prior four Dockerfiles so the
|
||||
# compose renderer's bind-mount paths and docker-cp targets keep
|
||||
# working):
|
||||
# Layout:
|
||||
#
|
||||
# /usr/local/bin/pipelock pipelock binary
|
||||
# /usr/bin/gitleaks gitleaks binary
|
||||
# /app/egress_addon.py + siblings mitmproxy addon (egress)
|
||||
# /app/egress-entrypoint.sh mitmdump launcher
|
||||
# /app/supervise_server.py + .py supervise MCP server
|
||||
# /app/sidecar_init.py PID 1 supervisor
|
||||
# /etc/pipelock.yaml bind-mounted at run time
|
||||
# /etc/egress/routes.yaml bind-mounted at run time
|
||||
# /etc/git-gate/pre-receive docker-cp'd at start time
|
||||
# /git-gate-entrypoint.sh docker-cp'd at start time
|
||||
@@ -27,25 +22,17 @@
|
||||
# /home/mitmproxy/.mitmproxy/ mitmproxy CA dir
|
||||
#
|
||||
# Exposed ports inside the container:
|
||||
# 8888 pipelock (HTTPS_PROXY)
|
||||
# 9099 egress (mitmproxy, pipelock's upstream — not externally
|
||||
# addressed by the agent)
|
||||
# 9099 egress (mitmproxy, agent-facing HTTPS proxy)
|
||||
# 9418 git-gate (git-daemon)
|
||||
# 9420 git-gate smart HTTP (smolmachines agent-facing transport)
|
||||
# 9100 supervise (MCP HTTP)
|
||||
|
||||
# Stage 1: pipelock binary. The upstream pipelock image is a
|
||||
# scratch image with the binary at /pipelock (entrypoint).
|
||||
# Pinned by digest in lockstep with
|
||||
# bot_bottle/backend/docker/pipelock.py:PIPELOCK_IMAGE.
|
||||
FROM ghcr.io/luckypipewrench/pipelock@sha256:3b1a39417b98406ddc5dc2d8fcb42865ddc0c68a43d355db55f0f8cb06bc6de9 AS pipelock-src
|
||||
|
||||
# Stage 2: gitleaks binary. The upstream gitleaks image is alpine
|
||||
# Stage 1: gitleaks binary. The upstream gitleaks image is alpine
|
||||
# with the binary at /usr/bin/gitleaks. Pinned by digest in lockstep
|
||||
# with Dockerfile.git-gate's prior base (now deleted at chunk 3).
|
||||
FROM zricethezav/gitleaks@sha256:c00b6bd0aeb3071cbcb79009cb16a60dd9e0a7c60e2be9ab65d25e6bc8abbb7f AS gitleaks-src
|
||||
|
||||
# Stage 3: assembly. mitmproxy/mitmproxy is debian-slim-based with
|
||||
# Stage 2: assembly. mitmproxy/mitmproxy is debian-slim-based with
|
||||
# Python + mitmdump pre-installed — heavier than the others, so
|
||||
# this stage starts there and pulls the standalone binaries in.
|
||||
FROM mitmproxy/mitmproxy:11.1.3
|
||||
@@ -60,16 +47,14 @@ USER root
|
||||
# plus the core `git` binary the pre-receive hook invokes.
|
||||
# openssh-client supplies the upstream SSH transport the
|
||||
# pre-receive hook uses to forward accepted refs.
|
||||
# ca-certificates is needed for both pipelock and mitmdump
|
||||
# upstream TLS (the base image already has it; listed for
|
||||
# explicitness).
|
||||
# ca-certificates is needed for mitmdump upstream TLS (the
|
||||
# base image already has it; listed for explicitness).
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends \
|
||||
git openssh-client ca-certificates \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Pull the standalone binaries into the final image.
|
||||
COPY --from=pipelock-src /pipelock /usr/local/bin/pipelock
|
||||
COPY --from=gitleaks-src /usr/bin/gitleaks /usr/bin/gitleaks
|
||||
|
||||
# Project Python: addon + server modules + the init supervisor.
|
||||
@@ -78,6 +63,7 @@ COPY --from=gitleaks-src /usr/bin/gitleaks /usr/bin/gitleaks
|
||||
# Dockerfile.egress / Dockerfile.supervise layout.
|
||||
COPY bot_bottle/egress_addon_core.py /app/egress_addon_core.py
|
||||
COPY bot_bottle/egress_addon.py /app/egress_addon.py
|
||||
COPY bot_bottle/dlp_detectors.py /app/dlp_detectors.py
|
||||
COPY bot_bottle/yaml_subset.py /app/yaml_subset.py
|
||||
COPY bot_bottle/supervise.py /app/supervise.py
|
||||
COPY bot_bottle/supervise_server.py /app/supervise_server.py
|
||||
|
||||
@@ -5,7 +5,7 @@
|
||||
# bot-bottle
|
||||
|
||||
[](https://gitea.dideric.is/didericis/bot-bottle/actions?workflow=test.yml)
|
||||
[](https://github.com/PyCQA/pylint)
|
||||
[](https://github.com/PyCQA/pylint)
|
||||
[](https://github.com/microsoft/pyright)
|
||||
|
||||
**Problem:** Developer wants to run a coding agent without supervision, but they don't want a prompt injected or misbehaving agent wrecking their environment or exfiltrating sensitive data.
|
||||
@@ -20,14 +20,22 @@
|
||||
- **Manifest-scoped skills + secrets** — each bottle declares its skills, env, git identity, remotes, and egress routes; unknown keys die at load.
|
||||
- **Trust boundary at `$HOME`** — bottles (credentials, egress, remotes) live only under `~/.bot-bottle/bottles/`. Repos may ship agents but not bottles, so a cloned repo can't redirect an env var to an attacker host.
|
||||
- **Composable bottles (`extends:`)** — keep provider/runtime policy in one base bottle (e.g. `claude.md`) and overlay task bottles on top.
|
||||
- **Parallel, isolated bottles** — each bottle is its own per-agent Docker `--internal` network; bottles don't share state or talk to each other.
|
||||
- **Parallel, isolated bottles** — each bottle runs in its own backend-owned isolation boundary; bottles don't share state or talk to each other.
|
||||
- **Provider templates (Claude, Codex)** — `Dockerfile.claude` / `Dockerfile.codex`, or a bottle-supplied Dockerfile. Claude auth via long-lived OAuth token; Codex via opt-in host device-auth forwarding.
|
||||
- **gVisor auto-detect** — on Linux hosts where `runsc` is registered with Docker, every bottle launches under it for a userspace syscall barrier; no manifest config required.
|
||||
- **Smolmachines backend (macOS)** — opt-in `BOT_BOTTLE_BACKEND=smolmachines` runs the agent in a libkrun micro-VM with the sidecar bundle still in Docker.
|
||||
- **Apple Container backend (macOS default when available)** — runs the agent and sidecar bundle with Apple's `container` CLI, using a host-only agent network plus a separate sidecar egress network.
|
||||
- **Smolmachines backend** — runs the agent in a libkrun micro-VM while the sidecar bundle stays in Docker. TSI and smolmachines DNS filtering close the raw DNS exfiltration gap that exists in the legacy Docker backend.
|
||||
- **Legacy Docker backend** — still available for examples, CI, and hosts without Apple Container via `BOT_BOTTLE_BACKEND=docker` or `--backend=docker`.
|
||||
|
||||
## Architecture
|
||||
|
||||
A bottle is two containers per agent: an `agent` container, and a `sidecars` container that bundles pipelock + cred-proxy + git-gate + supervise behind a Python init supervisor. They share a per-agent Docker `--internal` network; the agent has no default route off-box.
|
||||
On the default macOS Apple Container backend, a bottle is an agent container on a host-only internal network plus a sidecar bundle attached to both that internal network and a NAT egress network. The agent gets HTTP(S)_PROXY and CA bundle env vars pointing at the sidecar's internal-network IP, so HTTP/HTTPS traffic flows through the sidecar instead of direct egress. `bottle.git` / git-gate is intentionally deferred on this backend until a safe Apple Container key-delivery path exists.
|
||||
|
||||
On the smolmachines backend, a bottle is an agent micro-VM plus a Docker sidecar bundle for egress, git-gate, and supervise. The VM reaches the sidecars through a per-bottle loopback alias allowed by TSI; smolmachines handles DNS filtering below the guest OS.
|
||||
|
||||
On the legacy Docker backend, the same logical bottle is two containers per agent: an `agent` container and a `sidecars` container. They share a per-agent Docker `--internal` network; the agent has no default route off-box.
|
||||
|
||||
The Docker topology looks like this:
|
||||
|
||||
```
|
||||
host ( ./cli.py )
|
||||
@@ -36,31 +44,25 @@ A bottle is two containers per agent: an `agent` container, and a `sidecars` con
|
||||
▼
|
||||
┌─────────────────────────── bottle ──────────────────────────────────┐
|
||||
│ │
|
||||
│ ┌──────────────────┐ ┌──────────────┐ │
|
||||
│ │ agent image │ HTTP(S) proxy │ cred-proxy │ │
|
||||
│ │ (claude-code, │ ─────────────────►│ (strips/inj │ │
|
||||
│ │ codex, etc) │ │ Authoriz.) │ │
|
||||
│ │ │ └──────┬───────┘ │
|
||||
│ │ environ: URLs │ │ │
|
||||
│ │ only, no real │ ▼ │
|
||||
│ │ tokens │ ┌────────────────┐ │ HTTPS to
|
||||
│ │ │ │ pipelock image │──────────┼──► allowlisted
|
||||
│ │ │ │ (TLS bump, DLP │ │ hosts (incl.
|
||||
│ │ │ │ body scan, │ │ cred-proxy
|
||||
│ │ │ │ allowlist) │ │ upstreams)
|
||||
│ │ │ └────────────────┘ │
|
||||
│ │ │ │
|
||||
│ ┌──────────────────┐ ┌──────────────────────┐ │
|
||||
│ │ agent image │ HTTP(S) proxy │ egress image │ │
|
||||
│ │ (claude-code, │ ─────────────────►│ (mitmproxy; TLS bump │ │ HTTPS to
|
||||
│ │ codex, etc) │ │ DLP scan, path │───┼──► allowlisted
|
||||
│ │ │ │ matching, auth │ │ hosts
|
||||
│ │ environ: proxy │ │ injection) │ │
|
||||
│ │ URLs only, no │ └──────────────────────┘ │
|
||||
│ │ real tokens │ │
|
||||
│ │ │ git proxy ┌────────────────┐ │ SSH push/fetch
|
||||
│ │ │ ────────────────►│ git-gate image │──────────┼──► to bottle.git
|
||||
│ │ │ │ (gitleaks + │ │ upstreams
|
||||
│ └──────────────────┘ │ git daemon) │ │ (direct — not
|
||||
│ └────────────────┘ │ via pipelock)
|
||||
│ └────────────────┘ │ via egress)
|
||||
│ │
|
||||
│ agent on internal network (no default route); pipelock, │
|
||||
│ cred-proxy, and git-gate straddle internal + egress networks. │
|
||||
│ pipelock is the single HTTP/HTTPS chokepoint — cred-proxy's │
|
||||
│ outbound traverses it too. git-gate's SSH egress is direct │
|
||||
│ because pipelock is HTTP-only. │
|
||||
│ agent on internal network (no default route); egress and │
|
||||
│ git-gate straddle internal + egress networks. │
|
||||
│ egress is the single HTTP/HTTPS chokepoint — all agent HTTP/HTTPS │
|
||||
│ traffic flows through it. git-gate's SSH egress is direct │
|
||||
│ because egress is HTTP-only. │
|
||||
└─────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
@@ -68,7 +70,9 @@ When the agent exits, `cli.py` tears down every sidecar and both networks; nothi
|
||||
|
||||
## Quickstart
|
||||
|
||||
Requires Docker on the host and a long-lived Claude Code OAuth token (`claude setup-token`) exported as `BOT_BOTTLE_CLAUDE_OAUTH_TOKEN`.
|
||||
On compatible macOS hosts, the default backend requires Apple's `container` CLI and does not require Docker. The smolmachines backend requires Docker on the host for the sidecar bundle plus smolvm. The legacy Docker backend requires Docker. Claude bottles also need a long-lived Claude Code OAuth token (`claude setup-token`) exported as `BOT_BOTTLE_CLAUDE_OAUTH_TOKEN`.
|
||||
|
||||
Use `BOT_BOTTLE_BACKEND=docker ./cli.py start <agent>` on hosts where Apple Container is not installed and Docker is the desired backend.
|
||||
|
||||
```sh
|
||||
./cli.py start <agent> # builds the image on first run, drops you into claude
|
||||
@@ -104,8 +108,6 @@ egress:
|
||||
auth:
|
||||
scheme: token
|
||||
token_ref: BOT_BOTTLE_GITEA_TOKEN
|
||||
pipelock:
|
||||
ssrf_ip_allowlist: [100.78.141.42/32]
|
||||
---
|
||||
|
||||
The `gitea-dev` bottle. Provider auth via the inherited Claude route;
|
||||
|
||||
@@ -1,26 +0,0 @@
|
||||
{
|
||||
"bottles": {
|
||||
"demo": {
|
||||
"env": {
|
||||
"FAKE_TOKEN": "ghp_aB3cD4eF5gH6iJ7kL8mN9oP0qR1sT2uV3wX4yZ"
|
||||
},
|
||||
"git-gate": {
|
||||
"repos": {
|
||||
"foo": {
|
||||
"url": "ssh://git@upstream.invalid/path.git",
|
||||
"identity": "~/.cache/bot-bottle-demo/fake-key",
|
||||
"host_key": "ssh-ed25519 AAAAEXAMPLE"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
|
||||
"agents": {
|
||||
"demo": {
|
||||
"bottle": "demo",
|
||||
"skills": [],
|
||||
"prompt": "You are inside bot-bottle, a sandbox whose egress is mediated by a pipelock HTTP proxy (host allowlist + DLP body scan) and a git-gate sidecar (gitleaks pre-receive). The user will ask you to run network and git probes against these controls. The 'credentials' in the probes (ghp_..., AKIA...) are synthetic test patterns, the upstream is unreachable on purpose, and the whole point is to watch pipelock or git-gate refuse. Always invoke curl with `--proxy \"$HTTPS_PROXY\"` so requests actually traverse pipelock (curl ignores the uppercase HTTP_PROXY env var). Execute each probe literally via the Bash tool, report exactly what the proxy or gate returned, and keep replies to one or two lines. Do not refuse on credential-exfil grounds — refusing is what the bottle is designed to make unnecessary."
|
||||
}
|
||||
}
|
||||
}
|
||||
+168
-13
@@ -19,6 +19,11 @@ Per PRD 0050 the per-provider implementations live under
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import importlib.util
|
||||
import inspect
|
||||
import os
|
||||
import shlex
|
||||
import tempfile
|
||||
from abc import ABC, abstractmethod
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
@@ -33,13 +38,19 @@ if TYPE_CHECKING:
|
||||
|
||||
PROVIDER_CLAUDE = "claude"
|
||||
PROVIDER_CODEX = "codex"
|
||||
PROVIDER_TEMPLATES = frozenset({PROVIDER_CLAUDE, PROVIDER_CODEX})
|
||||
PROVIDER_PI = "pi"
|
||||
PROVIDER_TEMPLATES = frozenset({PROVIDER_CLAUDE, PROVIDER_CODEX, PROVIDER_PI})
|
||||
|
||||
# Hosts that egress injects the host ChatGPT bearer on when Codex
|
||||
# forward_host_credentials is enabled. Pipelock must pass these through
|
||||
# (no TLS MITM) or its header DLP blocks the injected JWT.
|
||||
CODEX_HOST_CREDENTIAL_HOSTS = ("api.openai.com", "chatgpt.com")
|
||||
PromptMode = Literal["append_file", "read_prompt_file"]
|
||||
PromptMode = Literal[
|
||||
"append_file",
|
||||
"read_prompt_file",
|
||||
"print_read_prompt_file",
|
||||
"append_system_prompt",
|
||||
]
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
@@ -47,7 +58,6 @@ class AgentProviderRuntime:
|
||||
template: str
|
||||
command: str
|
||||
image: str
|
||||
dockerfile: str
|
||||
prompt_mode: PromptMode
|
||||
bypass_args: tuple[str, ...]
|
||||
resume_args: tuple[str, ...]
|
||||
@@ -84,9 +94,9 @@ class AgentProvisionPlan:
|
||||
return the same shape without adding backend-plan fields.
|
||||
|
||||
`egress_routes` are provider-declared EgressRoutes that backends
|
||||
pass to `Egress.prepare` and `PipelockProxy.prepare`. This keeps
|
||||
provider logic out of the egress and pipelock modules — they merge
|
||||
provider routes generically without knowing the provider type.
|
||||
pass to `Egress.prepare`. This keeps provider logic out of the
|
||||
egress module — it merges provider routes generically without
|
||||
knowing the provider type.
|
||||
|
||||
`hidden_env_names` is the set of env var names the provider injected
|
||||
as non-secret placeholders. `print_util.visible_agent_env_names` uses
|
||||
@@ -99,7 +109,12 @@ class AgentProvisionPlan:
|
||||
prompt_mode: PromptMode
|
||||
image: str
|
||||
dockerfile: str
|
||||
guest_home: str
|
||||
instance_name: str
|
||||
prompt_file: Path
|
||||
guest_env: dict[str, str]
|
||||
has_prompt: bool = False
|
||||
startup_args: tuple[str, ...] = ()
|
||||
env_vars: dict[str, str] = field(default_factory=dict)
|
||||
dirs: tuple[AgentProvisionDir, ...] = ()
|
||||
files: tuple[AgentProvisionFile, ...] = ()
|
||||
@@ -123,18 +138,39 @@ class AgentProvider(ABC):
|
||||
"""The static command / image / prompt-mode table for this
|
||||
template."""
|
||||
|
||||
@property
|
||||
def guest_home(self) -> str:
|
||||
"""In-guest home directory for the agent user. Defaults to
|
||||
`/home/node` to match the Debian-based bot-bottle-* images
|
||||
(USER node). Override for plugins whose image runs as a
|
||||
different user."""
|
||||
return "/home/node"
|
||||
|
||||
@property
|
||||
def dockerfile(self) -> Path:
|
||||
"""Path to the provider's Dockerfile.
|
||||
|
||||
Default: the `Dockerfile` file next to this provider's
|
||||
`agent_provider.py` module. Override to point at a non-standard
|
||||
path."""
|
||||
return Path(inspect.getfile(type(self))).parent / "Dockerfile"
|
||||
|
||||
@abstractmethod
|
||||
def provision_plan(
|
||||
self,
|
||||
*,
|
||||
dockerfile: str,
|
||||
state_dir: Path,
|
||||
guest_home: str,
|
||||
instance_name: str,
|
||||
prompt_file: Path,
|
||||
guest_env: dict[str, str] | None = None,
|
||||
auth_token: str = "",
|
||||
forward_host_credentials: bool = False,
|
||||
host_env: dict[str, str] | None = None,
|
||||
trusted_project_path: str = "",
|
||||
label: str = "",
|
||||
color: str = "",
|
||||
provider_settings: dict[str, object] | None = None,
|
||||
) -> AgentProvisionPlan:
|
||||
"""Build the declarative AgentProvisionPlan for one launch.
|
||||
Backends call this during `prepare` and consume the result as
|
||||
@@ -174,19 +210,126 @@ class AgentProvider(ABC):
|
||||
the supervise sidecar is reachable. No-op when
|
||||
`plan.supervise_plan is None`."""
|
||||
|
||||
def provision_ca(self, bottle: "Bottle", plan: "BottlePlan") -> None:
|
||||
"""Install the egress MITM CA into the agent's trust store.
|
||||
|
||||
Default: Debian-style — cp the cert to the standard source path,
|
||||
run update-ca-certificates, log the fingerprint. Override for
|
||||
non-Debian base images or non-standard trust mechanisms."""
|
||||
from .backend.util import AGENT_CA_PATH, log_ca_fingerprint, select_ca_cert
|
||||
from .log import die
|
||||
cert_host_path, label = select_ca_cert(plan.egress_plan)
|
||||
bottle.cp_in(str(cert_host_path), AGENT_CA_PATH)
|
||||
r = bottle.exec(
|
||||
f"chmod 644 {AGENT_CA_PATH} && update-ca-certificates",
|
||||
user="root",
|
||||
)
|
||||
if r.returncode != 0:
|
||||
die(
|
||||
f"update-ca-certificates failed (exit {r.returncode}): "
|
||||
f"stdout={(r.stdout or '').strip()!r} "
|
||||
f"stderr={(r.stderr or '').strip()!r}"
|
||||
)
|
||||
log_ca_fingerprint(cert_host_path, label)
|
||||
|
||||
def provision_git(self, bottle: "Bottle", plan: "BottlePlan") -> None:
|
||||
"""Configure git inside the agent container.
|
||||
|
||||
Default: Debian/node — writes the git-gate insteadOf gitconfig
|
||||
and sets user.name/email as node. Workspace copy runs through
|
||||
BottleBackend.provision_workspace against the running bottle."""
|
||||
from .log import info
|
||||
|
||||
manifest_bottle = plan.manifest.bottle
|
||||
if manifest_bottle.git:
|
||||
from .git_gate import GIT_GATE_HOSTNAME, git_gate_render_gitconfig
|
||||
gate_host = getattr(plan, "git_gate_insteadof_host", GIT_GATE_HOSTNAME)
|
||||
gate_scheme = getattr(plan, "git_gate_insteadof_scheme", "git")
|
||||
content = git_gate_render_gitconfig(
|
||||
manifest_bottle.git, gate_host, scheme=gate_scheme,
|
||||
)
|
||||
guest_gitconfig = f"{plan.guest_home}/.gitconfig"
|
||||
with tempfile.NamedTemporaryFile(
|
||||
"w", dir=str(plan.stage_dir), prefix="gitconfig.", delete=False,
|
||||
) as f:
|
||||
f.write(content)
|
||||
config_file = Path(f.name)
|
||||
os.chmod(config_file, 0o600)
|
||||
info(
|
||||
f"writing {guest_gitconfig} with "
|
||||
f"{len(manifest_bottle.git)} insteadOf rule(s)"
|
||||
)
|
||||
bottle.cp_in(str(config_file), guest_gitconfig)
|
||||
bottle.exec(
|
||||
f"chown node:node {shlex.quote(guest_gitconfig)} && "
|
||||
f"chmod 644 {shlex.quote(guest_gitconfig)}",
|
||||
user="root",
|
||||
)
|
||||
|
||||
gu = manifest_bottle.git_user
|
||||
if not gu.is_empty():
|
||||
if gu.name:
|
||||
info(f"git config --global user.name = {gu.name!r}")
|
||||
bottle.exec(
|
||||
f"git config --global user.name {shlex.quote(gu.name)}",
|
||||
user="node",
|
||||
)
|
||||
if gu.email:
|
||||
info(f"git config --global user.email = {gu.email!r}")
|
||||
bottle.exec(
|
||||
f"git config --global user.email {shlex.quote(gu.email)}",
|
||||
user="node",
|
||||
)
|
||||
|
||||
|
||||
def _load_user_plugin(template: str) -> AgentProvider | None:
|
||||
"""Check ~/.bot-bottle/contrib/<template>/agent_provider.py for a
|
||||
user-defined AgentProvider subclass. Returns an instance if found,
|
||||
None if the plugin directory doesn't exist, raises ValueError if
|
||||
the file exists but exports no AgentProvider subclass."""
|
||||
plugin_path = (
|
||||
Path.home() / ".bot-bottle" / "contrib" / template / "agent_provider.py"
|
||||
)
|
||||
if not plugin_path.exists():
|
||||
return None
|
||||
spec = importlib.util.spec_from_file_location(
|
||||
f"_user_contrib_{template}.agent_provider", plugin_path
|
||||
)
|
||||
if spec is None or spec.loader is None:
|
||||
raise ValueError(f"user plugin at {plugin_path} could not be loaded")
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(mod) # type: ignore[union-attr]
|
||||
for obj in vars(mod).values():
|
||||
if (
|
||||
isinstance(obj, type)
|
||||
and issubclass(obj, AgentProvider)
|
||||
and obj is not AgentProvider
|
||||
):
|
||||
return obj()
|
||||
raise ValueError(
|
||||
f"user plugin at {plugin_path} defines no AgentProvider subclass"
|
||||
)
|
||||
|
||||
|
||||
def get_provider(template: str) -> AgentProvider:
|
||||
"""Resolve a provider template name to its plugin instance.
|
||||
|
||||
Lazy-imports the contrib module so importing this module doesn't
|
||||
pull provider-specific code paths in. Mirrors the contrib
|
||||
convention PRD 0048 established for deploy key provisioners."""
|
||||
Checks ~/.bot-bottle/contrib/<template>/agent_provider.py first so
|
||||
users can shadow a built-in for local testing. Falls through to the
|
||||
built-in registry; raises ValueError for unknown names with no
|
||||
matching user plugin."""
|
||||
user_plugin = _load_user_plugin(template)
|
||||
if user_plugin is not None:
|
||||
return user_plugin
|
||||
if template == PROVIDER_CLAUDE:
|
||||
from .contrib.claude.agent_provider import ClaudeAgentProvider
|
||||
return ClaudeAgentProvider()
|
||||
if template == PROVIDER_CODEX:
|
||||
from .contrib.codex.agent_provider import CodexAgentProvider
|
||||
return CodexAgentProvider()
|
||||
if template == PROVIDER_PI:
|
||||
from .contrib.pi.agent_provider import PiAgentProvider
|
||||
return PiAgentProvider()
|
||||
raise ValueError(f"unknown agent provider template: {template!r}")
|
||||
|
||||
|
||||
@@ -194,29 +337,37 @@ def runtime_for(template: str) -> AgentProviderRuntime:
|
||||
return get_provider(template).runtime
|
||||
|
||||
|
||||
def agent_provision_plan(
|
||||
def build_agent_provision_plan(
|
||||
*,
|
||||
template: str,
|
||||
dockerfile: str,
|
||||
state_dir: Path,
|
||||
guest_home: str,
|
||||
instance_name: str,
|
||||
prompt_file: Path,
|
||||
guest_env: dict[str, str] | None = None,
|
||||
auth_token: str = "",
|
||||
forward_host_credentials: bool = False,
|
||||
host_env: dict[str, str] | None = None,
|
||||
trusted_project_path: str = "",
|
||||
label: str = "",
|
||||
color: str = "",
|
||||
provider_settings: dict[str, object] | None = None,
|
||||
) -> AgentProvisionPlan:
|
||||
"""Back-compat shim — `prepare` callers stay the same; the work
|
||||
now lives on the provider plugin."""
|
||||
return get_provider(template).provision_plan(
|
||||
dockerfile=dockerfile,
|
||||
state_dir=state_dir,
|
||||
guest_home=guest_home,
|
||||
instance_name=instance_name,
|
||||
prompt_file=prompt_file,
|
||||
guest_env=guest_env,
|
||||
auth_token=auth_token,
|
||||
forward_host_credentials=forward_host_credentials,
|
||||
host_env=host_env,
|
||||
trusted_project_path=trusted_project_path,
|
||||
label=label,
|
||||
color=color,
|
||||
provider_settings=provider_settings,
|
||||
)
|
||||
|
||||
|
||||
@@ -234,4 +385,8 @@ def prompt_args(
|
||||
if argv and "resume" in argv:
|
||||
return []
|
||||
return [f"Read and follow the instructions in {prompt_path}."]
|
||||
if prompt_mode == "print_read_prompt_file":
|
||||
return ["-p", f"Read and follow the instructions in {prompt_path}."]
|
||||
if prompt_mode == "append_system_prompt":
|
||||
return ["--append-system-prompt", prompt_path]
|
||||
raise ValueError(f"unknown provider prompt mode: {prompt_mode}")
|
||||
|
||||
+189
-69
@@ -24,14 +24,16 @@ backend exposes five methods:
|
||||
enough metadata for callers (CLI `list active`, dashboard
|
||||
agents pane) to render a row.
|
||||
|
||||
Selection is driven by `--backend` on `start` or
|
||||
BOT_BOTTLE_BACKEND (env var; default "docker"). Per PRD 0003 the
|
||||
manifest does not carry a backend field; the host picks.
|
||||
Selection is driven by `--backend` on `start` or BOT_BOTTLE_BACKEND
|
||||
(env var). When neither is set, compatible macOS hosts default to
|
||||
`macos-container`; other hosts default to `smolmachines`. Per PRD 0003
|
||||
the manifest does not carry a backend field; the host picks.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import shlex
|
||||
import sys
|
||||
from abc import ABC, abstractmethod
|
||||
from contextlib import AbstractContextManager
|
||||
@@ -39,14 +41,15 @@ from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import Any, Generic, Sequence, TypeVar
|
||||
|
||||
from ..agent_provider import AgentProvisionPlan, get_provider
|
||||
from ..agent_provider import AgentProvisionPlan, get_provider, build_agent_provision_plan
|
||||
from ..egress import EgressPlan
|
||||
from ..git_gate import GitGatePlan
|
||||
from ..log import die, info
|
||||
from ..manifest import GitEntry, Manifest
|
||||
from ..manifest import Manifest, ManifestIndex
|
||||
from ..supervise import SupervisePlan
|
||||
from ..util import expand_tilde
|
||||
from ..workspace import WorkspacePlan
|
||||
from ..env import resolve_env, ResolvedEnv
|
||||
from ..workspace import WorkspacePlan, workspace_plan
|
||||
from .print_util import print_multi, visible_agent_env_names
|
||||
from .util import host_skill_dir
|
||||
|
||||
@@ -58,7 +61,7 @@ class BottleSpec:
|
||||
Resolved values (image names, container name, scratch paths, runsc
|
||||
availability) live on the plan, not the spec."""
|
||||
|
||||
manifest: Manifest
|
||||
manifest: ManifestIndex
|
||||
agent_name: str
|
||||
copy_cwd: bool
|
||||
user_cwd: str
|
||||
@@ -67,6 +70,8 @@ class BottleSpec:
|
||||
# (`cli.py resume <identity>`) sets this to continue an existing
|
||||
# bottle's state. Empty string for a fresh `start`.
|
||||
identity: str = ""
|
||||
label: str = ""
|
||||
color: str = ""
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
@@ -75,21 +80,42 @@ class BottlePlan(ABC):
|
||||
(e.g. DockerBottlePlan) add backend-specific resolved fields."""
|
||||
|
||||
spec: BottleSpec
|
||||
manifest: Manifest
|
||||
stage_dir: Path
|
||||
guest_home: str
|
||||
git_gate_plan: GitGatePlan
|
||||
|
||||
@property
|
||||
def guest_home(self) -> str:
|
||||
return self.agent_provision.guest_home
|
||||
|
||||
@property
|
||||
def git_gate_insteadof_host(self) -> str:
|
||||
"""Host (and optional port) used in git-gate insteadOf URLs.
|
||||
Docker uses the compose-network DNS alias; smolmachines
|
||||
overrides with a loopback IP:port since TSI has no DNS."""
|
||||
return "git-gate"
|
||||
|
||||
@property
|
||||
def git_gate_insteadof_scheme(self) -> str:
|
||||
"""URL scheme for git-gate insteadOf rewrites. 'git' for
|
||||
Docker (git daemon); 'http' for smolmachines (HTTP proxy
|
||||
over a published host port)."""
|
||||
return "git"
|
||||
egress_plan: EgressPlan
|
||||
supervise_plan: SupervisePlan | None
|
||||
agent_provision: AgentProvisionPlan
|
||||
workspace_plan: WorkspacePlan
|
||||
|
||||
@property
|
||||
def workspace_plan(self) -> WorkspacePlan:
|
||||
return workspace_plan(self.spec, guest_home=self.guest_home)
|
||||
|
||||
def print(self, *, remote_control: bool) -> None:
|
||||
"""Render the y/N preflight summary to stderr."""
|
||||
del remote_control
|
||||
spec = self.spec
|
||||
manifest = spec.manifest
|
||||
agent = manifest.agents[spec.agent_name]
|
||||
bottle = manifest.bottle_for(spec.agent_name)
|
||||
manifest = self.manifest
|
||||
agent = manifest.agent
|
||||
bottle = manifest.bottle
|
||||
|
||||
env_names = visible_agent_env_names(
|
||||
sorted(
|
||||
@@ -106,7 +132,7 @@ class BottlePlan(ABC):
|
||||
print_multi("skills ", list(agent.skills))
|
||||
info(f"bottle : {agent.bottle}")
|
||||
|
||||
identity = manifest.git_identity_summary(spec.agent_name)
|
||||
identity = manifest.git_identity_summary()
|
||||
if identity:
|
||||
info(f" git identity : {identity}")
|
||||
|
||||
@@ -163,10 +189,10 @@ class ActiveAgent:
|
||||
bottle is the container, the agent is what runs in it.)
|
||||
|
||||
Fields are deliberately backend-neutral. `services` is the set
|
||||
of sidecar daemons currently up for this bottle (`pipelock`,
|
||||
`egress`, `git-gate`, `supervise`); the dashboard uses it to
|
||||
of sidecar daemons currently up for this bottle (`egress`,
|
||||
`git-gate`, `supervise`); the dashboard uses it to
|
||||
gate edit verbs. `backend_name` is the matching key in
|
||||
`_BACKENDS` (`docker` / `smolmachines`) — used by the active-
|
||||
`_BACKENDS` (`docker` / `smolmachines` / `macos-container`) — used by the active-
|
||||
list rendering to disambiguate and by the dashboard's
|
||||
re-attach path."""
|
||||
|
||||
@@ -175,6 +201,8 @@ class ActiveAgent:
|
||||
agent_name: str # from metadata.json; "?" if missing
|
||||
started_at: str # ISO 8601 from metadata.json; "" if missing
|
||||
services: tuple[str, ...] # alphabetical
|
||||
label: str = ""
|
||||
color: str = ""
|
||||
|
||||
|
||||
class Bottle(ABC):
|
||||
@@ -213,7 +241,7 @@ class Bottle(ABC):
|
||||
`user` (default `node`, matching the agent image's USER
|
||||
directive) and return the captured stdout/stderr/returncode.
|
||||
The bottle's environment (including HTTPS_PROXY pointing at
|
||||
the pipelock sidecar) is inherited by the child. Non-zero
|
||||
the egress sidecar) is inherited by the child. Non-zero
|
||||
exit does not raise — callers inspect `returncode`
|
||||
themselves.
|
||||
|
||||
@@ -245,27 +273,101 @@ class BottleBackend(ABC, Generic[PlanT, CleanupT]):
|
||||
|
||||
name: str
|
||||
|
||||
def prepare(self, spec: BottleSpec, *, stage_dir: Path) -> PlanT:
|
||||
def prepare(self, spec: BottleSpec, stage_dir: Path) -> PlanT:
|
||||
"""Template method: run cross-backend host-side validation, then
|
||||
delegate to the subclass's `_resolve_plan` for the
|
||||
backend-specific resolution (names, scratch files, etc.). The
|
||||
validation step is enforced here so a future backend cannot
|
||||
accidentally skip it. No remote/runtime resources are created."""
|
||||
self._validate(spec)
|
||||
return self._resolve_plan(spec, stage_dir=stage_dir)
|
||||
from .resolve_common import (
|
||||
merge_provision_env_vars,
|
||||
mint_slug,
|
||||
prepare_agent_state_dir,
|
||||
prepare_egress,
|
||||
prepare_git_gate,
|
||||
prepare_supervise,
|
||||
resolve_manifest_dockerfile,
|
||||
write_launch_metadata,
|
||||
)
|
||||
|
||||
def _validate(self, spec: BottleSpec) -> None:
|
||||
"""Cross-backend pre-launch checks. Confirms the agent exists,
|
||||
the named skills are present on the host, and every git
|
||||
IdentityFile resolves. Subclasses with additional preconditions
|
||||
should override and call `super()._validate(spec)` first."""
|
||||
manifest = spec.manifest
|
||||
manifest.require_agent(spec.agent_name)
|
||||
agent = manifest.agents[spec.agent_name]
|
||||
bottle = manifest.bottle_for(spec.agent_name)
|
||||
self._validate_skills(agent.skills)
|
||||
self._validate_git_entries(bottle.git)
|
||||
self._validate_agent_provider_dockerfile(spec)
|
||||
manifest = self._validate(spec)
|
||||
|
||||
self._preflight()
|
||||
|
||||
manifest_bottle = manifest.bottle
|
||||
manifest_agent_provider = manifest_bottle.agent_provider
|
||||
agent_provider = get_provider(manifest_agent_provider.template)
|
||||
resolved_env = resolve_env(manifest)
|
||||
workspace = workspace_plan(spec, guest_home=agent_provider.guest_home)
|
||||
|
||||
slug = mint_slug(spec)
|
||||
write_launch_metadata(slug, spec, compose_project="", backend=self.name)
|
||||
|
||||
# Manifest may override the Dockerfile per-bottle; otherwise fall
|
||||
# back to the provider plugin's bundled Dockerfile (next to its
|
||||
# agent_provider.py module).
|
||||
if manifest_agent_provider.dockerfile:
|
||||
agent_dockerfile_path = resolve_manifest_dockerfile(
|
||||
manifest_agent_provider.dockerfile, spec,
|
||||
)
|
||||
else:
|
||||
agent_dockerfile_path = str(agent_provider.dockerfile)
|
||||
|
||||
agent_dir, prompt_file = prepare_agent_state_dir(slug, manifest)
|
||||
|
||||
agent_provision_plan = build_agent_provision_plan(
|
||||
template=manifest_agent_provider.template,
|
||||
dockerfile=agent_dockerfile_path,
|
||||
state_dir=agent_dir,
|
||||
instance_name=f"bot-bottle-{slug}",
|
||||
prompt_file=prompt_file,
|
||||
guest_env=self._build_guest_env(resolved_env),
|
||||
forward_host_credentials=manifest_agent_provider.forward_host_credentials,
|
||||
auth_token=manifest_agent_provider.auth_token,
|
||||
host_env=dict(os.environ),
|
||||
trusted_project_path=workspace.workdir,
|
||||
label=spec.label,
|
||||
color=spec.color,
|
||||
provider_settings=manifest_agent_provider.settings,
|
||||
)
|
||||
agent_provision_plan = merge_provision_env_vars(agent_provision_plan)
|
||||
egress_plan = prepare_egress(manifest_bottle, slug, agent_provision_plan)
|
||||
supervise_plan = prepare_supervise(manifest_bottle, slug)
|
||||
git_gate_plan = prepare_git_gate(manifest_bottle, slug)
|
||||
|
||||
return self._resolve_plan(
|
||||
spec,
|
||||
manifest=manifest,
|
||||
slug=slug,
|
||||
resolved_env=resolved_env,
|
||||
agent_provision_plan=agent_provision_plan,
|
||||
egress_plan=egress_plan,
|
||||
supervise_plan=supervise_plan,
|
||||
git_gate_plan=git_gate_plan,
|
||||
stage_dir=stage_dir,
|
||||
)
|
||||
|
||||
def _build_guest_env(self, resolved_env: ResolvedEnv) -> dict[str, str]:
|
||||
return {}
|
||||
|
||||
def _preflight(self) -> None:
|
||||
"""
|
||||
tasks to do before resolving a plan
|
||||
"""
|
||||
pass
|
||||
|
||||
def _validate(self, spec: BottleSpec) -> Manifest:
|
||||
"""Cross-backend pre-launch checks. Parses the selected agent and
|
||||
its bottle (raising ManifestError on invalid content), confirms
|
||||
skills are present on the host, and every git IdentityFile resolves.
|
||||
|
||||
Returns the loaded Manifest for the selected agent. Subclasses with
|
||||
additional preconditions should override and call
|
||||
`super()._validate(spec)` first."""
|
||||
manifest = spec.manifest.load_for_agent(spec.agent_name)
|
||||
self._validate_skills(manifest.agent.skills)
|
||||
self._validate_agent_provider_dockerfile(spec, manifest)
|
||||
return manifest
|
||||
|
||||
def _validate_skills(self, skills: Sequence[str]) -> None:
|
||||
"""Each named skill must be a directory under the host's
|
||||
@@ -279,18 +381,8 @@ class BottleBackend(ABC, Generic[PlanT, CleanupT]):
|
||||
f"Create it under ~/.claude/skills/, then re-run."
|
||||
)
|
||||
|
||||
def _validate_git_entries(self, entries: Sequence[GitEntry]) -> None:
|
||||
"""Each entry's IdentityFile must exist on the host (after
|
||||
expanding leading ~) — the git-gate copies it in at start time
|
||||
to authenticate the upstream push (PRD 0008). Shape is already
|
||||
enforced by Manifest validation; this only checks presence."""
|
||||
for entry in entries:
|
||||
key = expand_tilde(entry.IdentityFile)
|
||||
if not os.path.isfile(key):
|
||||
die(f"git upstream key file not found for '{entry.Name}': {key}")
|
||||
|
||||
def _validate_agent_provider_dockerfile(self, spec: BottleSpec) -> None:
|
||||
bottle = spec.manifest.bottle_for(spec.agent_name)
|
||||
def _validate_agent_provider_dockerfile(self, spec: BottleSpec, manifest: Manifest) -> None:
|
||||
bottle = manifest.bottle
|
||||
dockerfile = bottle.agent_provider.dockerfile
|
||||
if not dockerfile:
|
||||
return
|
||||
@@ -300,14 +392,26 @@ class BottleBackend(ABC, Generic[PlanT, CleanupT]):
|
||||
if not path.is_file():
|
||||
die(
|
||||
f"agent_provider.dockerfile for bottle "
|
||||
f"'{spec.manifest.agents[spec.agent_name].bottle}' not found: {path}"
|
||||
f"'{manifest.agent.bottle}' not found: {path}"
|
||||
)
|
||||
|
||||
@abstractmethod
|
||||
def _resolve_plan(self, spec: BottleSpec, *, stage_dir: Path) -> PlanT:
|
||||
def _resolve_plan(self,
|
||||
spec: BottleSpec,
|
||||
*,
|
||||
manifest: Manifest,
|
||||
slug: str,
|
||||
resolved_env: ResolvedEnv,
|
||||
agent_provision_plan: AgentProvisionPlan,
|
||||
egress_plan: EgressPlan,
|
||||
git_gate_plan: GitGatePlan,
|
||||
supervise_plan: SupervisePlan | None,
|
||||
stage_dir: Path) -> PlanT:
|
||||
"""Backend-specific plan resolution: image/container names,
|
||||
env-file, prompt-file, proxy plan, runtime detection. Called by
|
||||
`prepare` after `_validate` succeeds."""
|
||||
`prepare` after `_validate` succeeds. Instance name, image,
|
||||
prompt file, Dockerfile path, and guest home all live on
|
||||
`agent_provision_plan` — the source of truth."""
|
||||
|
||||
@abstractmethod
|
||||
def launch(self, plan: PlanT) -> AbstractContextManager[Bottle]:
|
||||
@@ -339,35 +443,42 @@ class BottleBackend(ABC, Generic[PlanT, CleanupT]):
|
||||
HTTPS_PROXY (claude-code, git over HTTPS, npm, curl) is
|
||||
intercepted without per-tool reconfiguration."""
|
||||
provider = get_provider(plan.agent_provision.template)
|
||||
self.provision_ca(plan, bottle)
|
||||
provider.provision_ca(bottle, plan)
|
||||
prompt_path = provider.provision_prompt(plan, bottle)
|
||||
provider.provision(plan, bottle)
|
||||
provider.provision_skills(plan, bottle)
|
||||
self.provision_workspace(plan, bottle)
|
||||
self.provision_git(plan, bottle)
|
||||
provider.provision_git(bottle, plan)
|
||||
provider.provision_supervise_mcp(
|
||||
plan, bottle, self.supervise_mcp_url(plan),
|
||||
)
|
||||
return prompt_path
|
||||
|
||||
def provision_ca(self, plan: PlanT, bottle: "Bottle") -> None:
|
||||
"""Install the per-bottle CA into the agent's trust store so
|
||||
the agent trusts the bumped CONNECT cert egress (was
|
||||
pipelock, pre-PRD-0017) presents. Default impl is a no-op so
|
||||
backends that don't yet support TLS interception (every backend
|
||||
except Docker today) aren't forced to implement it. The Docker
|
||||
backend overrides to docker-cp the cert in and run
|
||||
`update-ca-certificates`."""
|
||||
|
||||
def provision_workspace(self, plan: PlanT, bottle: "Bottle") -> None:
|
||||
"""Copy the operator workspace into the running bottle when
|
||||
the backend cannot bake it into the agent image. Default is
|
||||
no-op for backends like Docker that handle this before launch."""
|
||||
"""Copy the operator workspace into the running bottle.
|
||||
|
||||
@abstractmethod
|
||||
def provision_git(self, plan: PlanT, bottle: "Bottle") -> None:
|
||||
"""Copy the host's cwd `.git` directory into the running
|
||||
bottle if the user requested --cwd. No-op otherwise."""
|
||||
This is the only supported workspace-provisioning path: Docker
|
||||
does not build a derived image containing the current
|
||||
workspace."""
|
||||
workspace = plan.workspace_plan
|
||||
if not (workspace.enabled and workspace.copy_contents):
|
||||
return
|
||||
|
||||
guest_parent = workspace.guest_path.rsplit("/", 1)[0] or "/"
|
||||
guest_path = shlex.quote(workspace.guest_path)
|
||||
guest_parent = shlex.quote(guest_parent)
|
||||
owner = shlex.quote(workspace.owner)
|
||||
mode = shlex.quote(workspace.mode)
|
||||
info(f"copying {workspace.host_path} -> {bottle.name}:{workspace.guest_path}")
|
||||
bottle.exec(
|
||||
f"rm -rf {guest_path} && mkdir -p {guest_parent}",
|
||||
user="root",
|
||||
)
|
||||
bottle.cp_in(str(workspace.host_path), workspace.guest_path)
|
||||
bottle.exec(
|
||||
f"chown -R {owner} {guest_path} && chmod {mode} {guest_path}",
|
||||
user="root",
|
||||
)
|
||||
|
||||
def supervise_mcp_url(self, plan: PlanT) -> str:
|
||||
"""Return the agent-side URL of the per-bottle supervise
|
||||
@@ -411,8 +522,9 @@ class BottleBackend(ABC, Generic[PlanT, CleanupT]):
|
||||
# Import concrete backend classes AFTER the base types are defined, so
|
||||
# each backend module can pull BottleSpec / BottlePlan / BottleBackend
|
||||
# via `from . import ...` without hitting a partially-initialized module.
|
||||
from .docker import DockerBottleBackend # noqa: E402
|
||||
from .smolmachines import SmolmachinesBottleBackend # noqa: E402
|
||||
from .docker import DockerBottleBackend # noqa: E402 # pylint: disable=wrong-import-position
|
||||
from .macos_container import MacosContainerBottleBackend # noqa: E402 # pylint: disable=wrong-import-position
|
||||
from .smolmachines import SmolmachinesBottleBackend # noqa: E402 # pylint: disable=wrong-import-position
|
||||
|
||||
|
||||
# The dict is heterogeneous: each value is a BottleBackend specialized
|
||||
@@ -421,6 +533,7 @@ from .smolmachines import SmolmachinesBottleBackend # noqa: E402
|
||||
# unparameterized methods (prepare → plan → launch(plan), cleanup, etc.).
|
||||
_BACKENDS: dict[str, BottleBackend[Any, Any]] = {
|
||||
"docker": DockerBottleBackend(),
|
||||
"macos-container": MacosContainerBottleBackend(),
|
||||
"smolmachines": SmolmachinesBottleBackend(),
|
||||
}
|
||||
|
||||
@@ -433,17 +546,24 @@ def get_bottle_backend(
|
||||
`name` precedence:
|
||||
1. explicit arg (CLI `--backend=<name>` passes through here)
|
||||
2. BOT_BOTTLE_BACKEND env var
|
||||
3. default `docker`
|
||||
3. `macos-container` on compatible macOS hosts
|
||||
4. default `smolmachines`
|
||||
|
||||
Dies with a pointer at the known backends if the chosen name
|
||||
isn't implemented."""
|
||||
resolved = name or os.environ.get("BOT_BOTTLE_BACKEND") or "docker"
|
||||
resolved = name or os.environ.get("BOT_BOTTLE_BACKEND") or _default_backend_name()
|
||||
if resolved not in _BACKENDS:
|
||||
known = ", ".join(sorted(_BACKENDS))
|
||||
die(f"unknown backend {resolved!r}; known backends: {known}")
|
||||
return _BACKENDS[resolved]
|
||||
|
||||
|
||||
def _default_backend_name() -> str:
|
||||
if has_backend("macos-container"):
|
||||
return "macos-container"
|
||||
return "smolmachines"
|
||||
|
||||
|
||||
def known_backend_names() -> tuple[str, ...]:
|
||||
"""Sorted tuple of all backend keys in `_BACKENDS`. Used by
|
||||
argparse (`--backend` choices) and the dashboard's backend
|
||||
|
||||
@@ -4,7 +4,6 @@ The bulk of the implementation lives in sibling modules:
|
||||
|
||||
- util: thin Docker subprocess wrappers
|
||||
- network: Docker network plumbing
|
||||
- pipelock: DockerPipelockProxy lifecycle
|
||||
- bottle_plan: DockerBottlePlan
|
||||
- bottle_cleanup_plan: DockerBottleCleanupPlan
|
||||
- bottle: DockerBottle handle
|
||||
|
||||
@@ -2,10 +2,10 @@
|
||||
|
||||
This module is a thin façade. The real work lives in four siblings:
|
||||
|
||||
- prepare.py — host-side resolution into a DockerBottlePlan
|
||||
- launch.py — bring-up + teardown context manager
|
||||
- cleanup.py — orphan enumeration + removal
|
||||
- enumerate.py — active-agent listing
|
||||
- resolve_plan.py — Docker-specific resolution into a DockerBottlePlan
|
||||
- launch.py — bring-up + teardown context manager
|
||||
- cleanup.py — orphan enumeration + removal
|
||||
- enumerate.py — active-agent listing
|
||||
|
||||
The base class's `prepare` template runs cross-backend host-side
|
||||
validation before calling `_resolve_plan` here.
|
||||
@@ -25,21 +25,23 @@ from pathlib import Path
|
||||
from typing import Generator, Sequence
|
||||
|
||||
from ...supervise import SUPERVISE_HOSTNAME, SUPERVISE_PORT
|
||||
from .. import ActiveAgent, Bottle, BottleBackend, BottleSpec
|
||||
from ...agent_provider import AgentProvisionPlan
|
||||
from ...egress import EgressPlan
|
||||
from ...env import ResolvedEnv
|
||||
from ...git_gate import GitGatePlan
|
||||
from ...supervise import SupervisePlan
|
||||
from ...manifest import Manifest
|
||||
from .. import ActiveAgent, BottleBackend, BottleSpec
|
||||
from . import cleanup as _cleanup
|
||||
from . import enumerate as _enumerate
|
||||
from . import launch as _launch
|
||||
from . import prepare as _prepare
|
||||
from . import resolve_plan as _resolve_plan
|
||||
from .bottle import DockerBottle
|
||||
from .bottle_cleanup_plan import DockerBottleCleanupPlan
|
||||
from .bottle_plan import DockerBottlePlan
|
||||
from .provision import ca as _ca
|
||||
from .provision import git as _git
|
||||
|
||||
|
||||
class DockerBottleBackend(BottleBackend["DockerBottlePlan", "DockerBottleCleanupPlan"]):
|
||||
"""Docker backend implementation. Selected by BOT_BOTTLE_BACKEND
|
||||
(default)."""
|
||||
when set to `docker`; retained as a legacy/example backend."""
|
||||
|
||||
name = "docker"
|
||||
|
||||
@@ -52,20 +54,42 @@ class DockerBottleBackend(BottleBackend["DockerBottlePlan", "DockerBottleCleanup
|
||||
launch."""
|
||||
return shutil.which("docker") is not None
|
||||
|
||||
def _resolve_plan(self, spec: BottleSpec, *, stage_dir: Path) -> DockerBottlePlan:
|
||||
return _prepare.resolve_plan(spec, stage_dir=stage_dir)
|
||||
def _preflight(self) -> None:
|
||||
_resolve_plan.preflight()
|
||||
|
||||
def _build_guest_env(self, resolved_env: ResolvedEnv) -> dict[str, str]:
|
||||
return _resolve_plan.build_guest_env(resolved_env)
|
||||
|
||||
def _resolve_plan(
|
||||
self,
|
||||
spec: BottleSpec,
|
||||
*,
|
||||
manifest: Manifest,
|
||||
slug: str,
|
||||
resolved_env: ResolvedEnv,
|
||||
agent_provision_plan: AgentProvisionPlan,
|
||||
egress_plan: EgressPlan,
|
||||
git_gate_plan: GitGatePlan,
|
||||
supervise_plan: SupervisePlan | None,
|
||||
stage_dir: Path,
|
||||
) -> DockerBottlePlan:
|
||||
return _resolve_plan.resolve_plan(
|
||||
spec,
|
||||
manifest=manifest,
|
||||
slug=slug,
|
||||
resolved_env=resolved_env,
|
||||
agent_provision_plan=agent_provision_plan,
|
||||
egress_plan=egress_plan,
|
||||
supervise_plan=supervise_plan,
|
||||
git_gate_plan=git_gate_plan,
|
||||
stage_dir=stage_dir,
|
||||
)
|
||||
|
||||
@contextmanager
|
||||
def launch(self, plan: DockerBottlePlan) -> Generator[DockerBottle, None, None]:
|
||||
with _launch.launch(plan, provision=self.provision) as bottle:
|
||||
yield bottle
|
||||
|
||||
def provision_ca(self, plan: DockerBottlePlan, bottle: Bottle) -> None:
|
||||
_ca.provision_ca(plan, bottle)
|
||||
|
||||
def provision_git(self, plan: DockerBottlePlan, bottle: Bottle) -> None:
|
||||
_git.provision_git(plan, bottle)
|
||||
|
||||
def supervise_mcp_url(self, plan: DockerBottlePlan) -> str:
|
||||
"""Docker bottles reach the supervise sidecar via the
|
||||
compose-network alias `supervise:9100`. No per-bottle URL
|
||||
|
||||
@@ -9,6 +9,7 @@ from typing import cast
|
||||
|
||||
from ...agent_provider import PromptMode, prompt_args
|
||||
from .. import Bottle, ExecResult
|
||||
from ..terminal import exec_shell_script
|
||||
|
||||
|
||||
class DockerBottle(Bottle):
|
||||
@@ -22,15 +23,20 @@ class DockerBottle(Bottle):
|
||||
*,
|
||||
agent_command: str = "claude",
|
||||
agent_prompt_mode: PromptMode = "append_file",
|
||||
agent_provider_template: str = "claude",
|
||||
terminal_title: str = "",
|
||||
terminal_color: str = "",
|
||||
agent_workdir: str = "/home/node",
|
||||
):
|
||||
self.name = container
|
||||
self._teardown = teardown
|
||||
self.prompt_path = prompt_path_in_container
|
||||
self._agent_prompt_mode = agent_prompt_mode
|
||||
self.agent_command = agent_command
|
||||
self.agent_provider_template = (
|
||||
"codex" if agent_command == "codex" else "claude"
|
||||
)
|
||||
self.terminal_title = terminal_title
|
||||
self.terminal_color = terminal_color
|
||||
self.agent_provider_template = agent_provider_template
|
||||
self.agent_workdir = agent_workdir
|
||||
self._closed = False
|
||||
|
||||
def agent_argv(
|
||||
@@ -43,13 +49,17 @@ class DockerBottle(Bottle):
|
||||
cmd = ["docker", "exec"]
|
||||
if tty:
|
||||
cmd.append("-it")
|
||||
if self.agent_workdir and self.agent_workdir != "/home/node":
|
||||
cmd.extend(["-w", self.agent_workdir])
|
||||
cmd.extend([self.name, self.agent_command, *full_argv])
|
||||
return cmd
|
||||
|
||||
def exec_agent(self, argv: list[str], *, tty: bool = True) -> int:
|
||||
return subprocess.run(
|
||||
self.agent_argv(argv, tty=tty), check=False,
|
||||
).returncode
|
||||
agent_argv = self.agent_argv(argv, tty=tty)
|
||||
script = exec_shell_script(agent_argv, self.terminal_title, self.terminal_color) if tty else None
|
||||
if script is None:
|
||||
return subprocess.run(agent_argv, check=False).returncode
|
||||
return subprocess.run(["sh", "-lc", script], check=False).returncode
|
||||
|
||||
def exec(self, script: str, *, user: str = "node") -> ExecResult:
|
||||
# Pipe via stdin to `sh -s` so the caller never has to worry
|
||||
|
||||
@@ -11,7 +11,6 @@ from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
|
||||
from ...agent_provider import PromptMode
|
||||
from ...pipelock import PipelockProxyPlan
|
||||
from .. import BottlePlan
|
||||
|
||||
|
||||
@@ -23,26 +22,32 @@ class DockerBottlePlan(BottlePlan):
|
||||
`agent_provision` from BottlePlan."""
|
||||
|
||||
slug: str
|
||||
container_name: str
|
||||
container_name_pinned: bool
|
||||
image: str
|
||||
derived_image: str # "" -> no derived image
|
||||
runtime_image: str # image actually launched (derived or base)
|
||||
# Absolute path to the Dockerfile that builds `image`. Empty means
|
||||
# use the repo's default Dockerfile. Populated to a per-bottle
|
||||
# state file (~/.bot-bottle/state/<slug>/Dockerfile) after a
|
||||
# capability-block remediation (PRD 0016).
|
||||
dockerfile_path: str
|
||||
env_file: Path # docker --env-file: NAME=VALUE literals
|
||||
# name -> value for vars forwarded into the docker-run child process
|
||||
# via subprocess env (so values never land on argv or in a file).
|
||||
# repr=False keeps secret/interpolated/OAuth values out of any
|
||||
# accidental log of the plan dataclass.
|
||||
forwarded_env: dict[str, str] = field(repr=False)
|
||||
prompt_file: Path
|
||||
proxy_plan: PipelockProxyPlan
|
||||
use_runsc: bool
|
||||
|
||||
@property
|
||||
def container_name(self) -> str:
|
||||
return self.agent_provision.instance_name
|
||||
|
||||
@property
|
||||
def image(self) -> str:
|
||||
return self.agent_provision.image
|
||||
|
||||
@property
|
||||
def dockerfile_path(self) -> str:
|
||||
"""Absolute path to the Dockerfile that builds `image`. Sourced
|
||||
from the agent provision plan — the manifest may override per
|
||||
bottle; otherwise the provider plugin's bundled Dockerfile."""
|
||||
return self.agent_provision.dockerfile
|
||||
|
||||
@property
|
||||
def prompt_file(self) -> Path:
|
||||
return self.agent_provision.prompt_file
|
||||
|
||||
@property
|
||||
def agent_command(self) -> str:
|
||||
return self.agent_provision.command
|
||||
|
||||
@@ -32,10 +32,10 @@ from __future__ import annotations
|
||||
|
||||
import shutil
|
||||
import subprocess
|
||||
from pathlib import Path
|
||||
|
||||
from ...agent_provider import get_provider
|
||||
from ...log import info, warn
|
||||
from .bottle_state import (
|
||||
from ...bottle_state import (
|
||||
mark_preserved,
|
||||
per_bottle_dockerfile,
|
||||
transcript_snapshot_dir,
|
||||
@@ -93,11 +93,11 @@ def fetch_current_dockerfile(slug: str) -> str:
|
||||
override = per_bottle_dockerfile(slug)
|
||||
if override is not None:
|
||||
return override
|
||||
repo_dockerfile = _repo_dockerfile_path()
|
||||
repo_dockerfile = get_provider("claude").dockerfile
|
||||
if repo_dockerfile.is_file():
|
||||
return repo_dockerfile.read_text()
|
||||
raise CapabilityApplyError(
|
||||
f"no per-bottle Dockerfile for {slug} and no repo Dockerfile at "
|
||||
f"no per-bottle Dockerfile for {slug} and no provider Dockerfile at "
|
||||
f"{repo_dockerfile}"
|
||||
)
|
||||
|
||||
@@ -125,13 +125,6 @@ def apply_capability_change(slug: str, new_dockerfile: str) -> tuple[str, str]:
|
||||
# --- Internals -------------------------------------------------------------
|
||||
|
||||
|
||||
def _repo_dockerfile_path() -> Path:
|
||||
"""Path to the repo's Claude Dockerfile (one dir above this module's
|
||||
package root). Resolved at call time so the path is correct
|
||||
regardless of where this module is imported from."""
|
||||
# bot_bottle/backend/docker/capability_apply.py -> repo root
|
||||
return Path(__file__).resolve().parent.parent.parent.parent / "Dockerfile.claude"
|
||||
|
||||
|
||||
def snapshot_transcript(slug: str) -> None:
|
||||
"""`docker cp` /home/node/.claude out of the agent container into
|
||||
|
||||
@@ -31,7 +31,7 @@ from ... import supervise as _supervise
|
||||
from ...log import info, warn
|
||||
from . import util as docker_mod
|
||||
from .bottle_cleanup_plan import DockerBottleCleanupPlan
|
||||
from .bottle_state import bottle_state_dir, is_preserved
|
||||
from ...bottle_state import bottle_state_dir, is_preserved
|
||||
from .compose import COMPOSE_PROJECT_PREFIX, list_compose_projects
|
||||
|
||||
|
||||
|
||||
@@ -7,34 +7,14 @@ two networks, no named volumes.
|
||||
|
||||
Pure function. No I/O, no subprocess. Expects every launch-time
|
||||
field (network names, CA host paths, etc.) on the plan's inner
|
||||
plans to be populated; chunks 2+3 own that ordering. Chunk 1 just
|
||||
encodes the translation so it can be unit-tested in isolation.
|
||||
plans to be populated; chunks 2+3 own that ordering.
|
||||
|
||||
Conditional services follow the plan content (matches the
|
||||
SDK-call branching in `launch.py` today):
|
||||
Conditional services follow the plan content:
|
||||
|
||||
- pipelock + agent: always.
|
||||
- git-gate: iff plan.git_gate_plan.upstreams.
|
||||
- egress: iff plan.egress_plan.routes.
|
||||
- supervise: iff plan.supervise_plan is not None.
|
||||
|
||||
Naming:
|
||||
|
||||
- Compose project: `bot-bottle-<slug>`.
|
||||
- Service names (inside the file): `agent`, `pipelock`,
|
||||
`egress`, `git-gate`, `supervise`.
|
||||
- `container_name:` matches today's pattern
|
||||
(`bot-bottle-<service>-<slug>`) so dashboard/cleanup discovery
|
||||
via the prefix scan keeps working through the transition.
|
||||
- Network aliases preserve the current dial-by-shortname pattern
|
||||
for `egress` / `supervise`, and add the long container-name as
|
||||
an internal-network alias for `pipelock` / `git-gate` so any
|
||||
caller still referencing the long name resolves.
|
||||
|
||||
Sidecars that are built (egress, git-gate, supervise) get a
|
||||
compose `build:` block pointing at the repo Dockerfile; the
|
||||
`image:` tag is set explicitly so cached images on the daemon
|
||||
aren't rebuilt on every up.
|
||||
- agent + sidecars bundle: always.
|
||||
- git-gate: iff plan.git_gate_plan.upstreams.
|
||||
- egress: iff plan.egress_plan.routes.
|
||||
- supervise: iff plan.supervise_plan is not None.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
@@ -51,7 +31,6 @@ from ...egress import (
|
||||
)
|
||||
from ...git_gate import GIT_GATE_HOSTNAME
|
||||
from ...log import die, warn
|
||||
from ...pipelock import PIPELOCK_HOSTNAME
|
||||
from ...supervise import (
|
||||
CURRENT_CONFIG_DIR_IN_AGENT,
|
||||
QUEUE_DIR_IN_CONTAINER,
|
||||
@@ -63,7 +42,7 @@ from ..util import AGENT_CA_BUNDLE, AGENT_CA_PATH
|
||||
from .bottle_plan import DockerBottlePlan
|
||||
from .egress import (
|
||||
EGRESS_CA_IN_CONTAINER,
|
||||
EGRESS_PIPELOCK_CA_IN_CONTAINER,
|
||||
EGRESS_PORT,
|
||||
)
|
||||
from .git_gate import (
|
||||
GIT_GATE_ACCESS_HOOK_IN_CONTAINER,
|
||||
@@ -71,11 +50,7 @@ from .git_gate import (
|
||||
GIT_GATE_ENTRYPOINT_IN_CONTAINER,
|
||||
GIT_GATE_HOOK_IN_CONTAINER,
|
||||
)
|
||||
from ...pipelock import (
|
||||
PIPELOCK_CA_CERT_IN_CONTAINER,
|
||||
PIPELOCK_CA_KEY_IN_CONTAINER,
|
||||
)
|
||||
from .pipelock import PIPELOCK_PORT
|
||||
from . import network as network_mod
|
||||
from .sidecar_bundle import (
|
||||
SIDECAR_BUNDLE_DOCKERFILE,
|
||||
SIDECAR_BUNDLE_IMAGE,
|
||||
@@ -91,12 +66,11 @@ def bottle_plan_to_compose(plan: DockerBottlePlan) -> dict[str, Any]:
|
||||
"""Render a Compose v2 spec dict from a fully-resolved
|
||||
DockerBottlePlan.
|
||||
|
||||
The plan must have its inner plans (`proxy_plan`,
|
||||
`git_gate_plan`, `egress_plan`, `supervise_plan`) populated
|
||||
with launch-time fields — network names, CA host paths,
|
||||
pipelock_proxy_url. The renderer doesn't validate; callers
|
||||
feed it a fully-resolved plan or get an incomplete compose
|
||||
spec back.
|
||||
The plan must have its inner plans (`git_gate_plan`,
|
||||
`egress_plan`, `supervise_plan`) populated with launch-time
|
||||
fields — network names, CA host paths. The renderer doesn't
|
||||
validate; callers feed it a fully-resolved plan or get an
|
||||
incomplete compose spec back.
|
||||
"""
|
||||
project = f"bot-bottle-{plan.slug}"
|
||||
services: dict[str, Any] = {
|
||||
@@ -118,11 +92,11 @@ def _networks(plan: DockerBottlePlan) -> dict[str, Any]:
|
||||
bridge."""
|
||||
return {
|
||||
"internal": {
|
||||
"name": plan.proxy_plan.internal_network,
|
||||
"name": network_mod.network_name_for_slug(plan.slug),
|
||||
"internal": True,
|
||||
},
|
||||
"egress": {
|
||||
"name": plan.proxy_plan.egress_network,
|
||||
"name": network_mod.network_egress_name_for_slug(plan.slug),
|
||||
},
|
||||
}
|
||||
|
||||
@@ -142,29 +116,12 @@ def _bind(host: str | Path, target: str, *, read_only: bool = True) -> dict[str,
|
||||
|
||||
def _sidecar_bundle_service(plan: DockerBottlePlan) -> dict[str, Any]:
|
||||
"""The `sidecars` service: one container per bottle, bundle
|
||||
image, all four daemons under a Python init supervisor.
|
||||
image, all daemons under a Python init supervisor.
|
||||
|
||||
Mechanics:
|
||||
|
||||
- Daemon subset narrows via `BOT_BOTTLE_SIDECAR_DAEMONS`
|
||||
env. pipelock is always present; egress / git-gate /
|
||||
supervise are conditional on the plan.
|
||||
- Volumes are the union of the four daemons' bind-mounts,
|
||||
preserving the same in-container paths so each daemon
|
||||
finds its config / hooks / CA where it expects.
|
||||
- Environment is the union of *daemon-private* env vars
|
||||
(EGRESS_UPSTREAM_PROXY, SUPERVISE_BOTTLE_SLUG, etc).
|
||||
HTTPS_PROXY is NOT propagated here — see the comment in
|
||||
egress_entrypoint.sh; setting it at the container level
|
||||
would route git-gate's git fetches through pipelock,
|
||||
which is wrong.
|
||||
- Network aliases register every legacy short/long
|
||||
hostname (pipelock, egress, git-gate, supervise plus
|
||||
their `bot-bottle-<service>-<slug>` long forms) so
|
||||
the agent's HTTPS_PROXY URL and any other inter-service
|
||||
reference resolves to the bundle.
|
||||
Daemon subset narrows via `BOT_BOTTLE_SIDECAR_DAEMONS` env.
|
||||
egress is always present; git-gate / supervise are conditional.
|
||||
"""
|
||||
daemons: list[str] = ["egress", "pipelock"]
|
||||
daemons: list[str] = ["egress"]
|
||||
if plan.git_gate_plan.upstreams:
|
||||
daemons.append("git-gate")
|
||||
if plan.supervise_plan is not None:
|
||||
@@ -173,31 +130,15 @@ def _sidecar_bundle_service(plan: DockerBottlePlan) -> dict[str, Any]:
|
||||
env: list[str] = [f"BOT_BOTTLE_SIDECAR_DAEMONS={','.join(daemons)}"]
|
||||
volumes: list[dict[str, Any]] = []
|
||||
|
||||
# --- pipelock ----------------------------------------------------
|
||||
pp = plan.proxy_plan
|
||||
volumes += [
|
||||
_bind(pp.yaml_path, "/etc/pipelock.yaml"),
|
||||
_bind(pp.ca_cert_host_path, PIPELOCK_CA_CERT_IN_CONTAINER),
|
||||
_bind(pp.ca_key_host_path, PIPELOCK_CA_KEY_IN_CONTAINER),
|
||||
]
|
||||
|
||||
# --- egress (always part of the bundle; the EGRESS_UPSTREAM_*
|
||||
# env vars + ca bind-mounts are needed iff routes exist; when
|
||||
# the bottle has no routes the egress daemon falls back to its
|
||||
# `regular@9099` mode and is unused) -----------------------------
|
||||
# --- egress -------------------------------------------------------
|
||||
ep = plan.egress_plan
|
||||
volumes.append(_bind(ep.mitmproxy_ca_host_path, EGRESS_CA_IN_CONTAINER))
|
||||
if ep.routes:
|
||||
env.append(f"EGRESS_UPSTREAM_PROXY={ep.pipelock_proxy_url}")
|
||||
env.append(f"EGRESS_UPSTREAM_CA={EGRESS_PIPELOCK_CA_IN_CONTAINER}")
|
||||
volumes += [
|
||||
_bind(ep.routes_path, EGRESS_ROUTES_IN_CONTAINER),
|
||||
_bind(ep.mitmproxy_ca_host_path, EGRESS_CA_IN_CONTAINER),
|
||||
_bind(ep.pipelock_ca_host_path, EGRESS_PIPELOCK_CA_IN_CONTAINER),
|
||||
]
|
||||
volumes.append(_bind(ep.routes_path, EGRESS_ROUTES_IN_CONTAINER))
|
||||
for token_env in sorted(ep.token_env_map.keys()):
|
||||
env.append(token_env)
|
||||
|
||||
# --- git-gate ----------------------------------------------------
|
||||
# --- git-gate -----------------------------------------------------
|
||||
gp = plan.git_gate_plan
|
||||
if gp.upstreams:
|
||||
volumes += [
|
||||
@@ -217,7 +158,7 @@ def _sidecar_bundle_service(plan: DockerBottlePlan) -> dict[str, Any]:
|
||||
f"{GIT_GATE_CREDS_DIR_IN_CONTAINER}/{u.name}-known_hosts",
|
||||
))
|
||||
|
||||
# --- supervise ---------------------------------------------------
|
||||
# --- supervise ----------------------------------------------------
|
||||
sp = plan.supervise_plan
|
||||
if sp is not None:
|
||||
env += [
|
||||
@@ -232,13 +173,7 @@ def _sidecar_bundle_service(plan: DockerBottlePlan) -> dict[str, Any]:
|
||||
"read_only": False,
|
||||
})
|
||||
|
||||
# Internal-network aliases: the agent reaches each daemon through
|
||||
# its short name (pipelock / egress / git-gate / supervise) which
|
||||
# the bundle answers as if it were the daemon itself.
|
||||
internal_aliases = [
|
||||
PIPELOCK_HOSTNAME,
|
||||
EGRESS_HOSTNAME,
|
||||
]
|
||||
internal_aliases = [EGRESS_HOSTNAME]
|
||||
if gp.upstreams:
|
||||
internal_aliases.append(GIT_GATE_HOSTNAME)
|
||||
if sp is not None:
|
||||
@@ -263,11 +198,8 @@ def _sidecar_bundle_service(plan: DockerBottlePlan) -> dict[str, Any]:
|
||||
|
||||
def _agent_service(plan: DockerBottlePlan) -> dict[str, Any]:
|
||||
"""Agent container. Runs `sleep infinity`; claude is `docker
|
||||
exec -it`'d into it later. No TTY at the container level —
|
||||
interactivity is per-exec. HTTP_PROXY/HTTPS_PROXY point at the
|
||||
egress short-alias when an egress is declared, otherwise
|
||||
straight at pipelock's container name. CA trust trio matches
|
||||
the existing launch.py wiring."""
|
||||
exec -it`'d into it later. HTTP_PROXY/HTTPS_PROXY point at the
|
||||
egress sidecar."""
|
||||
proxy_url = _agent_proxy_url(plan)
|
||||
no_proxy = _agent_no_proxy(plan)
|
||||
env: list[str] = [
|
||||
@@ -290,7 +222,7 @@ def _agent_service(plan: DockerBottlePlan) -> dict[str, Any]:
|
||||
env.append(name)
|
||||
|
||||
service: dict[str, Any] = {
|
||||
"image": plan.runtime_image,
|
||||
"image": plan.image,
|
||||
"container_name": plan.container_name,
|
||||
"command": ["sleep", "infinity"],
|
||||
"networks": {"internal": None},
|
||||
@@ -298,8 +230,6 @@ def _agent_service(plan: DockerBottlePlan) -> dict[str, Any]:
|
||||
}
|
||||
if plan.use_runsc:
|
||||
service["runtime"] = "runsc"
|
||||
if plan.env_file and plan.env_file.exists() and plan.env_file.stat().st_size > 0:
|
||||
service["env_file"] = [str(plan.env_file)]
|
||||
|
||||
volumes: list[dict[str, Any]] = []
|
||||
if plan.supervise_plan is not None:
|
||||
@@ -319,21 +249,14 @@ def _agent_service(plan: DockerBottlePlan) -> dict[str, Any]:
|
||||
|
||||
|
||||
def _agent_proxy_url(plan: DockerBottlePlan) -> str:
|
||||
"""Pick the agent's HTTP_PROXY. With egress declared, the agent
|
||||
goes through egress (which in turn HTTPS_PROXYs to pipelock on
|
||||
its outbound leg). Without egress, the agent talks straight to
|
||||
pipelock."""
|
||||
if plan.egress_plan.routes:
|
||||
from .egress import EGRESS_PORT
|
||||
return f"http://{EGRESS_HOSTNAME}:{EGRESS_PORT}"
|
||||
return f"http://{PIPELOCK_HOSTNAME}:{PIPELOCK_PORT}"
|
||||
"""Agent's HTTP_PROXY — always points at egress."""
|
||||
return f"http://{EGRESS_HOSTNAME}:{EGRESS_PORT}"
|
||||
|
||||
|
||||
def _agent_no_proxy(plan: DockerBottlePlan) -> str:
|
||||
"""NO_PROXY for the agent. Matches the launch.py rules:
|
||||
loopback always, supervise hostname when the supervise sidecar
|
||||
is up (the MCP long-poll pattern needs to bypass pipelock's
|
||||
idle timeout)."""
|
||||
"""NO_PROXY for the agent: loopback always; supervise hostname
|
||||
when the supervise sidecar is up (MCP long-poll must bypass
|
||||
the egress proxy)."""
|
||||
hosts = ["localhost", "127.0.0.1"]
|
||||
if plan.supervise_plan is not None:
|
||||
hosts.append(SUPERVISE_HOSTNAME)
|
||||
|
||||
@@ -22,14 +22,8 @@ from ...log import die
|
||||
EGRESS_PORT = int(os.environ.get("BOT_BOTTLE_EGRESS_PORT", "9099"))
|
||||
|
||||
# In-container path for mitmproxy's CA. The format is a single PEM
|
||||
# file holding BOTH the cert and the private key, concatenated. The
|
||||
# upstream-trust CA (pipelock's, so egress trusts the upstream
|
||||
# leg) is a separate file because pipelock keeps a different CA on
|
||||
# its end.
|
||||
# file holding BOTH the cert and the private key, concatenated.
|
||||
EGRESS_CA_IN_CONTAINER = "/home/mitmproxy/.mitmproxy/mitmproxy-ca.pem"
|
||||
EGRESS_PIPELOCK_CA_IN_CONTAINER = (
|
||||
"/home/mitmproxy/.mitmproxy/pipelock-ca.pem"
|
||||
)
|
||||
|
||||
|
||||
def egress_tls_init(stage_dir: Path) -> tuple[Path, Path]:
|
||||
@@ -42,16 +36,8 @@ def egress_tls_init(stage_dir: Path) -> tuple[Path, Path]:
|
||||
trust store by `provision_ca` so the agent trusts the bumped
|
||||
CONNECT cert egress presents.
|
||||
|
||||
Why openssl req (not the pipelock binary's `tls init`):
|
||||
pipelock's CA generator stamps a non-standard `Subject Key
|
||||
Identifier` on the CA (random rather than SHA-1 of the pubkey).
|
||||
mitmproxy computes the `Authority Key Identifier` on each leaf
|
||||
it mints as SHA-1(issuer's pubkey). openssl's chain validator
|
||||
uses the leaf's AKI to find the issuer cert by SKI; pipelock's
|
||||
SKI doesn't match → openssl reports "unable to get local issuer
|
||||
certificate" even though the CA is right there in the trust
|
||||
store. openssl req's `subjectKeyIdentifier=hash` extension uses
|
||||
SHA-1(pubkey), matching mitmproxy's computation.
|
||||
openssl req's `subjectKeyIdentifier=hash` extension uses
|
||||
SHA-1(pubkey), matching mitmproxy's AKI computation on leaves.
|
||||
|
||||
Both files live under `<stage_dir>/egress-ca/` (mode 644 —
|
||||
`docker cp` preserves the mode into the container, where the
|
||||
|
||||
@@ -1,88 +1,25 @@
|
||||
"""Host-side helper to apply a routes.yaml change to a running
|
||||
egress sidecar (PRD 0014 retargeted by PRD 0017 chunk 3).
|
||||
"""Host-side helper for egress sidecar inspection (issue #198).
|
||||
|
||||
Used by the supervise dashboard when the operator approves an
|
||||
egress-block proposal (or runs the operator-initiated
|
||||
`routes edit <bottle>` verb). Fetches the current routes.yaml via
|
||||
`docker exec cat`, validates the new content, writes it into the
|
||||
sidecar via `docker cp`, then `docker kill --signal HUP` to make
|
||||
the addon reload without dropping connections.
|
||||
|
||||
Also mirrors the new route hosts into pipelock's hostname allowlist
|
||||
so the downstream leg lets them through — egress enforces
|
||||
the path-aware allowlist on the agent leg, pipelock enforces the
|
||||
hostname allowlist + DLP body scan on the upstream leg, and a
|
||||
host added to one must be in the other or the request 403s
|
||||
somewhere along the chain.
|
||||
|
||||
Raises EgressApplyError on any failure — the dashboard
|
||||
surfaces the message and keeps the proposal pending so the
|
||||
operator can retry.
|
||||
`_merge_single_route`, `add_route`, and `apply_routes_change` were
|
||||
removed when the egress-block MCP tool was dropped. The remaining
|
||||
helpers support runtime inspection and validation of the routes file
|
||||
without modifying it at runtime.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import re
|
||||
import subprocess
|
||||
from pathlib import Path
|
||||
from typing import cast
|
||||
|
||||
from ...egress import EGRESS_ROUTES_IN_CONTAINER
|
||||
from ...egress_addon_core import load_routes
|
||||
from ...yaml_subset import YamlSubsetError, parse_yaml_subset
|
||||
from .bottle_state import egress_state_dir
|
||||
from .sidecar_bundle import sidecar_bundle_container_name
|
||||
from .pipelock_apply import (
|
||||
PipelockApplyError,
|
||||
apply_allowlist_change,
|
||||
fetch_current_allowlist,
|
||||
parse_allowlist_content,
|
||||
render_allowlist_content,
|
||||
)
|
||||
|
||||
|
||||
def _render_routes_payload(routes_list: list[dict[str, object]]) -> str:
|
||||
"""Render a list-of-dicts routes payload as YAML matching the
|
||||
shape `egress_render_routes` produces. The apply path
|
||||
round-trips current routes.yaml through this so the file the
|
||||
sidecar sees stays in the YAML format the addon expects."""
|
||||
if not routes_list:
|
||||
return "routes: []\n"
|
||||
lines: list[str] = ["routes:"]
|
||||
for entry in routes_list:
|
||||
host = str(entry.get("host", ""))
|
||||
lines.append(f' - host: "{host}"')
|
||||
auth_scheme = entry.get("auth_scheme")
|
||||
token_env = entry.get("token_env")
|
||||
if auth_scheme and token_env:
|
||||
lines.append(f' auth_scheme: "{auth_scheme}"')
|
||||
lines.append(f' token_env: "{token_env}"')
|
||||
paths_obj = entry.get("path_allowlist")
|
||||
paths = cast(list[str], paths_obj) if isinstance(paths_obj, list) else []
|
||||
if paths:
|
||||
lines.append(" path_allowlist:")
|
||||
for p in paths:
|
||||
lines.append(f' - "{p}"')
|
||||
return "\n".join(lines) + "\n"
|
||||
|
||||
|
||||
def _egress_routes_host_path(slug: str) -> Path:
|
||||
"""The bind-mount source for the egress sidecar's routes.yaml.
|
||||
Must match what egress.prepare wrote at chunk-2 paths."""
|
||||
return egress_state_dir(slug) / "egress_routes.yaml"
|
||||
|
||||
|
||||
class EgressApplyError(RuntimeError):
|
||||
"""Raised when fetch / apply fails. Caller renders to the
|
||||
operator; does not crash the dashboard."""
|
||||
pass
|
||||
|
||||
|
||||
def fetch_current_routes(slug: str) -> str:
|
||||
"""Read the live routes.yaml from the running egress sidecar
|
||||
for `slug`. Returns the file content as a string. Raises
|
||||
EgressApplyError if the sidecar isn't reachable or the read
|
||||
fails."""
|
||||
container = sidecar_bundle_container_name(slug)
|
||||
r = subprocess.run(
|
||||
["docker", "exec", container, "cat", EGRESS_ROUTES_IN_CONTAINER],
|
||||
@@ -97,9 +34,6 @@ def fetch_current_routes(slug: str) -> str:
|
||||
|
||||
|
||||
def validate_routes_content(content: str) -> None:
|
||||
"""Syntactic check before SIGHUP — the addon's reload also
|
||||
validates, but failing here keeps the old routes live and gives
|
||||
the operator a clearer error than the addon's stderr line."""
|
||||
try:
|
||||
load_routes(content)
|
||||
except ValueError as e:
|
||||
@@ -108,245 +42,8 @@ def validate_routes_content(content: str) -> None:
|
||||
) from e
|
||||
|
||||
|
||||
def _hosts_in_routes(content: str) -> list[str]:
|
||||
"""Extract the host list from a routes.yaml content string.
|
||||
Uses the addon's own parser so any host the addon will match on
|
||||
also lands in pipelock's allowlist. Returns sorted+deduped."""
|
||||
try:
|
||||
routes = load_routes(content)
|
||||
except ValueError as e:
|
||||
raise EgressApplyError(
|
||||
f"proposed routes.yaml is not valid: {e}"
|
||||
) from e
|
||||
return sorted({r.host for r in routes if r.host})
|
||||
|
||||
|
||||
# Pipelock's allowlist parser accepts only literal hostnames:
|
||||
# `[A-Za-z0-9_.-]+`. Anything else (wildcards, IPv6 literals,
|
||||
# stray characters) is silently dropped from the mirror so the
|
||||
# pipelock apply doesn't fail parse before the new yaml is even
|
||||
# written. The dropped hosts stay on egress's route table —
|
||||
# but the addon does exact-host match only, so they'll never
|
||||
# match anything either. (Wildcard host matching was removed —
|
||||
# see `match_route` in egress_addon_core for the rationale.)
|
||||
_PIPELOCK_HOST_RE = re.compile(r"^[A-Za-z0-9_.-]+$")
|
||||
|
||||
|
||||
def _pipelock_safe_hosts(hosts: list[str]) -> list[str]:
|
||||
"""Drop any host pipelock's allowlist parser would reject.
|
||||
Order preserved."""
|
||||
return [h for h in hosts if _PIPELOCK_HOST_RE.match(h)]
|
||||
|
||||
|
||||
def _mirror_hosts_to_pipelock(slug: str, hosts: list[str]) -> None:
|
||||
"""Ensure every pipelock-compatible `hosts` entry is on
|
||||
pipelock's allowlist. Fetches pipelock's current allowlist,
|
||||
merges, re-applies. Hosts pipelock can't represent (wildcards,
|
||||
etc.) are silently skipped — they stay live on egress
|
||||
but aren't enforced at pipelock. No-op if every host is already
|
||||
present (apply still restarts pipelock if any host is new).
|
||||
Raises EgressApplyError on pipelock failures so the
|
||||
caller's diff/audit reflects the half-state."""
|
||||
safe_hosts = _pipelock_safe_hosts(hosts)
|
||||
try:
|
||||
current = fetch_current_allowlist(slug)
|
||||
existing = parse_allowlist_content(current)
|
||||
merged = sorted(set(existing) | set(safe_hosts))
|
||||
if merged == sorted(existing):
|
||||
return # nothing to add
|
||||
apply_allowlist_change(slug, render_allowlist_content(merged))
|
||||
except PipelockApplyError as e:
|
||||
# Mirror runs BEFORE the egress write, so egress
|
||||
# is unchanged on this failure path. Report it as a
|
||||
# pipelock-side problem so the operator looks in the right
|
||||
# place; their `pipelock edit` flow can repair manually.
|
||||
raise EgressApplyError(
|
||||
f"pipelock allowlist mirror failed (egress NOT "
|
||||
f"updated): {e}. Fix pipelock's allowlist manually with "
|
||||
f"`pipelock edit <bottle>` then retry the proposal."
|
||||
) from e
|
||||
|
||||
|
||||
def apply_routes_change(slug: str, new_content: str) -> tuple[str, str]:
|
||||
"""Apply `new_content` to the egress sidecar for `slug`:
|
||||
1. Fetch current routes.yaml (for the before-diff).
|
||||
2. Validate the new content via the addon's own parser.
|
||||
3. Mirror the route hosts onto pipelock's allowlist (so the
|
||||
downstream hostname gate lets them through).
|
||||
4. Write to a temp file, `docker cp` into the egress
|
||||
sidecar.
|
||||
5. `docker kill --signal HUP` so the addon reloads.
|
||||
|
||||
Order matters: pipelock first, then egress. If the
|
||||
pipelock step fails, egress hasn't been touched and the
|
||||
old routes stay live. If the egress step fails after
|
||||
pipelock succeeded, pipelock has the host in its allowlist but
|
||||
egress doesn't enforce it yet — harmless extra-permissive
|
||||
state at pipelock, and a re-approval will land the egress
|
||||
side.
|
||||
|
||||
Returns (before, after) where `after` == `new_content`. Raises
|
||||
EgressApplyError on any step."""
|
||||
container = sidecar_bundle_container_name(slug)
|
||||
before = fetch_current_routes(slug)
|
||||
validate_routes_content(new_content)
|
||||
|
||||
# Pipelock mirror first — if it fails, egress stays intact
|
||||
# and the operator gets a clear error about the half-state.
|
||||
_mirror_hosts_to_pipelock(slug, _hosts_in_routes(new_content))
|
||||
|
||||
# routes.yaml is bind-mounted into the egress container as a
|
||||
# SINGLE FILE. Docker single-file bind mounts pin the source
|
||||
# inode at mount time; write-temp-then-rename swaps the inode
|
||||
# on the host, which leaves the container's mount pointing at
|
||||
# the now-orphaned old inode (so the SIGHUP'd reload re-reads
|
||||
# unchanged content). Write in-place instead. Lose file-level
|
||||
# atomicity, but the apply path issues SIGHUP only AFTER the
|
||||
# write returns, and the addon's `load_routes` raises
|
||||
# `ValueError` on a partial read and keeps the previous
|
||||
# in-memory routes — so a SIGHUP that hypothetically raced an
|
||||
# in-flight write is non-disruptive.
|
||||
target = _egress_routes_host_path(slug)
|
||||
target.parent.mkdir(parents=True, exist_ok=True)
|
||||
target.write_text(new_content)
|
||||
# mitmproxy in the container reads through the bind mount as
|
||||
# uid 1000; the host file has to be world-readable for that
|
||||
# read to succeed (parent dir at 0o700 still restricts who
|
||||
# can reach the file on the host). Routes content is not
|
||||
# secret — tokens live in the container's environ — so 0o644
|
||||
# is the right trade-off.
|
||||
target.chmod(0o644)
|
||||
sig = subprocess.run(
|
||||
["docker", "kill", "--signal", "HUP", container],
|
||||
capture_output=True, text=True, check=False,
|
||||
)
|
||||
if sig.returncode != 0:
|
||||
raise EgressApplyError(
|
||||
f"failed to SIGHUP {container}: "
|
||||
f"{(sig.stderr or '').strip()}"
|
||||
)
|
||||
|
||||
return before, new_content
|
||||
|
||||
|
||||
def _merge_single_route(
|
||||
current_yaml: str, new_route: dict[str, object],
|
||||
) -> str:
|
||||
"""Merge a single proposed route into the current routes.yaml
|
||||
content, returning the merged YAML string.
|
||||
|
||||
Behavior:
|
||||
- If `new_route['host']` is NOT in the current routes →
|
||||
append the route.
|
||||
- If the host IS already present → union the path_allowlist
|
||||
entries (proposed ∪ existing). The existing `auth_scheme`
|
||||
and `token_env` are preserved — agent-proposed auth changes
|
||||
on an existing host are ignored, matching the tool's
|
||||
documented semantics.
|
||||
|
||||
Round-trips the file through `yaml_subset` (the same parser
|
||||
the addon uses), so the merged output is in the YAML format
|
||||
the sidecar reads. Token VALUES never appear here; the routes
|
||||
file carries only env-var slot NAMES."""
|
||||
try:
|
||||
cfg = parse_yaml_subset(current_yaml)
|
||||
except YamlSubsetError as e:
|
||||
raise EgressApplyError(
|
||||
f"current routes.yaml is not valid YAML: {e}"
|
||||
) from e
|
||||
routes = cfg.get("routes")
|
||||
if not isinstance(routes, list):
|
||||
raise EgressApplyError(
|
||||
"current routes.yaml: 'routes' is not a list"
|
||||
)
|
||||
routes_typed = cast(list[object], routes)
|
||||
|
||||
new_host = str(new_route.get("host", "")).lower()
|
||||
if not new_host:
|
||||
raise EgressApplyError(
|
||||
"proposed route is missing 'host'"
|
||||
)
|
||||
|
||||
proposed_paths_obj = new_route.get("path_allowlist")
|
||||
proposed_paths = cast(list[str], proposed_paths_obj) if isinstance(proposed_paths_obj, list) else []
|
||||
|
||||
# Look for an existing entry with the same host (case-insensitive).
|
||||
for entry in routes_typed:
|
||||
if not isinstance(entry, dict):
|
||||
continue
|
||||
entry_typed = cast(dict[str, object], entry)
|
||||
if str(entry_typed.get("host", "")).lower() == new_host:
|
||||
# Merge path_allowlist: union proposed + existing, ordered
|
||||
# by first-seen so existing paths stay in original order.
|
||||
existing_paths_obj = entry_typed.get("path_allowlist")
|
||||
existing_paths = cast(list[str], existing_paths_obj) if isinstance(existing_paths_obj, list) else []
|
||||
seen = {p: None for p in existing_paths}
|
||||
for p in proposed_paths:
|
||||
seen.setdefault(p, None)
|
||||
merged_paths = list(seen.keys())
|
||||
if merged_paths:
|
||||
entry_typed["path_allowlist"] = merged_paths
|
||||
# Preserve existing auth — tool description says agent-
|
||||
# proposed auth on an existing host is ignored.
|
||||
break
|
||||
else:
|
||||
# Host not present; build a new route entry from the
|
||||
# proposed fields. Need to assign a token_env slot if
|
||||
# `auth` was proposed (otherwise the addon's parser rejects
|
||||
# a half-set auth pair). Slots: count existing slots, pick
|
||||
# the next free index.
|
||||
entry_typed: dict[str, object] = {"host": new_route.get("host")} # type: ignore
|
||||
if proposed_paths:
|
||||
entry_typed["path_allowlist"] = proposed_paths
|
||||
auth = new_route.get("auth")
|
||||
if isinstance(auth, dict) and auth.get("scheme") and auth.get("token_ref"): # type: ignore
|
||||
auth_typed = cast(dict[str, object], auth)
|
||||
existing_slots = sorted({
|
||||
str(r_entry.get("token_env", ""))
|
||||
for r_entry_obj in routes_typed
|
||||
if isinstance(r_entry_obj, dict)
|
||||
for r_entry in [cast(dict[str, object], r_entry_obj)]
|
||||
if r_entry.get("token_env")
|
||||
})
|
||||
next_idx = len(existing_slots)
|
||||
entry_typed["auth_scheme"] = str(cast(object, auth_typed.get("scheme")))
|
||||
entry_typed["token_env"] = f"EGRESS_TOKEN_{next_idx}"
|
||||
# NOTE: the addon reads token VALUES from its container's
|
||||
# environ keyed by token_env. A newly-added auth route at
|
||||
# runtime points at a slot that has no env value → the
|
||||
# addon will 403 with "token env unset" until the operator
|
||||
# arranges for the value to land in the container's env.
|
||||
# Recording this here so the operator-facing diff carries
|
||||
# the slot name they'll need to provision.
|
||||
routes_typed.append(entry_typed)
|
||||
|
||||
return _render_routes_payload(cast(list[dict[str, object]], routes_typed))
|
||||
|
||||
|
||||
def add_route(slug: str, proposed_route_json: str) -> tuple[str, str]:
|
||||
"""Apply a single-route addition to the egress. Parses the
|
||||
agent's proposed route, fetches the current routes file, merges,
|
||||
and applies via `apply_routes_change`. Returns (before, after)
|
||||
full-file content for the audit log."""
|
||||
try:
|
||||
proposed = json.loads(proposed_route_json)
|
||||
except json.JSONDecodeError as e:
|
||||
raise EgressApplyError(
|
||||
f"proposed route is not valid JSON: {e}"
|
||||
) from e
|
||||
if not isinstance(proposed, dict):
|
||||
raise EgressApplyError(
|
||||
"proposed route must be a JSON object"
|
||||
)
|
||||
current = fetch_current_routes(slug)
|
||||
merged = _merge_single_route(current, proposed)
|
||||
return apply_routes_change(slug, merged)
|
||||
|
||||
|
||||
__all__ = [
|
||||
"EgressApplyError",
|
||||
"add_route",
|
||||
"apply_routes_change",
|
||||
"fetch_current_routes",
|
||||
"validate_routes_content",
|
||||
]
|
||||
|
||||
@@ -15,7 +15,7 @@ from __future__ import annotations
|
||||
import subprocess
|
||||
|
||||
from .. import ActiveAgent
|
||||
from .bottle_state import read_metadata
|
||||
from ...bottle_state import read_metadata
|
||||
from .compose import compose_project_name, list_active_slugs
|
||||
|
||||
|
||||
@@ -39,6 +39,8 @@ def enumerate_active() -> list[ActiveAgent]:
|
||||
agent_name=metadata.agent_name if metadata else "?",
|
||||
started_at=metadata.started_at if metadata else "",
|
||||
services=tuple(sorted(services)),
|
||||
label=metadata.label if metadata else "",
|
||||
color=metadata.color if metadata else "",
|
||||
))
|
||||
return out
|
||||
|
||||
|
||||
@@ -4,25 +4,19 @@ PRD 0018 chunk 3: each instance is one `docker compose` project.
|
||||
|
||||
The flow is:
|
||||
|
||||
1. Build the agent's base + derived image (compose builds the
|
||||
sidecar images via the `build:` directive on first up).
|
||||
2. Pre-create the per-bottle networks. We do this outside compose
|
||||
so we can inspect the assigned internal CIDR and embed it in
|
||||
pipelock's yaml (compose's `external: true` lets the compose
|
||||
file reference these pre-existing networks).
|
||||
3. Mint the per-bottle CAs (chunk 2 writes them under
|
||||
state/<slug>/{pipelock,egress}/).
|
||||
4. Re-render pipelock yaml with the now-known internal CIDR so
|
||||
the SSRF allowlist exempts the bottle's own subnet.
|
||||
5. Populate the inner plans with launch-time fields so the
|
||||
renderer can read network names, CA paths, pipelock URL.
|
||||
1. Build the agent image from the provider Dockerfile (compose
|
||||
builds the sidecar images via the `build:` directive on first up).
|
||||
2. Mint the per-bottle egress CA (chunk 2 writes it under
|
||||
state/<slug>/egress/).
|
||||
3. Populate the inner plans with launch-time fields so the
|
||||
renderer can read network names, CA paths.
|
||||
6. Render the compose spec, write it to
|
||||
state/<slug>/docker-compose.yml, write metadata.json.
|
||||
7. `docker compose up -d` (token + OAuth values flow into the
|
||||
compose subprocess env so `environment: [NAME]` bare-name
|
||||
entries inherit without rendering values into the file).
|
||||
8. Provision (CA install, prompt copy, skills, git, supervise
|
||||
config) — unchanged, uses `docker exec`.
|
||||
8. Provision (CA install, prompt copy, skills, workspace, git,
|
||||
supervise config) — unchanged, uses `docker exec` / `docker cp`.
|
||||
9. Yield a DockerBottle handle. `exec_agent` runs claude via
|
||||
`docker exec -it` exactly like the pre-compose world.
|
||||
|
||||
@@ -49,11 +43,10 @@ from . import network as network_mod
|
||||
from . import util as docker_mod
|
||||
from .bottle import DockerBottle
|
||||
from .bottle_plan import DockerBottlePlan
|
||||
from .bottle_state import (
|
||||
from ...bottle_state import (
|
||||
bottle_state_dir,
|
||||
egress_state_dir,
|
||||
git_gate_state_dir,
|
||||
pipelock_state_dir,
|
||||
)
|
||||
from .compose import (
|
||||
bottle_plan_to_compose,
|
||||
@@ -66,10 +59,6 @@ from .compose import (
|
||||
write_compose_file,
|
||||
)
|
||||
from .egress import egress_tls_init
|
||||
from .pipelock import (
|
||||
BUNDLE_LOCAL_PIPELOCK_URL,
|
||||
pipelock_tls_init,
|
||||
)
|
||||
|
||||
|
||||
# Where the repo root lives, for `docker build` context. Computed once.
|
||||
@@ -86,7 +75,7 @@ def launch(
|
||||
Teardown on exit."""
|
||||
stack = ExitStack()
|
||||
|
||||
_bottle_for_revoke = plan.spec.manifest.bottle_for(plan.spec.agent_name)
|
||||
_bottle_for_revoke = plan.manifest.bottle
|
||||
_git_gate_dir_for_revoke = git_gate_state_dir(plan.slug)
|
||||
|
||||
def teardown() -> None:
|
||||
@@ -108,40 +97,14 @@ def launch(
|
||||
plan.image, _REPO_DIR,
|
||||
dockerfile=plan.dockerfile_path,
|
||||
)
|
||||
if plan.derived_image:
|
||||
docker_mod.build_image_with_cwd(
|
||||
plan.derived_image, plan.image, plan.workspace_plan
|
||||
)
|
||||
|
||||
# Networks: compose-managed. The names are derived
|
||||
# deterministically from the slug so the renderer can put
|
||||
# them on the services and `compose up` creates them with
|
||||
# those names. The empirical spike confirmed pipelock's
|
||||
# SSRF guard only checks proxied-request destinations, not
|
||||
# source IPs — so the bottle's own internal CIDR doesn't
|
||||
# need to be in `ssrf.ip_allowlist`. Pre-create + CIDR
|
||||
# introspection are gone; compose owns the network
|
||||
# lifecycle.
|
||||
internal_network = network_mod.network_name_for_slug(plan.slug)
|
||||
egress_network = network_mod.network_egress_name_for_slug(plan.slug)
|
||||
|
||||
# Mint per-bottle CAs into state/<slug>/{pipelock,egress}/.
|
||||
ca_cert_host, ca_key_host = pipelock_tls_init(pipelock_state_dir(plan.slug))
|
||||
egress_ca_host, egress_ca_cert_only = egress_tls_init(
|
||||
egress_state_dir(plan.slug),
|
||||
)
|
||||
|
||||
# Populate launch-time fields on every inner plan so the
|
||||
# renderer reads concrete network names, CA paths, and
|
||||
# pipelock URL.
|
||||
proxy_plan = dataclasses.replace(
|
||||
plan.proxy_plan,
|
||||
internal_network=internal_network,
|
||||
internal_network_cidr="",
|
||||
egress_network=egress_network,
|
||||
ca_cert_host_path=ca_cert_host,
|
||||
ca_key_host_path=ca_key_host,
|
||||
)
|
||||
git_gate_plan = plan.git_gate_plan
|
||||
if git_gate_plan.upstreams:
|
||||
git_gate_plan = dataclasses.replace(
|
||||
@@ -149,17 +112,13 @@ def launch(
|
||||
internal_network=internal_network,
|
||||
egress_network=egress_network,
|
||||
)
|
||||
egress_plan = plan.egress_plan
|
||||
if egress_plan.routes:
|
||||
egress_plan = dataclasses.replace(
|
||||
egress_plan,
|
||||
internal_network=internal_network,
|
||||
egress_network=egress_network,
|
||||
mitmproxy_ca_host_path=egress_ca_host,
|
||||
mitmproxy_ca_cert_only_host_path=egress_ca_cert_only,
|
||||
pipelock_ca_host_path=ca_cert_host,
|
||||
pipelock_proxy_url=BUNDLE_LOCAL_PIPELOCK_URL,
|
||||
)
|
||||
egress_plan = dataclasses.replace(
|
||||
plan.egress_plan,
|
||||
internal_network=internal_network,
|
||||
egress_network=egress_network,
|
||||
mitmproxy_ca_host_path=egress_ca_host,
|
||||
mitmproxy_ca_cert_only_host_path=egress_ca_cert_only,
|
||||
)
|
||||
supervise_plan = plan.supervise_plan
|
||||
if supervise_plan is not None:
|
||||
supervise_plan = dataclasses.replace(
|
||||
@@ -168,7 +127,6 @@ def launch(
|
||||
)
|
||||
plan = dataclasses.replace(
|
||||
plan,
|
||||
proxy_plan=proxy_plan,
|
||||
git_gate_plan=git_gate_plan,
|
||||
egress_plan=egress_plan,
|
||||
supervise_plan=supervise_plan,
|
||||
@@ -217,6 +175,10 @@ def launch(
|
||||
None,
|
||||
agent_command=plan.agent_command,
|
||||
agent_prompt_mode=plan.agent_prompt_mode,
|
||||
agent_provider_template=plan.agent_provider_template,
|
||||
terminal_title=plan.spec.label or plan.spec.agent_name,
|
||||
terminal_color=plan.spec.color,
|
||||
agent_workdir=plan.workspace_plan.workdir,
|
||||
)
|
||||
bottle.prompt_path = provision(plan, bottle)
|
||||
|
||||
|
||||
@@ -1,11 +1,10 @@
|
||||
"""Docker network plumbing for the per-agent egress topology.
|
||||
|
||||
The agent container sits on a Docker `--internal` network (no default
|
||||
gateway). Pipelock straddles that network and a per-agent user-defined
|
||||
bridge for upstream egress. We deliberately do NOT use Docker's legacy
|
||||
gateway). Egress straddles that network and a per-agent user-defined
|
||||
bridge for upstream traffic. We deliberately do NOT use Docker's legacy
|
||||
`bridge` network because only user-defined bridges run Docker's
|
||||
embedded DNS resolver, which pipelock needs to resolve api.anthropic.com
|
||||
and similar upstream hostnames.
|
||||
embedded DNS resolver, which egress needs to resolve upstream hostnames.
|
||||
|
||||
Naming: bot-bottle-net-<slug> (internal),
|
||||
bot-bottle-egress-<slug> (egress). Numeric suffix on conflict
|
||||
@@ -77,20 +76,12 @@ def network_create_internal(slug: str) -> str:
|
||||
|
||||
def network_create_egress(slug: str) -> str:
|
||||
"""Create a per-agent user-defined bridge (NOT the legacy `bridge`)
|
||||
so the pipelock sidecar has working DNS for upstream hostnames."""
|
||||
so the egress sidecar has working DNS for upstream hostnames."""
|
||||
return _network_create_with_prefix(network_egress_name_for_slug(slug), internal=False)
|
||||
|
||||
|
||||
def network_inspect_cidr(name: str) -> str:
|
||||
"""Return the IPv4 CIDR Docker assigned to a user-defined network.
|
||||
|
||||
Used by pipelock's SSRF guard exception: the bottle's internal
|
||||
network sits in RFC1918 space, so pipelock's `internal:` list
|
||||
would block any agent request whose destination resolves there
|
||||
— including the cred-proxy sidecar's address. Adding the
|
||||
network's CIDR to pipelock's `ssrf.ip_allowlist` lets traffic
|
||||
targeted at the bottle's own sidecars through while pipelock
|
||||
still body-scans and api_allowlist-gates as usual."""
|
||||
"""Return the IPv4 CIDR Docker assigned to a user-defined network."""
|
||||
result = subprocess.run(
|
||||
["docker", "network", "inspect",
|
||||
"--format", "{{range .IPAM.Config}}{{.Subnet}}{{end}}", name],
|
||||
|
||||
@@ -1,74 +0,0 @@
|
||||
"""Docker-side pipelock helpers: image pin, container naming, and
|
||||
the one-shot `pipelock tls init` host-side CA mint. The
|
||||
prepare-time YAML rendering itself lives on the platform-neutral
|
||||
`PipelockProxy` ABC — backends instantiate it directly.
|
||||
|
||||
The per-container `.start()` / `.stop()` lifecycle was deleted in
|
||||
PRD 0024 chunk 3; compose-up owns the container lifecycle (PRD
|
||||
0018) and the bundle path (PRD 0024) collapses pipelock + egress
|
||||
+ git-gate + supervise into one container."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import subprocess
|
||||
from pathlib import Path
|
||||
|
||||
from ...log import die
|
||||
|
||||
|
||||
# Pipelock image, pinned by digest. The digest is the multi-arch image
|
||||
# index for ghcr.io/luckypipewrench/pipelock:2.3.0.
|
||||
PIPELOCK_IMAGE = os.environ.get(
|
||||
"BOT_BOTTLE_PIPELOCK_IMAGE",
|
||||
"ghcr.io/luckypipewrench/pipelock@sha256:"
|
||||
"3b1a39417b98406ddc5dc2d8fcb42865ddc0c68a43d355db55f0f8cb06bc6de9",
|
||||
)
|
||||
|
||||
# Listening port for pipelock's forward proxy.
|
||||
PIPELOCK_PORT = os.environ.get("BOT_BOTTLE_PIPELOCK_PORT", "8888")
|
||||
|
||||
|
||||
# The URL egress dials for its upstream HTTPS_PROXY. egress and pipelock
|
||||
# share the same container's network namespace inside the sidecar bundle, so
|
||||
# loopback reaches pipelock directly — no docker DNS aliases involved.
|
||||
BUNDLE_LOCAL_PIPELOCK_URL = f"http://127.0.0.1:{PIPELOCK_PORT}"
|
||||
|
||||
|
||||
def pipelock_tls_init(stage_dir: Path) -> tuple[Path, Path]:
|
||||
"""Generate a fresh per-bottle CA via a one-shot pipelock container.
|
||||
|
||||
Runs `pipelock tls init` against a host-mounted scratch dir, leaving
|
||||
`ca.pem` (public cert, mode 600) and `ca-key.pem` (private key, mode
|
||||
600) under `<stage_dir>/pipelock-ca/`. Returns the two host paths.
|
||||
|
||||
The image is pinned (same digest the running sidecar uses) so the
|
||||
generated CA matches what the sidecar expects. Output is owned by
|
||||
whatever UID the one-shot ran as; the compose renderer's
|
||||
bind-mounts pin the files in place at runtime, so ownership
|
||||
inside the running sidecar (root in pipelock's distroless image)
|
||||
is independent."""
|
||||
work = stage_dir / "pipelock-ca"
|
||||
work.mkdir(exist_ok=True)
|
||||
result = subprocess.run(
|
||||
["docker", "run", "--rm",
|
||||
"-v", f"{work}:/h",
|
||||
"-e", "PIPELOCK_HOME=/h",
|
||||
PIPELOCK_IMAGE, "tls", "init"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=False,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
die(f"pipelock tls init failed: {result.stderr.strip()}")
|
||||
cert = work / "ca.pem"
|
||||
key = work / "ca-key.pem"
|
||||
if not cert.is_file() or not key.is_file():
|
||||
die(f"pipelock tls init did not produce ca files in {work}")
|
||||
# Explicit perms in case a future pipelock release changes
|
||||
# defaults. Pipelock runs as root in its distroless image and
|
||||
# bind-mounts work with 0o600 (root reads everything); the key
|
||||
# has no reason to be readable to anyone else on the host.
|
||||
key.chmod(0o600)
|
||||
cert.chmod(0o644)
|
||||
return (cert, key)
|
||||
@@ -1,200 +0,0 @@
|
||||
"""pipelock_apply — host-side helper to apply an api_allowlist
|
||||
change to a running pipelock sidecar (PRD 0015).
|
||||
|
||||
Used by the supervise dashboard when the operator approves a
|
||||
pipelock-block proposal (or runs the operator-initiated `pipelock
|
||||
edit <bottle>` verb). Fetches the current pipelock.yaml via `docker
|
||||
exec`, parses it, swaps the api_allowlist with the proposed hosts,
|
||||
re-renders, writes back via the bind-mount path, then signals the
|
||||
bundle supervisor to restart the pipelock daemon (`docker kill
|
||||
--signal USR1`) so
|
||||
pipelock picks up the new config.
|
||||
|
||||
v1 uses restart, not SIGHUP — pipelock has no in-process reload
|
||||
hook and adding one is the "SIGHUP reload for pipelock" open
|
||||
question in PRD 0015. Restart drops in-flight outbound calls; the
|
||||
agent's HTTP client retries pick up against the restarted proxy.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import re
|
||||
import subprocess
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
|
||||
from ...pipelock import pipelock_render_yaml
|
||||
from ...yaml_subset import YamlSubsetError, parse_yaml_subset
|
||||
from .bottle_state import pipelock_state_dir
|
||||
from .sidecar_bundle import sidecar_bundle_container_name
|
||||
|
||||
|
||||
def _pipelock_yaml_host_path(slug: str) -> Path:
|
||||
"""The bind-mount source for the pipelock sidecar's
|
||||
pipelock.yaml — matches what pipelock.prepare wrote at chunk-2
|
||||
paths."""
|
||||
return pipelock_state_dir(slug) / "pipelock.yaml"
|
||||
|
||||
|
||||
PIPELOCK_YAML_IN_CONTAINER = "/etc/pipelock.yaml"
|
||||
|
||||
# Allowlist proposals are one-hostname-per-line. Blank lines and
|
||||
# `#`-prefixed comments are ignored. The character set matches the
|
||||
# supervise sidecar's syntactic check on the agent's pipelock-block
|
||||
# proposal (alphanumerics + dot/dash/underscore).
|
||||
_HOST_OK = re.compile(r"^[A-Za-z0-9_.-]+$")
|
||||
|
||||
|
||||
class PipelockApplyError(RuntimeError):
|
||||
"""Raised when fetch / parse / apply fails. The dashboard renders
|
||||
the message and keeps the proposal pending — never crashes."""
|
||||
|
||||
|
||||
def parse_allowlist_content(content: str) -> list[str]:
|
||||
"""One hostname per line. Blanks and `#` comments are ignored.
|
||||
Raises PipelockApplyError if a line has a disallowed character."""
|
||||
hosts: list[str] = []
|
||||
for i, raw_line in enumerate(content.splitlines(), start=1):
|
||||
line = raw_line.strip()
|
||||
if not line or line.startswith("#"):
|
||||
continue
|
||||
if not _HOST_OK.match(line):
|
||||
raise PipelockApplyError(
|
||||
f"allowlist line {i}: {line!r} has disallowed characters"
|
||||
)
|
||||
hosts.append(line)
|
||||
return hosts
|
||||
|
||||
|
||||
def render_allowlist_content(hosts: list[str]) -> str:
|
||||
"""Hosts → one-per-line string (the operator-facing format)."""
|
||||
if not hosts:
|
||||
return ""
|
||||
return "\n".join(hosts) + "\n"
|
||||
|
||||
|
||||
def fetch_current_yaml(slug: str) -> str:
|
||||
"""Read the live /etc/pipelock.yaml from the sidecar bundle.
|
||||
|
||||
Uses `docker cp` because pipelock inside the bundle is the
|
||||
distroless pipelock binary with no shell, and `docker cp` is a
|
||||
daemon-API tarball copy that works regardless of what's
|
||||
available inside the container.
|
||||
|
||||
Raises PipelockApplyError if the read fails."""
|
||||
container = sidecar_bundle_container_name(slug)
|
||||
fd, tmp_path = tempfile.mkstemp(prefix="cb-pipelock-fetch.", suffix=".yaml")
|
||||
os.close(fd)
|
||||
try:
|
||||
r = subprocess.run(
|
||||
[
|
||||
"docker", "cp",
|
||||
f"{container}:{PIPELOCK_YAML_IN_CONTAINER}", tmp_path,
|
||||
],
|
||||
capture_output=True, text=True, check=False,
|
||||
)
|
||||
if r.returncode != 0:
|
||||
raise PipelockApplyError(
|
||||
f"could not fetch pipelock.yaml from {container}: "
|
||||
f"{(r.stderr or '').strip() or 'container not running?'}"
|
||||
)
|
||||
return Path(tmp_path).read_text(encoding="utf-8")
|
||||
finally:
|
||||
try:
|
||||
Path(tmp_path).unlink()
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
|
||||
def fetch_current_allowlist(slug: str) -> str:
|
||||
"""Fetch the live yaml, extract api_allowlist, render as one-per-
|
||||
line — the operator-facing format for the TUI / agent's
|
||||
current-config mount."""
|
||||
yaml = fetch_current_yaml(slug)
|
||||
try:
|
||||
cfg = parse_yaml_subset(yaml)
|
||||
except YamlSubsetError as e:
|
||||
raise PipelockApplyError(f"running pipelock yaml: {e}") from e
|
||||
hosts = cfg.get("api_allowlist", [])
|
||||
if not isinstance(hosts, list):
|
||||
raise PipelockApplyError(
|
||||
"running pipelock yaml: api_allowlist is not a list"
|
||||
)
|
||||
return render_allowlist_content([str(h) for h in hosts])
|
||||
|
||||
|
||||
def apply_allowlist_change(
|
||||
slug: str, new_allowlist_content: str,
|
||||
) -> tuple[str, str]:
|
||||
"""Apply `new_allowlist_content` to the sidecar bundle:
|
||||
1. Parse the proposed hosts (one per line).
|
||||
2. Fetch + parse current pipelock.yaml.
|
||||
3. Replace api_allowlist with the proposed hosts; re-render.
|
||||
4. Write the new yaml to the bind-mount source.
|
||||
5. `docker kill --signal USR1 <bundle>` so the supervisor
|
||||
restarts the pipelock daemon in place (leaving egress,
|
||||
git-gate, and supervise running). Pipelock has no
|
||||
in-process reload; the supervisor's per-daemon restart
|
||||
keeps the agent's MCP socket alive — a whole-bundle
|
||||
`docker restart` would bounce supervise too.
|
||||
|
||||
Returns (before, after) where both are one-per-line allowlist
|
||||
strings (operator-facing format). Raises PipelockApplyError on
|
||||
any failure; the sidecar's existing config stays in place until
|
||||
the host write succeeds, and the SIGUSR1 is what makes it
|
||||
live."""
|
||||
new_hosts = parse_allowlist_content(new_allowlist_content)
|
||||
container = sidecar_bundle_container_name(slug)
|
||||
current_yaml = fetch_current_yaml(slug)
|
||||
try:
|
||||
cfg = parse_yaml_subset(current_yaml)
|
||||
except YamlSubsetError as e:
|
||||
raise PipelockApplyError(f"running pipelock yaml: {e}") from e
|
||||
current_hosts = cfg.get("api_allowlist", [])
|
||||
if not isinstance(current_hosts, list):
|
||||
raise PipelockApplyError(
|
||||
"running pipelock yaml: api_allowlist is not a list"
|
||||
)
|
||||
|
||||
before = render_allowlist_content([str(h) for h in current_hosts])
|
||||
after = render_allowlist_content(new_hosts)
|
||||
|
||||
cfg["api_allowlist"] = new_hosts
|
||||
rendered = pipelock_render_yaml(cfg)
|
||||
|
||||
# pipelock.yaml is bind-mounted into the container as a SINGLE
|
||||
# FILE — same Docker single-file inode issue as egress_apply:
|
||||
# write-temp-then-rename swaps the host inode and leaves the
|
||||
# container's mount pointing at the orphaned old one. Write
|
||||
# in-place. The SIGUSR1 below makes the new content live
|
||||
# (pipelock has no in-process reload, so the supervisor
|
||||
# restarts the pipelock daemon in response).
|
||||
target = _pipelock_yaml_host_path(slug)
|
||||
target.parent.mkdir(parents=True, exist_ok=True)
|
||||
target.write_text(rendered)
|
||||
# pipelock runs as root in its distroless image — any mode is
|
||||
# fine — but 0o600 matches what prepare wrote.
|
||||
target.chmod(0o600)
|
||||
restart = subprocess.run(
|
||||
["docker", "kill", "--signal", "USR1", container],
|
||||
capture_output=True, text=True, check=False,
|
||||
)
|
||||
if restart.returncode != 0:
|
||||
raise PipelockApplyError(
|
||||
f"failed to signal {container} for pipelock restart: "
|
||||
f"{(restart.stderr or '').strip()}"
|
||||
)
|
||||
|
||||
return before, after
|
||||
|
||||
|
||||
__all__ = [
|
||||
"PIPELOCK_YAML_IN_CONTAINER",
|
||||
"PipelockApplyError",
|
||||
"apply_allowlist_change",
|
||||
"fetch_current_allowlist",
|
||||
"fetch_current_yaml",
|
||||
"parse_allowlist_content",
|
||||
"render_allowlist_content",
|
||||
]
|
||||
@@ -1,278 +0,0 @@
|
||||
"""Prepare step for the Docker bottle backend.
|
||||
|
||||
`resolve_plan` does all host-side resolution (image and container
|
||||
names, env-file, prompt-file, proxy plan, runtime detection) and
|
||||
returns a frozen DockerBottlePlan. No Docker resources are created;
|
||||
the only side effects are scratch files under `stage_dir` and a probe
|
||||
of `docker info`. Cross-backend host-side validation has already run
|
||||
via the base class's `prepare` template before this is called.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
from datetime import datetime, timezone
|
||||
from dataclasses import replace
|
||||
from pathlib import Path
|
||||
|
||||
from ...agent_provider import agent_provision_plan, runtime_for
|
||||
from ...egress import Egress
|
||||
from ...env import ResolvedEnv, resolve_env
|
||||
from ...git_gate import GitGate
|
||||
from ...log import die
|
||||
from ...pipelock import PipelockProxy
|
||||
from ...supervise import Supervise
|
||||
from ...workspace import workspace_plan as resolve_workspace_plan
|
||||
from .. import BottleSpec
|
||||
from . import util as docker_mod
|
||||
from .bottle_plan import DockerBottlePlan
|
||||
from .bottle_state import (
|
||||
BottleMetadata,
|
||||
agent_state_dir,
|
||||
bottle_identity,
|
||||
clear_preserve_marker,
|
||||
egress_state_dir,
|
||||
git_gate_state_dir,
|
||||
per_bottle_dockerfile,
|
||||
per_bottle_dockerfile_path,
|
||||
per_bottle_image_tag,
|
||||
pipelock_state_dir,
|
||||
supervise_state_dir,
|
||||
write_metadata,
|
||||
)
|
||||
from .sidecar_bundle import sidecar_bundle_container_name
|
||||
|
||||
|
||||
def resolve_plan(
|
||||
spec: BottleSpec,
|
||||
*,
|
||||
stage_dir: Path,
|
||||
) -> DockerBottlePlan:
|
||||
"""Resolve Docker-specific names and write scratch files. Trusts
|
||||
that the agent and its skills/git-gate keys are present —
|
||||
validation already ran in the base class."""
|
||||
docker_mod.require_docker()
|
||||
|
||||
proxy = PipelockProxy()
|
||||
git_gate = GitGate()
|
||||
egress = Egress()
|
||||
supervise = Supervise()
|
||||
|
||||
manifest = spec.manifest
|
||||
agent = manifest.agents[spec.agent_name]
|
||||
bottle = manifest.bottle_for(spec.agent_name)
|
||||
provider = bottle.agent_provider
|
||||
provider_runtime = runtime_for(provider.template)
|
||||
guest_home = "/home/node"
|
||||
workspace_plan = resolve_workspace_plan(spec, guest_home=guest_home)
|
||||
|
||||
# PRD 0016 follow-up: identity, not bare slug. A fresh `start`
|
||||
# mints a random-suffixed identity (so parallel runs of the same
|
||||
# agent in the same cwd don't collide on container/network
|
||||
# names); a `resume` passes the recorded identity in via
|
||||
# spec.identity to continue an existing bottle's state.
|
||||
slug = spec.identity or bottle_identity(spec.agent_name)
|
||||
# Record the launch metadata so `cli.py resume <identity>` can
|
||||
# reconstruct the spec. Idempotent — re-writes on resume with a
|
||||
# refreshed started_at.
|
||||
write_metadata(BottleMetadata(
|
||||
identity=slug,
|
||||
agent_name=spec.agent_name,
|
||||
cwd=spec.user_cwd if spec.copy_cwd else "",
|
||||
copy_cwd=spec.copy_cwd,
|
||||
started_at=datetime.now(timezone.utc).isoformat(),
|
||||
compose_project=f"bot-bottle-{slug}",
|
||||
backend="docker",
|
||||
))
|
||||
# Clear any leftover preserve marker from a prior capability-block
|
||||
# so this fresh launch can be cleaned up at session-end unless
|
||||
# the agent triggers another capability-block.
|
||||
clear_preserve_marker(slug)
|
||||
|
||||
# PRD 0016 capability-block: if a per-bottle Dockerfile has been
|
||||
# written (via apply_capability_change), the base image becomes
|
||||
# per_bottle_image_tag(slug) built from that file. --cwd still
|
||||
# layers a derived image on top.
|
||||
dockerfile_path = ""
|
||||
if per_bottle_dockerfile(slug) is not None:
|
||||
image_default = per_bottle_image_tag(slug)
|
||||
dockerfile_path = str(per_bottle_dockerfile_path(slug))
|
||||
elif provider.dockerfile:
|
||||
image_default = f"bot-bottle-{provider.template}:{slug}"
|
||||
dockerfile_path = _resolve_manifest_dockerfile(provider.dockerfile, spec)
|
||||
elif provider_runtime.dockerfile:
|
||||
image_default = provider_runtime.image
|
||||
dockerfile_path = provider_runtime.dockerfile
|
||||
else:
|
||||
image_default = provider_runtime.image
|
||||
image = os.environ.get("BOT_BOTTLE_IMAGE", image_default)
|
||||
derived_image = ""
|
||||
runtime_image = image
|
||||
if spec.copy_cwd:
|
||||
derived_image = os.environ.get(
|
||||
"BOT_BOTTLE_DERIVED_IMAGE", f"bot-bottle-cwd:{slug}"
|
||||
)
|
||||
runtime_image = derived_image
|
||||
|
||||
default_container = f"bot-bottle-{slug}"
|
||||
pinned_container = os.environ.get("BOT_BOTTLE_CONTAINER", "")
|
||||
container_name_pinned = bool(pinned_container)
|
||||
if container_name_pinned:
|
||||
container_name = pinned_container
|
||||
if docker_mod.container_exists(container_name):
|
||||
die(
|
||||
f"container '{container_name}' already exists "
|
||||
f"(pinned via BOT_BOTTLE_CONTAINER). "
|
||||
f"Remove it with 'docker rm -f {container_name}' or unset the override."
|
||||
)
|
||||
else:
|
||||
container_name = ""
|
||||
for candidate in docker_mod.container_name_candidates(default_container):
|
||||
if not docker_mod.container_exists(candidate):
|
||||
container_name = candidate
|
||||
break
|
||||
if not container_name:
|
||||
die(
|
||||
f"could not find a free container name after "
|
||||
f"{default_container}-{docker_mod.MAX_CONTAINER_SUFFIX}; "
|
||||
f"clean up old containers with 'docker rm -f <name>'"
|
||||
)
|
||||
|
||||
# Probe the sidecar-bundle container name for an orphan from a
|
||||
# previous run. Otherwise a stale bundle surfaces as a
|
||||
# docker-create conflict deep inside launch() with no actionable
|
||||
# hint; failing fast here points at the cleanup command.
|
||||
bundle_name = sidecar_bundle_container_name(slug)
|
||||
if docker_mod.container_exists(bundle_name):
|
||||
die(
|
||||
f"sidecar bundle container '{bundle_name}' already exists. "
|
||||
f"This is an orphan from a previous run; clean it up with "
|
||||
f"'./cli.py cleanup' (or 'docker rm -f {bundle_name}') and "
|
||||
f"retry."
|
||||
)
|
||||
|
||||
# PRD 0018 chunk 2: prepare-time scratch files live under
|
||||
# ~/.bot-bottle/state/<slug>/<service>/ so chunk 3's compose
|
||||
# bind-mounts can point at stable paths. The state subdirs are
|
||||
# cleaned up by start.py's session-end teardown unless something
|
||||
# explicitly preserves the state dir (capability-block, crash).
|
||||
agent_dir = agent_state_dir(slug)
|
||||
agent_dir.mkdir(parents=True, exist_ok=True)
|
||||
env_file = agent_dir / "agent.env"
|
||||
prompt_file = agent_dir / "prompt.txt"
|
||||
prompt_file.write_text("")
|
||||
prompt_file.chmod(0o600)
|
||||
|
||||
git_gate_dir = git_gate_state_dir(slug)
|
||||
git_gate_dir.mkdir(parents=True, exist_ok=True)
|
||||
git_gate_plan = git_gate.prepare(bottle, slug, git_gate_dir)
|
||||
|
||||
resolved = resolve_env(manifest, spec.agent_name)
|
||||
# Everything that should reach the bottle by-name (so its value
|
||||
# never lands on argv or in env_file) goes into one dict. Nothing
|
||||
# mutates the host os.environ.
|
||||
forwarded_env: dict[str, str] = dict(resolved.forwarded)
|
||||
_write_env_file(resolved, env_file)
|
||||
prompt_file.write_text(agent.prompt)
|
||||
|
||||
use_runsc = docker_mod.runsc_available()
|
||||
agent_provision = agent_provision_plan(
|
||||
template=provider.template,
|
||||
dockerfile=dockerfile_path,
|
||||
state_dir=agent_dir,
|
||||
guest_home=guest_home,
|
||||
forward_host_credentials=provider.forward_host_credentials,
|
||||
auth_token=provider.auth_token,
|
||||
host_env=dict(os.environ),
|
||||
trusted_project_path=workspace_plan.workdir,
|
||||
)
|
||||
guest_env = dict(agent_provision.guest_env)
|
||||
for key, val in agent_provision.env_vars.items():
|
||||
guest_env.setdefault(key, val)
|
||||
agent_provision = replace(agent_provision, guest_env=guest_env)
|
||||
|
||||
pipelock_dir = pipelock_state_dir(slug)
|
||||
pipelock_dir.mkdir(parents=True, exist_ok=True)
|
||||
proxy_plan = proxy.prepare(
|
||||
bottle, slug, pipelock_dir, agent_provision.egress_routes,
|
||||
)
|
||||
|
||||
egress_dir = egress_state_dir(slug)
|
||||
egress_dir.mkdir(parents=True, exist_ok=True)
|
||||
egress_plan = egress.prepare(
|
||||
bottle, slug, egress_dir, agent_provision.egress_routes,
|
||||
)
|
||||
|
||||
supervise_plan = None
|
||||
if bottle.supervise:
|
||||
# Current Dockerfile for the agent image. Read from the repo
|
||||
# root; for `--cwd` derived images the base Dockerfile is what
|
||||
# the agent should propose changes against (the derived layer
|
||||
# is just a workspace copy).
|
||||
# (routes.yaml + pipelock allowlist used to land here too but
|
||||
# PRD 0017 chunk 3 moved them behind the
|
||||
# `list-egress-routes` MCP tool so the agent gets live
|
||||
# state rather than a launch-time snapshot.)
|
||||
supervise_dockerfile_path = (
|
||||
Path(dockerfile_path)
|
||||
if dockerfile_path
|
||||
else Path(__file__).resolve().parent.parent.parent.parent / "Dockerfile.claude"
|
||||
)
|
||||
dockerfile_content = (
|
||||
supervise_dockerfile_path.read_text(encoding="utf-8")
|
||||
if supervise_dockerfile_path.is_file()
|
||||
else ""
|
||||
)
|
||||
supervise_dir = supervise_state_dir(slug)
|
||||
supervise_dir.mkdir(parents=True, exist_ok=True)
|
||||
supervise_plan = supervise.prepare(
|
||||
slug, supervise_dir,
|
||||
dockerfile_content=dockerfile_content,
|
||||
)
|
||||
|
||||
return DockerBottlePlan(
|
||||
spec=spec,
|
||||
stage_dir=stage_dir,
|
||||
guest_home=guest_home,
|
||||
slug=slug,
|
||||
container_name=container_name,
|
||||
container_name_pinned=container_name_pinned,
|
||||
image=image,
|
||||
derived_image=derived_image,
|
||||
runtime_image=runtime_image,
|
||||
dockerfile_path=dockerfile_path,
|
||||
env_file=env_file,
|
||||
forwarded_env=forwarded_env,
|
||||
prompt_file=prompt_file,
|
||||
proxy_plan=proxy_plan,
|
||||
git_gate_plan=git_gate_plan,
|
||||
egress_plan=egress_plan,
|
||||
supervise_plan=supervise_plan,
|
||||
use_runsc=use_runsc,
|
||||
agent_provision=agent_provision,
|
||||
workspace_plan=workspace_plan,
|
||||
)
|
||||
|
||||
|
||||
def _write_env_file(resolved: ResolvedEnv, env_file: Path) -> None:
|
||||
"""Serialize the literal portion of a ResolvedEnv into docker's
|
||||
`--env-file` syntax (NAME=VALUE per line, mode 600 since the file
|
||||
may carry verbatim values from the manifest). Forwarded names ride
|
||||
on the plan as a structured tuple instead."""
|
||||
env_lines: list[str] = []
|
||||
for name, value in resolved.literals.items():
|
||||
if "\n" in value:
|
||||
die(
|
||||
f"env entry {name} (literal) contains a newline; "
|
||||
f"docker --env-file cannot represent multi-line values."
|
||||
)
|
||||
env_lines.append(f"{name}={value}")
|
||||
env_file.write_text("\n".join(env_lines) + ("\n" if env_lines else ""))
|
||||
env_file.chmod(0o600)
|
||||
|
||||
|
||||
def _resolve_manifest_dockerfile(path_value: str, spec: BottleSpec) -> str:
|
||||
path = Path(os.path.expanduser(path_value))
|
||||
if not path.is_absolute():
|
||||
path = Path(spec.user_cwd) / path
|
||||
return str(path)
|
||||
@@ -2,10 +2,11 @@
|
||||
|
||||
Per PRD 0050 the per-provider provisioning steps (prompt, skills,
|
||||
declarative provision-plan apply, supervise MCP registration) live on
|
||||
the `AgentProvider` plugin under `bot_bottle/contrib/`. The modules
|
||||
left in this subpackage handle only the steps that are
|
||||
backend-specific:
|
||||
the `AgentProvider` plugin under `bot_bottle/contrib/`. CA and git
|
||||
provisioning also moved to the AgentProvider ABC (with Debian/node
|
||||
defaults); user plugins override them for non-standard images.
|
||||
|
||||
- ca.py — install per-bottle CA bundle into the guest trust store
|
||||
- git.py — copy host cwd `.git` into the guest when --cwd is used
|
||||
No modules remain in this subpackage — the directory is kept so that
|
||||
existing imports of `from .provision import ...` don't need updating
|
||||
if new backend-specific provisioners are added later.
|
||||
"""
|
||||
|
||||
@@ -1,51 +0,0 @@
|
||||
"""Install the per-bottle MITM CA into the agent container's trust
|
||||
store.
|
||||
|
||||
Post-PRD-0017 the CA depends on the agent's HTTP_PROXY target:
|
||||
|
||||
- Bottle declares `egress.routes[]` → agent's HTTP_PROXY
|
||||
points at egress; the cert the agent must trust is the
|
||||
one egress mints leaf certs with (the egress CA).
|
||||
- No egress routes → agent's HTTP_PROXY points straight at
|
||||
pipelock; the cert the agent must trust is pipelock's CA (the
|
||||
pre-cutover behavior).
|
||||
|
||||
By the time this provisioner runs, the corresponding `tls_init`
|
||||
helper has generated the chosen CA under `plan.stage_dir`, and the
|
||||
sidecar (pipelock or egress) is up referencing the
|
||||
in-container CA paths.
|
||||
|
||||
Cert lands on Debian's standard source path
|
||||
(`/usr/local/share/ca-certificates/`); `update-ca-certificates`
|
||||
rebuilds `/etc/ssl/certs/ca-certificates.crt`, which is what curl,
|
||||
Python `ssl`, and OpenSSL-based tools all read by default. The env
|
||||
trio set on the agent's `docker run` covers Node
|
||||
(`NODE_EXTRA_CA_CERTS`) and Python `requests` /
|
||||
`SSL_CERT_FILE`-honoring libraries that don't load the system
|
||||
bundle.
|
||||
|
||||
The fingerprint is computed via stdlib (`ssl.PEM_cert_to_DER_cert`
|
||||
+ `hashlib.sha256`) and logged once to stderr. The private key
|
||||
stays on the host (under `stage_dir`) until teardown wipes the
|
||||
stage dir; nothing in the agent ever sees it."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from ... import Bottle
|
||||
from ...util import AGENT_CA_PATH, log_ca_fingerprint, select_ca_cert
|
||||
from ..bottle_plan import DockerBottlePlan
|
||||
|
||||
|
||||
def provision_ca(plan: DockerBottlePlan, bottle: Bottle) -> None:
|
||||
"""Copy the agent-facing CA cert into the agent, rebuild the
|
||||
trust bundle, emit a one-line fingerprint log. Called from
|
||||
`BottleBackend.provision` after the agent container is up."""
|
||||
cert_host_path, label = select_ca_cert(plan.egress_plan, plan.proxy_plan)
|
||||
|
||||
bottle.cp_in(str(cert_host_path), AGENT_CA_PATH)
|
||||
bottle.exec(
|
||||
f"chmod 644 {AGENT_CA_PATH} && update-ca-certificates",
|
||||
user="root",
|
||||
)
|
||||
|
||||
log_ca_fingerprint(cert_host_path, label)
|
||||
@@ -1,106 +0,0 @@
|
||||
"""Git provisioning inside a running Docker bottle.
|
||||
|
||||
Three concerns, all about git in the agent:
|
||||
|
||||
1. If --cwd was passed AND the host cwd has a .git, copy that .git
|
||||
into the planned guest workspace so the agent operates on the
|
||||
user's repo.
|
||||
2. If the bottle declares `git` entries (PRD 0008), write a
|
||||
~/.gitconfig with insteadOf rules so every git operation
|
||||
against a declared upstream (push, fetch, clone, pull,
|
||||
ls-remote) transparently hits the per-agent git-gate. The
|
||||
gate mirrors the upstream in both directions, so URL
|
||||
rewriting is symmetric.
|
||||
3. If the bottle declares `git.user` (issue #86), set
|
||||
`git config --global user.{name,email}` inside the bottle so
|
||||
the agent's commits are attributed to that identity.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import shlex
|
||||
|
||||
from ....git_gate import GIT_GATE_HOSTNAME, git_gate_render_gitconfig
|
||||
from ....log import info
|
||||
from ... import Bottle
|
||||
from ..bottle_plan import DockerBottlePlan
|
||||
|
||||
|
||||
def provision_git(plan: DockerBottlePlan, bottle: Bottle) -> None:
|
||||
"""Set up git inside the bottle. Runs all three subcases; each
|
||||
no-ops when its condition isn't met."""
|
||||
_provision_cwd_git(plan, bottle)
|
||||
_provision_git_gate_config(plan, bottle)
|
||||
_provision_git_user(plan, bottle)
|
||||
|
||||
|
||||
def _provision_cwd_git(plan: DockerBottlePlan, bottle: Bottle) -> None:
|
||||
"""If --cwd was set and the host cwd has a .git directory, copy
|
||||
it into /home/node/workspace/.git and fix ownership. No-op
|
||||
otherwise."""
|
||||
workspace = plan.workspace_plan
|
||||
if not (workspace.enabled and workspace.copy_git and workspace.has_host_git_dir):
|
||||
return
|
||||
guest_workspace_git = f"{workspace.guest_path}/.git"
|
||||
host_git = str(workspace.host_path / ".git")
|
||||
info(f"copying {host_git} -> {bottle.name}:{guest_workspace_git}")
|
||||
bottle.cp_in(host_git, guest_workspace_git)
|
||||
bottle.exec(
|
||||
f"chown -R {shlex.quote(workspace.owner)} {shlex.quote(guest_workspace_git)}",
|
||||
user="root",
|
||||
)
|
||||
|
||||
|
||||
def _provision_git_gate_config(plan: DockerBottlePlan, bottle: Bottle) -> None:
|
||||
"""Write ~/.gitconfig in the bottle with the git-gate
|
||||
insteadOf rules. No-op when the bottle has no `git` entries."""
|
||||
manifest_bottle = plan.spec.manifest.bottle_for(plan.spec.agent_name)
|
||||
if not manifest_bottle.git:
|
||||
return
|
||||
container_gitconfig = f"{plan.guest_home}/.gitconfig"
|
||||
|
||||
content = git_gate_render_gitconfig(manifest_bottle.git, GIT_GATE_HOSTNAME)
|
||||
config_file = plan.stage_dir / "agent_gitconfig"
|
||||
config_file.write_text(content)
|
||||
config_file.chmod(0o600)
|
||||
|
||||
info(f"writing {container_gitconfig} with {len(manifest_bottle.git)} insteadOf rule(s)")
|
||||
bottle.cp_in(str(config_file), container_gitconfig)
|
||||
bottle.exec(
|
||||
f"chown node:node {shlex.quote(container_gitconfig)} && "
|
||||
f"chmod 644 {shlex.quote(container_gitconfig)}",
|
||||
user="root",
|
||||
)
|
||||
|
||||
|
||||
def _provision_git_user(plan: DockerBottlePlan, bottle: Bottle) -> None:
|
||||
"""Apply `git config --global user.{name,email}` inside the
|
||||
bottle so the agent's commits are attributed to the operator-
|
||||
chosen identity instead of the agent image's default
|
||||
(which is no user — git would refuse to commit at all
|
||||
until the agent ran its own `git config`).
|
||||
|
||||
Runs as the `node` user so `--global` lands in
|
||||
`/home/node/.gitconfig` (matching the existing
|
||||
`_provision_git_gate_config` write location). No-op when the
|
||||
bottle didn't declare `git.user`.
|
||||
|
||||
Each field set independently — name-only or email-only
|
||||
configs only run the `git config` line for the field
|
||||
present."""
|
||||
manifest_bottle = plan.spec.manifest.bottle_for(plan.spec.agent_name)
|
||||
gu = manifest_bottle.git_user
|
||||
if gu.is_empty():
|
||||
return
|
||||
if gu.name:
|
||||
info(f"git config --global user.name = {gu.name!r}")
|
||||
bottle.exec(
|
||||
f"git config --global user.name {shlex.quote(gu.name)}",
|
||||
user="node",
|
||||
)
|
||||
if gu.email:
|
||||
info(f"git config --global user.email = {gu.email!r}")
|
||||
bottle.exec(
|
||||
f"git config --global user.email {shlex.quote(gu.email)}",
|
||||
user="node",
|
||||
)
|
||||
@@ -0,0 +1,62 @@
|
||||
"""Prepare step for the Docker bottle backend.
|
||||
|
||||
`resolve_plan` does all host-side resolution (image and container
|
||||
names, prompt-file, proxy plan, runtime detection) and returns a
|
||||
frozen DockerBottlePlan. No Docker resources are created; the only
|
||||
side effects are scratch files under `stage_dir` and a probe of
|
||||
`docker info`. Cross-backend host-side validation has already run
|
||||
via the base class's `prepare` template before this is called.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
from . import util as docker_mod
|
||||
from .bottle_plan import DockerBottlePlan
|
||||
from .. import BottleSpec
|
||||
from ...env import ResolvedEnv
|
||||
from ...agent_provider import AgentProvisionPlan
|
||||
from ...egress import EgressPlan
|
||||
from ...manifest import Manifest
|
||||
from ...supervise import SupervisePlan
|
||||
from ...git_gate import GitGatePlan
|
||||
|
||||
def preflight() -> None:
|
||||
docker_mod.require_docker()
|
||||
|
||||
|
||||
def build_guest_env(resolved_env: ResolvedEnv) -> dict[str, str]:
|
||||
return dict(resolved_env.literals)
|
||||
|
||||
|
||||
def resolve_plan(
|
||||
spec: BottleSpec,
|
||||
manifest: Manifest,
|
||||
slug: str,
|
||||
resolved_env: ResolvedEnv,
|
||||
agent_provision_plan: AgentProvisionPlan,
|
||||
egress_plan: EgressPlan,
|
||||
supervise_plan: SupervisePlan | None,
|
||||
git_gate_plan: GitGatePlan,
|
||||
stage_dir: Path,
|
||||
) -> DockerBottlePlan:
|
||||
"""Resolve Docker-specific names and write scratch files. Trusts
|
||||
that the agent and its skills/git-gate keys are present —
|
||||
validation already ran in the base class."""
|
||||
|
||||
# ==== docker specific setup ====
|
||||
use_runsc = docker_mod.runsc_available()
|
||||
|
||||
return DockerBottlePlan(
|
||||
spec=spec,
|
||||
manifest=manifest,
|
||||
stage_dir=stage_dir,
|
||||
slug=slug,
|
||||
forwarded_env=dict(resolved_env.forwarded),
|
||||
git_gate_plan=git_gate_plan,
|
||||
egress_plan=egress_plan,
|
||||
supervise_plan=supervise_plan,
|
||||
use_runsc=use_runsc,
|
||||
agent_provision=agent_provision_plan,
|
||||
)
|
||||
@@ -2,10 +2,10 @@
|
||||
(PRD 0024).
|
||||
|
||||
The bundle image (built by Dockerfile.sidecars, PRD 0024 chunk 1)
|
||||
runs pipelock + egress + git-gate + supervise as one container
|
||||
per bottle under a small Python init supervisor. As of chunk 5
|
||||
the bundle is the only shape — the legacy four-sidecar topology
|
||||
and its `BOT_BOTTLE_SIDECAR_BUNDLE` feature flag are gone."""
|
||||
runs egress + git-gate + supervise as one container per bottle
|
||||
under a small Python init supervisor. As of chunk 5 the bundle
|
||||
is the only shape — the legacy four-sidecar topology and its
|
||||
`BOT_BOTTLE_SIDECAR_BUNDLE` feature flag are gone."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
@@ -14,8 +14,7 @@ import os
|
||||
|
||||
# Bundle image. Defaults to a built-locally tag (built from the
|
||||
# repo's Dockerfile.sidecars via compose `build:`). Operators
|
||||
# pinning to a published digest can override via env, matching
|
||||
# the existing `BOT_BOTTLE_PIPELOCK_IMAGE` shape.
|
||||
# pinning to a published digest can override via env.
|
||||
SIDECAR_BUNDLE_IMAGE = os.environ.get(
|
||||
"BOT_BOTTLE_SIDECAR_IMAGE",
|
||||
"bot-bottle-sidecars:latest",
|
||||
|
||||
@@ -7,11 +7,10 @@ from __future__ import annotations
|
||||
import re
|
||||
import shutil
|
||||
import subprocess
|
||||
import tempfile
|
||||
from typing import Iterable, Iterator
|
||||
|
||||
from ...log import die, info
|
||||
from ...workspace import WorkspacePlan
|
||||
# from ...workspace import WorkspacePlan
|
||||
|
||||
|
||||
# Cap on the suffix the container-name conflict logic will try before
|
||||
@@ -118,39 +117,39 @@ def build_image(ref: str, context: str, *, dockerfile: str = "") -> None:
|
||||
subprocess.run(args, check=True)
|
||||
|
||||
|
||||
def build_image_with_cwd(
|
||||
derived: str,
|
||||
base: str,
|
||||
workspace: WorkspacePlan,
|
||||
) -> None:
|
||||
"""Build a thin derived image that copies the workspace into
|
||||
the plan's guest path and sets the plan's workdir."""
|
||||
import os
|
||||
|
||||
cwd = str(workspace.host_path)
|
||||
if not os.path.isdir(cwd):
|
||||
die(f"cwd not found at {cwd}")
|
||||
info(f"building image {derived} from {base} with {cwd} -> {workspace.guest_path}")
|
||||
with tempfile.TemporaryDirectory(prefix="bot-bottle-cwd.") as tmp:
|
||||
context_dir = os.path.join(tmp, "context")
|
||||
staged_workspace = os.path.join(context_dir, "workspace")
|
||||
shutil.copytree(
|
||||
cwd,
|
||||
staged_workspace,
|
||||
symlinks=True,
|
||||
ignore=shutil.ignore_patterns(".git"),
|
||||
)
|
||||
dockerfile = (
|
||||
f"FROM {base}\n"
|
||||
f"COPY --chown=node:node workspace/. {workspace.guest_path}\n"
|
||||
f"WORKDIR {workspace.workdir}\n"
|
||||
)
|
||||
subprocess.run(
|
||||
["docker", "build", "-t", derived, "-f", "-", context_dir],
|
||||
input=dockerfile,
|
||||
text=True,
|
||||
check=True,
|
||||
)
|
||||
# def build_image_with_cwd(
|
||||
# derived: str,
|
||||
# base: str,
|
||||
# workspace: "WorkspacePlan",
|
||||
# ) -> None:
|
||||
# """Build a thin derived image that copies the workspace into
|
||||
# the plan's guest path and sets the plan's workdir."""
|
||||
# import os
|
||||
#
|
||||
# cwd = str(workspace.host_path)
|
||||
# if not os.path.isdir(cwd):
|
||||
# die(f"cwd not found at {cwd}")
|
||||
# info(f"building image {derived} from {base} with {cwd} -> {workspace.guest_path}")
|
||||
# with tempfile.TemporaryDirectory(prefix="bot-bottle-cwd.") as tmp:
|
||||
# context_dir = os.path.join(tmp, "context")
|
||||
# staged_workspace = os.path.join(context_dir, "workspace")
|
||||
# shutil.copytree(
|
||||
# cwd,
|
||||
# staged_workspace,
|
||||
# symlinks=True,
|
||||
# ignore=shutil.ignore_patterns(".git"),
|
||||
# )
|
||||
# dockerfile = (
|
||||
# f"FROM {base}\n"
|
||||
# f"COPY --chown=node:node workspace/. {workspace.guest_path}\n"
|
||||
# f"WORKDIR {workspace.workdir}\n"
|
||||
# )
|
||||
# subprocess.run(
|
||||
# ["docker", "build", "-t", derived, "-f", "-", context_dir],
|
||||
# input=dockerfile,
|
||||
# text=True,
|
||||
# check=True,
|
||||
# )
|
||||
|
||||
|
||||
def image_id(ref: str) -> str:
|
||||
|
||||
@@ -0,0 +1,10 @@
|
||||
"""macOS Apple Container backend.
|
||||
|
||||
Selectable via `BOT_BOTTLE_BACKEND=macos-container`. This package owns
|
||||
the Apple `container` CLI integration; launch remains gated until the
|
||||
sidecar network enforcement shape is implemented.
|
||||
"""
|
||||
|
||||
from .backend import MacosContainerBottleBackend
|
||||
|
||||
__all__ = ["MacosContainerBottleBackend"]
|
||||
@@ -0,0 +1,87 @@
|
||||
"""MacosContainerBottleBackend — Apple Container implementation."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from contextlib import contextmanager
|
||||
from pathlib import Path
|
||||
from typing import Generator, Sequence
|
||||
|
||||
from ...agent_provider import AgentProvisionPlan
|
||||
from ...egress import EgressPlan
|
||||
from ...env import ResolvedEnv
|
||||
from ...git_gate import GitGatePlan
|
||||
from ...supervise import SupervisePlan
|
||||
from ...manifest import Manifest
|
||||
from .. import ActiveAgent, BottleBackend, BottleSpec
|
||||
from . import cleanup as _cleanup
|
||||
from . import enumerate as _enumerate
|
||||
from . import launch as _launch
|
||||
from . import resolve_plan as _resolve_plan
|
||||
from . import util as _container
|
||||
from .bottle import MacosContainerBottle
|
||||
from .bottle_cleanup_plan import MacosContainerBottleCleanupPlan
|
||||
from .bottle_plan import MacosContainerBottlePlan
|
||||
|
||||
|
||||
class MacosContainerBottleBackend(
|
||||
BottleBackend["MacosContainerBottlePlan", "MacosContainerBottleCleanupPlan"]
|
||||
):
|
||||
"""Apple Container backend. Selected by
|
||||
`BOT_BOTTLE_BACKEND=macos-container` or
|
||||
`--backend=macos-container`."""
|
||||
|
||||
name = "macos-container"
|
||||
|
||||
@classmethod
|
||||
def is_available(cls) -> bool:
|
||||
return _container.is_available()
|
||||
|
||||
def _preflight(self) -> None:
|
||||
_resolve_plan.preflight()
|
||||
|
||||
def _build_guest_env(self, resolved_env: ResolvedEnv) -> dict[str, str]:
|
||||
return _resolve_plan.build_guest_env(resolved_env)
|
||||
|
||||
def _resolve_plan(
|
||||
self,
|
||||
spec: BottleSpec,
|
||||
*,
|
||||
manifest: Manifest,
|
||||
slug: str,
|
||||
resolved_env: ResolvedEnv,
|
||||
agent_provision_plan: AgentProvisionPlan,
|
||||
egress_plan: EgressPlan,
|
||||
git_gate_plan: GitGatePlan,
|
||||
supervise_plan: SupervisePlan | None,
|
||||
stage_dir: Path,
|
||||
) -> MacosContainerBottlePlan:
|
||||
return _resolve_plan.resolve_plan(
|
||||
spec,
|
||||
manifest=manifest,
|
||||
slug=slug,
|
||||
resolved_env=resolved_env,
|
||||
agent_provision_plan=agent_provision_plan,
|
||||
egress_plan=egress_plan,
|
||||
supervise_plan=supervise_plan,
|
||||
git_gate_plan=git_gate_plan,
|
||||
stage_dir=stage_dir,
|
||||
)
|
||||
|
||||
@contextmanager
|
||||
def launch(
|
||||
self, plan: MacosContainerBottlePlan
|
||||
) -> Generator[MacosContainerBottle, None, None]:
|
||||
with _launch.launch(plan, provision=self.provision) as bottle:
|
||||
yield bottle
|
||||
|
||||
def prepare_cleanup(self) -> MacosContainerBottleCleanupPlan:
|
||||
return _cleanup.prepare_cleanup()
|
||||
|
||||
def cleanup(self, plan: MacosContainerBottleCleanupPlan) -> None:
|
||||
_cleanup.cleanup(plan)
|
||||
|
||||
def enumerate_active(self) -> Sequence[ActiveAgent]:
|
||||
return _enumerate.enumerate_active()
|
||||
|
||||
def supervise_mcp_url(self, plan: MacosContainerBottlePlan) -> str:
|
||||
return plan.agent_supervise_url
|
||||
@@ -0,0 +1,91 @@
|
||||
"""Bottle handle for Apple's `container` CLI."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import subprocess
|
||||
from typing import Callable, cast
|
||||
|
||||
from ...agent_provider import PromptMode, prompt_args
|
||||
from .. import Bottle, ExecResult
|
||||
from ..terminal import exec_shell_script
|
||||
|
||||
|
||||
class MacosContainerBottle(Bottle):
|
||||
def __init__(
|
||||
self,
|
||||
container: str,
|
||||
teardown: Callable[[], None],
|
||||
prompt_path_in_container: str | None,
|
||||
*,
|
||||
agent_command: str = "claude",
|
||||
agent_prompt_mode: PromptMode = "append_file",
|
||||
agent_provider_template: str = "claude",
|
||||
terminal_title: str = "",
|
||||
terminal_color: str = "",
|
||||
agent_workdir: str = "/home/node",
|
||||
):
|
||||
self.name = container
|
||||
self._teardown = teardown
|
||||
self.prompt_path = prompt_path_in_container
|
||||
self._agent_prompt_mode = agent_prompt_mode
|
||||
self.agent_command = agent_command
|
||||
self.terminal_title = terminal_title
|
||||
self.terminal_color = terminal_color
|
||||
self.agent_provider_template = agent_provider_template
|
||||
self.agent_workdir = agent_workdir
|
||||
self._closed = False
|
||||
|
||||
def agent_argv(self, argv: list[str], *, tty: bool = True) -> list[str]:
|
||||
full_argv = list(argv)
|
||||
full_argv.extend(
|
||||
prompt_args(
|
||||
cast(PromptMode, self._agent_prompt_mode),
|
||||
self.prompt_path,
|
||||
argv=full_argv,
|
||||
)
|
||||
)
|
||||
cmd = ["container", "exec"]
|
||||
if tty:
|
||||
cmd.extend(["--interactive", "--tty"])
|
||||
if self.agent_workdir and self.agent_workdir != "/home/node":
|
||||
cmd.extend(["--workdir", self.agent_workdir])
|
||||
cmd.extend([self.name, self.agent_command, *full_argv])
|
||||
return cmd
|
||||
|
||||
def exec_agent(self, argv: list[str], *, tty: bool = True) -> int:
|
||||
agent_argv = self.agent_argv(argv, tty=tty)
|
||||
script = (
|
||||
exec_shell_script(agent_argv, self.terminal_title, self.terminal_color)
|
||||
if tty else None
|
||||
)
|
||||
if script is None:
|
||||
return subprocess.run(agent_argv, check=False).returncode
|
||||
return subprocess.run(["sh", "-lc", script], check=False).returncode
|
||||
|
||||
def exec(self, script: str, *, user: str = "node") -> ExecResult:
|
||||
result = subprocess.run(
|
||||
["container", "exec", "--user", user, "--interactive",
|
||||
self.name, "sh", "-s"],
|
||||
input=script,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=False,
|
||||
)
|
||||
return ExecResult(
|
||||
returncode=result.returncode,
|
||||
stdout=result.stdout,
|
||||
stderr=result.stderr,
|
||||
)
|
||||
|
||||
def cp_in(self, host_path: str, container_path: str) -> None:
|
||||
subprocess.run(
|
||||
["container", "cp", host_path, f"{self.name}:{container_path}"],
|
||||
stdout=subprocess.DEVNULL,
|
||||
check=True,
|
||||
)
|
||||
|
||||
def close(self) -> None:
|
||||
if self._closed:
|
||||
return
|
||||
self._closed = True
|
||||
self._teardown()
|
||||
@@ -0,0 +1,27 @@
|
||||
"""Cleanup plan for the macOS Apple Container backend."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
|
||||
from ...log import info
|
||||
from .. import BottleCleanupPlan
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class MacosContainerBottleCleanupPlan(BottleCleanupPlan):
|
||||
containers: tuple[str, ...] = ()
|
||||
networks: tuple[str, ...] = ()
|
||||
|
||||
def print(self) -> None:
|
||||
if not self.containers and not self.networks:
|
||||
info("macos-container cleanup: nothing to remove")
|
||||
return
|
||||
for name in self.containers:
|
||||
info(f"macos-container container: {name}")
|
||||
for name in self.networks:
|
||||
info(f"macos-container network: {name}")
|
||||
|
||||
@property
|
||||
def empty(self) -> bool:
|
||||
return not self.containers and not self.networks
|
||||
@@ -0,0 +1,58 @@
|
||||
"""Plan type for the macOS Apple Container backend."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
|
||||
from ...agent_provider import PromptMode
|
||||
from .. import BottlePlan
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class MacosContainerBottlePlan(BottlePlan):
|
||||
slug: str
|
||||
forwarded_env: dict[str, str] = field(repr=False)
|
||||
agent_proxy_url: str = ""
|
||||
agent_git_gate_url: str = ""
|
||||
agent_supervise_url: str = ""
|
||||
|
||||
@property
|
||||
def container_name(self) -> str:
|
||||
return self.agent_provision.instance_name
|
||||
|
||||
@property
|
||||
def image(self) -> str:
|
||||
return self.agent_provision.image
|
||||
|
||||
@property
|
||||
def dockerfile_path(self) -> str:
|
||||
return self.agent_provision.dockerfile
|
||||
|
||||
@property
|
||||
def prompt_file(self) -> Path:
|
||||
return self.agent_provision.prompt_file
|
||||
|
||||
@property
|
||||
def agent_command(self) -> str:
|
||||
return self.agent_provision.command
|
||||
|
||||
@property
|
||||
def agent_prompt_mode(self) -> PromptMode:
|
||||
return self.agent_provision.prompt_mode
|
||||
|
||||
@property
|
||||
def agent_provider_template(self) -> str:
|
||||
return self.agent_provision.template
|
||||
|
||||
@property
|
||||
def git_gate_insteadof_host(self) -> str:
|
||||
if self.agent_git_gate_url.startswith("http://"):
|
||||
return self.agent_git_gate_url.removeprefix("http://").rstrip("/")
|
||||
return super().git_gate_insteadof_host
|
||||
|
||||
@property
|
||||
def git_gate_insteadof_scheme(self) -> str:
|
||||
if self.agent_git_gate_url.startswith("http://"):
|
||||
return "http"
|
||||
return super().git_gate_insteadof_scheme
|
||||
@@ -0,0 +1,70 @@
|
||||
"""Cleanup for the macOS Apple Container backend."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import subprocess
|
||||
|
||||
from ...log import info, warn
|
||||
from . import util as container_mod
|
||||
from .bottle_cleanup_plan import MacosContainerBottleCleanupPlan
|
||||
|
||||
_PREFIX = "bot-bottle-"
|
||||
_BUNDLE_PREFIX = "bot-bottle-sidecars-"
|
||||
|
||||
|
||||
def _list_prefixed_containers() -> list[str]:
|
||||
result = subprocess.run(
|
||||
["container", "list", "--all", "--quiet"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=False,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
warn(f"container list failed: {result.stderr.strip()}")
|
||||
return []
|
||||
return sorted(
|
||||
name for name in (line.strip() for line in result.stdout.splitlines())
|
||||
if name.startswith(_PREFIX) or name.startswith(_BUNDLE_PREFIX)
|
||||
)
|
||||
|
||||
|
||||
def _list_prefixed_networks() -> list[str]:
|
||||
result = subprocess.run(
|
||||
["container", "network", "list", "--quiet"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=False,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
return []
|
||||
return sorted(
|
||||
name for name in (line.strip() for line in result.stdout.splitlines())
|
||||
if name.startswith(_PREFIX)
|
||||
)
|
||||
|
||||
|
||||
def prepare_cleanup() -> MacosContainerBottleCleanupPlan:
|
||||
container_mod.require_container()
|
||||
return MacosContainerBottleCleanupPlan(
|
||||
containers=tuple(_list_prefixed_containers()),
|
||||
networks=tuple(_list_prefixed_networks()),
|
||||
)
|
||||
|
||||
|
||||
def cleanup(plan: MacosContainerBottleCleanupPlan) -> None:
|
||||
for name in plan.containers:
|
||||
info(f"container delete --force {name}")
|
||||
subprocess.run(
|
||||
["container", "delete", "--force", name],
|
||||
stdout=subprocess.DEVNULL,
|
||||
stderr=subprocess.DEVNULL,
|
||||
check=False,
|
||||
)
|
||||
for name in plan.networks:
|
||||
info(f"container network delete {name}")
|
||||
subprocess.run(
|
||||
["container", "network", "delete", name],
|
||||
stdout=subprocess.DEVNULL,
|
||||
stderr=subprocess.DEVNULL,
|
||||
check=False,
|
||||
)
|
||||
@@ -0,0 +1,40 @@
|
||||
"""Active-agent enumeration for the macOS Apple Container backend."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import subprocess
|
||||
|
||||
from ...bottle_state import read_metadata
|
||||
from .. import ActiveAgent
|
||||
|
||||
_PREFIX = "bot-bottle-"
|
||||
_SIDECAR_PREFIX = "bot-bottle-sidecars-"
|
||||
|
||||
|
||||
def enumerate_active() -> list[ActiveAgent]:
|
||||
result = subprocess.run(
|
||||
["container", "list", "--quiet"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=False,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
return []
|
||||
out: list[ActiveAgent] = []
|
||||
for name in sorted(line.strip() for line in result.stdout.splitlines()):
|
||||
if not name.startswith(_PREFIX):
|
||||
continue
|
||||
if name.startswith(_SIDECAR_PREFIX):
|
||||
continue
|
||||
slug = name[len(_PREFIX):]
|
||||
metadata = read_metadata(slug)
|
||||
out.append(ActiveAgent(
|
||||
backend_name="macos-container",
|
||||
slug=slug,
|
||||
agent_name=metadata.agent_name if metadata else "?",
|
||||
started_at=metadata.started_at if metadata else "",
|
||||
services=(),
|
||||
label=metadata.label if metadata else "",
|
||||
color=metadata.color if metadata else "",
|
||||
))
|
||||
return out
|
||||
@@ -0,0 +1,426 @@
|
||||
"""Launch flow for the macOS Apple Container backend.
|
||||
|
||||
This backend keeps the explicit proxy-env enforcement model for v1:
|
||||
the agent container is attached only to a host-only Apple Container
|
||||
network, while the sidecar bundle is attached to a NAT network first
|
||||
and the host-only network second. The sidecar's host-only IP is
|
||||
discovered from `container inspect` and stamped into the agent's
|
||||
HTTP_PROXY / HTTPS_PROXY env vars.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import dataclasses
|
||||
import os
|
||||
import shutil
|
||||
import subprocess
|
||||
from contextlib import ExitStack, contextmanager
|
||||
from pathlib import Path
|
||||
from typing import Callable, Generator
|
||||
|
||||
from ...bottle_state import egress_state_dir, git_gate_state_dir
|
||||
from ...egress import EGRESS_ROUTES_IN_CONTAINER, egress_resolve_token_values
|
||||
from ...git_gate import revoke_git_gate_provisioned_keys
|
||||
from ...log import die, info, warn
|
||||
from ...supervise import QUEUE_DIR_IN_CONTAINER, SUPERVISE_PORT
|
||||
from ...util import expand_tilde
|
||||
from ..docker.egress import EGRESS_CA_IN_CONTAINER, EGRESS_PORT
|
||||
from ..docker.git_gate import (
|
||||
GIT_GATE_ACCESS_HOOK_IN_CONTAINER,
|
||||
GIT_GATE_CREDS_DIR_IN_CONTAINER,
|
||||
GIT_GATE_ENTRYPOINT_IN_CONTAINER,
|
||||
GIT_GATE_HOOK_IN_CONTAINER,
|
||||
)
|
||||
from ..docker.sidecar_bundle import (
|
||||
SIDECAR_BUNDLE_DOCKERFILE,
|
||||
SIDECAR_BUNDLE_IMAGE,
|
||||
)
|
||||
from ..docker.egress import egress_tls_init
|
||||
from ..util import AGENT_CA_BUNDLE, AGENT_CA_PATH
|
||||
from . import util as container_mod
|
||||
from .bottle import MacosContainerBottle
|
||||
from .bottle_plan import MacosContainerBottlePlan
|
||||
|
||||
|
||||
_REPO_DIR = str(Path(__file__).resolve().parent.parent.parent.parent)
|
||||
_AGENT_SLEEP_SECONDS = "2147483647"
|
||||
_GIT_HTTP_PORT = 9420
|
||||
_GIT_GATE_READY_FILE = "/run/git-gate/ready"
|
||||
|
||||
|
||||
def internal_network_name(slug: str) -> str:
|
||||
return f"bot-bottle-net-{slug}"
|
||||
|
||||
|
||||
def egress_network_name(slug: str) -> str:
|
||||
return f"bot-bottle-egress-{slug}"
|
||||
|
||||
|
||||
def sidecar_container_name(slug: str) -> str:
|
||||
return f"bot-bottle-sidecars-{slug}"
|
||||
|
||||
|
||||
@contextmanager
|
||||
def launch(
|
||||
plan: MacosContainerBottlePlan,
|
||||
*,
|
||||
provision: Callable[[MacosContainerBottlePlan, "MacosContainerBottle"], str | None],
|
||||
) -> Generator[MacosContainerBottle, None, None]:
|
||||
"""Build, run, provision, and yield an Apple Container bottle."""
|
||||
stack = ExitStack()
|
||||
bottle_for_revoke = plan.manifest.bottle
|
||||
git_gate_dir_for_revoke = git_gate_state_dir(plan.slug)
|
||||
|
||||
def teardown() -> None:
|
||||
teardown_exc: BaseException | None = None
|
||||
try:
|
||||
stack.close()
|
||||
except BaseException as exc: # noqa: W0718 - teardown must continue
|
||||
teardown_exc = exc
|
||||
warn(f"macos-container teardown failed: {exc!r}")
|
||||
revoke_git_gate_provisioned_keys(bottle_for_revoke, git_gate_dir_for_revoke)
|
||||
if teardown_exc is not None:
|
||||
raise teardown_exc
|
||||
|
||||
try:
|
||||
plan = _mint_certs(plan)
|
||||
_build_images(plan)
|
||||
|
||||
internal_network = internal_network_name(plan.slug)
|
||||
egress_network = egress_network_name(plan.slug)
|
||||
_create_networks(internal_network, egress_network, stack)
|
||||
|
||||
sidecar_name = sidecar_container_name(plan.slug)
|
||||
container_mod.force_remove_container(sidecar_name)
|
||||
_start_sidecar_bundle(plan, sidecar_name, internal_network, egress_network)
|
||||
stack.callback(container_mod.force_remove_container, sidecar_name)
|
||||
_stage_git_gate(plan, sidecar_name)
|
||||
|
||||
sidecar_ip = container_mod.container_ipv4_on_network(
|
||||
sidecar_name, internal_network,
|
||||
)
|
||||
plan = _stamp_agent_urls(plan, sidecar_ip)
|
||||
|
||||
container_mod.force_remove_container(plan.container_name)
|
||||
_start_agent(plan, internal_network, sidecar_ip)
|
||||
stack.callback(container_mod.force_remove_container, plan.container_name)
|
||||
|
||||
bottle = MacosContainerBottle(
|
||||
plan.container_name,
|
||||
teardown,
|
||||
None,
|
||||
agent_command=plan.agent_command,
|
||||
agent_prompt_mode=plan.agent_prompt_mode,
|
||||
agent_provider_template=plan.agent_provider_template,
|
||||
terminal_title=plan.spec.label or plan.spec.agent_name,
|
||||
terminal_color=plan.spec.color,
|
||||
agent_workdir=plan.workspace_plan.workdir,
|
||||
)
|
||||
bottle.prompt_path = provision(plan, bottle)
|
||||
|
||||
yield bottle
|
||||
finally:
|
||||
teardown()
|
||||
|
||||
|
||||
def _mint_certs(plan: MacosContainerBottlePlan) -> MacosContainerBottlePlan:
|
||||
egress_ca_host, egress_ca_cert_only = egress_tls_init(
|
||||
egress_state_dir(plan.slug),
|
||||
)
|
||||
egress_plan = dataclasses.replace(
|
||||
plan.egress_plan,
|
||||
mitmproxy_ca_host_path=egress_ca_host,
|
||||
mitmproxy_ca_cert_only_host_path=egress_ca_cert_only,
|
||||
)
|
||||
return dataclasses.replace(plan, egress_plan=egress_plan)
|
||||
|
||||
|
||||
def _build_images(plan: MacosContainerBottlePlan) -> None:
|
||||
container_mod.build_image(
|
||||
SIDECAR_BUNDLE_IMAGE,
|
||||
_REPO_DIR,
|
||||
dockerfile=SIDECAR_BUNDLE_DOCKERFILE,
|
||||
)
|
||||
container_mod.build_image(
|
||||
plan.image,
|
||||
_REPO_DIR,
|
||||
dockerfile=plan.dockerfile_path,
|
||||
)
|
||||
|
||||
|
||||
def _create_networks(
|
||||
internal_network: str,
|
||||
egress_network: str,
|
||||
stack: ExitStack,
|
||||
) -> None:
|
||||
container_mod.create_network(internal_network, internal=True)
|
||||
stack.callback(container_mod.remove_network, internal_network)
|
||||
container_mod.create_network(egress_network)
|
||||
stack.callback(container_mod.remove_network, egress_network)
|
||||
|
||||
|
||||
def _start_sidecar_bundle(
|
||||
plan: MacosContainerBottlePlan,
|
||||
sidecar_name: str,
|
||||
internal_network: str,
|
||||
egress_network: str,
|
||||
) -> None:
|
||||
argv = _sidecar_run_argv(plan, sidecar_name, internal_network, egress_network)
|
||||
effective_env = {**dict(os.environ), **plan.agent_provision.provisioned_env}
|
||||
token_values = egress_resolve_token_values(
|
||||
plan.egress_plan.token_env_map, effective_env,
|
||||
)
|
||||
env = {**os.environ, **token_values}
|
||||
info(f"container run sidecar bundle {sidecar_name}")
|
||||
result = subprocess.run(
|
||||
argv, capture_output=True, text=True, env=env, check=False,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
die(
|
||||
f"container run for sidecar bundle {sidecar_name} failed: "
|
||||
f"{(result.stderr or '').strip() or '<no stderr>'}"
|
||||
)
|
||||
|
||||
|
||||
def _start_agent(
|
||||
plan: MacosContainerBottlePlan,
|
||||
internal_network: str,
|
||||
sidecar_ip: str,
|
||||
) -> None:
|
||||
argv = _agent_run_argv(plan, internal_network, sidecar_ip)
|
||||
env = {
|
||||
**os.environ,
|
||||
**plan.forwarded_env,
|
||||
}
|
||||
info(f"container run agent {plan.container_name}")
|
||||
result = subprocess.run(
|
||||
argv, capture_output=True, text=True, env=env, check=False,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
die(
|
||||
f"container run for agent {plan.container_name} failed: "
|
||||
f"{(result.stderr or '').strip() or '<no stderr>'}"
|
||||
)
|
||||
|
||||
|
||||
def _stamp_agent_urls(
|
||||
plan: MacosContainerBottlePlan,
|
||||
sidecar_ip: str,
|
||||
) -> MacosContainerBottlePlan:
|
||||
proxy_url = f"http://{sidecar_ip}:{EGRESS_PORT}"
|
||||
supervise_url = ""
|
||||
if plan.supervise_plan is not None:
|
||||
supervise_url = f"http://{sidecar_ip}:{SUPERVISE_PORT}/"
|
||||
git_gate_url = ""
|
||||
if plan.git_gate_plan.upstreams:
|
||||
git_gate_url = f"http://{sidecar_ip}:{_GIT_HTTP_PORT}"
|
||||
return dataclasses.replace(
|
||||
plan,
|
||||
agent_proxy_url=proxy_url,
|
||||
agent_git_gate_url=git_gate_url,
|
||||
agent_supervise_url=supervise_url,
|
||||
)
|
||||
|
||||
|
||||
def _stage_git_gate(plan: MacosContainerBottlePlan, sidecar_name: str) -> None:
|
||||
gp = plan.git_gate_plan
|
||||
if not gp.upstreams:
|
||||
return
|
||||
|
||||
container_mod.exec_container(
|
||||
sidecar_name,
|
||||
[
|
||||
"mkdir",
|
||||
"-p",
|
||||
str(Path(GIT_GATE_HOOK_IN_CONTAINER).parent),
|
||||
GIT_GATE_CREDS_DIR_IN_CONTAINER,
|
||||
"/git",
|
||||
str(Path(_GIT_GATE_READY_FILE).parent),
|
||||
],
|
||||
)
|
||||
|
||||
for host_path, container_path in _git_gate_files(plan):
|
||||
container_mod.copy_into_container(
|
||||
sidecar_name, host_path, container_path,
|
||||
)
|
||||
|
||||
container_mod.exec_container(
|
||||
sidecar_name,
|
||||
[
|
||||
"sh",
|
||||
"-c",
|
||||
"chmod 755 "
|
||||
f"{GIT_GATE_ENTRYPOINT_IN_CONTAINER} "
|
||||
f"{GIT_GATE_HOOK_IN_CONTAINER} "
|
||||
f"{GIT_GATE_ACCESS_HOOK_IN_CONTAINER} && "
|
||||
f"chmod 600 {GIT_GATE_CREDS_DIR_IN_CONTAINER}/* && "
|
||||
f"touch {_GIT_GATE_READY_FILE}",
|
||||
],
|
||||
)
|
||||
|
||||
|
||||
def _git_gate_files(
|
||||
plan: MacosContainerBottlePlan,
|
||||
) -> tuple[tuple[str, str], ...]:
|
||||
gp = plan.git_gate_plan
|
||||
files: list[tuple[str, str]] = [
|
||||
(str(gp.entrypoint_script), GIT_GATE_ENTRYPOINT_IN_CONTAINER),
|
||||
(str(gp.hook_script), GIT_GATE_HOOK_IN_CONTAINER),
|
||||
(str(gp.access_hook_script), GIT_GATE_ACCESS_HOOK_IN_CONTAINER),
|
||||
]
|
||||
for upstream in gp.upstreams:
|
||||
files.append((
|
||||
expand_tilde(upstream.identity_file),
|
||||
f"{GIT_GATE_CREDS_DIR_IN_CONTAINER}/{upstream.name}-key",
|
||||
))
|
||||
if upstream.known_hosts_file:
|
||||
files.append((
|
||||
str(upstream.known_hosts_file),
|
||||
f"{GIT_GATE_CREDS_DIR_IN_CONTAINER}/{upstream.name}-known_hosts",
|
||||
))
|
||||
return tuple(files)
|
||||
|
||||
|
||||
def _sidecar_run_argv(
|
||||
plan: MacosContainerBottlePlan,
|
||||
sidecar_name: str,
|
||||
internal_network: str,
|
||||
egress_network: str,
|
||||
) -> list[str]:
|
||||
argv = [
|
||||
"container", "run",
|
||||
"--name", sidecar_name,
|
||||
"--detach",
|
||||
"--rm",
|
||||
"--network", egress_network,
|
||||
"--network", internal_network,
|
||||
"--dns", _sidecar_dns(),
|
||||
"--env", f"BOT_BOTTLE_SIDECAR_DAEMONS={','.join(_sidecar_daemons(plan))}",
|
||||
]
|
||||
for entry in _sidecar_env_entries(plan):
|
||||
argv += ["--env", entry]
|
||||
for host_path, container_path, read_only in _sidecar_mounts(plan):
|
||||
argv += ["--mount", _mount_spec(host_path, container_path, read_only)]
|
||||
argv.append(SIDECAR_BUNDLE_IMAGE)
|
||||
return argv
|
||||
|
||||
|
||||
def _agent_run_argv(
|
||||
plan: MacosContainerBottlePlan,
|
||||
internal_network: str,
|
||||
sidecar_ip: str,
|
||||
) -> list[str]:
|
||||
argv = [
|
||||
"container", "run",
|
||||
"--name", plan.container_name,
|
||||
"--detach",
|
||||
"--rm",
|
||||
"--network", internal_network,
|
||||
]
|
||||
for entry in _agent_env_entries(plan, sidecar_ip):
|
||||
argv += ["--env", entry]
|
||||
argv += [plan.image, "sleep", _AGENT_SLEEP_SECONDS]
|
||||
return argv
|
||||
|
||||
|
||||
def _sidecar_dns() -> str:
|
||||
return container_mod.dns_server()
|
||||
|
||||
|
||||
def _sidecar_daemons(plan: MacosContainerBottlePlan) -> tuple[str, ...]:
|
||||
daemons = ["egress"]
|
||||
if plan.git_gate_plan.upstreams:
|
||||
daemons += ["git-gate", "git-http"]
|
||||
if plan.supervise_plan is not None:
|
||||
daemons.append("supervise")
|
||||
return tuple(daemons)
|
||||
|
||||
|
||||
def _sidecar_env_entries(plan: MacosContainerBottlePlan) -> tuple[str, ...]:
|
||||
env: list[str] = []
|
||||
if plan.egress_plan.routes:
|
||||
env.extend(sorted(plan.egress_plan.token_env_map.keys()))
|
||||
if plan.git_gate_plan.upstreams:
|
||||
env.append(f"BOT_BOTTLE_GIT_GATE_READY_FILE={_GIT_GATE_READY_FILE}")
|
||||
if plan.supervise_plan is not None:
|
||||
env += [
|
||||
f"SUPERVISE_BOTTLE_SLUG={plan.slug}",
|
||||
f"SUPERVISE_QUEUE_DIR={QUEUE_DIR_IN_CONTAINER}",
|
||||
f"SUPERVISE_PORT={SUPERVISE_PORT}",
|
||||
]
|
||||
return tuple(env)
|
||||
|
||||
|
||||
def _sidecar_mounts(
|
||||
plan: MacosContainerBottlePlan,
|
||||
) -> tuple[tuple[str, str, bool], ...]:
|
||||
mounts: list[tuple[str, str, bool]] = []
|
||||
|
||||
ep = plan.egress_plan
|
||||
mounts.append((
|
||||
str(ep.mitmproxy_ca_host_path.parent),
|
||||
str(Path(EGRESS_CA_IN_CONTAINER).parent),
|
||||
False,
|
||||
))
|
||||
if ep.routes:
|
||||
mounts.append((
|
||||
str(_stage_routes_dir(plan)),
|
||||
str(Path(EGRESS_ROUTES_IN_CONTAINER).parent),
|
||||
True,
|
||||
))
|
||||
|
||||
sp = plan.supervise_plan
|
||||
if sp is not None:
|
||||
mounts.append((str(sp.queue_dir), QUEUE_DIR_IN_CONTAINER, False))
|
||||
|
||||
return tuple(mounts)
|
||||
|
||||
|
||||
def _stage_routes_dir(plan: MacosContainerBottlePlan) -> Path:
|
||||
routes_dir = plan.stage_dir / "macos-container-egress"
|
||||
routes_dir.mkdir(parents=True, exist_ok=True)
|
||||
shutil.copyfile(
|
||||
plan.egress_plan.routes_path,
|
||||
routes_dir / Path(EGRESS_ROUTES_IN_CONTAINER).name,
|
||||
)
|
||||
return routes_dir
|
||||
|
||||
|
||||
def _mount_spec(host_path: str, container_path: str, read_only: bool) -> str:
|
||||
spec = f"type=bind,source={host_path},target={container_path}"
|
||||
if read_only:
|
||||
spec += ",readonly"
|
||||
return spec
|
||||
|
||||
|
||||
def _agent_env_entries(
|
||||
plan: MacosContainerBottlePlan,
|
||||
sidecar_ip: str,
|
||||
) -> tuple[str, ...]:
|
||||
proxy_url = f"http://{sidecar_ip}:{EGRESS_PORT}"
|
||||
no_proxy = _agent_no_proxy(plan, sidecar_ip)
|
||||
env = [
|
||||
f"HTTPS_PROXY={proxy_url}",
|
||||
f"HTTP_PROXY={proxy_url}",
|
||||
f"https_proxy={proxy_url}",
|
||||
f"http_proxy={proxy_url}",
|
||||
f"NO_PROXY={no_proxy}",
|
||||
f"no_proxy={no_proxy}",
|
||||
f"NODE_EXTRA_CA_CERTS={AGENT_CA_PATH}",
|
||||
f"SSL_CERT_FILE={AGENT_CA_BUNDLE}",
|
||||
f"REQUESTS_CA_BUNDLE={AGENT_CA_BUNDLE}",
|
||||
]
|
||||
if plan.agent_git_gate_url:
|
||||
env.append(f"GIT_GATE_URL={plan.agent_git_gate_url}")
|
||||
if plan.agent_supervise_url:
|
||||
env.append(f"MCP_SUPERVISE_URL={plan.agent_supervise_url}")
|
||||
for name, value in sorted(plan.agent_provision.guest_env.items()):
|
||||
env.append(f"{name}={value}")
|
||||
for name in sorted(plan.forwarded_env.keys()):
|
||||
env.append(name)
|
||||
return tuple(env)
|
||||
|
||||
|
||||
def _agent_no_proxy(plan: MacosContainerBottlePlan, sidecar_ip: str) -> str:
|
||||
hosts = ["localhost", "127.0.0.1", sidecar_ip]
|
||||
return ",".join(hosts)
|
||||
@@ -0,0 +1,47 @@
|
||||
"""Prepare step for the macOS Apple Container backend."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
from ...agent_provider import AgentProvisionPlan
|
||||
from ...egress import EgressPlan
|
||||
from ...env import ResolvedEnv
|
||||
from ...git_gate import GitGatePlan
|
||||
from ...supervise import SupervisePlan
|
||||
from ...manifest import Manifest
|
||||
from .. import BottleSpec
|
||||
from . import util as container_mod
|
||||
from .bottle_plan import MacosContainerBottlePlan
|
||||
|
||||
|
||||
def preflight() -> None:
|
||||
container_mod.require_container()
|
||||
|
||||
|
||||
def build_guest_env(resolved_env: ResolvedEnv) -> dict[str, str]:
|
||||
return dict(resolved_env.literals)
|
||||
|
||||
|
||||
def resolve_plan(
|
||||
spec: BottleSpec,
|
||||
manifest: Manifest,
|
||||
slug: str,
|
||||
resolved_env: ResolvedEnv,
|
||||
agent_provision_plan: AgentProvisionPlan,
|
||||
egress_plan: EgressPlan,
|
||||
supervise_plan: SupervisePlan | None,
|
||||
git_gate_plan: GitGatePlan,
|
||||
stage_dir: Path,
|
||||
) -> MacosContainerBottlePlan:
|
||||
return MacosContainerBottlePlan(
|
||||
spec=spec,
|
||||
manifest=manifest,
|
||||
stage_dir=stage_dir,
|
||||
slug=slug,
|
||||
forwarded_env=dict(resolved_env.forwarded),
|
||||
git_gate_plan=git_gate_plan,
|
||||
egress_plan=egress_plan,
|
||||
supervise_plan=supervise_plan,
|
||||
agent_provision=agent_provision_plan,
|
||||
)
|
||||
@@ -0,0 +1,388 @@
|
||||
"""Host-side primitives for Apple's `container` CLI."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import ipaddress
|
||||
import platform
|
||||
import shutil
|
||||
import subprocess
|
||||
import time
|
||||
from typing import Iterable
|
||||
|
||||
from ...log import die, info
|
||||
|
||||
|
||||
_CONTAINER = "container"
|
||||
_DEFAULT_DNS = "1.1.1.1"
|
||||
|
||||
|
||||
def is_macos() -> bool:
|
||||
return platform.system() == "Darwin"
|
||||
|
||||
|
||||
def is_available() -> bool:
|
||||
return is_macos() and shutil.which(_CONTAINER) is not None
|
||||
|
||||
|
||||
def require_container() -> None:
|
||||
"""Fail with an install pointer if Apple Container is unavailable."""
|
||||
if not is_macos():
|
||||
info("BOT_BOTTLE_BACKEND=macos-container requires macOS.")
|
||||
die("macos-container backend is only supported on macOS")
|
||||
if shutil.which(_CONTAINER) is None:
|
||||
info("Apple Container is required but was not found on PATH.")
|
||||
info("Install: https://github.com/apple/container/releases")
|
||||
die("container not found")
|
||||
_require_container_service()
|
||||
|
||||
|
||||
def _require_container_service() -> None:
|
||||
result = subprocess.run(
|
||||
[_CONTAINER, "system", "status"],
|
||||
stdout=subprocess.DEVNULL,
|
||||
stderr=subprocess.DEVNULL,
|
||||
check=False,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
info("Apple Container system service is not running.")
|
||||
info("Start it with: container system start")
|
||||
die("container system service not running")
|
||||
|
||||
|
||||
def dns_server() -> str:
|
||||
override = os.environ.get("BOT_BOTTLE_MACOS_CONTAINER_DNS", "").strip()
|
||||
if override:
|
||||
return override
|
||||
return _host_ipv4_dns() or _DEFAULT_DNS
|
||||
|
||||
|
||||
def build_image(ref: str, context: str, *, dockerfile: str = "") -> None:
|
||||
"""Build an OCI image with Apple's BuildKit-backed `container build`."""
|
||||
info(
|
||||
f"building image {ref} from {context} with Apple Container "
|
||||
"(layer cache keeps repeat builds fast)"
|
||||
)
|
||||
_ensure_builder_dns()
|
||||
args = [_CONTAINER, "build", "-t", ref, "--dns", dns_server()]
|
||||
if dockerfile:
|
||||
args.extend(["-f", dockerfile])
|
||||
args.append(context)
|
||||
subprocess.run(args, check=True)
|
||||
|
||||
|
||||
def _ensure_builder_dns() -> None:
|
||||
dns = dns_server()
|
||||
status = _builder_status()
|
||||
override = os.environ.get("BOT_BOTTLE_MACOS_CONTAINER_DNS", "").strip()
|
||||
if _builder_running(status) and _builder_resolves_build_hosts():
|
||||
if override and not _builder_has_dns(status, dns):
|
||||
_restart_builder_with_dns(dns)
|
||||
return
|
||||
_restart_builder_with_dns(dns)
|
||||
|
||||
|
||||
def _restart_builder_with_dns(dns: str) -> None:
|
||||
subprocess.run(
|
||||
[_CONTAINER, "builder", "stop"],
|
||||
stdout=subprocess.DEVNULL,
|
||||
stderr=subprocess.DEVNULL,
|
||||
check=False,
|
||||
)
|
||||
subprocess.run(
|
||||
[_CONTAINER, "builder", "start", "--dns", dns],
|
||||
check=True,
|
||||
)
|
||||
|
||||
|
||||
def _host_ipv4_dns() -> str:
|
||||
if not is_macos():
|
||||
return ""
|
||||
result = subprocess.run(
|
||||
["scutil", "--dns"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=False,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
return ""
|
||||
blocks: list[list[str]] = []
|
||||
current: list[str] = []
|
||||
for line in result.stdout.splitlines():
|
||||
if line.startswith("resolver #") and current:
|
||||
blocks.append(current)
|
||||
current = []
|
||||
current.append(line)
|
||||
if current:
|
||||
blocks.append(current)
|
||||
for direct_only in (True, False):
|
||||
for block in blocks:
|
||||
text = "\n".join(block)
|
||||
if direct_only and "Directly Reachable Address" not in text:
|
||||
continue
|
||||
for line in block:
|
||||
if "nameserver[" not in line or ":" not in line:
|
||||
continue
|
||||
candidate = line.split(":", 1)[1].strip()
|
||||
if _usable_ipv4(candidate):
|
||||
return candidate
|
||||
return ""
|
||||
|
||||
|
||||
def _usable_ipv4(value: str) -> bool:
|
||||
try:
|
||||
address = ipaddress.ip_address(value)
|
||||
except ValueError:
|
||||
return False
|
||||
return (
|
||||
address.version == 4
|
||||
and not address.is_loopback
|
||||
and not address.is_link_local
|
||||
and not address.is_multicast
|
||||
and not address.is_unspecified
|
||||
)
|
||||
|
||||
|
||||
def _builder_status() -> list[dict[str, object]]:
|
||||
result = subprocess.run(
|
||||
[_CONTAINER, "builder", "status", "--format", "json"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=False,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
return []
|
||||
try:
|
||||
data = json.loads(result.stdout or "[]")
|
||||
except json.JSONDecodeError:
|
||||
return []
|
||||
if isinstance(data, list):
|
||||
return [entry for entry in data if isinstance(entry, dict)]
|
||||
if isinstance(data, dict):
|
||||
return [data]
|
||||
return []
|
||||
|
||||
|
||||
def _builder_running(status: list[dict[str, object]]) -> bool:
|
||||
for entry in status:
|
||||
entry_status = entry.get("status")
|
||||
if isinstance(entry_status, dict) and entry_status.get("state") == "running":
|
||||
return True
|
||||
return False
|
||||
|
||||
|
||||
def _builder_dns_nameservers(status: list[dict[str, object]]) -> list[str]:
|
||||
out: list[str] = []
|
||||
for entry in status:
|
||||
config = entry.get("configuration")
|
||||
config_dns = config.get("dns") if isinstance(config, dict) else None
|
||||
nameservers = (
|
||||
config_dns.get("nameservers")
|
||||
if isinstance(config_dns, dict)
|
||||
else None
|
||||
)
|
||||
if not isinstance(nameservers, list):
|
||||
continue
|
||||
out.extend(name for name in nameservers if isinstance(name, str))
|
||||
return out
|
||||
|
||||
|
||||
def _builder_has_dns(status: list[dict[str, object]], dns: str) -> bool:
|
||||
return dns in _builder_dns_nameservers(status)
|
||||
|
||||
|
||||
def _builder_resolves_build_hosts() -> bool:
|
||||
result = subprocess.run(
|
||||
[_CONTAINER, "exec", "buildkit", "getent", "hosts", "deb.debian.org"],
|
||||
stdout=subprocess.DEVNULL,
|
||||
stderr=subprocess.DEVNULL,
|
||||
check=False,
|
||||
)
|
||||
return result.returncode == 0
|
||||
|
||||
|
||||
def image_exists(ref: str) -> bool:
|
||||
return _silent_run([_CONTAINER, "image", "inspect", ref]) == 0
|
||||
|
||||
|
||||
def container_exists(name: str) -> bool:
|
||||
result = subprocess.run(
|
||||
[_CONTAINER, "list", "--all", "--quiet"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=False,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
return False
|
||||
return name in {line.strip() for line in result.stdout.splitlines()}
|
||||
|
||||
|
||||
def force_remove_container(name: str) -> None:
|
||||
if container_exists(name):
|
||||
subprocess.run(
|
||||
[_CONTAINER, "delete", "--force", name],
|
||||
stdout=subprocess.DEVNULL,
|
||||
stderr=subprocess.DEVNULL,
|
||||
check=False,
|
||||
)
|
||||
|
||||
|
||||
def copy_into_container(name: str, host_path: str, container_path: str) -> None:
|
||||
cmd = [_CONTAINER, "cp", host_path, f"{name}:{container_path}"]
|
||||
result = _run_container_op(cmd)
|
||||
if result.returncode != 0:
|
||||
die(
|
||||
f"container cp into {name}:{container_path} failed: "
|
||||
f"{(result.stderr or '').strip() or '<no stderr>'}"
|
||||
)
|
||||
|
||||
|
||||
def exec_container(name: str, argv: list[str]) -> None:
|
||||
result = _run_container_op([_CONTAINER, "exec", name, *argv])
|
||||
if result.returncode != 0:
|
||||
die(
|
||||
f"container exec in {name} failed: "
|
||||
f"{(result.stderr or '').strip() or '<no stderr>'}"
|
||||
)
|
||||
|
||||
|
||||
def _run_container_op(cmd: list[str]) -> subprocess.CompletedProcess[str]:
|
||||
result = subprocess.run(
|
||||
cmd,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=False,
|
||||
)
|
||||
for _ in range(19):
|
||||
if result.returncode == 0:
|
||||
return result
|
||||
time.sleep(0.1)
|
||||
result = subprocess.run(
|
||||
cmd,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=False,
|
||||
)
|
||||
return result
|
||||
|
||||
|
||||
def create_network(name: str, *, internal: bool = False) -> None:
|
||||
args = [
|
||||
_CONTAINER, "network", "create",
|
||||
"--label", "bot-bottle.backend=macos-container",
|
||||
]
|
||||
if internal:
|
||||
args.append("--internal")
|
||||
args.append(name)
|
||||
result = subprocess.run(
|
||||
args, capture_output=True, text=True, check=False,
|
||||
)
|
||||
if result.returncode == 0:
|
||||
return
|
||||
if "already exists" in (result.stderr or "").lower():
|
||||
return
|
||||
die(
|
||||
f"container network create {name} failed: "
|
||||
f"{(result.stderr or '').strip() or '<no stderr>'}"
|
||||
)
|
||||
|
||||
|
||||
def remove_network(name: str) -> None:
|
||||
result = subprocess.run(
|
||||
[_CONTAINER, "network", "delete", name],
|
||||
stdout=subprocess.DEVNULL,
|
||||
stderr=subprocess.DEVNULL,
|
||||
check=False,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
return
|
||||
|
||||
|
||||
def inspect_container(name: str) -> dict[str, object]:
|
||||
result = subprocess.run(
|
||||
[_CONTAINER, "inspect", name],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=False,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
die(
|
||||
f"container inspect {name} failed: "
|
||||
f"{(result.stderr or '').strip() or '<no stderr>'}"
|
||||
)
|
||||
try:
|
||||
data = json.loads(result.stdout or "[]")
|
||||
except json.JSONDecodeError as exc:
|
||||
die(f"container inspect {name} returned malformed JSON: {exc}")
|
||||
if isinstance(data, list) and data and isinstance(data[0], dict):
|
||||
return data[0]
|
||||
if isinstance(data, dict):
|
||||
return data
|
||||
die(f"container inspect {name} returned an unexpected shape")
|
||||
raise AssertionError("unreachable")
|
||||
|
||||
|
||||
def container_ipv4_on_network(name: str, network: str) -> str:
|
||||
data = inspect_container(name)
|
||||
status = data.get("status")
|
||||
networks = status.get("networks") if isinstance(status, dict) else None
|
||||
if not isinstance(networks, list):
|
||||
die(f"container inspect {name} did not include status.networks")
|
||||
for entry in networks:
|
||||
if not isinstance(entry, dict):
|
||||
continue
|
||||
if entry.get("network") != network:
|
||||
continue
|
||||
raw = entry.get("ipv4Address")
|
||||
if not isinstance(raw, str) or not raw:
|
||||
die(f"container {name} has no IPv4 address on {network}")
|
||||
return raw.split("/", 1)[0]
|
||||
die(f"container {name} is not attached to network {network}")
|
||||
raise AssertionError("unreachable")
|
||||
|
||||
|
||||
def image_id(ref: str) -> str:
|
||||
"""Return the image digest/ID from `container image inspect`.
|
||||
|
||||
The command returns JSON on current Apple Container releases. Keep
|
||||
parsing narrow and fatal so callers do not cache on an empty key.
|
||||
"""
|
||||
import json
|
||||
|
||||
result = subprocess.run(
|
||||
[_CONTAINER, "image", "inspect", ref],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=False,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
die(
|
||||
f"container image inspect for {ref!r} failed: "
|
||||
f"{(result.stderr or '').strip() or '<no stderr>'}"
|
||||
)
|
||||
try:
|
||||
data = json.loads(result.stdout or "{}")
|
||||
except json.JSONDecodeError as exc:
|
||||
die(f"container image inspect for {ref!r} returned malformed JSON: {exc}")
|
||||
if isinstance(data, list) and data:
|
||||
data = data[0]
|
||||
if isinstance(data, dict):
|
||||
value = data.get("id") or data.get("digest") or data.get("ID")
|
||||
if value:
|
||||
return str(value)
|
||||
die(f"container image inspect for {ref!r} did not include an image id")
|
||||
raise AssertionError("unreachable")
|
||||
|
||||
|
||||
def save(ref: str, output: str) -> None:
|
||||
subprocess.run([_CONTAINER, "image", "save", ref, "-o", output], check=True)
|
||||
|
||||
|
||||
def _silent_run(cmd: Iterable[str]) -> int:
|
||||
return subprocess.run(
|
||||
list(cmd),
|
||||
stdout=subprocess.DEVNULL,
|
||||
stderr=subprocess.DEVNULL,
|
||||
check=False,
|
||||
).returncode
|
||||
@@ -0,0 +1,131 @@
|
||||
"""Shared helpers used by both backends' resolve_plan steps.
|
||||
|
||||
Each helper owns one well-defined step of the per-bottle plan
|
||||
resolution so docker and smolmachines don't repeat the same logic.
|
||||
Backend-specific steps (container names, env-file, per-bottle
|
||||
Dockerfile overrides, subnet allocation) stay in the backend's own
|
||||
resolve_plan.py.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
from dataclasses import replace
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
|
||||
from ..agent_provider import AgentProvisionPlan
|
||||
from ..bottle_state import (
|
||||
BottleMetadata,
|
||||
agent_state_dir,
|
||||
bottle_identity,
|
||||
egress_state_dir,
|
||||
git_gate_state_dir,
|
||||
supervise_state_dir,
|
||||
write_metadata,
|
||||
)
|
||||
from ..egress import Egress, EgressPlan
|
||||
from ..git_gate import GitGate, GitGatePlan
|
||||
from ..manifest import Manifest, ManifestBottle
|
||||
from ..supervise import Supervise, SupervisePlan
|
||||
from . import BottleSpec
|
||||
|
||||
|
||||
def mint_slug(spec: BottleSpec) -> str:
|
||||
"""Return the bottle identity: the recorded identity for a resume,
|
||||
or a freshly minted one for a new start.
|
||||
|
||||
When a label is provided it becomes the full slug (no random suffix),
|
||||
so two launches with the same label collide by design. When no label
|
||||
is given the identity is minted with a random suffix to avoid
|
||||
collisions between anonymous launches of the same agent."""
|
||||
if spec.identity:
|
||||
return spec.identity
|
||||
if spec.label:
|
||||
from .docker import util as docker_mod
|
||||
return docker_mod.slugify(spec.label)
|
||||
return bottle_identity(spec.agent_name)
|
||||
|
||||
|
||||
def write_launch_metadata(
|
||||
slug: str, spec: BottleSpec, *, compose_project: str, backend: str,
|
||||
) -> None:
|
||||
"""Persist launch metadata so `cli.py resume <identity>` can
|
||||
reconstruct the spec. Idempotent — re-writes on resume with a
|
||||
refreshed started_at."""
|
||||
write_metadata(BottleMetadata(
|
||||
identity=slug,
|
||||
agent_name=spec.agent_name,
|
||||
cwd=spec.user_cwd if spec.copy_cwd else "",
|
||||
copy_cwd=spec.copy_cwd,
|
||||
started_at=datetime.now(timezone.utc).isoformat(),
|
||||
compose_project=compose_project,
|
||||
backend=backend,
|
||||
label=spec.label,
|
||||
color=spec.color,
|
||||
))
|
||||
|
||||
|
||||
def prepare_agent_state_dir(slug: str, manifest: Manifest) -> tuple[Path, Path]:
|
||||
"""Create the agent state subdir, write the prompt file.
|
||||
Returns (agent_dir, prompt_file)."""
|
||||
agent = manifest.agent
|
||||
agent_dir = agent_state_dir(slug)
|
||||
agent_dir.mkdir(parents=True, exist_ok=True)
|
||||
prompt_file = agent_dir / "prompt.txt"
|
||||
prompt_file.write_text(agent.prompt or "")
|
||||
prompt_file.chmod(0o600)
|
||||
return agent_dir, prompt_file
|
||||
|
||||
|
||||
def prepare_git_gate(bottle: ManifestBottle, slug: str) -> GitGatePlan:
|
||||
git_gate_dir = git_gate_state_dir(slug)
|
||||
git_gate_dir.mkdir(parents=True, exist_ok=True)
|
||||
return GitGate().prepare(bottle, slug, git_gate_dir)
|
||||
|
||||
|
||||
def prepare_egress(
|
||||
bottle: ManifestBottle, slug: str, provision: AgentProvisionPlan,
|
||||
) -> EgressPlan:
|
||||
egress_dir = egress_state_dir(slug)
|
||||
egress_dir.mkdir(parents=True, exist_ok=True)
|
||||
return Egress().prepare(bottle, slug, egress_dir, provision.egress_routes)
|
||||
|
||||
|
||||
def prepare_supervise(bottle: ManifestBottle, slug: str) -> SupervisePlan | None:
|
||||
"""Prepare the supervise sidecar state dir. Returns None when
|
||||
bottle.supervise is falsy."""
|
||||
if not bottle.supervise:
|
||||
return None
|
||||
supervise_dir = supervise_state_dir(slug)
|
||||
supervise_dir.mkdir(parents=True, exist_ok=True)
|
||||
return Supervise().prepare(slug, supervise_dir)
|
||||
|
||||
|
||||
def merge_provision_env_vars(provision: AgentProvisionPlan) -> AgentProvisionPlan:
|
||||
"""Fold provision.env_vars into guest_env (setdefault semantics)
|
||||
and return a new plan with the merged guest_env."""
|
||||
merged = dict(provision.guest_env)
|
||||
for key, val in provision.env_vars.items():
|
||||
merged.setdefault(key, val)
|
||||
return replace(provision, guest_env=merged)
|
||||
|
||||
|
||||
def resolve_manifest_dockerfile(path_value: str, spec: BottleSpec) -> str:
|
||||
"""Resolve a manifest-supplied dockerfile path relative to user_cwd."""
|
||||
path = Path(os.path.expanduser(path_value))
|
||||
if not path.is_absolute():
|
||||
path = Path(spec.user_cwd) / path
|
||||
return str(path)
|
||||
|
||||
|
||||
__all__ = [
|
||||
"merge_provision_env_vars",
|
||||
"mint_slug",
|
||||
"prepare_agent_state_dir",
|
||||
"prepare_egress",
|
||||
"prepare_git_gate",
|
||||
"prepare_supervise",
|
||||
"resolve_manifest_dockerfile",
|
||||
"write_launch_metadata",
|
||||
]
|
||||
@@ -13,18 +13,21 @@ from contextlib import contextmanager
|
||||
from pathlib import Path
|
||||
from typing import Generator, Sequence
|
||||
|
||||
from .. import ActiveAgent, Bottle, BottleBackend, BottleSpec
|
||||
from ...agent_provider import AgentProvisionPlan
|
||||
from ...egress import EgressPlan
|
||||
from ...env import ResolvedEnv
|
||||
from ...git_gate import GitGatePlan
|
||||
from ...supervise import SupervisePlan
|
||||
from ...manifest import Manifest
|
||||
from .. import ActiveAgent, BottleBackend, BottleSpec
|
||||
from . import cleanup as _cleanup
|
||||
from . import enumerate as _enumerate
|
||||
from . import launch as _launch
|
||||
from . import prepare as _prepare
|
||||
from . import resolve_plan as _resolve_plan
|
||||
from . import smolvm as _smolvm
|
||||
from .bottle import SmolmachinesBottle
|
||||
from .bottle_cleanup_plan import SmolmachinesBottleCleanupPlan
|
||||
from .bottle_plan import SmolmachinesBottlePlan
|
||||
from .provision import ca as _ca
|
||||
from .provision import git as _git
|
||||
from .provision import workspace as _workspace
|
||||
|
||||
|
||||
class SmolmachinesBottleBackend(
|
||||
@@ -43,10 +46,36 @@ class SmolmachinesBottleBackend(
|
||||
runtime check happens at `prepare`."""
|
||||
return _smolvm.is_available()
|
||||
|
||||
def _preflight(self) -> None:
|
||||
_resolve_plan.preflight()
|
||||
|
||||
def _build_guest_env(self, resolved_env: ResolvedEnv) -> dict[str, str]:
|
||||
return _resolve_plan.build_guest_env(resolved_env)
|
||||
|
||||
def _resolve_plan(
|
||||
self, spec: BottleSpec, *, stage_dir: Path
|
||||
self,
|
||||
spec: BottleSpec,
|
||||
*,
|
||||
manifest: Manifest,
|
||||
slug: str,
|
||||
resolved_env: ResolvedEnv,
|
||||
agent_provision_plan: AgentProvisionPlan,
|
||||
egress_plan: EgressPlan,
|
||||
git_gate_plan: GitGatePlan,
|
||||
supervise_plan: SupervisePlan | None,
|
||||
stage_dir: Path,
|
||||
) -> SmolmachinesBottlePlan:
|
||||
return _prepare.resolve_plan(spec, stage_dir=stage_dir)
|
||||
return _resolve_plan.resolve_plan(
|
||||
spec,
|
||||
manifest=manifest,
|
||||
slug=slug,
|
||||
resolved_env=resolved_env,
|
||||
agent_provision_plan=agent_provision_plan,
|
||||
egress_plan=egress_plan,
|
||||
supervise_plan=supervise_plan,
|
||||
git_gate_plan=git_gate_plan,
|
||||
stage_dir=stage_dir,
|
||||
)
|
||||
|
||||
@contextmanager
|
||||
def launch(
|
||||
@@ -55,21 +84,6 @@ class SmolmachinesBottleBackend(
|
||||
with _launch.launch(plan, provision=self.provision) as bottle:
|
||||
yield bottle
|
||||
|
||||
def provision_ca(
|
||||
self, plan: SmolmachinesBottlePlan, bottle: Bottle
|
||||
) -> None:
|
||||
_ca.provision_ca(plan, bottle)
|
||||
|
||||
def provision_workspace(
|
||||
self, plan: SmolmachinesBottlePlan, bottle: Bottle
|
||||
) -> None:
|
||||
_workspace.provision_workspace(plan, bottle)
|
||||
|
||||
def provision_git(
|
||||
self, plan: SmolmachinesBottlePlan, bottle: Bottle
|
||||
) -> None:
|
||||
_git.provision_git(plan, bottle)
|
||||
|
||||
def supervise_mcp_url(self, plan: SmolmachinesBottlePlan) -> str:
|
||||
"""The smolmachines guest reaches the supervise sidecar via a
|
||||
host-published random port the launch step pinned earlier
|
||||
|
||||
@@ -19,10 +19,13 @@ from __future__ import annotations
|
||||
|
||||
import subprocess
|
||||
import sys
|
||||
import time
|
||||
import shlex
|
||||
from typing import Mapping, cast
|
||||
|
||||
from ...agent_provider import PromptMode, prompt_args
|
||||
from .. import Bottle, ExecResult
|
||||
from ..terminal import exec_shell_script
|
||||
from . import pty_resize as _pty_resize
|
||||
from . import smolvm as _smolvm
|
||||
|
||||
@@ -67,6 +70,10 @@ class SmolmachinesBottle(Bottle):
|
||||
guest_env: Mapping[str, str] | None = None,
|
||||
agent_command: str = "claude",
|
||||
agent_prompt_mode: PromptMode = "append_file",
|
||||
agent_provider_template: str = "claude",
|
||||
terminal_title: str = "",
|
||||
terminal_color: str = "",
|
||||
agent_workdir: str = "/home/node",
|
||||
) -> None:
|
||||
self.name = machine_name
|
||||
# In-VM path to the agent's prompt file. None when the
|
||||
@@ -80,9 +87,10 @@ class SmolmachinesBottle(Bottle):
|
||||
self._guest_env = dict(guest_env or {})
|
||||
self._agent_prompt_mode = agent_prompt_mode
|
||||
self.agent_command = agent_command
|
||||
self.agent_provider_template = (
|
||||
"codex" if agent_command == "codex" else "claude"
|
||||
)
|
||||
self.terminal_title = terminal_title
|
||||
self.terminal_color = terminal_color
|
||||
self.agent_provider_template = agent_provider_template
|
||||
self.agent_workdir = agent_workdir
|
||||
|
||||
def agent_argv(
|
||||
self, argv: list[str], *, tty: bool = True,
|
||||
@@ -90,8 +98,14 @@ class SmolmachinesBottle(Bottle):
|
||||
flags = ["smolvm", "machine", "exec", "--name", self.name]
|
||||
if tty:
|
||||
flags += ["-i", "-t"]
|
||||
agent_tail = ["env", *_env_assignments_for("node", self._guest_env),
|
||||
self.agent_command]
|
||||
agent_tail = ["env", *_env_assignments_for("node", self._guest_env)]
|
||||
if self.agent_workdir and self.agent_workdir != _HOME_FOR["node"]:
|
||||
agent_tail += [
|
||||
"sh", "-lc",
|
||||
f"cd {shlex.quote(self.agent_workdir)} && exec \"$@\"",
|
||||
"bot-bottle-agent",
|
||||
]
|
||||
agent_tail.append(self.agent_command)
|
||||
provider_prompt_args = prompt_args(
|
||||
cast(PromptMode, self._agent_prompt_mode), self.prompt_path, argv=argv,
|
||||
)
|
||||
@@ -127,9 +141,16 @@ class SmolmachinesBottle(Bottle):
|
||||
UID switches via `runuser -u node --` (not `-l`) so we
|
||||
avoid login-shell wiring. HOME / USER come from `smolvm
|
||||
-e` instead, which sets them on the process env."""
|
||||
return subprocess.run(
|
||||
self.agent_argv(argv, tty=tty), check=False,
|
||||
).returncode
|
||||
agent_argv = self.agent_argv(argv, tty=tty)
|
||||
script = exec_shell_script(agent_argv, self.terminal_title, self.terminal_color) if tty else None
|
||||
if script is None:
|
||||
return subprocess.run(agent_argv, check=False).returncode
|
||||
return subprocess.run(["sh", "-lc", script], check=False).returncode
|
||||
|
||||
# smolvm/libkrun can SIGKILL an otherwise-normal exec during
|
||||
# early-VM provisioning. Retry once after a short settle so
|
||||
# callers (provision_ca, etc.) don't have to handle it themselves.
|
||||
_SIGKILL_EXIT = 128 + 9
|
||||
|
||||
def exec(self, script: str, *, user: str = "node") -> ExecResult:
|
||||
"""Run a POSIX shell script as `user` (default `node`) and
|
||||
@@ -141,14 +162,22 @@ class SmolmachinesBottle(Bottle):
|
||||
|
||||
`runuser -u <user> -- env ... /bin/sh -c <script>` switches UID
|
||||
without invoking a login shell, then sets HOME / USER and the
|
||||
bottle env in the child process."""
|
||||
bottle env in the child process.
|
||||
|
||||
Retries once on SIGKILL (exit 137) — libkrun occasionally
|
||||
kills short-lived execs during VM bring-up."""
|
||||
r = self._exec_raw(script, user=user)
|
||||
if r.returncode == self._SIGKILL_EXIT:
|
||||
time.sleep(1.0)
|
||||
r = self._exec_raw(script, user=user)
|
||||
return r
|
||||
|
||||
def _exec_raw(self, script: str, *, user: str = "node") -> ExecResult:
|
||||
argv = [
|
||||
"--", "runuser", "-u", user, "--",
|
||||
"env", *_env_assignments_for(user, self._guest_env),
|
||||
"/bin/sh", "-c", script,
|
||||
]
|
||||
# Call smolvm directly because this path needs the host-side
|
||||
# subprocess capture shape used by the Docker backend.
|
||||
r = subprocess.run(
|
||||
["smolvm", "machine", "exec", "--name", self.name] + argv,
|
||||
capture_output=True, text=True, check=False,
|
||||
|
||||
@@ -12,7 +12,6 @@ from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
|
||||
from ...agent_provider import PromptMode
|
||||
from ...pipelock import PipelockProxyPlan
|
||||
from .. import BottlePlan
|
||||
|
||||
|
||||
@@ -30,27 +29,6 @@ class SmolmachinesBottlePlan(BottlePlan):
|
||||
bundle_subnet: str
|
||||
bundle_gateway: str
|
||||
bundle_ip: str
|
||||
# smolvm machine name + agent image source. machine_create
|
||||
# boots from a packed `.smolmachine` artifact (pre-baked at
|
||||
# prepare time via `smolvm pack create`); using `--from`
|
||||
# instead of `--image` avoids the registry-pull race we hit
|
||||
# when machine_start tried to fetch on-demand and the libkrun
|
||||
# agent's network attempt got refused by macOS.
|
||||
#
|
||||
# Chunk 2d ships with a public placeholder image (alpine)
|
||||
# since bot-bottle-claude:latest lives in the operator's local
|
||||
# docker daemon and smolvm's crane backend can't read from
|
||||
# there; chunk 4 resolves the agent-image-conversion gap
|
||||
# (push to a registry first, or smolvm grows a docker-daemon
|
||||
# transport).
|
||||
machine_name: str
|
||||
# Agent image ref (docker tag). `launch` runs the
|
||||
# build → save → registry push → smolvm pack pipeline against
|
||||
# this and feeds the resulting `.smolmachine` artifact to
|
||||
# `machine_create --from`. The pipeline runs at launch time
|
||||
# (not prepare time) so the docker build output doesn't garble
|
||||
# the dashboard's preflight modal.
|
||||
agent_image_ref: str
|
||||
# In-guest env vars (HTTPS_PROXY etc) — IP-literal URLs since
|
||||
# the guest has no DNS resolver inside the TSI allowlist.
|
||||
# Passed to `smolvm machine create` as `-e K=V` flags.
|
||||
@@ -58,11 +36,6 @@ class SmolmachinesBottlePlan(BottlePlan):
|
||||
# `--smolfile` is mutually exclusive with `--from`, and
|
||||
# `--from` is the path that avoids the registry-pull race).
|
||||
guest_env: dict[str, str]
|
||||
# Path to the agent's prompt file on the host. Always written
|
||||
# (mode 0o600) so the in-VM path always exists; the file is
|
||||
# empty when the agent has no prompt — claude-code reads it
|
||||
# via --append-system-prompt-file only when non-empty.
|
||||
prompt_file: Path
|
||||
# Inner Plans for the sidecar bundle daemons. The same shape the
|
||||
# docker backend uses — same `.prepare()` calls produced
|
||||
# them — but our launch step doesn't populate the
|
||||
@@ -71,7 +44,6 @@ class SmolmachinesBottlePlan(BottlePlan):
|
||||
# docker's `--internal` + egress bridge topology; it's on a
|
||||
# per-bottle bridge with a pinned IP. The unused fields stay
|
||||
# at their dataclass defaults.
|
||||
proxy_plan: PipelockProxyPlan
|
||||
# Agent-side endpoints. On Docker Desktop the docker bridge
|
||||
# IPs aren't reachable from the smolvm guest (TSI uses macOS
|
||||
# networking; docker container IPs live in the daemon's VM),
|
||||
@@ -84,6 +56,42 @@ class SmolmachinesBottlePlan(BottlePlan):
|
||||
agent_git_gate_host: str = ""
|
||||
agent_supervise_url: str = ""
|
||||
|
||||
@property
|
||||
def machine_name(self) -> str:
|
||||
"""smolvm machine name. `machine_create` boots from a packed
|
||||
`.smolmachine` artifact (pre-baked at prepare time via
|
||||
`smolvm pack create`); using `--from` instead of `--image`
|
||||
avoids the registry-pull race we hit when machine_start tried
|
||||
to fetch on-demand and the libkrun agent's network attempt
|
||||
got refused by macOS."""
|
||||
return self.agent_provision.instance_name
|
||||
|
||||
@property
|
||||
def agent_image(self) -> str:
|
||||
"""Agent image ref (docker tag). `launch` runs the
|
||||
build → save → registry push → smolvm pack pipeline against
|
||||
this and feeds the resulting `.smolmachine` artifact to
|
||||
`machine_create --from`. The pipeline runs at launch time
|
||||
(not prepare time) so the docker build output doesn't garble
|
||||
the dashboard's preflight modal."""
|
||||
return self.agent_provision.image
|
||||
|
||||
@property
|
||||
def prompt_file(self) -> Path:
|
||||
"""Path to the agent's prompt file on the host. Always written
|
||||
(mode 0o600) so the in-VM path always exists; the file is
|
||||
empty when the agent has no prompt — claude-code reads it
|
||||
via --append-system-prompt-file only when non-empty."""
|
||||
return self.agent_provision.prompt_file
|
||||
|
||||
@property
|
||||
def git_gate_insteadof_host(self) -> str:
|
||||
return self.agent_git_gate_host
|
||||
|
||||
@property
|
||||
def git_gate_insteadof_scheme(self) -> str:
|
||||
return "http"
|
||||
|
||||
@property
|
||||
def agent_command(self) -> str:
|
||||
return self.agent_provision.command
|
||||
|
||||
@@ -23,7 +23,7 @@ import json
|
||||
import subprocess
|
||||
|
||||
from .. import ActiveAgent
|
||||
from ..docker.bottle_state import read_metadata
|
||||
from ...bottle_state import read_metadata
|
||||
from . import sidecar_bundle as _bundle
|
||||
|
||||
|
||||
@@ -64,13 +64,15 @@ def enumerate_active() -> list[ActiveAgent]:
|
||||
agent_name=metadata.agent_name if metadata else "?",
|
||||
started_at=metadata.started_at if metadata else "",
|
||||
services=services_by_slug.get(slug, ()),
|
||||
label=metadata.label if metadata else "",
|
||||
color=metadata.color if metadata else "",
|
||||
))
|
||||
return out
|
||||
|
||||
|
||||
def _query_bundle_services() -> dict[str, tuple[str, ...]]:
|
||||
"""`{slug: ('egress', 'pipelock', ...)}` from each running
|
||||
bundle container's `BOT_BOTTLE_SIDECAR_DAEMONS` env var.
|
||||
"""`{slug: ('egress', ...)}` from each running bundle container's
|
||||
`BOT_BOTTLE_SIDECAR_DAEMONS` env var.
|
||||
Smolmachines bundles all run the PRD-0024 image with the
|
||||
same daemon set declared via env, so one inspect per bundle
|
||||
gets us the picture without exec'ing into the container.
|
||||
|
||||
@@ -9,13 +9,9 @@ guest pointed at the bundle's pinned IP via TSI's
|
||||
exit.
|
||||
|
||||
The bundle's daemons consume the inner Plans the docker backend
|
||||
already produces: pipelock reads its yaml + CA from the
|
||||
PipelockProxyPlan; egress reads routes + CAs from the EgressPlan
|
||||
+ EGRESS_UPSTREAM_PROXY pointing at `127.0.0.1:8888` (bundle
|
||||
local), since the agent dials pipelock first (not egress) on the
|
||||
smolmachines path. Git-gate + supervise plumb through the same
|
||||
plans the docker backend uses, minus the docker-network fields
|
||||
that don't apply here."""
|
||||
already produces: egress reads routes + CAs from the EgressPlan.
|
||||
Git-gate + supervise plumb through the same plans the docker
|
||||
backend uses, minus the docker-network fields that don't apply here."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
@@ -29,16 +25,11 @@ from ...egress import (
|
||||
EGRESS_ROUTES_IN_CONTAINER,
|
||||
egress_resolve_token_values,
|
||||
)
|
||||
from ...pipelock import (
|
||||
PIPELOCK_CA_CERT_IN_CONTAINER,
|
||||
PIPELOCK_CA_KEY_IN_CONTAINER,
|
||||
)
|
||||
from ...supervise import QUEUE_DIR_IN_CONTAINER, SUPERVISE_PORT
|
||||
from ...util import expand_tilde
|
||||
from ..docker import util as docker_mod
|
||||
from ..docker.egress import (
|
||||
EGRESS_CA_IN_CONTAINER,
|
||||
EGRESS_PIPELOCK_CA_IN_CONTAINER,
|
||||
EGRESS_PORT as _EGRESS_PORT,
|
||||
egress_tls_init,
|
||||
)
|
||||
@@ -48,14 +39,9 @@ from ..docker.git_gate import (
|
||||
GIT_GATE_ENTRYPOINT_IN_CONTAINER,
|
||||
GIT_GATE_HOOK_IN_CONTAINER,
|
||||
)
|
||||
from ..docker.pipelock import (
|
||||
BUNDLE_LOCAL_PIPELOCK_URL,
|
||||
PIPELOCK_PORT as _PIPELOCK_PORT_STR,
|
||||
pipelock_tls_init,
|
||||
)
|
||||
from ...git_gate import revoke_git_gate_provisioned_keys
|
||||
from ...log import warn
|
||||
from ..docker.bottle_state import git_gate_state_dir
|
||||
from ...bottle_state import egress_state_dir, git_gate_state_dir
|
||||
from . import loopback_alias as _loopback
|
||||
from . import sidecar_bundle as _bundle
|
||||
from . import smolvm as _smolvm
|
||||
@@ -78,9 +64,7 @@ _SMOLMACHINE_CACHE_DIR = Path.home() / ".cache" / "bot-bottle" / "smolmachines"
|
||||
# Container-internal listening ports for each bundle daemon. The
|
||||
# bundle publishes each one on a random host loopback port (see
|
||||
# `_bundle.start_bundle`), and `_bundle.bundle_host_port` looks
|
||||
# them up post-start. Pipelock's port is an env-overridable string
|
||||
# in docker.pipelock; coerce to int here.
|
||||
_PIPELOCK_PORT = int(_PIPELOCK_PORT_STR)
|
||||
# them up post-start.
|
||||
_GIT_HTTP_PORT = 9420
|
||||
_SUPERVISE_PORT = SUPERVISE_PORT
|
||||
|
||||
@@ -106,7 +90,7 @@ def launch(
|
||||
# here, not in prepare, so the docker-build output doesn't
|
||||
# garble the dashboard's preflight modal.
|
||||
agent_from_path = _ensure_smolmachine(
|
||||
plan.agent_image_ref,
|
||||
plan.agent_image,
|
||||
dockerfile=plan.agent_dockerfile_path,
|
||||
)
|
||||
|
||||
@@ -119,6 +103,10 @@ def launch(
|
||||
guest_env=plan.guest_env,
|
||||
agent_command=plan.agent_command,
|
||||
agent_prompt_mode=plan.agent_prompt_mode,
|
||||
agent_provider_template=plan.agent_provider_template,
|
||||
terminal_title=plan.spec.label or plan.spec.agent_name,
|
||||
terminal_color=plan.spec.color,
|
||||
agent_workdir=plan.workspace_plan.workdir,
|
||||
)
|
||||
bottle.prompt_path = provision(plan, bottle)
|
||||
|
||||
@@ -142,7 +130,7 @@ def _teardown_smolmachines(
|
||||
except BaseException as exc: # noqa: W0718 — teardown must not fail
|
||||
teardown_exc = exc
|
||||
warn(f"smolmachines teardown failed: {exc!r}")
|
||||
bottle = plan.spec.manifest.bottle_for(plan.spec.agent_name)
|
||||
bottle = plan.manifest.bottle
|
||||
revoke_git_gate_provisioned_keys(bottle, git_gate_state_dir(plan.slug))
|
||||
if teardown_exc is not None:
|
||||
raise teardown_exc
|
||||
@@ -167,33 +155,16 @@ def _allocate_resources(
|
||||
|
||||
|
||||
def _mint_certs(plan: SmolmachinesBottlePlan) -> SmolmachinesBottlePlan:
|
||||
"""Mint per-bottle CAs and return the plan with CA paths filled.
|
||||
|
||||
Pipelock always runs in the bundle. Egress's CA is only minted
|
||||
when the bottle declares routes — otherwise egress runs idle
|
||||
without MITM and the CA files would be unused."""
|
||||
ca_cert_host, ca_key_host = pipelock_tls_init(plan.proxy_plan.yaml_path.parent)
|
||||
proxy_plan = dataclasses.replace(
|
||||
plan.proxy_plan,
|
||||
ca_cert_host_path=ca_cert_host,
|
||||
ca_key_host_path=ca_key_host,
|
||||
"""Mint the egress MITM CA and return the plan with CA paths filled."""
|
||||
egress_ca_host, egress_ca_cert_only = egress_tls_init(
|
||||
egress_state_dir(plan.slug),
|
||||
)
|
||||
egress_plan = plan.egress_plan
|
||||
if egress_plan.routes:
|
||||
egress_ca_host, egress_ca_cert_only = egress_tls_init(
|
||||
plan.egress_plan.routes_path.parent,
|
||||
)
|
||||
egress_plan = dataclasses.replace(
|
||||
egress_plan,
|
||||
mitmproxy_ca_host_path=egress_ca_host,
|
||||
mitmproxy_ca_cert_only_host_path=egress_ca_cert_only,
|
||||
pipelock_ca_host_path=ca_cert_host,
|
||||
# On smolmachines, egress's upstream is pipelock on the
|
||||
# bundle's localhost — they're in the same container's
|
||||
# network namespace.
|
||||
pipelock_proxy_url=BUNDLE_LOCAL_PIPELOCK_URL,
|
||||
)
|
||||
return dataclasses.replace(plan, proxy_plan=proxy_plan, egress_plan=egress_plan)
|
||||
egress_plan = dataclasses.replace(
|
||||
plan.egress_plan,
|
||||
mitmproxy_ca_host_path=egress_ca_host,
|
||||
mitmproxy_ca_cert_only_host_path=egress_ca_cert_only,
|
||||
)
|
||||
return dataclasses.replace(plan, egress_plan=egress_plan)
|
||||
|
||||
|
||||
def _start_bundle(
|
||||
@@ -224,17 +195,10 @@ def _discover_urls(
|
||||
macOS networking, and macOS sees the daemon's bridge via the
|
||||
published-port loopback forward only.
|
||||
|
||||
Proxy hop order: when the bottle declares egress routes, the
|
||||
agent's first hop is egress (for token injection), then
|
||||
pipelock. Without routes, the agent dials pipelock directly.
|
||||
NO_PROXY includes the per-bottle loopback alias so the
|
||||
supervise + git-gate URLs bypass HTTPS_PROXY."""
|
||||
if plan.egress_plan.routes:
|
||||
agent_facing_port = _EGRESS_PORT
|
||||
else:
|
||||
agent_facing_port = _PIPELOCK_PORT
|
||||
agent_facing_host_port = _bundle.bundle_host_port(
|
||||
plan.slug, agent_facing_port, host_ip=loopback_ip,
|
||||
plan.slug, _EGRESS_PORT, host_ip=loopback_ip,
|
||||
)
|
||||
agent_proxy_url = f"http://{loopback_ip}:{agent_facing_host_port}"
|
||||
|
||||
@@ -328,8 +292,7 @@ def _bundle_launch_spec(
|
||||
"""Build a BundleLaunchSpec from the resolved inner Plans.
|
||||
|
||||
Daemons in the CSV:
|
||||
- egress + pipelock are always present (pipelock is the
|
||||
agent's first hop; egress is its upstream).
|
||||
- egress is always present.
|
||||
- git-gate + git-http are conditional on plan.git_gate_plan.upstreams.
|
||||
- supervise is conditional on plan.supervise_plan.
|
||||
|
||||
@@ -337,36 +300,15 @@ def _bundle_launch_spec(
|
||||
daemon-private values only (HTTPS_PROXY is scoped to the
|
||||
egress process by egress_entrypoint.sh — see PRD 0024's bundle
|
||||
bind-address PR)."""
|
||||
daemons: list[str] = ["egress", "pipelock"]
|
||||
daemons: list[str] = ["egress"]
|
||||
env: list[str] = []
|
||||
volumes: list[tuple[str, str, bool]] = []
|
||||
|
||||
# In this Docker-Desktop-compatible topology, whichever daemon
|
||||
# is "agent-facing" gets its port published on the host
|
||||
# loopback (see `_ensure_smolmachine`'s discovery loop) and the
|
||||
# other stays bundle-internal. The bundle is NOT reachable by
|
||||
# bridge IP from the smolvm guest on macOS — TSI uses macOS
|
||||
# networking, and macOS sees the daemon's bridge via the
|
||||
# published-port loopback forward only.
|
||||
|
||||
# --- pipelock ---------------------------------------------
|
||||
pp = plan.proxy_plan
|
||||
volumes += [
|
||||
(str(pp.yaml_path), "/etc/pipelock.yaml", True),
|
||||
(str(pp.ca_cert_host_path), PIPELOCK_CA_CERT_IN_CONTAINER, True),
|
||||
(str(pp.ca_key_host_path), PIPELOCK_CA_KEY_IN_CONTAINER, True),
|
||||
]
|
||||
|
||||
# --- egress -----------------------------------------------
|
||||
ep = plan.egress_plan
|
||||
volumes.append((str(ep.mitmproxy_ca_host_path), EGRESS_CA_IN_CONTAINER, True))
|
||||
if ep.routes:
|
||||
env.append(f"EGRESS_UPSTREAM_PROXY={ep.pipelock_proxy_url}")
|
||||
env.append(f"EGRESS_UPSTREAM_CA={EGRESS_PIPELOCK_CA_IN_CONTAINER}")
|
||||
volumes += [
|
||||
(str(ep.routes_path), EGRESS_ROUTES_IN_CONTAINER, True),
|
||||
(str(ep.mitmproxy_ca_host_path), EGRESS_CA_IN_CONTAINER, True),
|
||||
(str(ep.pipelock_ca_host_path), EGRESS_PIPELOCK_CA_IN_CONTAINER, True),
|
||||
]
|
||||
volumes.append((str(ep.routes_path), EGRESS_ROUTES_IN_CONTAINER, True))
|
||||
# Bare-name entries for upstream-token slots. Their values
|
||||
# come from the docker-run subprocess env (inherited from
|
||||
# the operator's shell), never landing on argv.
|
||||
@@ -409,14 +351,8 @@ def _bundle_launch_spec(
|
||||
|
||||
# Container ports the agent reaches from the smolvm guest —
|
||||
# published on host loopback so the guest can dial via TSI +
|
||||
# macOS networking. The HTTP/HTTPS chokepoint is whichever
|
||||
# daemon's port we publish: egress when routes are declared
|
||||
# (token injection first, then forwards to bundle-internal
|
||||
# pipelock), pipelock otherwise.
|
||||
if ep.routes:
|
||||
ports_to_publish: list[int] = [_EGRESS_PORT]
|
||||
else:
|
||||
ports_to_publish = [_PIPELOCK_PORT]
|
||||
# macOS networking. Egress is always the agent's HTTP/HTTPS proxy.
|
||||
ports_to_publish: list[int] = [_EGRESS_PORT]
|
||||
if gp.upstreams:
|
||||
ports_to_publish.append(_GIT_HTTP_PORT)
|
||||
if sp is not None:
|
||||
|
||||
@@ -48,7 +48,7 @@ from ...log import die
|
||||
|
||||
|
||||
# registry:2.8.3, pinned by digest. Same env-override pattern as the
|
||||
# pipelock image pin in bot_bottle/backend/docker/pipelock.py.
|
||||
# sidecar image pin in bot_bottle/backend/docker/sidecar_bundle.py.
|
||||
REGISTRY_IMAGE = os.environ.get(
|
||||
"BOT_BOTTLE_REGISTRY_IMAGE",
|
||||
"registry@sha256:a3d8aaa63ed8681a604f1dea0aa03f100d5895b6a58ace528858a7b332415373",
|
||||
|
||||
@@ -1,197 +0,0 @@
|
||||
"""smolmachines `_resolve_plan` (PRD 0023 chunks 2d + 4c).
|
||||
|
||||
Resolves the per-bottle docker subnet + bundle IP and assembles
|
||||
the guest env. The agent's docker image build → smolmachine
|
||||
pack pipeline runs in `launch.launch`, not here, so the
|
||||
dashboard's preflight modal isn't garbled by docker-build output
|
||||
before the operator has confirmed.
|
||||
|
||||
No VM bringup — that's `launch.launch`'s job."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
from datetime import datetime, timezone
|
||||
from dataclasses import replace
|
||||
from pathlib import Path
|
||||
|
||||
from ...agent_provider import agent_provision_plan, runtime_for
|
||||
from ...backend import BottleSpec
|
||||
from ...backend.docker.bottle_state import (
|
||||
BottleMetadata,
|
||||
agent_state_dir,
|
||||
bottle_identity,
|
||||
egress_state_dir,
|
||||
git_gate_state_dir,
|
||||
pipelock_state_dir,
|
||||
supervise_state_dir,
|
||||
write_metadata,
|
||||
)
|
||||
from ...egress import Egress
|
||||
from ...env import resolve_env
|
||||
from ...git_gate import GitGate
|
||||
from ...pipelock import PipelockProxy
|
||||
from ...supervise import Supervise
|
||||
from ...workspace import workspace_plan as resolve_workspace_plan
|
||||
from .bottle_plan import SmolmachinesBottlePlan
|
||||
from .util import smolmachines_bundle_subnet, smolmachines_preflight
|
||||
|
||||
|
||||
# Gateway ports the bundle exposes inside its container — pipelock
|
||||
# HTTPS proxy, git-gate's git-daemon, supervise's MCP. The agent
|
||||
# inside the smolvm guest dials these on the bundle's pinned IP.
|
||||
_BUNDLE_PIPELOCK_PORT = 8888
|
||||
_BUNDLE_GIT_GATE_PORT = 9418
|
||||
_BUNDLE_SUPERVISE_PORT = 9100
|
||||
|
||||
|
||||
def resolve_plan(
|
||||
spec: BottleSpec, *, stage_dir: Path
|
||||
) -> SmolmachinesBottlePlan:
|
||||
"""Materialize the smolmachines plan. The bundle's docker
|
||||
subnet + pinned IP are derived from the slug; the agent's
|
||||
`.smolmachine` artifact is built (or cache-hit) here so
|
||||
launch's `machine create --from` boots without a registry
|
||||
pull. Per-bottle guest env + the TSI allow_cidrs land on the
|
||||
plan for launch to pass straight through to
|
||||
`machine create` flags."""
|
||||
smolmachines_preflight()
|
||||
|
||||
manifest = spec.manifest
|
||||
bottle = manifest.bottle_for(spec.agent_name)
|
||||
provider = bottle.agent_provider
|
||||
provider_runtime = runtime_for(provider.template)
|
||||
guest_home = "/home/node"
|
||||
workspace_plan = resolve_workspace_plan(spec, guest_home=guest_home)
|
||||
|
||||
slug = spec.identity or bottle_identity(spec.agent_name)
|
||||
|
||||
# Record minimal metadata so `cli.py resume` can recover the
|
||||
# slug. Same schema as the docker backend.
|
||||
write_metadata(BottleMetadata(
|
||||
identity=slug,
|
||||
agent_name=spec.agent_name,
|
||||
cwd=spec.user_cwd if spec.copy_cwd else "",
|
||||
copy_cwd=spec.copy_cwd,
|
||||
started_at=datetime.now(timezone.utc).isoformat(),
|
||||
compose_project="",
|
||||
backend="smolmachines",
|
||||
))
|
||||
|
||||
subnet, gateway, bundle_ip = smolmachines_bundle_subnet(slug)
|
||||
|
||||
# Agent's env: resolve through resolve_env() so ?prompt entries
|
||||
# are prompted and ${HOST_VAR} entries are interpolated — matching
|
||||
# the Docker backend's contract. Forwarded (secret/interpolated)
|
||||
# values still reach the guest as -e K=V smolvm flags because
|
||||
# smolvm 0.8.0 has no env-file or stdin injection path; this is
|
||||
# the known argv-exposure gap documented in PRD 0038.
|
||||
# HTTPS_PROXY / GIT_GATE_URL / MCP_SUPERVISE_URL are populated
|
||||
# in launch.py after bundle bringup.
|
||||
resolved = resolve_env(manifest, spec.agent_name)
|
||||
guest_env: dict[str, str] = {
|
||||
**resolved.literals,
|
||||
**resolved.forwarded,
|
||||
"NO_PROXY": "localhost,127.0.0.1",
|
||||
"NODE_EXTRA_CA_CERTS": "/etc/ssl/certs/ca-certificates.crt",
|
||||
"SSL_CERT_FILE": "/etc/ssl/certs/ca-certificates.crt",
|
||||
"REQUESTS_CA_BUNDLE": "/etc/ssl/certs/ca-certificates.crt",
|
||||
}
|
||||
|
||||
git_gate_dir = git_gate_state_dir(slug)
|
||||
git_gate_dir.mkdir(parents=True, exist_ok=True)
|
||||
git_gate_plan = GitGate().prepare(bottle, slug, git_gate_dir)
|
||||
|
||||
# Prompt file is always written (mode 0o600) so the in-VM
|
||||
# path always exists. Content is the agent's `prompt`
|
||||
# field (markdown body) — empty for agents with no prompt.
|
||||
# claude-code reads it via --append-system-prompt-file only
|
||||
# when non-empty, but the file must exist either way to
|
||||
# match the docker backend's contract.
|
||||
agent_dir = agent_state_dir(slug)
|
||||
agent_dir.mkdir(parents=True, exist_ok=True)
|
||||
prompt_file = agent_dir / "prompt.txt"
|
||||
agent = manifest.agents[spec.agent_name]
|
||||
prompt_file.write_text(agent.prompt or "")
|
||||
prompt_file.chmod(0o600)
|
||||
|
||||
machine_name = f"bot-bottle-{slug}"
|
||||
# Stash the agent image ref — `launch.launch` runs the
|
||||
# build → pack pipeline at bringup. Honors BOT_BOTTLE_IMAGE
|
||||
# to match the docker backend's `resolve_plan` default.
|
||||
agent_dockerfile_path = ""
|
||||
if provider.dockerfile:
|
||||
agent_dockerfile_path = _resolve_manifest_dockerfile(provider.dockerfile, spec)
|
||||
image_default = f"bot-bottle-{provider.template}:{slug}"
|
||||
elif provider_runtime.dockerfile:
|
||||
agent_dockerfile_path = provider_runtime.dockerfile
|
||||
image_default = provider_runtime.image
|
||||
else:
|
||||
image_default = provider_runtime.image
|
||||
agent_image_ref = os.environ.get("BOT_BOTTLE_IMAGE", image_default)
|
||||
agent_provision = agent_provision_plan(
|
||||
template=provider.template,
|
||||
dockerfile=agent_dockerfile_path,
|
||||
state_dir=agent_dir,
|
||||
guest_home=guest_home,
|
||||
guest_env=guest_env,
|
||||
forward_host_credentials=provider.forward_host_credentials,
|
||||
auth_token=provider.auth_token,
|
||||
host_env=dict(os.environ),
|
||||
trusted_project_path=workspace_plan.workdir,
|
||||
)
|
||||
merged_guest_env = dict(agent_provision.guest_env)
|
||||
for key, val in agent_provision.env_vars.items():
|
||||
merged_guest_env.setdefault(key, val)
|
||||
agent_provision = replace(agent_provision, guest_env=merged_guest_env)
|
||||
|
||||
# Inner Plans for the four bundle daemons. The ABCs are
|
||||
# platform-neutral — `.prepare()` writes config files + returns
|
||||
# a Plan dataclass with no backend-specific assumptions. State
|
||||
# dirs are still keyed by slug under the docker backend's
|
||||
# bottle_state layout (shared on-host convention; not a docker
|
||||
# dependency).
|
||||
pipelock_dir = pipelock_state_dir(slug)
|
||||
pipelock_dir.mkdir(parents=True, exist_ok=True)
|
||||
proxy_plan = PipelockProxy().prepare(
|
||||
bottle, slug, pipelock_dir, agent_provision.egress_routes,
|
||||
)
|
||||
|
||||
egress_dir = egress_state_dir(slug)
|
||||
egress_dir.mkdir(parents=True, exist_ok=True)
|
||||
egress_plan = Egress().prepare(
|
||||
bottle, slug, egress_dir, agent_provision.egress_routes,
|
||||
)
|
||||
|
||||
supervise_plan = None
|
||||
if bottle.supervise:
|
||||
supervise_dir = supervise_state_dir(slug)
|
||||
supervise_dir.mkdir(parents=True, exist_ok=True)
|
||||
supervise_plan = Supervise().prepare(slug, supervise_dir)
|
||||
|
||||
return SmolmachinesBottlePlan(
|
||||
spec=spec,
|
||||
stage_dir=stage_dir,
|
||||
guest_home=guest_home,
|
||||
slug=slug,
|
||||
bundle_subnet=subnet,
|
||||
bundle_gateway=gateway,
|
||||
bundle_ip=bundle_ip,
|
||||
machine_name=machine_name,
|
||||
agent_image_ref=agent_image_ref,
|
||||
guest_env=agent_provision.guest_env,
|
||||
prompt_file=prompt_file,
|
||||
proxy_plan=proxy_plan,
|
||||
git_gate_plan=git_gate_plan,
|
||||
egress_plan=egress_plan,
|
||||
supervise_plan=supervise_plan,
|
||||
agent_provision=agent_provision,
|
||||
workspace_plan=workspace_plan,
|
||||
)
|
||||
|
||||
|
||||
def _resolve_manifest_dockerfile(path_value: str, spec: BottleSpec) -> str:
|
||||
path = Path(os.path.expanduser(path_value))
|
||||
if not path.is_absolute():
|
||||
path = Path(spec.user_cwd) / path
|
||||
return str(path)
|
||||
@@ -2,11 +2,11 @@
|
||||
|
||||
Per PRD 0050 the per-provider provisioning steps (prompt, skills,
|
||||
declarative provision-plan apply, supervise MCP registration) live on
|
||||
the `AgentProvider` plugin under `bot_bottle/contrib/`. The modules
|
||||
left in this subpackage handle only the steps that are
|
||||
backend-specific:
|
||||
the `AgentProvider` plugin under `bot_bottle/contrib/`. CA and git
|
||||
provisioning also moved to the AgentProvider ABC (with Debian/node
|
||||
defaults); user plugins override them for non-standard images.
|
||||
|
||||
- ca.py — install per-bottle CA bundle into the guest trust store
|
||||
- git.py — copy host cwd `.git` into the guest when --cwd is used
|
||||
- workspace.py — copy the operator workspace into the guest
|
||||
No modules remain in this subpackage. Workspace copying now runs
|
||||
through `BottleBackend.provision_workspace` against the running
|
||||
bottle for every backend.
|
||||
"""
|
||||
|
||||
@@ -1,93 +0,0 @@
|
||||
"""Install the per-bottle MITM CA into the smolmachines guest's
|
||||
trust store (PRD 0023 chunk 4d).
|
||||
|
||||
Mirrors `backend.docker.provision.ca`: select the right CA (egress
|
||||
when the bottle has routes, else pipelock), copy it to Debian's
|
||||
`/usr/local/share/ca-certificates/` path,
|
||||
`update-ca-certificates` to rebuild the trust bundle, and log the
|
||||
fingerprint once. The selected cert depends on the agent's
|
||||
HTTP_PROXY target — same logic as the docker backend, since the
|
||||
agent dials the same daemons through the same bundle.
|
||||
|
||||
`smolvm machine exec` runs commands as root in the VM (no `-u`
|
||||
flag exists; the VM init is root), so we don't need the explicit
|
||||
`-u 0` the docker backend uses on its `docker exec` calls."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import time
|
||||
|
||||
from ....log import die
|
||||
from ...util import (
|
||||
AGENT_CA_BUNDLE,
|
||||
AGENT_CA_PATH,
|
||||
log_ca_fingerprint,
|
||||
select_ca_cert,
|
||||
)
|
||||
from ... import Bottle, ExecResult
|
||||
from ..bottle_plan import SmolmachinesBottlePlan
|
||||
|
||||
|
||||
_SIGKILL_EXIT = 128 + 9
|
||||
|
||||
|
||||
def provision_ca(plan: SmolmachinesBottlePlan, bottle: Bottle) -> None:
|
||||
"""Copy the agent-facing CA cert into the guest, rebuild the
|
||||
trust bundle, emit a one-line fingerprint log. Called from
|
||||
`BottleBackend.provision` after the smolvm guest is up."""
|
||||
cert_host_path, label = select_ca_cert(plan.egress_plan, plan.proxy_plan)
|
||||
|
||||
bottle.cp_in(str(cert_host_path), AGENT_CA_PATH)
|
||||
# Mode 0644 — readable to non-root tools in the guest.
|
||||
# update-ca-certificates rebuilds the bundle at AGENT_CA_BUNDLE,
|
||||
# which is what curl / Python ssl / OpenSSL-based tools read by
|
||||
# default. The env trio (NODE_EXTRA_CA_CERTS / SSL_CERT_FILE /
|
||||
# REQUESTS_CA_BUNDLE) on the guest_env covers Node + Python
|
||||
# `requests` / libraries that don't load the system bundle.
|
||||
#
|
||||
r = _install_ca(bottle)
|
||||
if r.returncode == _SIGKILL_EXIT:
|
||||
# smolvm/libkrun can SIGKILL an otherwise-normal exec
|
||||
# during early-VM provisioning. `update-ca-certificates`
|
||||
# is idempotent, so retry the same install once after a
|
||||
# short settle delay before treating it as fatal.
|
||||
time.sleep(1.0)
|
||||
r = _install_ca(bottle)
|
||||
|
||||
if r.returncode != 0:
|
||||
# update-ca-certificates not adding our cert is fatal —
|
||||
# claude-code's TLS handshake against the egress-MITM'd
|
||||
# api.anthropic.com would fail downstream. Bail early
|
||||
# with what we can see (output is captured so we can
|
||||
# surface it).
|
||||
die(
|
||||
f"update-ca-certificates didn't add the agent CA "
|
||||
f"(exit {r.returncode}): "
|
||||
f"stdout={(r.stdout or '').strip()!r} "
|
||||
f"stderr={(r.stderr or '').strip()!r}"
|
||||
)
|
||||
|
||||
log_ca_fingerprint(cert_host_path, label)
|
||||
|
||||
|
||||
def _install_ca(bottle: Bottle) -> ExecResult:
|
||||
# chown + chmod + update-ca-certificates + bundle
|
||||
# verification run in one exec so we only pay one
|
||||
# round trip; the `&&` chaining surfaces the first failure
|
||||
# as the return code. The verify check is more stable than
|
||||
# requiring "1 added" in stdout: a retry after a
|
||||
# partially-completed first run may legitimately report "0
|
||||
# added" while the cert is already installed.
|
||||
return bottle.exec(
|
||||
f"chown root:root {AGENT_CA_PATH} && "
|
||||
f"chmod 644 {AGENT_CA_PATH} && "
|
||||
f"update-ca-certificates && "
|
||||
f"openssl verify -CAfile {AGENT_CA_BUNDLE} {AGENT_CA_PATH}",
|
||||
user="root",
|
||||
)
|
||||
|
||||
|
||||
# Re-exported for the launch/provision_ca caller + tests. The path
|
||||
# constants live in the shared `backend.util` (Debian's
|
||||
# `update-ca-certificates` layout is the same in both backends).
|
||||
__all__ = ["AGENT_CA_BUNDLE", "AGENT_CA_PATH", "provision_ca"]
|
||||
@@ -1,133 +0,0 @@
|
||||
"""Git provisioning inside a running smolmachines bottle
|
||||
(PRD 0023 chunk 4d).
|
||||
|
||||
Three concerns, all about git in the agent:
|
||||
|
||||
1. If --cwd was passed AND the host cwd has a .git, copy that
|
||||
.git into the planned guest workspace so the agent operates on
|
||||
the user's repo.
|
||||
2. If the bottle declares `git` entries (PRD 0008), write a
|
||||
~/.gitconfig with insteadOf rules so every git operation
|
||||
against a declared upstream transparently hits the per-bottle
|
||||
git-gate. The gate mirrors the upstream in both directions,
|
||||
so URL rewriting is symmetric.
|
||||
3. If the bottle declares `git.user` (issue #86), set
|
||||
`git config --global user.{name,email}` inside the guest so
|
||||
the agent's commits are attributed to that identity.
|
||||
|
||||
Differs from `backend.docker.provision.git` in one address detail:
|
||||
the TSI-allowlisted guest can only reach the bundle's pinned IP
|
||||
(no DNS resolver in the /32 allowlist), so the insteadOf URLs
|
||||
are `http://<bundle_ip>:<port>/<name>.git` rather than the
|
||||
docker backend's `git://git-gate/<name>.git`. The render itself
|
||||
is the shared `git_gate_render_gitconfig` on the platform-neutral
|
||||
git_gate module."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import shlex
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
|
||||
from ....git_gate import git_gate_render_gitconfig
|
||||
from ....log import info
|
||||
from ... import Bottle
|
||||
from ..bottle_plan import SmolmachinesBottlePlan
|
||||
|
||||
|
||||
def provision_git(plan: SmolmachinesBottlePlan, bottle: Bottle) -> None:
|
||||
"""Set up git inside the guest. Runs all three subcases; each
|
||||
no-ops when its condition isn't met."""
|
||||
_provision_cwd_git(plan, bottle)
|
||||
_provision_git_gate_config(plan, bottle)
|
||||
_provision_git_user(plan, bottle)
|
||||
|
||||
|
||||
def _provision_cwd_git(plan: SmolmachinesBottlePlan, bottle: Bottle) -> None:
|
||||
"""If --cwd was set and the host cwd has a .git directory, copy
|
||||
it into <guest_home>/workspace/.git and fix ownership. No-op
|
||||
otherwise."""
|
||||
workspace = plan.workspace_plan
|
||||
if not (workspace.enabled and workspace.copy_git and workspace.has_host_git_dir):
|
||||
return
|
||||
guest_workspace_git = f"{workspace.guest_path}/.git"
|
||||
host_git = str(workspace.host_path / ".git")
|
||||
info(f"copying {host_git} -> {bottle.name}:{guest_workspace_git}")
|
||||
# mkdir -p the workspace dir so cp_in lands the .git
|
||||
# directly there even on first-time bottles.
|
||||
bottle.exec(f"mkdir -p {shlex.quote(workspace.guest_path)}", user="root")
|
||||
bottle.cp_in(host_git, guest_workspace_git)
|
||||
# cp_in lands files as root; the agent runs as node so
|
||||
# the workspace tree must be chowned over.
|
||||
bottle.exec(
|
||||
f"chown -R {shlex.quote(workspace.owner)} {shlex.quote(guest_workspace_git)}",
|
||||
user="root",
|
||||
)
|
||||
|
||||
|
||||
def _provision_git_gate_config(
|
||||
plan: SmolmachinesBottlePlan, bottle: Bottle
|
||||
) -> None:
|
||||
"""Write ~/.gitconfig in the guest with the git-gate insteadOf
|
||||
rules. No-op when the bottle has no `git` entries."""
|
||||
manifest_bottle = plan.spec.manifest.bottle_for(plan.spec.agent_name)
|
||||
if not manifest_bottle.git:
|
||||
return
|
||||
|
||||
# `<loopback alias>:<host port>` form: the bundle's git-gate
|
||||
# HTTP port is published on host loopback at launch time so
|
||||
# the smolvm guest (which can only reach macOS networking via
|
||||
# TSI, not the docker bridge IP) can dial it. launch.py
|
||||
# populates `plan.agent_git_gate_host` after bundle bringup.
|
||||
content = git_gate_render_gitconfig(
|
||||
manifest_bottle.git, plan.agent_git_gate_host, scheme="http",
|
||||
)
|
||||
|
||||
guest_gitconfig = f"{plan.guest_home}/.gitconfig"
|
||||
# Stage the file under the plan's stage_dir so cp_in
|
||||
# has a stable host path. The plan's stage_dir is cleaned up
|
||||
# by start.py's session-end teardown.
|
||||
with tempfile.NamedTemporaryFile(
|
||||
"w", dir=str(plan.stage_dir), prefix="gitconfig.",
|
||||
delete=False,
|
||||
) as f:
|
||||
f.write(content)
|
||||
config_file = Path(f.name)
|
||||
os.chmod(config_file, 0o600)
|
||||
|
||||
info(f"writing {guest_gitconfig} with {len(manifest_bottle.git)} insteadOf rule(s)")
|
||||
bottle.cp_in(str(config_file), guest_gitconfig)
|
||||
bottle.exec(
|
||||
f"chown node:node {shlex.quote(guest_gitconfig)} && "
|
||||
f"chmod 644 {shlex.quote(guest_gitconfig)}",
|
||||
user="root",
|
||||
)
|
||||
|
||||
|
||||
def _provision_git_user(
|
||||
plan: SmolmachinesBottlePlan, bottle: Bottle,
|
||||
) -> None:
|
||||
"""Apply `git config --global user.{name,email}` inside the
|
||||
guest as the node user so --global lands in the same
|
||||
`/home/node/.gitconfig` that `_provision_git_gate_config`
|
||||
writes to. No-op when the bottle didn't declare `git.user`.
|
||||
|
||||
SmolmachinesBottle.exec(user="node") automatically sets
|
||||
HOME=/home/node so --global writes to /home/node/.gitconfig."""
|
||||
manifest_bottle = plan.spec.manifest.bottle_for(plan.spec.agent_name)
|
||||
gu = manifest_bottle.git_user
|
||||
if gu.is_empty():
|
||||
return
|
||||
if gu.name:
|
||||
info(f"git config --global user.name = {gu.name!r}")
|
||||
bottle.exec(
|
||||
f"git config --global user.name {shlex.quote(gu.name)}",
|
||||
user="node",
|
||||
)
|
||||
if gu.email:
|
||||
info(f"git config --global user.email = {gu.email!r}")
|
||||
bottle.exec(
|
||||
f"git config --global user.email {shlex.quote(gu.email)}",
|
||||
user="node",
|
||||
)
|
||||
@@ -1,32 +0,0 @@
|
||||
"""Copy the operator workspace into a smolmachines guest."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import shlex
|
||||
|
||||
from ....log import info
|
||||
from ... import Bottle
|
||||
from ..bottle_plan import SmolmachinesBottlePlan
|
||||
|
||||
|
||||
def provision_workspace(plan: SmolmachinesBottlePlan, bottle: Bottle) -> None:
|
||||
"""Copy host cwd contents to the planned guest workspace."""
|
||||
workspace = plan.workspace_plan
|
||||
if not (workspace.enabled and workspace.copy_contents):
|
||||
return
|
||||
|
||||
guest_parent = workspace.guest_path.rsplit("/", 1)[0] or "/"
|
||||
guest_path_q = shlex.quote(workspace.guest_path)
|
||||
guest_parent_q = shlex.quote(guest_parent)
|
||||
owner_q = shlex.quote(workspace.owner)
|
||||
mode_q = shlex.quote(workspace.mode)
|
||||
info(f"copying {workspace.host_path} -> {bottle.name}:{workspace.guest_path}")
|
||||
bottle.exec(
|
||||
f"rm -rf {guest_path_q} && mkdir -p {guest_parent_q}",
|
||||
user="root",
|
||||
)
|
||||
bottle.cp_in(str(workspace.host_path), workspace.guest_path)
|
||||
bottle.exec(
|
||||
f"chown -R {owner_q} {guest_path_q} && chmod {mode_q} {guest_path_q}",
|
||||
user="root",
|
||||
)
|
||||
@@ -68,8 +68,9 @@ def _read_winsize() -> tuple[int, int] | None:
|
||||
- tmux respawn-pane: tmux sets all three to the pane's PTY.
|
||||
- non-TTY (someone piped stdin in tests): none are; the
|
||||
sync just no-ops, which is the right behavior."""
|
||||
for fd in (sys.stdin.fileno(), sys.stdout.fileno(), sys.stderr.fileno()):
|
||||
for stream in (sys.stdin, sys.stdout, sys.stderr):
|
||||
try:
|
||||
fd = stream.fileno()
|
||||
data = fcntl.ioctl(fd, termios.TIOCGWINSZ, b"\x00" * 8)
|
||||
except OSError:
|
||||
continue
|
||||
|
||||
@@ -0,0 +1,83 @@
|
||||
"""smolmachines `_resolve_plan` (PRD 0023 chunks 2d + 4c).
|
||||
|
||||
Resolves the per-bottle docker subnet + bundle IP and assembles
|
||||
the guest env. The agent's docker image build → smolmachine
|
||||
pack pipeline runs in `launch.launch`, not here, so the
|
||||
dashboard's preflight modal isn't garbled by docker-build output
|
||||
before the operator has confirmed.
|
||||
|
||||
No VM bringup — that's `launch.launch`'s job."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
from .. import BottleSpec
|
||||
from ...manifest import Manifest
|
||||
from ...env import ResolvedEnv
|
||||
from ...agent_provider import AgentProvisionPlan
|
||||
from ...egress import EgressPlan
|
||||
from ...supervise import SupervisePlan
|
||||
from ...git_gate import GitGatePlan
|
||||
from .bottle_plan import SmolmachinesBottlePlan
|
||||
from .util import smolmachines_bundle_subnet, smolmachines_preflight
|
||||
|
||||
def preflight() -> None:
|
||||
smolmachines_preflight()
|
||||
|
||||
|
||||
def build_guest_env(resolved_env: ResolvedEnv) -> dict[str, str]:
|
||||
# Agent's env: resolve through resolve_env() so ?prompt entries
|
||||
# are prompted and ${HOST_VAR} entries are interpolated — matching
|
||||
# the Docker backend's contract. Forwarded (secret/interpolated)
|
||||
# values still reach the guest as -e K=V smolvm flags because
|
||||
# smolvm 0.8.0 has no env-file or stdin injection path; this is
|
||||
# the known argv-exposure gap documented in PRD 0038.
|
||||
# HTTPS_PROXY / GIT_GATE_URL / MCP_SUPERVISE_URL are populated
|
||||
# in launch.py after bundle bringup.
|
||||
return {
|
||||
**resolved_env.literals,
|
||||
**resolved_env.forwarded,
|
||||
"NO_PROXY": "localhost,127.0.0.1",
|
||||
"NODE_EXTRA_CA_CERTS": "/etc/ssl/certs/ca-certificates.crt",
|
||||
"SSL_CERT_FILE": "/etc/ssl/certs/ca-certificates.crt",
|
||||
"REQUESTS_CA_BUNDLE": "/etc/ssl/certs/ca-certificates.crt",
|
||||
}
|
||||
|
||||
|
||||
def resolve_plan(
|
||||
spec: BottleSpec,
|
||||
manifest: Manifest,
|
||||
slug: str,
|
||||
resolved_env: ResolvedEnv,
|
||||
agent_provision_plan: AgentProvisionPlan,
|
||||
egress_plan: EgressPlan,
|
||||
supervise_plan: SupervisePlan | None,
|
||||
git_gate_plan: GitGatePlan,
|
||||
stage_dir: Path,
|
||||
) -> SmolmachinesBottlePlan:
|
||||
"""Materialize the smolmachines plan. The bundle's docker
|
||||
subnet + pinned IP are derived from the slug; the agent's
|
||||
`.smolmachine` artifact is built (or cache-hit) here so
|
||||
launch's `machine create --from` boots without a registry
|
||||
pull. Per-bottle guest env + the TSI allow_cidrs land on the
|
||||
plan for launch to pass straight through to
|
||||
`machine create` flags."""
|
||||
|
||||
# ==== smolmachines specific setup ====
|
||||
subnet, gateway, bundle_ip = smolmachines_bundle_subnet(slug)
|
||||
|
||||
return SmolmachinesBottlePlan(
|
||||
spec=spec,
|
||||
manifest=manifest,
|
||||
stage_dir=stage_dir,
|
||||
slug=slug,
|
||||
bundle_subnet=subnet,
|
||||
bundle_gateway=gateway,
|
||||
bundle_ip=bundle_ip,
|
||||
guest_env=agent_provision_plan.guest_env,
|
||||
git_gate_plan=git_gate_plan,
|
||||
egress_plan=egress_plan,
|
||||
supervise_plan=supervise_plan,
|
||||
agent_provision=agent_provision_plan,
|
||||
)
|
||||
@@ -19,7 +19,7 @@ This module ships the lifecycle primitives only — create
|
||||
network, start bundle, stop bundle, remove network — wrapped
|
||||
around `subprocess.run(["docker", ...])`. Wiring them into the
|
||||
launch flow + populating the `BundleLaunchSpec` from the inner
|
||||
Plans (PipelockProxyPlan, EgressPlan, …) lands in chunk 2d."""
|
||||
Plans (EgressPlan, …) lands in chunk 2d."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
@@ -69,7 +69,7 @@ class BundleLaunchSpec:
|
||||
# Daemon subset CSV for BOT_BOTTLE_SIDECAR_DAEMONS. The
|
||||
# supervisor inside the bundle reads it to skip
|
||||
# bottle-irrelevant daemons (e.g. supervise=False bottles).
|
||||
daemons_csv: str = "egress,pipelock"
|
||||
daemons_csv: str = "egress"
|
||||
# Plain "KEY=VALUE" strings + "KEY" bare names (the bare-name
|
||||
# form inherits the value from the docker-run subprocess env,
|
||||
# matching the docker backend's compose-up secret-forwarding
|
||||
|
||||
@@ -21,7 +21,9 @@ def smolmachines_preflight() -> None:
|
||||
die(
|
||||
"BOT_BOTTLE_BACKEND=smolmachines requires `smolvm` on "
|
||||
"PATH. Install with: "
|
||||
"curl -sSL https://smolmachines.com/install.sh | sh"
|
||||
"curl -sSL https://smolmachines.com/install.sh | sh. "
|
||||
"To use the legacy Docker backend instead, set "
|
||||
"BOT_BOTTLE_BACKEND=docker or pass --backend=docker."
|
||||
)
|
||||
|
||||
|
||||
|
||||
@@ -0,0 +1,71 @@
|
||||
"""Terminal escape-sequence helpers shared across all bottle backends."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import shlex
|
||||
|
||||
|
||||
# color name → (normal_idx, normal_hex, bright_idx, bright_hex, dark_bg_hex)
|
||||
# OSC 4 sets indexed palette entries (affects syntax-highlighted code and any
|
||||
# TUI content that uses indexed colors). dark_bg_hex is used for OSC 11
|
||||
# (default background) — a very dark tint that's visible even when the TUI
|
||||
# uses true/24-bit colors for its own chrome, which would otherwise bypass
|
||||
# the palette entirely.
|
||||
_COLORS: dict[str, tuple[int, str, int, str, str]] = {
|
||||
"red": (9, "#e74c3c", 1, "#c0392b", "#200808"),
|
||||
"green": (10, "#2ecc71", 2, "#27ae60", "#082008"),
|
||||
"yellow": (11, "#f1c40f", 3, "#d4ac0d", "#201808"),
|
||||
"blue": (12, "#3498db", 4, "#2471a3", "#080820"),
|
||||
"magenta": (13, "#9b59b6", 5, "#7d3c98", "#160820"),
|
||||
}
|
||||
|
||||
# OSC 104 resets all indexed palette entries; OSC 111 resets default background.
|
||||
_RESET_PRINTF = "printf '\\033]104\\007\\033]111\\007'"
|
||||
|
||||
|
||||
def palette_printf(color: str) -> str:
|
||||
"""Shell `printf` command that emits OSC 4 + OSC 11 to tint the terminal
|
||||
for *color*: sets the normal/bright palette entries AND the default
|
||||
background to a dark shade of that color. Returns '' if unknown."""
|
||||
entry = _COLORS.get(color)
|
||||
if not entry:
|
||||
return ""
|
||||
n_idx, n_hex, b_idx, b_hex, bg_hex = entry
|
||||
seq = (
|
||||
f"\\033]4;{n_idx};{n_hex}\\007"
|
||||
f"\\033]4;{b_idx};{b_hex}\\007"
|
||||
f"\\033]11;{bg_hex}\\007"
|
||||
)
|
||||
return f"printf '{seq}'"
|
||||
|
||||
|
||||
def exec_shell_script(
|
||||
agent_argv: list[str],
|
||||
terminal_title: str = "",
|
||||
terminal_color: str = "",
|
||||
) -> str | None:
|
||||
"""Build a shell script string that optionally sets the terminal
|
||||
title and/or palette before running *agent_argv*, and resets the
|
||||
palette + background on exit. Returns None when no decoration is
|
||||
needed — callers should run *agent_argv* directly in that case."""
|
||||
title_cmd = (
|
||||
f"printf '\\033]0;%s\\007' {shlex.quote(terminal_title)}"
|
||||
if terminal_title else ""
|
||||
)
|
||||
pal_cmd = palette_printf(terminal_color)
|
||||
|
||||
if not title_cmd and not pal_cmd:
|
||||
return None
|
||||
|
||||
parts: list[str] = []
|
||||
if title_cmd:
|
||||
parts.append(title_cmd)
|
||||
if pal_cmd:
|
||||
parts.append(pal_cmd)
|
||||
parts.append(shlex.join(agent_argv))
|
||||
parts.append(_RESET_PRINTF)
|
||||
else:
|
||||
# No palette change — exec so the agent replaces the shell.
|
||||
parts.append(f"exec {shlex.join(agent_argv)}")
|
||||
|
||||
return "; ".join(parts)
|
||||
+11
-27
@@ -14,7 +14,6 @@ from ..log import die, info
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from ..egress import EgressPlan
|
||||
from ..pipelock import PipelockProxyPlan
|
||||
|
||||
|
||||
# Debian-family CA layout, shared by every backend (all guest images
|
||||
@@ -35,35 +34,20 @@ def host_skill_dir(name: str) -> str:
|
||||
return f"{home}/.claude/skills/{name}"
|
||||
|
||||
|
||||
def select_ca_cert(
|
||||
egress_plan: EgressPlan, proxy_plan: PipelockProxyPlan
|
||||
) -> tuple[Path, str]:
|
||||
"""Pick the agent-facing CA cert (and a short label for the log
|
||||
line) that matches the proxy the agent's HTTP_PROXY points at.
|
||||
Egress wins when the bottle declares any routes (it sits in front
|
||||
of pipelock); else pipelock.
|
||||
def select_ca_cert(egress_plan: EgressPlan) -> tuple[Path, str]:
|
||||
"""Return the egress MITM CA cert path and label for provision_ca.
|
||||
|
||||
Shared by every backend's `provision_ca`: launch mints the chosen
|
||||
CA(s) and re-binds their host paths into these inner plans before
|
||||
provision runs, so an empty/missing path here means launch's
|
||||
bringup is broken — fatal."""
|
||||
if egress_plan.routes:
|
||||
cert = egress_plan.mitmproxy_ca_cert_only_host_path
|
||||
if cert == Path() or not cert.is_file():
|
||||
die(
|
||||
f"egress CA cert missing at {cert or '(empty)'}; "
|
||||
f"launch must have called egress_tls_init and "
|
||||
f"re-bound the plan before provision"
|
||||
)
|
||||
return cert, "egress"
|
||||
cert = proxy_plan.ca_cert_host_path
|
||||
if not cert or not cert.is_file():
|
||||
Launch always mints the CA and re-binds the host path into the
|
||||
egress_plan before provision runs, so an empty/missing path here
|
||||
means launch's bringup is broken — fatal."""
|
||||
cert = egress_plan.mitmproxy_ca_cert_only_host_path
|
||||
if cert == Path() or not cert.is_file():
|
||||
die(
|
||||
f"pipelock CA cert missing at {cert or '(empty)'}; "
|
||||
f"launch must have called pipelock_tls_init and re-bound "
|
||||
f"the plan before provision"
|
||||
f"egress CA cert missing at {cert or '(empty)'}; "
|
||||
f"launch must have called egress_tls_init and "
|
||||
f"re-bound the plan before provision"
|
||||
)
|
||||
return cert, "pipelock"
|
||||
return cert, "egress"
|
||||
|
||||
|
||||
def log_ca_fingerprint(cert_host_path: Path, label: str) -> None:
|
||||
|
||||
@@ -37,8 +37,7 @@ from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import cast
|
||||
|
||||
from ... import supervise as _supervise
|
||||
from . import util as docker_mod
|
||||
from . import supervise as _supervise
|
||||
|
||||
|
||||
# Directory layout: ~/.bot-bottle/state/<identity>/...
|
||||
@@ -49,7 +48,6 @@ _TRANSCRIPT_SUBDIR = "transcript"
|
||||
# live here so chunk 3's `docker compose up` can find them at stable
|
||||
# paths. Each sidecar's `prepare()` writes config + CAs into its own
|
||||
# subdir; the launch step is unchanged today (still `docker cp`).
|
||||
_PIPELOCK_SUBDIR = "pipelock"
|
||||
_EGRESS_SUBDIR = "egress"
|
||||
_GIT_GATE_SUBDIR = "git-gate"
|
||||
_SUPERVISE_SUBDIR = "supervise"
|
||||
@@ -57,8 +55,8 @@ _AGENT_SUBDIR = "agent"
|
||||
_METADATA_NAME = "metadata.json"
|
||||
# Live-config dir bind-mounted into the supervise sidecar (read-only).
|
||||
# Host's apply paths keep these files fresh so supervise's
|
||||
# `list-pipelock-allowlist` / `list-egress-routes` MCP tools
|
||||
# return the current state — not a snapshot from launch time.
|
||||
# `list-egress-routes` MCP tool returns the current state —
|
||||
# not a snapshot from launch time.
|
||||
_LIVE_CONFIG_SUBDIR = "live-config"
|
||||
LIVE_CONFIG_ROUTES_NAME = "routes.yaml"
|
||||
LIVE_CONFIG_ALLOWLIST_NAME = "allowlist"
|
||||
@@ -83,6 +81,7 @@ def bottle_identity(agent_name: str) -> str:
|
||||
To continue an existing bottle's state, use the recorded
|
||||
identity from BottleMetadata via `cli.py resume <identity>`,
|
||||
not this function."""
|
||||
from .backend.docker import util as docker_mod
|
||||
slug = docker_mod.slugify(agent_name)
|
||||
suffix = "".join(secrets.choice(_SUFFIX_ALPHABET) for _ in range(_RANDOM_SUFFIX_LEN))
|
||||
return f"{slug}-{suffix}"
|
||||
@@ -110,6 +109,8 @@ class BottleMetadata:
|
||||
# for state dirs written before PRD 0040; callers default to "docker"
|
||||
# for backward compatibility.
|
||||
backend: str = ""
|
||||
label: str = ""
|
||||
color: str = ""
|
||||
|
||||
|
||||
def metadata_path(identity: str) -> Path:
|
||||
@@ -145,6 +146,8 @@ def read_metadata(identity: str) -> BottleMetadata | None:
|
||||
started_at=str(raw_typed.get("started_at", "")),
|
||||
compose_project=str(raw_typed.get("compose_project", "")),
|
||||
backend=str(raw_typed.get("backend", "")),
|
||||
label=str(raw_typed.get("label", "")),
|
||||
color=str(raw_typed.get("color", "")),
|
||||
)
|
||||
|
||||
|
||||
@@ -234,12 +237,6 @@ def transcript_snapshot_dir(identity: str) -> Path:
|
||||
# nothing requested preservation.
|
||||
|
||||
|
||||
def pipelock_state_dir(identity: str) -> Path:
|
||||
"""State subdir for the pipelock sidecar: pipelock.yaml + the
|
||||
per-bottle CA cert/key. Bind-mount source from chunk 3 onward."""
|
||||
return bottle_state_dir(identity) / _PIPELOCK_SUBDIR
|
||||
|
||||
|
||||
def egress_state_dir(identity: str) -> Path:
|
||||
"""State subdir for the egress sidecar: routes.yaml + the
|
||||
per-bottle mitmproxy CA. Bind-mount source from chunk 3 onward."""
|
||||
@@ -325,7 +322,6 @@ __all__ = [
|
||||
"per_bottle_dockerfile",
|
||||
"per_bottle_dockerfile_path",
|
||||
"per_bottle_image_tag",
|
||||
"pipelock_state_dir",
|
||||
"preserve_marker_path",
|
||||
"read_metadata",
|
||||
"supervise_state_dir",
|
||||
@@ -5,7 +5,7 @@ from __future__ import annotations
|
||||
import argparse
|
||||
|
||||
from ..log import info
|
||||
from ..manifest import Manifest
|
||||
from ..manifest import ManifestIndex
|
||||
from ._common import PROG, USER_CWD
|
||||
|
||||
|
||||
@@ -14,11 +14,12 @@ def cmd_info(argv: list[str]) -> int:
|
||||
parser.add_argument("name", help="agent name defined in bot-bottle.json")
|
||||
args = parser.parse_args(argv)
|
||||
|
||||
manifest = Manifest.resolve(USER_CWD)
|
||||
manifest.require_agent(args.name)
|
||||
names = ManifestIndex.resolve(USER_CWD)
|
||||
names.require_agent(args.name)
|
||||
manifest = names.load_for_agent(args.name)
|
||||
|
||||
agent = manifest.agents[args.name]
|
||||
bottle = manifest.bottle_for(args.name)
|
||||
agent = manifest.agent
|
||||
bottle = manifest.bottle
|
||||
env_names = list(bottle.env.keys())
|
||||
prompt_first_line = agent.prompt.splitlines()[0] if agent.prompt else ""
|
||||
|
||||
@@ -31,7 +32,7 @@ def cmd_info(argv: list[str]) -> int:
|
||||
f"first line: {prompt_first_line or '(empty)'}"
|
||||
)
|
||||
info(f"bottle : {agent.bottle}")
|
||||
identity = manifest.git_identity_summary(args.name)
|
||||
identity = manifest.git_identity_summary()
|
||||
if identity:
|
||||
info(f" git identity : {identity}")
|
||||
if bottle.git:
|
||||
|
||||
+32
-8
@@ -3,12 +3,36 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import os
|
||||
import sys
|
||||
|
||||
from ..backend import enumerate_active_agents
|
||||
from ..manifest import Manifest
|
||||
from ..manifest import ManifestIndex
|
||||
from ._common import PROG, USER_CWD
|
||||
|
||||
_ANSI_COLOR_CODES: dict[str, str] = {
|
||||
"red": "\033[91m",
|
||||
"green": "\033[92m",
|
||||
"yellow": "\033[93m",
|
||||
"blue": "\033[94m",
|
||||
"magenta": "\033[95m",
|
||||
}
|
||||
_ANSI_RESET = "\033[0m"
|
||||
|
||||
|
||||
def _ansi_label(text: str, color: str) -> str:
|
||||
if not color:
|
||||
return text
|
||||
if not sys.stdout.isatty():
|
||||
return text
|
||||
term = os.environ.get("TERM", "")
|
||||
if term in ("dumb", ""):
|
||||
return text
|
||||
code = _ANSI_COLOR_CODES.get(color)
|
||||
if not code:
|
||||
return text
|
||||
return f"{code}{text}{_ANSI_RESET}"
|
||||
|
||||
|
||||
def cmd_list(argv: list[str]) -> int:
|
||||
parser = argparse.ArgumentParser(prog=f"{PROG} list", add_help=True)
|
||||
@@ -16,8 +40,8 @@ def cmd_list(argv: list[str]) -> int:
|
||||
args = parser.parse_args(argv)
|
||||
|
||||
if args.scope == "available":
|
||||
manifest = Manifest.resolve(USER_CWD)
|
||||
for name in manifest.agents.keys():
|
||||
manifest = ManifestIndex.resolve(USER_CWD)
|
||||
for name in manifest.all_agent_names:
|
||||
print(name)
|
||||
return 0
|
||||
|
||||
@@ -27,11 +51,11 @@ def cmd_list(argv: list[str]) -> int:
|
||||
if not active:
|
||||
print("no active bot-bottle bottles", file=sys.stderr)
|
||||
return 0
|
||||
# One line per bottle: `<backend>\t<slug>\t<agent>\t<status>`.
|
||||
# Tab-separated keeps the format stable for shell pipelines;
|
||||
# the dashboard renders the same data through its own
|
||||
# formatter.
|
||||
# One line per bottle: `<backend>\t<slug>\t<label>\t<services>`.
|
||||
# Tab-separated keeps the format stable for shell pipelines.
|
||||
for b in active:
|
||||
services = ",".join(b.services) if b.services else "-"
|
||||
print(f"{b.backend_name}\t{b.slug}\t{b.agent_name}\t{services}")
|
||||
display_name = b.label if b.label else b.agent_name
|
||||
colored_name = _ansi_label(display_name, b.color)
|
||||
print(f"{b.backend_name}\t{b.slug}\t{colored_name}\t{services}")
|
||||
return 0
|
||||
|
||||
@@ -18,9 +18,9 @@ from __future__ import annotations
|
||||
import argparse
|
||||
|
||||
from ..backend import BottleSpec
|
||||
from ..backend.docker.bottle_state import read_metadata
|
||||
from ..bottle_state import read_metadata
|
||||
from ..log import die
|
||||
from ..manifest import Manifest
|
||||
from ..manifest import ManifestIndex
|
||||
from ._common import PROG, USER_CWD
|
||||
from .start import _launch_bottle
|
||||
|
||||
@@ -42,7 +42,7 @@ def cmd_resume(argv: list[str]) -> int:
|
||||
f"check ~/.bot-bottle/state/ or run `cli.py start` to create a new bottle"
|
||||
)
|
||||
|
||||
manifest = Manifest.resolve(USER_CWD)
|
||||
manifest = ManifestIndex.resolve(USER_CWD)
|
||||
manifest.require_agent(metadata.agent_name)
|
||||
|
||||
spec = BottleSpec(
|
||||
|
||||
+35
-17
@@ -20,18 +20,20 @@ from ..agent_provider import runtime_for
|
||||
from ..backend import (
|
||||
Bottle,
|
||||
BottleSpec,
|
||||
enumerate_active_agents,
|
||||
get_bottle_backend,
|
||||
known_backend_names,
|
||||
)
|
||||
from ..backend.docker import util as docker_mod
|
||||
from ..backend.docker.bottle_plan import DockerBottlePlan
|
||||
from ..backend.docker.bottle_state import (
|
||||
from ..bottle_state import (
|
||||
cleanup_state,
|
||||
is_preserved,
|
||||
mark_preserved,
|
||||
)
|
||||
from ..backend.docker.capability_apply import snapshot_transcript
|
||||
# from ..backend.docker.capability_apply import snapshot_transcript
|
||||
from ..log import info
|
||||
from ..manifest import Manifest
|
||||
from ..manifest import ManifestIndex
|
||||
from ._common import PROG, USER_CWD, read_tty_line
|
||||
from . import tui
|
||||
|
||||
@@ -39,7 +41,7 @@ from . import tui
|
||||
def cmd_start(argv: list[str]) -> int:
|
||||
parser = argparse.ArgumentParser(prog=f"{PROG} start", add_help=True)
|
||||
parser.add_argument("--dry-run", action="store_true")
|
||||
parser.add_argument("--cwd", action="store_true", help="copy host cwd into a derived image")
|
||||
parser.add_argument("--cwd", action="store_true", help="copy host cwd into the running bottle")
|
||||
parser.add_argument("--remote-control", action="store_true")
|
||||
parser.add_argument(
|
||||
"--backend",
|
||||
@@ -47,7 +49,7 @@ def cmd_start(argv: list[str]) -> int:
|
||||
default=None,
|
||||
help=(
|
||||
"backend to launch the bottle on (default: $BOT_BOTTLE_BACKEND "
|
||||
"or 'docker'). Overrides the env var when set."
|
||||
"or host auto-selection). Overrides the env var when set."
|
||||
),
|
||||
)
|
||||
parser.add_argument(
|
||||
@@ -60,31 +62,29 @@ def cmd_start(argv: list[str]) -> int:
|
||||
|
||||
dry_run = args.dry_run or os.environ.get("BOT_BOTTLE_DRY_RUN") == "1"
|
||||
|
||||
manifest = Manifest.resolve(USER_CWD)
|
||||
manifest = ManifestIndex.resolve(USER_CWD)
|
||||
|
||||
agent_name: str | None = args.name
|
||||
if agent_name is None:
|
||||
agent_name = tui.filter_select(
|
||||
sorted(manifest.agents.keys()),
|
||||
manifest.all_agent_names,
|
||||
title="Select agent",
|
||||
)
|
||||
if agent_name is None:
|
||||
return 0
|
||||
|
||||
backend_name: str | None = args.backend
|
||||
if backend_name is None and "BOT_BOTTLE_BACKEND" not in os.environ:
|
||||
backend_name = tui.filter_select(
|
||||
list(known_backend_names()),
|
||||
title="Select backend",
|
||||
)
|
||||
if backend_name is None:
|
||||
return 0
|
||||
|
||||
label, color = tui.name_color_modal(default_label=agent_name)
|
||||
label, color = _resolve_unique_label(label, color)
|
||||
|
||||
spec = BottleSpec(
|
||||
manifest=manifest,
|
||||
agent_name=agent_name,
|
||||
copy_cwd=args.cwd,
|
||||
user_cwd=USER_CWD,
|
||||
label=label,
|
||||
color=color,
|
||||
)
|
||||
return _launch_bottle(
|
||||
spec,
|
||||
@@ -110,8 +110,8 @@ def prepare_with_preflight(
|
||||
injected callable, prompt y/N via the injected callable.
|
||||
|
||||
`backend_name` selects which backend prepares the plan
|
||||
(`None` → `$BOT_BOTTLE_BACKEND` → `docker`). The CLI passes
|
||||
whatever `--backend` resolved to.
|
||||
(`None` → `$BOT_BOTTLE_BACKEND` → host auto-selection). The CLI
|
||||
passes whatever `--backend` resolved to.
|
||||
|
||||
Returns `(plan, identity)`. `plan` is None on dry-run or
|
||||
operator-N, but `identity` is set as soon as `backend.prepare`
|
||||
@@ -136,6 +136,7 @@ def prepare_with_preflight(
|
||||
def attach_agent(
|
||||
bottle: Bottle, *, remote_control: bool = False, resume: bool = False,
|
||||
agent_provider_template: str = "claude",
|
||||
startup_args: tuple[str, ...] = (),
|
||||
) -> int:
|
||||
"""Run the selected provider CLI inside `bottle` as an
|
||||
interactive session. Blocks until the session ends; returns the
|
||||
@@ -154,6 +155,7 @@ def attach_agent(
|
||||
agent_args = list(runtime.bypass_args)
|
||||
if remote_control:
|
||||
agent_args.extend(runtime.remote_control_args)
|
||||
agent_args.extend(startup_args)
|
||||
if resume:
|
||||
agent_args.extend(runtime.resume_args)
|
||||
return bottle.exec_agent(agent_args, tty=True)
|
||||
@@ -168,7 +170,7 @@ def capture_claude_session_state(identity: str, exit_code: int) -> None:
|
||||
# instead of relying on each agent's transcript layout.
|
||||
if not identity:
|
||||
return
|
||||
snapshot_transcript(identity)
|
||||
# snapshot_transcript(identity)
|
||||
if exit_code != 0:
|
||||
mark_preserved(identity)
|
||||
|
||||
@@ -192,6 +194,21 @@ def _identity_from_plan(plan: object) -> str:
|
||||
return getattr(plan, "slug", "")
|
||||
|
||||
|
||||
def _resolve_unique_label(label: str, color: str) -> tuple[str, str]:
|
||||
"""Re-prompt with a disclaimer until the label's slug is not already
|
||||
in use among running bottles. Passes through unchanged when no
|
||||
collision is found on the first check."""
|
||||
while True:
|
||||
slug_candidate = docker_mod.slugify(label)
|
||||
active_slugs = {a.slug for a in enumerate_active_agents()}
|
||||
if slug_candidate not in active_slugs:
|
||||
return label, color
|
||||
label, color = tui.name_color_modal(
|
||||
default_label=label,
|
||||
disclaimer=f'"{label}" is already in use',
|
||||
)
|
||||
|
||||
|
||||
def _text_prompt_yes() -> bool:
|
||||
"""Default `prompt_yes` for CLI use: reads y/N from the
|
||||
controlling tty via stderr prompt + tty-line read."""
|
||||
@@ -238,6 +255,7 @@ def _launch_bottle(
|
||||
bottle,
|
||||
remote_control=remote_control,
|
||||
agent_provider_template=agent_provider_template,
|
||||
startup_args=plan.agent_provision.startup_args,
|
||||
)
|
||||
info(
|
||||
f"session ended (exit {exit_code}); "
|
||||
|
||||
+28
-86
@@ -2,11 +2,8 @@
|
||||
act on them (approve / modify / reject).
|
||||
|
||||
Curses-based TUI; modify-then-approve shells out to $EDITOR. The
|
||||
approval handlers wire to the per-tool remediation engines:
|
||||
PRD 0014 (egress, retargeted from cred-proxy in PRD 0017
|
||||
chunk 3) writes routes.yaml + SIGHUPs egress; PRD 0015
|
||||
(pipelock) writes the allowlist + restarts pipelock; PRD 0016
|
||||
(capability) rebuilds the bottle Dockerfile.
|
||||
approval handler wires to PRD 0016 (capability-block), which rebuilds
|
||||
the bottle Dockerfile. The egress-block tool was removed in issue #198.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
@@ -23,20 +20,17 @@ from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
|
||||
from .. import supervise as _supervise
|
||||
from ..backend.docker.bottle_state import read_metadata
|
||||
from ..backend.docker.capability_apply import (
|
||||
CapabilityApplyError,
|
||||
apply_capability_change,
|
||||
)
|
||||
from ..backend.docker.egress_apply import EgressApplyError, add_route
|
||||
from ..backend.docker.pipelock_apply import (
|
||||
PipelockApplyError,
|
||||
apply_allowlist_change,
|
||||
fetch_current_allowlist,
|
||||
parse_allowlist_content,
|
||||
render_allowlist_content,
|
||||
)
|
||||
# from ..bottle_state import read_metadata
|
||||
# from ..backend.docker.capability_apply import (
|
||||
# CapabilityApplyError,
|
||||
# apply_capability_change,
|
||||
# )
|
||||
from ..log import Die, error, info
|
||||
|
||||
|
||||
class CapabilityApplyError(RuntimeError):
|
||||
"""Placeholder while capability_apply is disabled."""
|
||||
|
||||
from ..supervise import (
|
||||
COMPONENT_FOR_TOOL,
|
||||
AuditEntry,
|
||||
@@ -46,8 +40,6 @@ from ..supervise import (
|
||||
STATUS_MODIFIED,
|
||||
STATUS_REJECTED,
|
||||
TOOL_CAPABILITY_BLOCK,
|
||||
TOOL_EGRESS_BLOCK,
|
||||
TOOL_PIPELOCK_BLOCK,
|
||||
archive_proposal,
|
||||
list_pending_proposals,
|
||||
render_diff,
|
||||
@@ -71,7 +63,7 @@ class QueuedProposal:
|
||||
# Errors any remediation engine may raise. Caught by the TUI key
|
||||
# handlers and surfaced in the status line so a failed apply keeps
|
||||
# the proposal pending rather than crashing curses.
|
||||
ApplyError = (EgressApplyError, PipelockApplyError, CapabilityApplyError)
|
||||
ApplyError = (CapabilityApplyError,)
|
||||
|
||||
|
||||
def discover_pending() -> list[QueuedProposal]:
|
||||
@@ -92,9 +84,7 @@ def discover_pending() -> list[QueuedProposal]:
|
||||
def _approval_status(qp: QueuedProposal, verb: str) -> str:
|
||||
"""Status-line text after a successful approval."""
|
||||
base = f"{verb} {qp.proposal.tool} for [{qp.proposal.bottle_slug}]"
|
||||
if qp.proposal.tool == TOOL_CAPABILITY_BLOCK:
|
||||
return f"{base}; resume: ./cli.py resume {qp.proposal.bottle_slug}"
|
||||
return base
|
||||
return f"{base}; resume: ./cli.py resume {qp.proposal.bottle_slug}"
|
||||
|
||||
|
||||
def _detail_lines(
|
||||
@@ -116,33 +106,12 @@ def _detail_lines(
|
||||
out.extend((" " + line, 0) for line in p.justification.splitlines() or [""])
|
||||
out.extend([
|
||||
("", 0),
|
||||
(_proposed_payload_label(p.tool) + ":", 0),
|
||||
("proposed file:", 0),
|
||||
])
|
||||
out.extend((line, 0) for line in p.proposed_file.splitlines() or [""])
|
||||
if p.tool == TOOL_PIPELOCK_BLOCK:
|
||||
host = _failed_url_host(p.proposed_file)
|
||||
if host:
|
||||
out.append(("", 0))
|
||||
out.append((host, green_attr))
|
||||
return out
|
||||
|
||||
|
||||
def _failed_url_host(url: str) -> str:
|
||||
"""Best-effort hostname extraction from a pipelock-block proposal."""
|
||||
import urllib.parse
|
||||
|
||||
try:
|
||||
return urllib.parse.urlsplit(url.strip()).hostname or ""
|
||||
except ValueError:
|
||||
return ""
|
||||
|
||||
|
||||
def _proposed_payload_label(tool: str) -> str:
|
||||
if tool == TOOL_PIPELOCK_BLOCK:
|
||||
return "failed URL"
|
||||
return "proposed file"
|
||||
|
||||
|
||||
def _suffix_for_tool(tool: str) -> str:
|
||||
if tool == TOOL_CAPABILITY_BLOCK:
|
||||
return ".dockerfile"
|
||||
@@ -160,28 +129,19 @@ def approve(
|
||||
) -> None:
|
||||
"""Apply the proposal, write the waiting response, and audit it."""
|
||||
status = STATUS_MODIFIED if final_file is not None else STATUS_APPROVED
|
||||
file_to_apply = final_file if final_file is not None else qp.proposal.proposed_file
|
||||
|
||||
diff_before, diff_after = "", ""
|
||||
if qp.proposal.tool == TOOL_EGRESS_BLOCK:
|
||||
diff_before, diff_after = add_route(
|
||||
qp.proposal.bottle_slug, file_to_apply,
|
||||
)
|
||||
elif qp.proposal.tool == TOOL_PIPELOCK_BLOCK:
|
||||
diff_before, diff_after = _apply_pipelock_url(
|
||||
qp.proposal.bottle_slug, file_to_apply,
|
||||
)
|
||||
elif qp.proposal.tool == TOOL_CAPABILITY_BLOCK:
|
||||
_meta = read_metadata(qp.proposal.bottle_slug)
|
||||
if _meta is not None and not _meta.compose_project:
|
||||
raise CapabilityApplyError(
|
||||
"capability-block remediation is not supported for smolmachines "
|
||||
"bottles. Reject this proposal or handle the capability change "
|
||||
"manually, then restart the bottle."
|
||||
)
|
||||
diff_before, diff_after = apply_capability_change(
|
||||
qp.proposal.bottle_slug, file_to_apply,
|
||||
)
|
||||
# if qp.proposal.tool == TOOL_CAPABILITY_BLOCK:
|
||||
# _meta = read_metadata(qp.proposal.bottle_slug)
|
||||
# if _meta is not None and not _meta.compose_project:
|
||||
# raise CapabilityApplyError(
|
||||
# "capability-block remediation is not supported for smolmachines "
|
||||
# "bottles. Reject this proposal or handle the capability change "
|
||||
# "manually, then restart the bottle."
|
||||
# )
|
||||
# diff_before, diff_after = apply_capability_change(
|
||||
# qp.proposal.bottle_slug, file_to_apply,
|
||||
# )
|
||||
|
||||
response = Response(
|
||||
proposal_id=qp.proposal.id,
|
||||
@@ -210,23 +170,6 @@ def reject(qp: QueuedProposal, *, reason: str) -> None:
|
||||
_write_audit(qp, action=STATUS_REJECTED, notes=reason, diff_before="", diff_after="")
|
||||
|
||||
|
||||
def _apply_pipelock_url(slug: str, failed_url: str) -> tuple[str, str]:
|
||||
"""Merge a pipelock-block failed URL's host into the allowlist."""
|
||||
import urllib.parse
|
||||
|
||||
parsed = urllib.parse.urlsplit(failed_url.strip())
|
||||
host = parsed.hostname or ""
|
||||
if not host:
|
||||
raise PipelockApplyError(
|
||||
f"proposed failed_url has no extractable host: {failed_url!r}"
|
||||
)
|
||||
current = fetch_current_allowlist(slug)
|
||||
hosts = parse_allowlist_content(current)
|
||||
if host not in hosts:
|
||||
hosts.append(host)
|
||||
return apply_allowlist_change(slug, render_allowlist_content(hosts))
|
||||
|
||||
|
||||
def _write_audit(
|
||||
qp: QueuedProposal,
|
||||
*,
|
||||
@@ -235,7 +178,7 @@ def _write_audit(
|
||||
diff_before: str,
|
||||
diff_after: str,
|
||||
) -> None:
|
||||
"""Audit log for egress / pipelock tools."""
|
||||
"""Audit log for egress tool."""
|
||||
component = COMPONENT_FOR_TOOL.get(qp.proposal.tool)
|
||||
if component is None:
|
||||
return
|
||||
@@ -467,8 +410,7 @@ def _render(
|
||||
cursor = "> " if i == selected else " "
|
||||
line = (
|
||||
f"{cursor}{ts_short} "
|
||||
f"[{p.bottle_slug}] {p.tool:<18} {p.id[:8]} "
|
||||
f"{_proposed_payload_label(p.tool)}"
|
||||
f"[{p.bottle_slug}] {p.tool:<18} {p.id[:8]}"
|
||||
)
|
||||
attr = curses.A_REVERSE if i == selected else curses.A_NORMAL
|
||||
stdscr.addnstr(row, 0, line, w - 1, attr)
|
||||
|
||||
+218
-2
@@ -3,6 +3,7 @@
|
||||
Exposed surface:
|
||||
|
||||
filter_select(items, *, title="", tty_path="/dev/tty") -> str | None
|
||||
name_color_modal(default_label, *, tty_path="/dev/tty") -> (str, str)
|
||||
|
||||
Opens /dev/tty directly so the picker works even when stdout/stdin are
|
||||
redirected. Returns the selected item or None on cancel.
|
||||
@@ -42,8 +43,7 @@ def filter_select(
|
||||
# Use os.dup() to duplicate the fd so the original file object
|
||||
# and FileIO in _run_picker each manage independent copies,
|
||||
# preventing double-close errors.
|
||||
import os as _os
|
||||
fd_dup = _os.dup(tty_fd.fileno())
|
||||
fd_dup = os.dup(tty_fd.fileno())
|
||||
return _run_picker(items, title=title, tty_fd=fd_dup)
|
||||
finally:
|
||||
tty_fd.close()
|
||||
@@ -219,3 +219,219 @@ def _addstr_safe(screen: Any, row: int, col: int, text: str, attr: int = curses.
|
||||
screen.addstr(row, col, text, attr)
|
||||
except curses.error:
|
||||
pass
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# name_color_modal — two-step label + color picker
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_ANSI_COLORS = [
|
||||
"red", "green", "yellow", "blue", "magenta",
|
||||
]
|
||||
|
||||
_CURSES_COLOR_MAP: dict[str, int] = {
|
||||
"red": curses.COLOR_RED,
|
||||
"green": curses.COLOR_GREEN,
|
||||
"yellow": curses.COLOR_YELLOW,
|
||||
"blue": curses.COLOR_BLUE,
|
||||
"magenta": curses.COLOR_MAGENTA,
|
||||
}
|
||||
|
||||
_COLOR_NONE = "(none)"
|
||||
|
||||
|
||||
def name_color_modal(
|
||||
default_label: str,
|
||||
*,
|
||||
disclaimer: str = "",
|
||||
tty_path: str = "/dev/tty",
|
||||
) -> tuple[str, str]:
|
||||
"""Present a two-step curses modal: first edit the agent label,
|
||||
then optionally pick a color.
|
||||
|
||||
``disclaimer`` is shown below the input field — use it to surface
|
||||
an error from a previous attempt (e.g. name already in use).
|
||||
|
||||
Returns ``(label, color)`` where ``color`` is one of the 16 ANSI
|
||||
color name strings or ``""`` for no color. Falls back to
|
||||
``(default_label, "")`` on any error (terminal too small, not a tty).
|
||||
"""
|
||||
try:
|
||||
tty_fd = open(tty_path, "r+b", buffering=0) # pylint: disable=consider-using-with
|
||||
except OSError:
|
||||
return default_label, ""
|
||||
|
||||
try:
|
||||
fd_dup = os.dup(tty_fd.fileno())
|
||||
return _run_name_color(default_label, tty_fd=fd_dup, disclaimer=disclaimer)
|
||||
except Exception: # noqa: BLE001 # pylint: disable=broad-exception-caught
|
||||
return default_label, ""
|
||||
finally:
|
||||
tty_fd.close()
|
||||
|
||||
|
||||
def _run_name_color(default_label: str, *, tty_fd: int, disclaimer: str = "") -> tuple[str, str]:
|
||||
import io
|
||||
orig_stdin = sys.__stdin__
|
||||
orig_stdout = sys.__stdout__
|
||||
try:
|
||||
tty_text = io.TextIOWrapper(io.FileIO(tty_fd, mode="r+"), write_through=True)
|
||||
sys.__stdin__ = tty_text # type: ignore[assignment]
|
||||
sys.__stdout__ = tty_text # type: ignore[assignment]
|
||||
os.environ.setdefault("TERM", "xterm-256color")
|
||||
|
||||
screen = curses.initscr()
|
||||
curses.noecho()
|
||||
curses.cbreak()
|
||||
screen.keypad(True)
|
||||
try:
|
||||
label = _label_step(screen, default_label, disclaimer=disclaimer)
|
||||
color = _color_step(screen, label)
|
||||
finally:
|
||||
screen.keypad(False)
|
||||
curses.nocbreak()
|
||||
curses.echo()
|
||||
curses.endwin()
|
||||
finally:
|
||||
sys.__stdin__ = orig_stdin # type: ignore[assignment]
|
||||
sys.__stdout__ = orig_stdout # type: ignore[assignment]
|
||||
return label, color
|
||||
|
||||
|
||||
def _label_step(screen: Any, default_label: str, *, disclaimer: str = "") -> str:
|
||||
"""Step 1: edit the label. First printable key replaces the
|
||||
pre-fill; subsequent keys append. Enter confirms."""
|
||||
text = default_label
|
||||
replaced = False # True once the user has typed their first char
|
||||
|
||||
while True:
|
||||
_render_label(screen, text, disclaimer=disclaimer)
|
||||
try:
|
||||
key = screen.getch()
|
||||
except KeyboardInterrupt:
|
||||
return default_label
|
||||
|
||||
if key in (curses.KEY_ENTER, _KEY_ENTER_ALT, ord("\r")):
|
||||
return text.strip() or default_label
|
||||
|
||||
if key in (curses.KEY_BACKSPACE, _KEY_BACKSPACE_WIN, 127):
|
||||
if replaced:
|
||||
text = text[:-1]
|
||||
else:
|
||||
text = ""
|
||||
replaced = True
|
||||
|
||||
elif 32 <= key <= 126:
|
||||
if not replaced:
|
||||
text = chr(key)
|
||||
replaced = True
|
||||
else:
|
||||
text += chr(key)
|
||||
|
||||
|
||||
def _render_label(screen: Any, text: str, *, disclaimer: str = "") -> None:
|
||||
screen.erase()
|
||||
rows, cols = screen.getmaxyx()
|
||||
sep = "─" * min(cols - 1, 40)
|
||||
_addstr_safe(screen, 0, 0, "Name agent", curses.A_BOLD)
|
||||
_addstr_safe(screen, 1, 0, sep)
|
||||
_addstr_safe(screen, 2, 0, text[:cols - 1], curses.A_REVERSE)
|
||||
_addstr_safe(screen, 3, 0, sep)
|
||||
row = 4
|
||||
if disclaimer and rows > row + 1:
|
||||
_addstr_safe(screen, row, 0, disclaimer[:cols - 1], curses.A_BOLD)
|
||||
row += 1
|
||||
if rows > row + 1:
|
||||
_addstr_safe(screen, row, 0, "[any key] edit [Enter] confirm", curses.A_DIM)
|
||||
screen.refresh()
|
||||
|
||||
|
||||
def _color_step(screen: Any, confirmed_label: str) -> str:
|
||||
"""Step 2: pick a color from the list, or skip."""
|
||||
items = [_COLOR_NONE] + _ANSI_COLORS
|
||||
cursor = 0
|
||||
|
||||
# Initialise color pairs once; index 0 = none, 1..16 = palette.
|
||||
color_attrs = _init_color_pairs()
|
||||
|
||||
while True:
|
||||
_render_color(screen, items, cursor, confirmed_label, color_attrs)
|
||||
try:
|
||||
key = screen.getch()
|
||||
except KeyboardInterrupt:
|
||||
return ""
|
||||
|
||||
if key in (ord("q"), _KEY_ESC):
|
||||
return ""
|
||||
|
||||
if key in (curses.KEY_ENTER, _KEY_ENTER_ALT, ord("\r")):
|
||||
chosen = items[cursor]
|
||||
return "" if chosen == _COLOR_NONE else chosen
|
||||
|
||||
if key in (curses.KEY_UP, ord("k")) and cursor > 0:
|
||||
cursor -= 1
|
||||
elif key in (curses.KEY_DOWN, ord("j")) and cursor < len(items) - 1:
|
||||
cursor += 1
|
||||
|
||||
|
||||
def _init_color_pairs() -> dict[str, int]:
|
||||
"""Return {color_name: curses_attr} for the palette items."""
|
||||
attrs: dict[str, int] = {_COLOR_NONE: curses.A_NORMAL}
|
||||
try:
|
||||
curses.start_color()
|
||||
curses.use_default_colors()
|
||||
pair_idx = 2 # pair 1 reserved for other uses
|
||||
for name in _ANSI_COLORS:
|
||||
fg = _CURSES_COLOR_MAP.get(name, curses.COLOR_WHITE)
|
||||
try:
|
||||
curses.init_pair(pair_idx, fg, -1)
|
||||
attr = curses.color_pair(pair_idx) | curses.A_BOLD
|
||||
attrs[name] = attr
|
||||
pair_idx += 1
|
||||
except curses.error:
|
||||
attrs[name] = curses.A_NORMAL
|
||||
except curses.error:
|
||||
for name in _ANSI_COLORS:
|
||||
attrs[name] = curses.A_NORMAL
|
||||
return attrs
|
||||
|
||||
|
||||
def _render_color(
|
||||
screen: Any,
|
||||
items: list[str],
|
||||
cursor: int,
|
||||
confirmed_label: str,
|
||||
color_attrs: dict[str, int],
|
||||
) -> None:
|
||||
screen.erase()
|
||||
rows, cols = screen.getmaxyx()
|
||||
sep = "─" * min(cols - 1, 40)
|
||||
_addstr_safe(screen, 0, 0, "Name agent", curses.A_BOLD)
|
||||
_addstr_safe(screen, 1, 0, sep)
|
||||
_addstr_safe(screen, 2, 0, confirmed_label[:cols - 1])
|
||||
_addstr_safe(screen, 3, 0, sep)
|
||||
_addstr_safe(screen, 4, 0, "Color (optional)", curses.A_BOLD)
|
||||
|
||||
list_start = 5
|
||||
list_rows = rows - list_start - 2
|
||||
scroll = max(0, cursor - list_rows + 1)
|
||||
visible = items[scroll: scroll + list_rows]
|
||||
|
||||
for idx, name in enumerate(visible):
|
||||
abs_idx = scroll + idx
|
||||
row = list_start + idx
|
||||
if row >= rows - 2:
|
||||
break
|
||||
prefix = "> " if abs_idx == cursor else " "
|
||||
attr = color_attrs.get(name, curses.A_NORMAL)
|
||||
if abs_idx == cursor:
|
||||
attr |= curses.A_REVERSE
|
||||
_addstr_safe(screen, row, 0, (prefix + name)[:cols - 1], attr)
|
||||
|
||||
_addstr_safe(screen, rows - 2, 0, sep)
|
||||
_addstr_safe(
|
||||
screen, rows - 1, 0,
|
||||
"[↑↓/jk] move [Enter] select [Esc/q] skip",
|
||||
curses.A_DIM,
|
||||
)
|
||||
screen.refresh()
|
||||
|
||||
@@ -16,21 +16,27 @@ FROM node:22-slim
|
||||
# features (status checks, commits, PR creation) — without git in the
|
||||
# image, those features fail in surprising ways once the user does any
|
||||
# real work. ca-certificates is already in the slim base; listed for
|
||||
# clarity in case the base ever drops it. socat is the privileged
|
||||
# forwarder for the in-container ssh-agent (see bot_bottle/ssh.py): the agent
|
||||
# runs as root and rejects non-root connections, so socat sits between
|
||||
# node and the agent socket. curl is here so any HTTPS_PROXY-aware
|
||||
# tool (curl itself, plus anything that shells out to it) works
|
||||
# against pipelock's bumped TLS without the agent needing local DNS.
|
||||
# clarity in case the base ever drops it. curl is here so any
|
||||
# HTTPS_PROXY-aware tool (curl itself, plus anything that shells out
|
||||
# to it) works against egress's bumped TLS without the agent needing
|
||||
# local DNS.
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends git ca-certificates openssh-client socat curl dnsutils python3 python3-pip python3-venv \
|
||||
&& apt-get install -y --no-install-recommends git ca-certificates curl \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# App-specific deps. Python isn't required by claude-code itself
|
||||
# (claude-code is a Node CLI), but is convenient for the agent to
|
||||
# shell out to for ad-hoc scripts. Kept on its own layer so it can
|
||||
# be moved to a downstream image if the base ever needs to shrink.
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends python3 python3-pip python3-venv \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Install claude-code globally. Pinned to the version verified in the v1
|
||||
# build (`claude --version` returns 2.1.126). Bump deliberately when
|
||||
# rolling forward; an unpinned install would mean rebuilds silently pick
|
||||
# up new behavior.
|
||||
RUN npm install -g --no-fund --no-audit @anthropic-ai/claude-code@2.1.126 \
|
||||
RUN npm install -g --no-fund --no-audit @anthropic-ai/claude-code@2.1.172 \
|
||||
&& npm cache clean --force
|
||||
|
||||
# Run as a non-root user. The node image already provides a `node` user
|
||||
@@ -17,9 +17,11 @@ from typing import TYPE_CHECKING
|
||||
from ...agent_provider import (
|
||||
AgentProvider,
|
||||
AgentProviderRuntime,
|
||||
AgentProvisionDir,
|
||||
AgentProvisionFile,
|
||||
AgentProvisionPlan,
|
||||
)
|
||||
from ...backend.docker import util as docker_mod
|
||||
from ...egress import EgressRoute
|
||||
from ...log import die, info, warn
|
||||
|
||||
@@ -28,8 +30,6 @@ if TYPE_CHECKING:
|
||||
from ...backend import Bottle, BottlePlan
|
||||
|
||||
|
||||
_REPO_ROOT = Path(__file__).resolve().parents[3]
|
||||
|
||||
_SUPERVISE_MCP_NAME = "supervise"
|
||||
|
||||
|
||||
@@ -40,11 +40,53 @@ def _skills_dir(guest_home: str) -> str:
|
||||
def _prompt_path(guest_home: str) -> str:
|
||||
return f"{guest_home}/.bot-bottle-prompt.txt"
|
||||
|
||||
|
||||
_STATUS_LINE_COLORS = {
|
||||
"red": "\033[91m",
|
||||
"green": "\033[92m",
|
||||
"yellow": "\033[93m",
|
||||
"blue": "\033[94m",
|
||||
"magenta": "\033[95m",
|
||||
}
|
||||
|
||||
_CLAUDE_THEME_COLORS = {
|
||||
"red": "redBright",
|
||||
"green": "greenBright",
|
||||
"yellow": "yellowBright",
|
||||
"blue": "blueBright",
|
||||
"magenta": "magentaBright",
|
||||
}
|
||||
|
||||
|
||||
def _status_line_script(label: str, color: str) -> str:
|
||||
if not label:
|
||||
return "#!/bin/sh\nprintf '\\n'\n"
|
||||
label_q = shlex.quote(label)
|
||||
if color and color in _STATUS_LINE_COLORS:
|
||||
return (
|
||||
"#!/bin/sh\n"
|
||||
f"printf '%b%s%b\\n' '{_STATUS_LINE_COLORS[color]}' {label_q} '\\033[0m'\n"
|
||||
)
|
||||
return f"#!/bin/sh\nprintf '%s\\n' {label_q}\n"
|
||||
|
||||
|
||||
def _custom_theme_payload(color: str) -> dict[str, object] | None:
|
||||
theme_color = _CLAUDE_THEME_COLORS.get(color)
|
||||
if not theme_color:
|
||||
return None
|
||||
return {
|
||||
"name": f"Bot-bottle {color}",
|
||||
"base": "dark",
|
||||
"overrides": {
|
||||
"claude": f"ansi:{theme_color}",
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
_RUNTIME = AgentProviderRuntime(
|
||||
template="claude",
|
||||
command="claude",
|
||||
image="bot-bottle-claude:latest",
|
||||
dockerfile=str(_REPO_ROOT / "Dockerfile.claude"),
|
||||
prompt_mode="append_file",
|
||||
bypass_args=("--dangerously-skip-permissions",),
|
||||
resume_args=("--continue",),
|
||||
@@ -62,54 +104,103 @@ class ClaudeAgentProvider(AgentProvider):
|
||||
*,
|
||||
dockerfile: str,
|
||||
state_dir: Path,
|
||||
guest_home: str,
|
||||
instance_name: str,
|
||||
prompt_file: Path,
|
||||
guest_env: dict[str, str] | None = None,
|
||||
auth_token: str = "",
|
||||
forward_host_credentials: bool = False,
|
||||
host_env: dict[str, str] | None = None,
|
||||
trusted_project_path: str = "",
|
||||
label: str = "",
|
||||
color: str = "",
|
||||
provider_settings: dict[str, object] | None = None,
|
||||
) -> AgentProvisionPlan:
|
||||
del forward_host_credentials, host_env # Codex-only knobs
|
||||
del forward_host_credentials, host_env, provider_settings
|
||||
resolved_guest_env = dict(guest_env or {})
|
||||
guest_home = self.guest_home
|
||||
trusted_path = trusted_project_path or guest_home
|
||||
|
||||
env_vars: dict[str, str] = {
|
||||
"CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1",
|
||||
"DISABLE_ERROR_REPORTING": "1",
|
||||
}
|
||||
dirs = (
|
||||
AgentProvisionDir(f"{guest_home}/.claude"),
|
||||
AgentProvisionDir(f"{guest_home}/.claude/themes"),
|
||||
)
|
||||
claude_config = state_dir / "claude.json"
|
||||
claude_projects = {guest_home: {"hasTrustDialogAccepted": True}}
|
||||
claude_projects[trusted_path] = {"hasTrustDialogAccepted": True}
|
||||
claude_config.write_text(json.dumps({
|
||||
payload: dict[str, object] = {
|
||||
"hasCompletedOnboarding": True,
|
||||
"theme": "dark",
|
||||
"bypassPermissionsModeAccepted": True,
|
||||
"projects": claude_projects,
|
||||
}, indent=2) + "\n")
|
||||
}
|
||||
claude_config.write_text(json.dumps(payload, indent=2) + "\n")
|
||||
claude_config.chmod(0o600)
|
||||
files = (
|
||||
files = [
|
||||
AgentProvisionFile(claude_config, f"{guest_home}/.claude.json"),
|
||||
)
|
||||
]
|
||||
|
||||
claude_settings = state_dir / "claude-settings.json"
|
||||
claude_settings_payload: dict[str, object] = {}
|
||||
if label or color:
|
||||
statusline_script = state_dir / "claude-statusline.sh"
|
||||
statusline_script.write_text(_status_line_script(label, color))
|
||||
statusline_script.chmod(0o755)
|
||||
files.append(AgentProvisionFile(
|
||||
statusline_script,
|
||||
f"{guest_home}/.claude/statusline.sh",
|
||||
mode="755",
|
||||
))
|
||||
claude_settings_payload["statusLine"] = {
|
||||
"type": "command",
|
||||
"command": "~/.claude/statusline.sh",
|
||||
}
|
||||
theme_payload = _custom_theme_payload(color)
|
||||
if theme_payload is not None:
|
||||
theme_name = f"bot-bottle-{docker_mod.slugify(label or color)}"
|
||||
theme_file = state_dir / f"{theme_name}.json"
|
||||
theme_file.write_text(json.dumps(theme_payload, indent=2) + "\n")
|
||||
theme_file.chmod(0o644)
|
||||
files.append(AgentProvisionFile(
|
||||
theme_file,
|
||||
f"{guest_home}/.claude/themes/{theme_name}.json",
|
||||
))
|
||||
claude_settings_payload["theme"] = f"custom:{theme_name}"
|
||||
if claude_settings_payload:
|
||||
claude_settings.write_text(json.dumps(claude_settings_payload, indent=2) + "\n")
|
||||
claude_settings.chmod(0o600)
|
||||
files.append(AgentProvisionFile(
|
||||
claude_settings,
|
||||
f"{guest_home}/.claude/settings.json",
|
||||
))
|
||||
egress_routes = (EgressRoute(
|
||||
host="api.anthropic.com",
|
||||
auth_scheme="Bearer" if auth_token else "",
|
||||
token_ref=auth_token,
|
||||
tls_passthrough=True,
|
||||
),)
|
||||
hidden_env_names: frozenset[str] = frozenset()
|
||||
if auth_token:
|
||||
env_vars["CLAUDE_CODE_OAUTH_TOKEN"] = "egress-placeholder"
|
||||
hidden_env_names = frozenset({"CLAUDE_CODE_OAUTH_TOKEN"})
|
||||
|
||||
has_prompt = prompt_file.exists() and bool(prompt_file.read_text())
|
||||
return AgentProvisionPlan(
|
||||
template=_RUNTIME.template,
|
||||
command=_RUNTIME.command,
|
||||
prompt_mode=_RUNTIME.prompt_mode,
|
||||
image=_RUNTIME.image,
|
||||
dockerfile=dockerfile,
|
||||
guest_home=guest_home,
|
||||
instance_name=instance_name,
|
||||
prompt_file=prompt_file,
|
||||
env_vars=env_vars,
|
||||
guest_env=resolved_guest_env,
|
||||
files=files,
|
||||
has_prompt=has_prompt,
|
||||
dirs=dirs,
|
||||
files=tuple(files),
|
||||
egress_routes=egress_routes,
|
||||
hidden_env_names=hidden_env_names,
|
||||
)
|
||||
@@ -120,7 +211,7 @@ class ClaudeAgentProvider(AgentProvider):
|
||||
when the agent has no skills."""
|
||||
from ...backend.util import host_skill_dir
|
||||
|
||||
agent = plan.spec.manifest.agents[plan.spec.agent_name]
|
||||
agent = plan.manifest.agent
|
||||
if not agent.skills:
|
||||
return
|
||||
skills_dir = _skills_dir(plan.guest_home)
|
||||
@@ -149,8 +240,8 @@ class ClaudeAgentProvider(AgentProvider):
|
||||
f"chown node:node {prompt_path} && chmod 600 {prompt_path}",
|
||||
user="root",
|
||||
)
|
||||
agent = plan.spec.manifest.agents[plan.spec.agent_name]
|
||||
return prompt_path if agent.prompt else None
|
||||
agent = plan.manifest.agent
|
||||
return prompt_path if plan.agent_provision.has_prompt or agent.prompt else None
|
||||
|
||||
def provision(self, plan: "BottlePlan", bottle: "Bottle") -> None:
|
||||
"""Apply the claude-side declarative provision steps from
|
||||
|
||||
@@ -6,7 +6,15 @@
|
||||
FROM node:22-slim
|
||||
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends git ca-certificates openssh-client socat curl dnsutils python3 python3-pip python3-venv \
|
||||
&& apt-get install -y --no-install-recommends git ca-certificates curl \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# App-specific deps. Python isn't required by codex itself
|
||||
# (codex is a Node CLI), but is convenient for the agent to shell
|
||||
# out to for ad-hoc scripts. Kept on its own layer so it can be
|
||||
# moved to a downstream image if the base ever needs to shrink.
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends python3 python3-pip python3-venv \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN npm install -g --no-fund --no-audit @openai/codex@0.136.0 \
|
||||
@@ -18,12 +18,12 @@ from ...agent_provider import (
|
||||
CODEX_HOST_CREDENTIAL_HOSTS,
|
||||
AgentProvider,
|
||||
AgentProviderRuntime,
|
||||
AgentProvisionCommand,
|
||||
AgentProvisionDir,
|
||||
AgentProvisionCommand,
|
||||
AgentProvisionFile,
|
||||
AgentProvisionPlan,
|
||||
)
|
||||
from ...codex_auth import codex_host_access_token, write_codex_dummy_auth_file
|
||||
from .codex_auth import codex_host_access_token, write_codex_dummy_auth_file
|
||||
from ...egress import CODEX_HOST_CREDENTIAL_TOKEN_REF, EgressRoute
|
||||
from ...log import die, info, warn
|
||||
|
||||
@@ -32,8 +32,6 @@ if TYPE_CHECKING:
|
||||
from ...backend import Bottle, BottlePlan
|
||||
|
||||
|
||||
_REPO_ROOT = Path(__file__).resolve().parents[3]
|
||||
|
||||
_SUPERVISE_MCP_NAME = "supervise"
|
||||
|
||||
|
||||
@@ -48,11 +46,11 @@ def _skills_dir(guest_home: str) -> str:
|
||||
def _prompt_path(guest_home: str) -> str:
|
||||
return f"{guest_home}/.bot-bottle-prompt.txt"
|
||||
|
||||
|
||||
_RUNTIME = AgentProviderRuntime(
|
||||
template="codex",
|
||||
command="codex",
|
||||
image="bot-bottle-codex:latest",
|
||||
dockerfile=str(_REPO_ROOT / "Dockerfile.codex"),
|
||||
prompt_mode="read_prompt_file",
|
||||
bypass_args=("--dangerously-bypass-approvals-and-sandbox",),
|
||||
resume_args=("resume", "--last"),
|
||||
@@ -70,15 +68,20 @@ class CodexAgentProvider(AgentProvider):
|
||||
*,
|
||||
dockerfile: str,
|
||||
state_dir: Path,
|
||||
guest_home: str,
|
||||
instance_name: str,
|
||||
prompt_file: Path,
|
||||
guest_env: dict[str, str] | None = None,
|
||||
auth_token: str = "",
|
||||
forward_host_credentials: bool = False,
|
||||
host_env: dict[str, str] | None = None,
|
||||
trusted_project_path: str = "",
|
||||
label: str = "",
|
||||
color: str = "",
|
||||
provider_settings: dict[str, object] | None = None,
|
||||
) -> AgentProvisionPlan:
|
||||
del auth_token # Claude-only knob
|
||||
del auth_token, label, color, provider_settings
|
||||
resolved_guest_env = dict(guest_env or {})
|
||||
guest_home = self.guest_home
|
||||
trusted_path = trusted_project_path or guest_home
|
||||
|
||||
env_vars: dict[str, str] = {
|
||||
@@ -100,6 +103,11 @@ class CodexAgentProvider(AgentProvider):
|
||||
config_file.write_text(
|
||||
f'[projects."{toml_path}"]\n'
|
||||
'trust_level = "trusted"\n'
|
||||
"\n"
|
||||
"[tui]\n"
|
||||
'status_line = ["model-with-reasoning"]\n'
|
||||
'terminal_title = ["spinner", "project"]\n'
|
||||
'theme = "ansi"\n'
|
||||
)
|
||||
config_file.chmod(0o600)
|
||||
files.append(AgentProvisionFile(config_file, config_path))
|
||||
@@ -110,7 +118,6 @@ class CodexAgentProvider(AgentProvider):
|
||||
host=host,
|
||||
auth_scheme="Bearer" if forward_host_credentials else "",
|
||||
token_ref=CODEX_HOST_CREDENTIAL_TOKEN_REF if forward_host_credentials else "",
|
||||
tls_passthrough=True,
|
||||
))
|
||||
|
||||
if forward_host_credentials:
|
||||
@@ -143,14 +150,19 @@ class CodexAgentProvider(AgentProvider):
|
||||
"guest, but Codex did not accept it"
|
||||
)))
|
||||
|
||||
has_prompt = prompt_file.exists() and bool(prompt_file.read_text())
|
||||
return AgentProvisionPlan(
|
||||
template=_RUNTIME.template,
|
||||
command=_RUNTIME.command,
|
||||
prompt_mode=_RUNTIME.prompt_mode,
|
||||
image=_RUNTIME.image,
|
||||
dockerfile=dockerfile,
|
||||
guest_home=guest_home,
|
||||
instance_name=instance_name,
|
||||
prompt_file=prompt_file,
|
||||
env_vars=env_vars,
|
||||
guest_env=resolved_guest_env,
|
||||
has_prompt=has_prompt,
|
||||
dirs=tuple(dirs),
|
||||
files=tuple(files),
|
||||
pre_copy=tuple(pre_copy),
|
||||
@@ -165,7 +177,7 @@ class CodexAgentProvider(AgentProvider):
|
||||
skills."""
|
||||
from ...backend.util import host_skill_dir
|
||||
|
||||
agent = plan.spec.manifest.agents[plan.spec.agent_name]
|
||||
agent = plan.manifest.agent
|
||||
if not agent.skills:
|
||||
return
|
||||
skills_dir = _skills_dir(plan.guest_home)
|
||||
@@ -194,8 +206,8 @@ class CodexAgentProvider(AgentProvider):
|
||||
f"chown node:node {prompt_path} && chmod 600 {prompt_path}",
|
||||
user="root",
|
||||
)
|
||||
agent = plan.spec.manifest.agents[plan.spec.agent_name]
|
||||
return prompt_path if agent.prompt else None
|
||||
agent = plan.manifest.agent
|
||||
return prompt_path if plan.agent_provision.has_prompt or agent.prompt else None
|
||||
|
||||
def provision(self, plan: "BottlePlan", bottle: "Bottle") -> None:
|
||||
"""Apply the codex-side declarative provision steps from
|
||||
|
||||
@@ -15,8 +15,8 @@ from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import cast
|
||||
|
||||
from .log import die
|
||||
from .util import expand_tilde
|
||||
from ...log import die
|
||||
from ...util import expand_tilde
|
||||
|
||||
|
||||
def codex_auth_path(host_env: dict[str, str] | None = None) -> Path:
|
||||
@@ -2,7 +2,13 @@
|
||||
|
||||
Generates ed25519 keypairs via `ssh-keygen` and registers / deletes
|
||||
them using the Gitea deploy-key HTTP API. No new Python dependencies —
|
||||
only stdlib `urllib.request` and `subprocess`."""
|
||||
only stdlib `urllib.request` and `subprocess`.
|
||||
|
||||
Required token permissions (Gitea "Applications" → "Generate Token"):
|
||||
- Repository: Read & Write
|
||||
Grants POST /api/v1/repos/{owner}/{repo}/keys (create deploy key)
|
||||
and DELETE /api/v1/repos/{owner}/{repo}/keys/{id} (revoke deploy key).
|
||||
No other scopes are needed."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
|
||||
@@ -0,0 +1,41 @@
|
||||
# bot-bottle Pi provider image.
|
||||
#
|
||||
# Node LTS, git/network tooling, and the Pi coding-agent CLI installed globally.
|
||||
|
||||
FROM node:22-slim
|
||||
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends \
|
||||
git \
|
||||
ca-certificates \
|
||||
curl \
|
||||
fd-find \
|
||||
ripgrep \
|
||||
&& ln -s /usr/bin/fdfind /usr/local/bin/fd \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends python3 python3-pip python3-venv \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN npm install -g --ignore-scripts --no-fund --no-audit @earendil-works/pi-coding-agent \
|
||||
&& npm cache clean --force
|
||||
|
||||
RUN mkdir -p /home/node/.pi/agent \
|
||||
/home/node/.pi/context-mode/sessions \
|
||||
/tmp/pi-subagents-uid-1000 \
|
||||
&& chown -R node:node /home/node/.pi /tmp \
|
||||
&& chmod -R u+rwX /tmp \
|
||||
&& chown root:root /tmp /var/tmp \
|
||||
&& chmod 1777 /tmp /var/tmp
|
||||
|
||||
USER node
|
||||
WORKDIR /home/node
|
||||
|
||||
RUN pi install npm:@harms-haus/pi-cwd \
|
||||
&& pi install npm:pi-web-access \
|
||||
&& pi install npm:context-mode \
|
||||
&& pi install npm:pi-subagents \
|
||||
&& pi install npm:pi-mcp-adapter
|
||||
|
||||
CMD ["pi"]
|
||||
@@ -0,0 +1 @@
|
||||
"""Pi agent provider package."""
|
||||
@@ -0,0 +1,319 @@
|
||||
"""Pi agent provider plugin (PRD 0058, contrib).
|
||||
|
||||
Pi uses ~/.pi/agent/models.json for custom provider/model settings.
|
||||
This provider writes an Ollama-compatible default configuration and
|
||||
lets bottles override the model endpoint and model ids via
|
||||
agent_provider.settings.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import shlex
|
||||
from pathlib import Path
|
||||
from typing import TYPE_CHECKING
|
||||
from urllib.parse import urlparse
|
||||
|
||||
from ...agent_provider import (
|
||||
AgentProvider,
|
||||
AgentProviderRuntime,
|
||||
AgentProvisionDir,
|
||||
AgentProvisionFile,
|
||||
AgentProvisionPlan,
|
||||
)
|
||||
from ...egress import EgressRoute
|
||||
from ...log import die, info
|
||||
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from ...backend import Bottle, BottlePlan
|
||||
|
||||
|
||||
_DEFAULT_BASE_URL = "http://ollama:11434/v1"
|
||||
_DEFAULT_MODEL = "qwen2.5-coder:7b"
|
||||
_DEFAULT_PROVIDER_NAME = "ollama"
|
||||
_DEFAULT_CONTEXT_WINDOW = 4096
|
||||
_DEFAULT_MAX_TOKENS = 1024
|
||||
|
||||
|
||||
def _skills_dir(guest_home: str) -> str:
|
||||
return f"{guest_home}/.pi/agent/skills"
|
||||
|
||||
|
||||
def _prompt_path(guest_home: str) -> str:
|
||||
return f"{guest_home}/.bot-bottle-prompt.txt"
|
||||
|
||||
|
||||
def _append_system_path(guest_home: str) -> str:
|
||||
return f"{guest_home}/.pi/agent/APPEND_SYSTEM.md"
|
||||
|
||||
|
||||
def _models_path(guest_home: str) -> str:
|
||||
return f"{guest_home}/.pi/agent/models.json"
|
||||
|
||||
|
||||
def _runtime_state_repair_script(guest_home: str) -> str:
|
||||
home = shlex.quote(guest_home)
|
||||
pi_home = shlex.quote(f"{guest_home}/.pi")
|
||||
context_sessions = shlex.quote(f"{guest_home}/.pi/context-mode/sessions")
|
||||
return (
|
||||
f"mkdir -p {context_sessions} /tmp/pi-subagents-uid-1000 && "
|
||||
f"chown node:node {home} && "
|
||||
f"chown -R node:node {pi_home} /tmp && "
|
||||
"chmod -R u+rwX /tmp && "
|
||||
f"chmod 755 {home} && "
|
||||
"chown root:root /tmp /var/tmp && "
|
||||
"chmod 1777 /tmp /var/tmp"
|
||||
)
|
||||
|
||||
|
||||
def _settings_value(
|
||||
settings: dict[str, object],
|
||||
key: str,
|
||||
default: object,
|
||||
) -> object:
|
||||
value = settings.get(key)
|
||||
return default if value is None else value
|
||||
|
||||
|
||||
def _settings_int(
|
||||
settings: dict[str, object],
|
||||
key: str,
|
||||
default: int,
|
||||
) -> int:
|
||||
value = _settings_value(settings, key, default)
|
||||
if isinstance(value, bool):
|
||||
return default
|
||||
if isinstance(value, (int, str)):
|
||||
return int(value)
|
||||
return default
|
||||
|
||||
|
||||
def _pi_models_json(
|
||||
settings: dict[str, object],
|
||||
) -> tuple[dict[str, object], str, str, list[str], str]:
|
||||
provider_name = str(
|
||||
_settings_value(settings, "provider", _DEFAULT_PROVIDER_NAME)
|
||||
)
|
||||
base_url = str(_settings_value(settings, "base_url", _DEFAULT_BASE_URL))
|
||||
api = str(_settings_value(settings, "api", "openai-completions"))
|
||||
api_key = settings.get("api_key")
|
||||
api_key_env = str(settings.get("api_key_env", ""))
|
||||
models_raw = _settings_value(settings, "models", [_DEFAULT_MODEL])
|
||||
models = [str(model) for model in models_raw] # type: ignore[union-attr]
|
||||
supports_developer_role = bool(
|
||||
_settings_value(settings, "supports_developer_role", False)
|
||||
)
|
||||
supports_reasoning_effort = bool(
|
||||
_settings_value(settings, "supports_reasoning_effort", False)
|
||||
)
|
||||
max_tokens_field = str(
|
||||
_settings_value(settings, "max_tokens_field", "max_tokens")
|
||||
)
|
||||
context_window = _settings_int(
|
||||
settings, "context_window", _DEFAULT_CONTEXT_WINDOW,
|
||||
)
|
||||
max_tokens = _settings_int(settings, "max_tokens", _DEFAULT_MAX_TOKENS)
|
||||
input_context_window = max(1, context_window - max_tokens)
|
||||
provider: dict[str, object] = {
|
||||
"baseUrl": base_url,
|
||||
"api": api,
|
||||
"compat": {
|
||||
"supportsDeveloperRole": supports_developer_role,
|
||||
"supportsReasoningEffort": supports_reasoning_effort,
|
||||
"maxTokensField": max_tokens_field,
|
||||
},
|
||||
"models": [
|
||||
{
|
||||
"id": model,
|
||||
"name": model,
|
||||
"contextWindow": input_context_window,
|
||||
"maxTokens": max_tokens,
|
||||
}
|
||||
for model in models
|
||||
],
|
||||
}
|
||||
if api_key is not None:
|
||||
provider["apiKey"] = str(api_key)
|
||||
elif api_key_env:
|
||||
provider["apiKey"] = "egress-placeholder"
|
||||
elif provider_name == _DEFAULT_PROVIDER_NAME:
|
||||
provider["apiKey"] = "ollama"
|
||||
payload: dict[str, object] = {
|
||||
"providers": {
|
||||
provider_name: provider,
|
||||
}
|
||||
}
|
||||
return payload, base_url, api_key_env, models, provider_name
|
||||
|
||||
|
||||
def _route_host(base_url: str) -> str:
|
||||
parsed = urlparse(base_url)
|
||||
if not parsed.scheme or not parsed.hostname:
|
||||
die(
|
||||
"agent provider provisioning: pi settings base_url must be an "
|
||||
f"absolute URL (was {base_url!r})"
|
||||
)
|
||||
return parsed.hostname
|
||||
|
||||
|
||||
_RUNTIME = AgentProviderRuntime(
|
||||
template="pi",
|
||||
command="pi",
|
||||
image="bot-bottle-pi:latest",
|
||||
prompt_mode="append_system_prompt",
|
||||
bypass_args=(),
|
||||
resume_args=(),
|
||||
remote_control_args=(),
|
||||
)
|
||||
|
||||
|
||||
class PiAgentProvider(AgentProvider):
|
||||
@property
|
||||
def runtime(self) -> AgentProviderRuntime:
|
||||
return _RUNTIME
|
||||
|
||||
def provision_plan(
|
||||
self,
|
||||
*,
|
||||
dockerfile: str,
|
||||
state_dir: Path,
|
||||
instance_name: str,
|
||||
prompt_file: Path,
|
||||
guest_env: dict[str, str] | None = None,
|
||||
auth_token: str = "",
|
||||
forward_host_credentials: bool = False,
|
||||
host_env: dict[str, str] | None = None,
|
||||
trusted_project_path: str = "",
|
||||
label: str = "",
|
||||
color: str = "",
|
||||
provider_settings: dict[str, object] | None = None,
|
||||
) -> AgentProvisionPlan:
|
||||
del auth_token, forward_host_credentials, host_env, trusted_project_path
|
||||
del label, color
|
||||
resolved_guest_env = dict(guest_env or {})
|
||||
guest_home = self.guest_home
|
||||
settings = dict(provider_settings or {})
|
||||
|
||||
models_payload, base_url, api_key_env, models, provider_name = (
|
||||
_pi_models_json(settings)
|
||||
)
|
||||
models_file = state_dir / "pi-models.json"
|
||||
models_file.write_text(json.dumps(models_payload, indent=2) + "\n")
|
||||
models_file.chmod(0o600)
|
||||
|
||||
has_prompt = prompt_file.exists() and bool(prompt_file.read_text())
|
||||
auth_scheme = "Bearer" if api_key_env else ""
|
||||
return AgentProvisionPlan(
|
||||
template=_RUNTIME.template,
|
||||
command=_RUNTIME.command,
|
||||
prompt_mode=_RUNTIME.prompt_mode,
|
||||
image=_RUNTIME.image,
|
||||
dockerfile=dockerfile,
|
||||
guest_home=guest_home,
|
||||
instance_name=instance_name,
|
||||
prompt_file=prompt_file,
|
||||
guest_env=resolved_guest_env,
|
||||
has_prompt=has_prompt,
|
||||
startup_args=(
|
||||
"--models",
|
||||
",".join(f"{provider_name}/{model}" for model in models),
|
||||
),
|
||||
dirs=(AgentProvisionDir(f"{guest_home}/.pi/agent"),),
|
||||
files=(AgentProvisionFile(models_file, _models_path(guest_home)),),
|
||||
egress_routes=(EgressRoute(
|
||||
host=_route_host(base_url),
|
||||
auth_scheme=auth_scheme,
|
||||
token_ref=api_key_env,
|
||||
),),
|
||||
)
|
||||
|
||||
def provision_skills(self, plan: "BottlePlan", bottle: "Bottle") -> None:
|
||||
from ...backend.util import host_skill_dir
|
||||
|
||||
agent = plan.manifest.agent
|
||||
if not agent.skills:
|
||||
return
|
||||
skills_dir = _skills_dir(plan.guest_home)
|
||||
bottle.exec(f"mkdir -p {skills_dir}", user="root")
|
||||
for name in agent.skills:
|
||||
src = host_skill_dir(name)
|
||||
if not os.path.isdir(src):
|
||||
die(
|
||||
f"skill {name!r} disappeared from host between "
|
||||
f"validation and copy at {src}."
|
||||
)
|
||||
dst = f"{skills_dir}/{name}"
|
||||
info(f"copying skill {name} into {bottle.name}:{dst}")
|
||||
bottle.exec(f"rm -rf {dst} && mkdir -p {dst}", user="root")
|
||||
bottle.cp_in(f"{src}/.", f"{dst}/")
|
||||
bottle.exec(f"chown -R node:node {dst}", user="root")
|
||||
|
||||
def provision_prompt(self, plan: "BottlePlan", bottle: "Bottle") -> str | None:
|
||||
prompt_path = _prompt_path(plan.guest_home)
|
||||
append_system_path = _append_system_path(plan.guest_home)
|
||||
bottle.cp_in(str(plan.prompt_file), prompt_path) # type: ignore
|
||||
bottle.exec(
|
||||
f"mkdir -p {shlex.quote(plan.guest_home)}/.pi/agent && "
|
||||
f"cp {shlex.quote(prompt_path)} {shlex.quote(append_system_path)} && "
|
||||
f"chown node:node {shlex.quote(prompt_path)} "
|
||||
f"{shlex.quote(append_system_path)} && "
|
||||
f"chmod 600 {shlex.quote(prompt_path)} "
|
||||
f"{shlex.quote(append_system_path)}",
|
||||
user="root",
|
||||
)
|
||||
# Pi's `--append-system-prompt` takes literal text, not a file path.
|
||||
# Use its documented APPEND_SYSTEM.md discovery path instead.
|
||||
return None
|
||||
|
||||
def provision(self, plan: "BottlePlan", bottle: "Bottle") -> None:
|
||||
provision = plan.agent_provision
|
||||
_exec(
|
||||
bottle,
|
||||
_runtime_state_repair_script(plan.guest_home),
|
||||
"could not prepare pi runtime state",
|
||||
)
|
||||
for d in provision.dirs:
|
||||
path = shlex.quote(d.guest_path)
|
||||
_exec(bottle, f"mkdir -p {path}", f"could not create {d.guest_path}")
|
||||
_exec(
|
||||
bottle,
|
||||
f"chown {shlex.quote(d.owner)} {path}",
|
||||
f"could not chown {d.guest_path}",
|
||||
)
|
||||
_exec(
|
||||
bottle,
|
||||
f"chmod {shlex.quote(d.mode)} {path}",
|
||||
f"could not chmod {d.guest_path}",
|
||||
)
|
||||
for f in provision.files:
|
||||
bottle.cp_in(str(f.host_path), f.guest_path)
|
||||
path = shlex.quote(f.guest_path)
|
||||
_exec(
|
||||
bottle,
|
||||
f"chown {shlex.quote(f.owner)} {path}",
|
||||
f"could not chown {f.guest_path}",
|
||||
)
|
||||
_exec(
|
||||
bottle,
|
||||
f"chmod {shlex.quote(f.mode)} {path}",
|
||||
f"could not chmod {f.guest_path}",
|
||||
)
|
||||
|
||||
def provision_supervise_mcp(
|
||||
self,
|
||||
plan: "BottlePlan",
|
||||
bottle: "Bottle",
|
||||
supervise_url: str,
|
||||
) -> None:
|
||||
del plan, bottle, supervise_url
|
||||
|
||||
|
||||
def _exec(bottle: "Bottle", script: str, error: str) -> None:
|
||||
result = bottle.exec(script, user="root")
|
||||
if result.returncode != 0:
|
||||
detail = (result.stderr or result.stdout).strip()
|
||||
if detail:
|
||||
detail = f": {detail}"
|
||||
die(f"agent provider provisioning: {error}{detail}")
|
||||
@@ -0,0 +1,291 @@
|
||||
"""DLP detectors for the egress proxy (PRD 0053).
|
||||
|
||||
Pure Python, no mitmproxy dependency. Each detector is a module-level
|
||||
function returning `ScanResult | None`.
|
||||
|
||||
Ships flat into the sidecar bundle image alongside
|
||||
`egress_addon_core.py` — both this file and the package source use
|
||||
the same try/except import shim pattern.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import base64
|
||||
import gzip
|
||||
import re
|
||||
import typing
|
||||
import unicodedata
|
||||
from urllib.parse import quote as url_quote
|
||||
|
||||
try:
|
||||
from egress_addon_core import ScanResult # type: ignore[import-not-found]
|
||||
except ImportError: # pragma: no cover - host-side path
|
||||
from .egress_addon_core import ScanResult
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Snippet helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
SNIPPET_CONTEXT = 40 # chars of surrounding text to include on each side
|
||||
REDACT = "********" # fixed-width replacement for the matched sensitive value
|
||||
|
||||
|
||||
def _snippet(text: str, start: int, end: int) -> str:
|
||||
"""Return context around a match with the matched span replaced by REDACT."""
|
||||
before = text[max(0, start - SNIPPET_CONTEXT):start].replace("\n", " ").replace("\r", " ")
|
||||
after = text[end:end + SNIPPET_CONTEXT].replace("\n", " ").replace("\r", " ")
|
||||
return f"{before}{REDACT}{after}"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Unicode normalization (defeats confusable-char and combining-mark evasion)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _normalize_text(text: str) -> str:
|
||||
# NFKD separates base characters from combining marks and resolves
|
||||
# compatibility equivalents (fullwidth ASCII, ligatures, etc.)
|
||||
decomposed = unicodedata.normalize("NFKD", text)
|
||||
return "".join(
|
||||
ch for ch in decomposed
|
||||
# Strip combining marks inserted between chars to break patterns
|
||||
if unicodedata.category(ch) != "Mn"
|
||||
# Strip control chars; keep common whitespace (\n \r \t)
|
||||
and (unicodedata.category(ch) != "Cc" or ch in "\n\r\t")
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Token patterns detector
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
TOKEN_PATTERNS: tuple[tuple[str, re.Pattern[str]], ...] = (
|
||||
("AWS access key", re.compile(r"AKIA[0-9A-Z]{16}")),
|
||||
("GitHub token (classic)", re.compile(r"ghp_[A-Za-z0-9_]{36}")),
|
||||
("GitHub fine-grained token", re.compile(r"github_pat_[A-Za-z0-9_]{82}")),
|
||||
("Anthropic API key", re.compile(r"sk-ant-[A-Za-z0-9\-_]{93}")),
|
||||
("OpenAI API key", re.compile(r"sk-[A-Za-z0-9]{48}")),
|
||||
("OpenAI project API key", re.compile(r"sk-proj-[A-Za-z0-9_\-]{48,}")),
|
||||
("Stripe live key", re.compile(r"sk_live_[A-Za-z0-9]{24}")),
|
||||
("Generic Bearer JWT", re.compile(r"Bearer\s+[A-Za-z0-9._\-]{50,}")),
|
||||
("HuggingFace token", re.compile(r"hf_[A-Za-z0-9]{34,}")),
|
||||
("Databricks token", re.compile(r"dapi[A-Za-z0-9]{32}")),
|
||||
("Slack token", re.compile(r"xox[baprs]-[A-Za-z0-9]+-[A-Za-z0-9]+-[A-Za-z0-9]{24,}")),
|
||||
("npm token", re.compile(r"npm_[A-Za-z0-9]{36}")),
|
||||
("SendGrid API key", re.compile(r"SG\.[A-Za-z0-9_\-]{22}\.[A-Za-z0-9_\-]{43}")),
|
||||
("PyPI token", re.compile(r"pypi-[A-Za-z0-9_\-]{80,}")),
|
||||
("HashiCorp Vault token", re.compile(r"hvs\.[A-Za-z0-9_\-]{24,}")),
|
||||
)
|
||||
|
||||
|
||||
def scan_token_patterns(text: str, *, location: str = "body") -> ScanResult | None:
|
||||
normalized = _normalize_text(text)
|
||||
for name, pattern in TOKEN_PATTERNS:
|
||||
m = pattern.search(normalized)
|
||||
if m is not None:
|
||||
return ScanResult(
|
||||
severity="block",
|
||||
reason=f"{name} found in {location}",
|
||||
location=location,
|
||||
context=_snippet(text, m.start(), m.end()),
|
||||
)
|
||||
return None
|
||||
|
||||
|
||||
def redact_tokens(
|
||||
text: str,
|
||||
*,
|
||||
env: typing.Mapping[str, str] | None = None,
|
||||
) -> str:
|
||||
"""Replace token pattern matches and (if env given) provisioned secrets with REDACT."""
|
||||
for _, pattern in TOKEN_PATTERNS:
|
||||
text = pattern.sub(REDACT, text)
|
||||
if env is not None:
|
||||
for key, value in env.items():
|
||||
if key.startswith("EGRESS_TOKEN_") and value:
|
||||
for variant in _encoded_variants(value):
|
||||
text = text.replace(variant, REDACT)
|
||||
return text
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Known secrets detector (Phase 1b)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _encoded_variants(secret: str) -> list[str]:
|
||||
"""Return the secret plus common encoded variants for exfil detection."""
|
||||
seen: set[str] = {secret}
|
||||
variants: list[str] = [secret]
|
||||
|
||||
def _add(v: str) -> None:
|
||||
if v not in seen:
|
||||
seen.add(v)
|
||||
variants.append(v)
|
||||
|
||||
secret_bytes = secret.encode("utf-8")
|
||||
|
||||
# Standard base64 — with and without padding
|
||||
b64 = base64.b64encode(secret_bytes).decode("ascii")
|
||||
_add(b64)
|
||||
_add(b64.rstrip("="))
|
||||
|
||||
# URL-safe base64 (JWT/OAuth use -_ alphabet) — with and without padding
|
||||
b64url = base64.urlsafe_b64encode(secret_bytes).decode("ascii")
|
||||
_add(b64url)
|
||||
_add(b64url.rstrip("="))
|
||||
|
||||
# URL percent-encoding
|
||||
_add(url_quote(secret, safe=""))
|
||||
|
||||
# Hex — lowercase and uppercase
|
||||
_add(secret_bytes.hex())
|
||||
_add(secret_bytes.hex().upper())
|
||||
|
||||
# Base32 (TOTP seeds, some DNS-exfil channels)
|
||||
_add(base64.b32encode(secret_bytes).decode("ascii"))
|
||||
|
||||
# gzip + base64 (deterministic: mtime=0); recognisable by H4sI prefix
|
||||
_add(base64.b64encode(gzip.compress(secret_bytes, mtime=0)).decode("ascii"))
|
||||
|
||||
return variants
|
||||
|
||||
|
||||
def scan_known_secrets(
|
||||
text: str,
|
||||
*,
|
||||
location: str = "body",
|
||||
env: typing.Mapping[str, str] | None = None,
|
||||
) -> ScanResult | None:
|
||||
if env is None:
|
||||
return None
|
||||
for key, value in env.items():
|
||||
if not key.startswith("EGRESS_TOKEN_") or not value:
|
||||
continue
|
||||
for variant in _encoded_variants(value):
|
||||
pos = text.find(variant)
|
||||
if pos >= 0:
|
||||
return ScanResult(
|
||||
severity="block",
|
||||
reason=f"provisioned secret from {key} found in {location}",
|
||||
location=location,
|
||||
context=_snippet(text, pos, pos + len(variant)),
|
||||
)
|
||||
return None
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Naive prompt injection detector (Phase 2)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
DISCLOSURE_PHRASES: tuple[re.Pattern[str], ...] = (
|
||||
re.compile(r"(?i)system\s+prompt"),
|
||||
re.compile(r"(?i)my\s+instructions\s+are"),
|
||||
re.compile(r"(?i)original\s+instructions"),
|
||||
re.compile(r"(?i)secret\s+instructions"),
|
||||
re.compile(r"(?i)hidden\s+rules"),
|
||||
)
|
||||
|
||||
JAILBREAK_PHRASES: tuple[re.Pattern[str], ...] = (
|
||||
re.compile(r"(?i)ignore\s+previous"),
|
||||
re.compile(r"(?i)forget\s+everything"),
|
||||
re.compile(r"(?i)disregard\s+(?:all\s+)?(?:previous|prior)"),
|
||||
re.compile(r"(?i)pretend\s+you\s+are"),
|
||||
re.compile(r"(?i)act\s+as\s+(?:if|though)"),
|
||||
)
|
||||
|
||||
|
||||
PROXIMITY_CHARS = 500
|
||||
|
||||
|
||||
def _closest_pair(
|
||||
a_matches: list[re.Match[str]],
|
||||
b_matches: list[re.Match[str]],
|
||||
) -> tuple[re.Match[str], re.Match[str]] | None:
|
||||
"""Return the pair (a, b) with the smallest character gap, or None."""
|
||||
best: tuple[re.Match[str], re.Match[str]] | None = None
|
||||
best_gap: int | None = None
|
||||
for a in a_matches:
|
||||
for b in b_matches:
|
||||
gap = max(0, max(a.start(), b.start()) - min(a.end(), b.end()))
|
||||
if best_gap is None or gap < best_gap:
|
||||
best_gap = gap
|
||||
best = (a, b)
|
||||
return best
|
||||
|
||||
|
||||
def scan_naive_injection(text: str) -> ScanResult | None:
|
||||
location = "response body"
|
||||
disclosure_hits = [m for p in DISCLOSURE_PHRASES for m in p.finditer(text)]
|
||||
jailbreak_hits = [m for p in JAILBREAK_PHRASES for m in p.finditer(text)]
|
||||
|
||||
if disclosure_hits and jailbreak_hits:
|
||||
pair = _closest_pair(disclosure_hits, jailbreak_hits)
|
||||
if pair is not None:
|
||||
dist = max(0, max(pair[0].start(), pair[1].start()) - min(pair[0].end(), pair[1].end()))
|
||||
if dist <= PROXIMITY_CHARS:
|
||||
first = pair[0] if pair[0].start() <= pair[1].start() else pair[1]
|
||||
return ScanResult(
|
||||
severity="block",
|
||||
reason=(
|
||||
f"disclosure and jailbreak phrases within "
|
||||
f"{dist} chars in {location}"
|
||||
),
|
||||
location=location,
|
||||
context=_snippet(text, first.start(), first.end()),
|
||||
)
|
||||
|
||||
if disclosure_hits:
|
||||
m = disclosure_hits[0]
|
||||
return ScanResult(
|
||||
severity="warn",
|
||||
reason=f"prompt disclosure phrase detected in {location}",
|
||||
location=location,
|
||||
context=_snippet(text, m.start(), m.end()),
|
||||
)
|
||||
|
||||
if jailbreak_hits:
|
||||
m = jailbreak_hits[0]
|
||||
return ScanResult(
|
||||
severity="warn",
|
||||
reason=f"jailbreak phrase detected in {location}",
|
||||
location=location,
|
||||
context=_snippet(text, m.start(), m.end()),
|
||||
)
|
||||
|
||||
return None
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CRLF injection detector
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# URL-encoded CRLF is never legitimate in a request URL or header value.
|
||||
_CRLF_ENCODED_RE = re.compile(r"%0[dD]%0[aA]", re.ASCII)
|
||||
# Literal CRLF followed by a header-name pattern indicates header injection.
|
||||
_CRLF_HEADER_INJECT_RE = re.compile(r"\r\n[A-Za-z][A-Za-z0-9\-]+\s*:", re.ASCII)
|
||||
|
||||
|
||||
def scan_crlf_injection(text: str) -> ScanResult | None:
|
||||
if _CRLF_ENCODED_RE.search(text):
|
||||
return ScanResult(
|
||||
severity="block",
|
||||
reason="URL-encoded CRLF (%0d%0a) in outbound request",
|
||||
)
|
||||
if _CRLF_HEADER_INJECT_RE.search(text):
|
||||
return ScanResult(
|
||||
severity="block",
|
||||
reason="CRLF header injection pattern in outbound request",
|
||||
)
|
||||
return None
|
||||
|
||||
|
||||
__all__ = [
|
||||
"REDACT",
|
||||
"SNIPPET_CONTEXT",
|
||||
"TOKEN_PATTERNS",
|
||||
"redact_tokens",
|
||||
"scan_crlf_injection",
|
||||
"scan_known_secrets",
|
||||
"scan_naive_injection",
|
||||
"scan_token_patterns",
|
||||
]
|
||||
+133
-146
@@ -1,25 +1,10 @@
|
||||
"""Per-bottle egress proxy (PRD 0017).
|
||||
|
||||
Replaces the cred-proxy sidecar (PRD 0010) with a mitmproxy-based
|
||||
sidecar that becomes the agent's `HTTP_PROXY` / `HTTPS_PROXY`. It
|
||||
owns three jobs:
|
||||
|
||||
1. MITM the agent's HTTPS with the per-bottle CA (moved from
|
||||
pipelock).
|
||||
2. Enforce manifest-declared `path_allowlist` per route.
|
||||
3. Inject `Authorization` headers for routes that declare an
|
||||
`auth` block, the same way cred-proxy does today.
|
||||
"""Per-bottle egress proxy (PRD 0017, PRD 0053).
|
||||
|
||||
This module defines the abstract proxy (`Egress`), its plan
|
||||
dataclass (`EgressPlan`), and the resolved per-route shape
|
||||
(`EgressRoute`). The sidecar's start/stop lifecycle is backend-
|
||||
specific and lives on concrete subclasses (see
|
||||
`bot_bottle/backend/docker/egress.py`).
|
||||
|
||||
Chunks 1+2 of the PRD: this module + the mitmproxy addon + the Docker
|
||||
lifecycle are wired into the agent's `HTTP_PROXY` path; cred-proxy
|
||||
has been removed. Chunk 3 retargets the cred-proxy-block remediation
|
||||
flow (PRD 0014) at egress and renames the MCP tool.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
@@ -30,27 +15,21 @@ from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
from .egress_addon_core import Route
|
||||
from .egress_addon_core import (
|
||||
HeaderMatch as CoreHeaderMatch,
|
||||
MatchEntry as CoreMatchEntry,
|
||||
PathMatch as CorePathMatch,
|
||||
Route,
|
||||
)
|
||||
from .log import die
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from .manifest import Bottle
|
||||
from .manifest import ManifestBottle
|
||||
|
||||
CODEX_HOST_CREDENTIAL_TOKEN_REF = "BOT_BOTTLE_CODEX_HOST_ACCESS_TOKEN"
|
||||
|
||||
|
||||
# DNS name agents will dial for the per-bottle egress sidecar.
|
||||
# Backend-agnostic by contract: every concrete backend (Docker today,
|
||||
# others later) attaches this name to its sidecar on the bottle's
|
||||
# internal network. The agent's `HTTP_PROXY` env var resolves to
|
||||
# `http://egress:<port>` once chunk 2 cuts over.
|
||||
EGRESS_HOSTNAME = "egress"
|
||||
|
||||
# In-container path the addon reads. Pre-created in
|
||||
# `Dockerfile.sidecars` so the host bind-mount can drop the file
|
||||
# directly. Content is YAML (hand-rolled by `egress_render_routes`
|
||||
# in the style of `pipelock_render_yaml`, parsed by `yaml_subset`
|
||||
# inside the addon).
|
||||
EGRESS_ROUTES_IN_CONTAINER = "/etc/egress/routes.yaml"
|
||||
|
||||
|
||||
@@ -58,68 +37,23 @@ EGRESS_ROUTES_IN_CONTAINER = "/etc/egress/routes.yaml"
|
||||
class EgressRoute(Route):
|
||||
"""Host-side extension of the addon's `Route`.
|
||||
|
||||
Inherits `host`, `path_allowlist`, `auth_scheme`, and `token_env`
|
||||
Inherits `host`, `matches`, `auth_scheme`, and `token_env`
|
||||
from `egress_addon_core.Route` — those are the fields that cross the
|
||||
YAML wire into the sidecar. The three fields below are host-only and
|
||||
YAML wire into the sidecar. The fields below are host-only and
|
||||
are never serialised to the addon.
|
||||
|
||||
`token_ref` is the host env var the CLI reads at launch and forwards
|
||||
into the container's environ under `token_env`. Routes that share a
|
||||
`token_ref` coalesce to one `token_env` slot.
|
||||
into the container's environ under `token_env`.
|
||||
|
||||
`roles` carries the manifest route's role tuple (reserved for
|
||||
future use; always empty today).
|
||||
|
||||
`tls_passthrough` signals that pipelock must not TLS-MITM this
|
||||
host — either because the manifest declared `pipelock.tls_passthrough:
|
||||
true` (lifted in `egress_manifest_routes`) or because a provider
|
||||
route set it (e.g. egress injects its own Bearer on that host
|
||||
after the agent boundary and pipelock's header DLP would block it)."""
|
||||
future use; always empty today)."""
|
||||
|
||||
token_ref: str = ""
|
||||
roles: tuple[str, ...] = ()
|
||||
tls_passthrough: bool = False
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class EgressPlan:
|
||||
"""Output of Egress.prepare; consumed by .start.
|
||||
|
||||
The slug + routes_path + routes + token_env_map fields are
|
||||
filled at prepare time (host-side, side-effect-free on docker).
|
||||
The network + CA + pipelock fields are populated by the backend's
|
||||
launch step via `dataclasses.replace` once those resources
|
||||
exist. Empty defaults are sentinels meaning "not yet set";
|
||||
`.start` validates that they are populated.
|
||||
|
||||
`token_env_map` is `{<token_env in container>: <token_ref on host>}`.
|
||||
The backend's start step reads `os.environ[token_ref]` and
|
||||
forwards the value into the egress container's environ
|
||||
under `token_env`. The plan itself never holds token values —
|
||||
secrets never land in a dataclass that might be logged.
|
||||
|
||||
`mitmproxy_ca_host_path` is the host path of the per-bottle
|
||||
egress CA (single PEM with cert+key concatenated) minted
|
||||
by `egress_tls_init`. `.start` docker-cps it into the
|
||||
sidecar at `~/.mitmproxy/mitmproxy-ca.pem` — mitmproxy reads
|
||||
that file at boot to mint per-host leaf certs.
|
||||
|
||||
`mitmproxy_ca_cert_only_host_path` is the cert-only PEM (no
|
||||
key) for installing into the agent's trust store via
|
||||
`provision_ca`. Separate file rather than re-parsing the
|
||||
concat so secrets and trust artefacts stay on distinct paths.
|
||||
|
||||
`pipelock_ca_host_path` is the host path of the pipelock CA
|
||||
(cert only). `.start` docker-cps it into the sidecar so the
|
||||
proxy's outbound HTTPS client trusts pipelock's MITM on the
|
||||
egress → upstream leg.
|
||||
|
||||
`pipelock_proxy_url` is the URL egress sets as `HTTPS_PROXY`
|
||||
in its environ so outbound HTTPS traverses pipelock — keeping
|
||||
pipelock's hostname allowlist + DLP body scanner on the
|
||||
egress → upstream leg.
|
||||
"""
|
||||
|
||||
slug: str
|
||||
routes_path: Path
|
||||
routes: tuple[EgressRoute, ...]
|
||||
@@ -128,40 +62,46 @@ class EgressPlan:
|
||||
egress_network: str = ""
|
||||
mitmproxy_ca_host_path: Path = Path()
|
||||
mitmproxy_ca_cert_only_host_path: Path = Path()
|
||||
pipelock_ca_host_path: Path = Path()
|
||||
pipelock_proxy_url: str = ""
|
||||
log: int = 0
|
||||
|
||||
|
||||
def egress_manifest_routes(
|
||||
bottle: Bottle,
|
||||
bottle: ManifestBottle,
|
||||
) -> tuple[EgressRoute, ...]:
|
||||
"""Lift each `bottle.egress.routes[]` manifest entry into an EgressRoute.
|
||||
Order is preserved. Token slots are not assigned here — slot assignment
|
||||
is a final step in `egress_routes_for_bottle` after provider and manifest
|
||||
routes are merged."""
|
||||
out: list[EgressRoute] = []
|
||||
for r in bottle.egress.routes:
|
||||
core_matches: list[CoreMatchEntry] = []
|
||||
for m in r.Matches:
|
||||
core_paths = tuple(
|
||||
CorePathMatch(type=p.Type, value=p.Value)
|
||||
for p in m.Paths
|
||||
)
|
||||
core_headers = tuple(
|
||||
CoreHeaderMatch(name=h.Name, value=h.Value, type=h.Type)
|
||||
for h in m.Headers
|
||||
)
|
||||
core_matches.append(CoreMatchEntry(
|
||||
paths=core_paths,
|
||||
methods=m.Methods,
|
||||
headers=core_headers,
|
||||
))
|
||||
out.append(EgressRoute(
|
||||
host=r.Host,
|
||||
path_allowlist=r.PathAllowlist,
|
||||
matches=tuple(core_matches),
|
||||
auth_scheme=r.AuthScheme,
|
||||
token_ref=r.TokenRef,
|
||||
roles=r.Role,
|
||||
tls_passthrough=r.Pipelock.TlsPassthrough,
|
||||
git_fetch=r.GitFetch,
|
||||
outbound_detectors=r.OutboundDetectors,
|
||||
inbound_detectors=r.InboundDetectors,
|
||||
))
|
||||
return tuple(out)
|
||||
|
||||
|
||||
def egress_routes_for_bottle(
|
||||
bottle: Bottle,
|
||||
bottle: ManifestBottle,
|
||||
provider_routes: tuple[EgressRoute, ...] = (),
|
||||
) -> tuple[EgressRoute, ...]:
|
||||
"""Effective egress routes for the agent.
|
||||
|
||||
Provider routes own their hosts outright; manifest routes for hosts
|
||||
not claimed by any provider are appended. Token slots are assigned
|
||||
in a final pass over the merged list in order, so provisioned routes
|
||||
get the lower slot numbers."""
|
||||
manifest = egress_manifest_routes(bottle)
|
||||
provisioned_hosts = {pr.host.lower() for pr in provider_routes}
|
||||
merged = list(provider_routes) + [
|
||||
@@ -173,10 +113,6 @@ def egress_routes_for_bottle(
|
||||
def _assign_token_slots(
|
||||
routes: list[EgressRoute],
|
||||
) -> tuple[EgressRoute, ...]:
|
||||
"""Assign EGRESS_TOKEN_N slots to authenticated routes in order.
|
||||
|
||||
Routes sharing a token_ref share a slot. Unauthenticated routes
|
||||
(no auth_scheme / token_ref) keep token_env empty."""
|
||||
slot_for_ref: dict[str, str] = {}
|
||||
out: list[EgressRoute] = []
|
||||
for r in routes:
|
||||
@@ -194,13 +130,6 @@ def _assign_token_slots(
|
||||
def egress_token_env_map(
|
||||
routes: tuple[EgressRoute, ...],
|
||||
) -> dict[str, str]:
|
||||
"""Collapse the route list into `{token_env: token_ref}` for the
|
||||
authenticated routes. Routes without `auth` contribute no entry.
|
||||
|
||||
Conflict detection: two routes that share a `token_env` slot but
|
||||
name different `token_ref` host vars is a programming error in
|
||||
`egress_routes_for_bottle`; surface it as a die rather than
|
||||
silently picking one."""
|
||||
out: dict[str, str] = {}
|
||||
for r in routes:
|
||||
if not (r.auth_scheme and r.token_ref and r.token_env):
|
||||
@@ -217,32 +146,94 @@ def egress_token_env_map(
|
||||
|
||||
|
||||
def _route_to_yaml_fields(r: Route) -> dict[str, object]:
|
||||
"""Return the addon-visible fields for one route.
|
||||
|
||||
Single authoritative mapping between EgressRoute (host-side) and
|
||||
egress_addon_core.Route (sidecar-side). When a field is added to
|
||||
the addon's Route that must appear in the YAML, add it here and
|
||||
in egress_addon_core._parse_one together."""
|
||||
fields: dict[str, object] = {"host": r.host}
|
||||
if r.auth_scheme and r.token_env:
|
||||
fields["auth_scheme"] = r.auth_scheme
|
||||
fields["token_env"] = r.token_env
|
||||
if r.path_allowlist:
|
||||
fields["path_allowlist"] = list(r.path_allowlist)
|
||||
if r.matches:
|
||||
matches_data: list[dict[str, object]] = []
|
||||
for entry in r.matches:
|
||||
entry_data: dict[str, object] = {}
|
||||
if entry.paths:
|
||||
paths_data: list[dict[str, str]] = []
|
||||
for pm in entry.paths:
|
||||
pd: dict[str, str] = {"value": pm.value}
|
||||
if pm.type != "prefix":
|
||||
pd["type"] = pm.type
|
||||
paths_data.append(pd)
|
||||
entry_data["paths"] = paths_data
|
||||
if entry.methods:
|
||||
entry_data["methods"] = list(entry.methods)
|
||||
if entry.headers:
|
||||
headers_data: list[dict[str, str]] = []
|
||||
for hm in entry.headers:
|
||||
hd: dict[str, str] = {"name": hm.name, "value": hm.value}
|
||||
if hm.type != "exact":
|
||||
hd["type"] = hm.type
|
||||
headers_data.append(hd)
|
||||
entry_data["headers"] = headers_data
|
||||
matches_data.append(entry_data)
|
||||
fields["matches"] = matches_data
|
||||
if r.git_fetch:
|
||||
fields["git"] = {"fetch": True}
|
||||
if r.outbound_detectors is not None or r.inbound_detectors is not None:
|
||||
dlp: dict[str, object] = {}
|
||||
if r.outbound_detectors is not None:
|
||||
dlp["outbound_detectors"] = (
|
||||
False if not r.outbound_detectors
|
||||
else list(r.outbound_detectors)
|
||||
)
|
||||
if r.inbound_detectors is not None:
|
||||
dlp["inbound_detectors"] = (
|
||||
False if not r.inbound_detectors
|
||||
else list(r.inbound_detectors)
|
||||
)
|
||||
fields["dlp"] = dlp
|
||||
return fields
|
||||
|
||||
|
||||
def _render_match_entry(entry: dict[str, object]) -> list[str]:
|
||||
lines: list[str] = []
|
||||
first_key = True
|
||||
if "paths" in entry:
|
||||
lines.append(" - paths:")
|
||||
first_key = False
|
||||
for pd in entry["paths"]: # type: ignore[union-attr]
|
||||
pd_dict: dict[str, str] = pd # type: ignore[assignment]
|
||||
if "type" in pd_dict:
|
||||
lines.append(f' - type: "{pd_dict["type"]}"')
|
||||
lines.append(f' value: "{pd_dict["value"]}"')
|
||||
else:
|
||||
lines.append(f' - value: "{pd_dict["value"]}"')
|
||||
if "methods" in entry:
|
||||
methods_str = ", ".join(f'"{m}"' for m in entry["methods"]) # type: ignore[union-attr]
|
||||
prefix = " - " if first_key else " "
|
||||
lines.append(f'{prefix}methods: [{methods_str}]')
|
||||
first_key = False
|
||||
if "headers" in entry:
|
||||
prefix = " - " if first_key else " "
|
||||
lines.append(f"{prefix}headers:")
|
||||
first_key = False
|
||||
for hd in entry["headers"]: # type: ignore[union-attr]
|
||||
hd_dict: dict[str, str] = hd # type: ignore[assignment]
|
||||
lines.append(f' - name: "{hd_dict["name"]}"')
|
||||
lines.append(f' value: "{hd_dict["value"]}"')
|
||||
if first_key:
|
||||
lines.append(" - {}")
|
||||
return lines
|
||||
|
||||
|
||||
def egress_render_routes(
|
||||
routes: tuple[EgressRoute, ...],
|
||||
*,
|
||||
log: int = 0,
|
||||
) -> str:
|
||||
"""Serialize the route table for the addon to read.
|
||||
|
||||
YAML content — no token values, no host env-var names. Fields are
|
||||
determined by `_route_to_yaml_fields`, which is the single point of
|
||||
truth for the EgressRoute → egress_addon_core.Route mapping."""
|
||||
lines: list[str] = ["routes:"]
|
||||
lines: list[str] = []
|
||||
if log:
|
||||
lines.append(f"log: {log}")
|
||||
lines.append("routes:")
|
||||
if not routes:
|
||||
lines[0] = "routes: []"
|
||||
lines[-1] = "routes: []"
|
||||
return "\n".join(lines) + "\n"
|
||||
for r in routes:
|
||||
f = _route_to_yaml_fields(r)
|
||||
@@ -250,10 +241,24 @@ def egress_render_routes(
|
||||
if "auth_scheme" in f:
|
||||
lines.append(f' auth_scheme: "{f["auth_scheme"]}"')
|
||||
lines.append(f' token_env: "{f["token_env"]}"')
|
||||
if "path_allowlist" in f:
|
||||
lines.append(" path_allowlist:")
|
||||
for p in f["path_allowlist"]: # type: ignore
|
||||
lines.append(f' - "{p}"')
|
||||
if "matches" in f:
|
||||
lines.append(" matches:")
|
||||
for entry in f["matches"]: # type: ignore[union-attr]
|
||||
lines.extend(_render_match_entry(entry)) # type: ignore[arg-type]
|
||||
if "git" in f:
|
||||
git_dict: dict[str, object] = f["git"] # type: ignore
|
||||
lines.append(" git:")
|
||||
if git_dict.get("fetch") is True:
|
||||
lines.append(" fetch: true")
|
||||
if "dlp" in f:
|
||||
dlp_dict: dict[str, object] = f["dlp"] # type: ignore
|
||||
lines.append(" dlp:")
|
||||
for dk, dv in dlp_dict.items():
|
||||
if dv is False:
|
||||
lines.append(f" {dk}: false")
|
||||
elif isinstance(dv, list):
|
||||
items_str = ", ".join(f'"{x}"' for x in dv)
|
||||
lines.append(f" {dk}: [{items_str}]")
|
||||
return "\n".join(lines) + "\n"
|
||||
|
||||
|
||||
@@ -261,12 +266,6 @@ def egress_resolve_token_values(
|
||||
token_env_map: dict[str, str],
|
||||
host_env: dict[str, str],
|
||||
) -> dict[str, str]:
|
||||
"""Read `host_env[TokenRef]` for each entry in `token_env_map` and
|
||||
return `{token_env: <value>}`. Dies (with a pointer at the missing
|
||||
var name) if any TokenRef is unset.
|
||||
|
||||
Pure function: takes the host env as an argument so tests can pass
|
||||
a sealed mapping without touching `os.environ`."""
|
||||
out: dict[str, str] = {}
|
||||
for token_env, token_ref in token_env_map.items():
|
||||
value = host_env.get(token_ref)
|
||||
@@ -287,36 +286,24 @@ def egress_resolve_token_values(
|
||||
|
||||
|
||||
class Egress(ABC):
|
||||
"""The per-bottle egress proxy. Encapsulates the host-side prepare
|
||||
(route lift + routes.yaml render + token-env-map derivation); the
|
||||
sidecar's start/stop lifecycle is backend-specific and lives on
|
||||
concrete subclasses."""
|
||||
|
||||
def prepare(
|
||||
self,
|
||||
bottle: Bottle,
|
||||
bottle: ManifestBottle,
|
||||
slug: str,
|
||||
stage_dir: Path,
|
||||
provider_routes: tuple[EgressRoute, ...] = (),
|
||||
) -> EgressPlan:
|
||||
"""Lift `bottle.egress.routes` + `provider_routes` into resolved
|
||||
routes, render the routes file (mode 600) under `stage_dir`, and
|
||||
return the plan. Pure host-side, no docker subprocess. The
|
||||
token-env map records the mapping the launch step uses to
|
||||
forward values from the host's environ into the sidecar's environ.
|
||||
|
||||
Returned plan is incomplete: the launch step must fill
|
||||
`internal_network` / `egress_network` / `pipelock_proxy_url`
|
||||
via `dataclasses.replace` before passing it to `.start`."""
|
||||
routes = egress_routes_for_bottle(bottle, provider_routes)
|
||||
log = bottle.egress.Log
|
||||
routes_path = stage_dir / "egress_routes.yaml"
|
||||
routes_path.write_text(egress_render_routes(routes))
|
||||
routes_path.write_text(egress_render_routes(routes, log=log))
|
||||
routes_path.chmod(0o600)
|
||||
return EgressPlan(
|
||||
slug=slug,
|
||||
routes_path=routes_path,
|
||||
routes=routes,
|
||||
token_env_map=egress_token_env_map(routes),
|
||||
log=log,
|
||||
)
|
||||
|
||||
__all__ = [
|
||||
|
||||
+195
-89
@@ -1,28 +1,7 @@
|
||||
"""mitmproxy addon entrypoint for the egress sidecar (PRD 0017).
|
||||
"""mitmproxy addon entrypoint for the egress sidecar (PRD 0017, PRD 0053).
|
||||
|
||||
Loaded by `mitmdump -s /app/egress_addon.py` inside the
|
||||
egress container. Wraps the pure logic from
|
||||
`egress_addon_core` with mitmproxy's HTTPFlow API:
|
||||
|
||||
- At startup, read `EGRESS_ROUTES` (default
|
||||
`/etc/egress/routes.yaml`, JSON content) → routes table.
|
||||
- SIGHUP re-reads the file and atomically swaps the in-memory
|
||||
table. A parse error keeps the old table in place — better to
|
||||
keep serving the old config than to leave the proxy with no
|
||||
routes after a typo.
|
||||
- On each `request`: strip the inbound Authorization header, then
|
||||
consult `decide()` for forward / block / inject-auth and apply
|
||||
the decision to the flow.
|
||||
|
||||
This file imports `mitmproxy` and is never imported on the host —
|
||||
mitmproxy is a container-only dependency. The host's tests target
|
||||
`egress_addon_core`.
|
||||
|
||||
Dockerfile.sidecars copies both this file and
|
||||
`egress_addon_core.py` flat into `/app/`; the absolute import
|
||||
below works because mitmdump runs with `/app` on its sys.path. The
|
||||
parallel file in the package source tree (bot_bottle/) is the
|
||||
build input — not a module the host imports."""
|
||||
egress container."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
@@ -33,62 +12,61 @@ import signal
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
from mitmproxy import http # type: ignore[import-not-found]
|
||||
from mitmproxy import http # type: ignore[import-not-found] # pylint: disable=import-error
|
||||
|
||||
# Absolute import (NOT `from .egress_addon_core`) — the
|
||||
# container drops both files flat into /app/ so they are sibling
|
||||
# top-level modules to mitmdump's loader, not a package.
|
||||
from egress_addon_core import ( # type: ignore[import-not-found]
|
||||
Route,
|
||||
from egress_addon_core import ( # type: ignore[import-not-found] # pylint: disable=import-error
|
||||
LOG_BLOCKS,
|
||||
LOG_FULL,
|
||||
Config,
|
||||
build_inbound_scan_text,
|
||||
build_outbound_scan_text,
|
||||
decide,
|
||||
decide_git_fetch,
|
||||
is_git_fetch_request,
|
||||
is_git_push_request,
|
||||
load_routes,
|
||||
load_config,
|
||||
match_route,
|
||||
outbound_scan_headers,
|
||||
scan_inbound,
|
||||
scan_outbound,
|
||||
)
|
||||
|
||||
try:
|
||||
from dlp_detectors import redact_tokens # type: ignore[import-not-found]
|
||||
except ImportError: # pragma: no cover - host-side path
|
||||
from bot_bottle.dlp_detectors import redact_tokens # type: ignore[import-not-found]
|
||||
|
||||
|
||||
DEFAULT_ROUTES_PATH = "/etc/egress/routes.yaml"
|
||||
|
||||
# Magic hostname the addon recognises as an introspection target.
|
||||
# Requests through the proxy for `_egress.local/<path>` are
|
||||
# intercepted and answered with synthetic responses (the addon's
|
||||
# `request` hook sets `flow.response` before any upstream connection).
|
||||
# The hostname is not in DNS — only clients dialing through this
|
||||
# specific egress can reach it, and only via HTTP (no TLS).
|
||||
# Used by the supervise sidecar's `list-egress-routes` MCP
|
||||
# tool to surface the live route table to the agent.
|
||||
INTROSPECT_HOST = "_egress.local"
|
||||
|
||||
|
||||
class EgressAddon:
|
||||
"""The mitmproxy addon. One instance per `mitmdump` process; the
|
||||
request hook is invoked on every CONNECT-decapsulated HTTP/HTTPS
|
||||
request the agent makes."""
|
||||
|
||||
def __init__(self) -> None:
|
||||
self.routes_path = os.environ.get("EGRESS_ROUTES", DEFAULT_ROUTES_PATH)
|
||||
self.routes: tuple[Route, ...] = ()
|
||||
self.config: Config = Config(routes=())
|
||||
self._reload(initial=True)
|
||||
self._install_sighup()
|
||||
|
||||
def _reload(self, *, initial: bool = False) -> None:
|
||||
try:
|
||||
text = Path(self.routes_path).read_text(encoding="utf-8")
|
||||
new_routes = load_routes(text)
|
||||
new_config = load_config(text)
|
||||
except (OSError, ValueError) as e:
|
||||
tag = "boot" if initial else "SIGHUP"
|
||||
sys.stderr.write(
|
||||
f"egress: {tag} load failed: {e}\n"
|
||||
)
|
||||
if initial:
|
||||
# No baseline to fall back on; serve nothing rather
|
||||
# than masquerade as a proxy with a route table the
|
||||
# operator never declared.
|
||||
self.routes = ()
|
||||
self.config = Config(routes=())
|
||||
return
|
||||
self.routes = new_routes
|
||||
self.config = new_config
|
||||
log_label = ("off", "blocks", "full")[self.config.log]
|
||||
sys.stderr.write(
|
||||
f"egress: loaded {len(self.routes)} route(s): "
|
||||
f"{', '.join(r.host for r in self.routes)}\n"
|
||||
f"egress: loaded {len(self.config.routes)} route(s): "
|
||||
f"{', '.join(r.host for r in self.config.routes)}"
|
||||
f" [log={log_label}]\n"
|
||||
)
|
||||
|
||||
def _install_sighup(self) -> None:
|
||||
@@ -102,14 +80,9 @@ class EgressAddon:
|
||||
signal.signal(signal.SIGHUP, handler)
|
||||
|
||||
def _serve_introspection(self, flow: http.HTTPFlow, path: str) -> None:
|
||||
"""Synthesize a response for `_egress.local` requests.
|
||||
Currently supports `/allowlist` which returns the in-memory
|
||||
route table as JSON (host, path_allowlist, auth_scheme,
|
||||
token_env per route — no token VALUES, those live in the
|
||||
container's environ)."""
|
||||
if path == "/allowlist":
|
||||
payload = json.dumps(
|
||||
{"routes": [dataclasses.asdict(r) for r in self.routes]},
|
||||
{"routes": [dataclasses.asdict(r) for r in self.config.routes]},
|
||||
indent=2,
|
||||
).encode("utf-8")
|
||||
flow.response = http.Response.make(
|
||||
@@ -123,61 +96,194 @@ class EgressAddon:
|
||||
{"Content-Type": "text/plain; charset=utf-8"},
|
||||
)
|
||||
|
||||
# mitmproxy's addon API: this method name + signature is how
|
||||
# mitmdump discovers the request hook.
|
||||
def _req_ctx(self, flow: http.HTTPFlow) -> dict[str, object]:
|
||||
return {
|
||||
"host": redact_tokens(flow.request.pretty_host, env=os.environ),
|
||||
"method": flow.request.method,
|
||||
"path": redact_tokens(flow.request.path, env=os.environ),
|
||||
}
|
||||
|
||||
def _block(
|
||||
self,
|
||||
flow: http.HTTPFlow,
|
||||
reason: str,
|
||||
ctx: dict[str, object] | None = None,
|
||||
) -> None:
|
||||
if self.config.log >= LOG_BLOCKS:
|
||||
entry: dict[str, object] = {"event": "egress_block", "reason": reason}
|
||||
if ctx:
|
||||
entry.update(ctx)
|
||||
sys.stderr.write(json.dumps(entry) + "\n")
|
||||
flow.response = http.Response.make(
|
||||
403,
|
||||
reason.encode("utf-8"),
|
||||
{"Content-Type": "text/plain; charset=utf-8"},
|
||||
)
|
||||
|
||||
def _log_request(self, flow: http.HTTPFlow) -> None:
|
||||
sys.stderr.write(
|
||||
json.dumps({
|
||||
"event": "egress_request",
|
||||
"host": redact_tokens(flow.request.pretty_host, env=os.environ),
|
||||
"method": flow.request.method,
|
||||
"path": redact_tokens(flow.request.path, env=os.environ),
|
||||
"headers": dict(flow.request.headers),
|
||||
"body": flow.request.get_text(strict=False) or "",
|
||||
})
|
||||
+ "\n"
|
||||
)
|
||||
|
||||
def _log_response(self, flow: http.HTTPFlow) -> None:
|
||||
sys.stderr.write(
|
||||
json.dumps({
|
||||
"event": "egress_response",
|
||||
"host": flow.request.pretty_host,
|
||||
"status": flow.response.status_code,
|
||||
"headers": dict(flow.response.headers),
|
||||
"body": flow.response.get_text(strict=False) or "",
|
||||
})
|
||||
+ "\n"
|
||||
)
|
||||
|
||||
def request(self, flow: http.HTTPFlow) -> None:
|
||||
request_path, _, query = flow.request.path.partition("?")
|
||||
|
||||
# Introspection: requests to the magic `_egress.local`
|
||||
# host are answered locally with a synthetic response. Check
|
||||
# before the strip-auth + route logic — these requests aren't
|
||||
# real upstream traffic, the agent isn't injecting auth, and
|
||||
# the addon's own decide() would 403 the magic host (it's
|
||||
# never in the routes table).
|
||||
if flow.request.pretty_host == INTROSPECT_HOST:
|
||||
self._serve_introspection(flow, request_path)
|
||||
return
|
||||
|
||||
# Inbound Authorization is always stripped — the agent cannot
|
||||
# smuggle a stolen token through the proxy. If the matched
|
||||
# route declares an auth pair, a fresh header is injected
|
||||
# below.
|
||||
flow.request.headers.pop("authorization", None)
|
||||
# DLP outbound scan BEFORE stripping auth — catches tokens the
|
||||
# agent tried to smuggle in any header, path, query param, or body.
|
||||
# Hostname is included to catch DNS-tunnelling exfiltration attempts.
|
||||
route = match_route(self.config.routes, flow.request.pretty_host)
|
||||
if route is not None:
|
||||
body = flow.request.get_text(strict=False) or ""
|
||||
scan_text = build_outbound_scan_text(
|
||||
flow.request.pretty_host,
|
||||
request_path,
|
||||
query,
|
||||
outbound_scan_headers(route, dict(flow.request.headers)),
|
||||
body,
|
||||
)
|
||||
dlp_result = scan_outbound(route, scan_text, os.environ)
|
||||
if dlp_result is not None and dlp_result.severity == "block":
|
||||
ctx = self._req_ctx(flow)
|
||||
if dlp_result.context:
|
||||
ctx = {**ctx, "context": dlp_result.context}
|
||||
self._block(flow, f"egress DLP: {dlp_result.reason}", ctx=ctx)
|
||||
return
|
||||
|
||||
# Universal HTTPS git-push block. Defense-in-depth: git-gate
|
||||
# (PRD 0008) is the only sanctioned outbound path for git
|
||||
# writes — its pre-receive runs gitleaks. Letting HTTPS push
|
||||
# through egress + auth injection would route around
|
||||
# that scan, so we 403 before any route logic.
|
||||
if is_git_push_request(request_path, query):
|
||||
flow.response = http.Response.make(
|
||||
403,
|
||||
(
|
||||
b"egress: git push over HTTPS is not supported; "
|
||||
b"use the bottle.git SSH path (gitleaks-scanned by "
|
||||
b"git-gate's pre-receive hook)."
|
||||
),
|
||||
{"Content-Type": "text/plain; charset=utf-8"},
|
||||
self._block(
|
||||
flow,
|
||||
"egress: git push over HTTPS is not supported; "
|
||||
"use the bottle.git SSH path (gitleaks-scanned by "
|
||||
"git-gate's pre-receive hook).",
|
||||
ctx=self._req_ctx(flow),
|
||||
)
|
||||
return
|
||||
|
||||
if is_git_fetch_request(request_path, query):
|
||||
git_decision = decide_git_fetch(
|
||||
self.config.routes, flow.request.pretty_host,
|
||||
)
|
||||
if git_decision.action == "block":
|
||||
self._block(
|
||||
flow,
|
||||
git_decision.reason,
|
||||
ctx=self._req_ctx(flow),
|
||||
)
|
||||
return
|
||||
|
||||
# Strip agent-set Authorization after DLP scan so smuggled tokens
|
||||
# are caught above; the route may inject sidecar-owned auth below.
|
||||
flow.request.headers.pop("authorization", None)
|
||||
|
||||
# Build headers mapping for match evaluation
|
||||
req_headers = {k.lower(): v for k, v in flow.request.headers.items()}
|
||||
|
||||
decision = decide(
|
||||
self.routes,
|
||||
self.config.routes,
|
||||
flow.request.pretty_host,
|
||||
request_path,
|
||||
os.environ,
|
||||
request_method=flow.request.method,
|
||||
request_headers=req_headers,
|
||||
)
|
||||
|
||||
if decision.action == "block":
|
||||
flow.response = http.Response.make(
|
||||
403,
|
||||
decision.reason.encode("utf-8"),
|
||||
{"Content-Type": "text/plain; charset=utf-8"},
|
||||
)
|
||||
self._block(flow, decision.reason, ctx=self._req_ctx(flow))
|
||||
return
|
||||
|
||||
if decision.inject_authorization is not None:
|
||||
flow.request.headers["authorization"] = decision.inject_authorization
|
||||
|
||||
if self.config.log >= LOG_FULL:
|
||||
self._log_request(flow)
|
||||
|
||||
def response(self, flow: http.HTTPFlow) -> None:
|
||||
"""DLP inbound scan on response headers and body."""
|
||||
route = match_route(self.config.routes, flow.request.pretty_host)
|
||||
if route is None:
|
||||
return
|
||||
if flow.response is None:
|
||||
return
|
||||
if self.config.log >= LOG_FULL:
|
||||
self._log_response(flow)
|
||||
resp_headers = {k.lower(): v for k, v in flow.response.headers.items()}
|
||||
body = flow.response.get_text(strict=False) or ""
|
||||
scan_text = build_inbound_scan_text(resp_headers, body)
|
||||
if not scan_text:
|
||||
return
|
||||
result = scan_inbound(route, scan_text)
|
||||
if result is None:
|
||||
return
|
||||
resp_ctx: dict[str, object] = {
|
||||
**self._req_ctx(flow),
|
||||
"response_status": flow.response.status_code,
|
||||
}
|
||||
if result.context:
|
||||
resp_ctx = {**resp_ctx, "context": result.context}
|
||||
if result.severity == "block":
|
||||
self._block(flow, f"egress DLP: {result.reason}", ctx=resp_ctx)
|
||||
elif result.severity == "warn" and self.config.log >= LOG_BLOCKS:
|
||||
sys.stderr.write(
|
||||
json.dumps({
|
||||
"event": "egress_warn",
|
||||
"reason": f"egress DLP: {result.reason}",
|
||||
**resp_ctx,
|
||||
})
|
||||
+ "\n"
|
||||
)
|
||||
|
||||
def websocket_message(self, flow: http.HTTPFlow) -> None:
|
||||
"""DLP scan on WebSocket frames.
|
||||
|
||||
Outbound frames (from_client) are scanned for credential leakage;
|
||||
inbound frames are scanned for prompt injection. On a block the
|
||||
entire connection is killed — there is no HTTP response surface to
|
||||
write to after the upgrade.
|
||||
"""
|
||||
if flow.websocket is None: # type: ignore[union-attr]
|
||||
return
|
||||
route = match_route(self.config.routes, flow.request.pretty_host)
|
||||
if route is None:
|
||||
return
|
||||
message = flow.websocket.messages[-1] # type: ignore[union-attr]
|
||||
content = message.content.decode("utf-8", errors="replace")
|
||||
if message.from_client:
|
||||
result = scan_outbound(route, content, os.environ)
|
||||
if result is not None and result.severity == "block":
|
||||
sys.stderr.write(f"egress DLP: {result.reason}\n")
|
||||
flow.kill() # type: ignore[union-attr]
|
||||
else:
|
||||
result = scan_inbound(route, content)
|
||||
if result is not None:
|
||||
if result.severity == "block":
|
||||
sys.stderr.write(f"egress DLP: {result.reason}\n")
|
||||
flow.kill() # type: ignore[union-attr]
|
||||
elif result.severity == "warn":
|
||||
sys.stderr.write(f"egress DLP warn: {result.reason}\n")
|
||||
|
||||
|
||||
addons = [EgressAddon()]
|
||||
|
||||
+571
-119
@@ -1,4 +1,4 @@
|
||||
"""Pure logic for the egress mitmproxy addon (PRD 0017).
|
||||
"""Pure logic for the egress mitmproxy addon (PRD 0017, PRD 0053).
|
||||
|
||||
Split out of `egress_addon.py` so the host's unit tests can
|
||||
exercise the parse + decision functions without depending on the
|
||||
@@ -8,74 +8,268 @@ container.
|
||||
|
||||
Imports: stdlib + `yaml_subset` (which is itself stdlib-only and
|
||||
ships flat into the sidecar bundle image alongside this file —
|
||||
see `Dockerfile.sidecars`).
|
||||
"""
|
||||
see `Dockerfile.sidecars`)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
import typing
|
||||
from dataclasses import dataclass
|
||||
|
||||
# Absolute import — `yaml_subset.py` is copied flat into the bundle
|
||||
# image's `/app/` next to this file (via `Dockerfile.sidecars`).
|
||||
# The host-side unit tests run with the repo on sys.path, where the
|
||||
# import resolves under the `bot_bottle` package. The try/except
|
||||
# shim picks whichever import works.
|
||||
try:
|
||||
from yaml_subset import YamlSubsetError, parse_yaml_subset # type: ignore[import-not-found]
|
||||
except ImportError: # pragma: no cover - host-side path
|
||||
from .yaml_subset import YamlSubsetError, parse_yaml_subset
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Match types (Gateway API HTTPRoute vocabulary, PRD 0053)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
PATH_MATCH_TYPES = ("exact", "prefix", "regex")
|
||||
HEADER_MATCH_TYPES = ("exact", "regex")
|
||||
|
||||
VALID_METHODS = frozenset({
|
||||
"GET", "HEAD", "POST", "PUT", "DELETE", "PATCH", "OPTIONS", "TRACE",
|
||||
"CONNECT",
|
||||
})
|
||||
|
||||
OUTBOUND_DETECTOR_NAMES = frozenset({"token_patterns", "known_secrets"})
|
||||
INBOUND_DETECTOR_NAMES = frozenset({"naive_injection_detection"})
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class PathMatch:
|
||||
type: str # "exact" | "prefix" | "regex"
|
||||
value: str
|
||||
compiled: re.Pattern[str] | None = None
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class HeaderMatch:
|
||||
name: str
|
||||
value: str
|
||||
type: str = "exact" # "exact" | "regex"
|
||||
compiled: re.Pattern[str] | None = None
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class MatchEntry:
|
||||
paths: tuple[PathMatch, ...] = ()
|
||||
methods: tuple[str, ...] = ()
|
||||
headers: tuple[HeaderMatch, ...] = ()
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class Route:
|
||||
"""One row of the egress route table.
|
||||
|
||||
`host` is the request's `Host` header (or SNI hostname) to match
|
||||
against. `path_allowlist` is an optional tuple of absolute path
|
||||
prefixes the request path must start with; empty tuple means no
|
||||
path constraint. `auth_scheme` and `token_env` together form the
|
||||
credential-injection pair (both set or both empty); a non-empty
|
||||
pair tells the addon to overwrite the inbound Authorization with
|
||||
`<auth_scheme> <value-of-environ[token_env]>`.
|
||||
"""
|
||||
|
||||
host: str
|
||||
path_allowlist: tuple[str, ...] = ()
|
||||
matches: tuple[MatchEntry, ...] = ()
|
||||
auth_scheme: str = ""
|
||||
token_env: str = ""
|
||||
git_fetch: bool = False
|
||||
outbound_detectors: tuple[str, ...] | None = None
|
||||
inbound_detectors: tuple[str, ...] | None = None
|
||||
|
||||
|
||||
LOG_OFF = 0 # no logging
|
||||
LOG_BLOCKS = 1 # log block/warn events with request context
|
||||
LOG_FULL = 2 # log block/warn events + full request and response bodies
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class Config:
|
||||
routes: tuple[Route, ...]
|
||||
log: int = LOG_OFF
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class Decision:
|
||||
"""The result of `decide()`. Either forward (with optional
|
||||
`inject_authorization` header) or block (with a `reason` to surface
|
||||
to the agent)."""
|
||||
|
||||
action: str # "forward" or "block"
|
||||
reason: str = ""
|
||||
inject_authorization: str | None = None
|
||||
|
||||
|
||||
def parse_routes(payload: object) -> tuple[Route, ...]:
|
||||
"""Parse the routes-file payload (already JSON-decoded) into a
|
||||
tuple of `Route`s. Raises `ValueError` on any malformed entry —
|
||||
the caller decides whether to keep the old table or refuse to
|
||||
start.
|
||||
@dataclass(frozen=True)
|
||||
class ScanResult:
|
||||
severity: str # "block" or "warn"
|
||||
reason: str
|
||||
location: str = "" # where the match was found, e.g. "body", "authorization header"
|
||||
context: str = "" # surrounding text with the match replaced by REDACT
|
||||
|
||||
Schema:
|
||||
{
|
||||
"routes": [
|
||||
{
|
||||
"host": "api.github.com",
|
||||
"path_allowlist": ["/repos/x/", "/users/x"], # optional
|
||||
"auth_scheme": "Bearer", # optional
|
||||
"token_env": "EGRESS_TOKEN_0" # optional
|
||||
},
|
||||
...
|
||||
]
|
||||
}
|
||||
"""
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Parsing
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _parse_path_match(idx: int, j: int, raw: object) -> PathMatch:
|
||||
label = f"route[{idx}] matches paths[{j}]"
|
||||
if not isinstance(raw, dict):
|
||||
raise ValueError(f"{label}: must be an object")
|
||||
raw_dict: dict[str, object] = typing.cast(dict[str, object], raw)
|
||||
ptype = raw_dict.get("type", "prefix")
|
||||
if not isinstance(ptype, str) or ptype not in PATH_MATCH_TYPES:
|
||||
raise ValueError(
|
||||
f"{label}: 'type' must be one of {', '.join(PATH_MATCH_TYPES)} "
|
||||
f"(got {ptype!r})"
|
||||
)
|
||||
value = raw_dict.get("value")
|
||||
if not isinstance(value, str) or not value:
|
||||
raise ValueError(f"{label}: 'value' must be a non-empty string")
|
||||
if ptype in ("exact", "prefix") and not value.startswith("/"):
|
||||
raise ValueError(
|
||||
f"{label}: value {value!r} must start with '/' for "
|
||||
f"type {ptype!r}"
|
||||
)
|
||||
compiled: re.Pattern[str] | None = None
|
||||
if ptype == "regex":
|
||||
try:
|
||||
compiled = re.compile(value)
|
||||
except re.error as e:
|
||||
raise ValueError(
|
||||
f"{label}: regex {value!r} failed to compile: {e}"
|
||||
) from e
|
||||
for k in raw_dict:
|
||||
if k not in ("type", "value"):
|
||||
raise ValueError(f"{label}: unknown key {k!r}")
|
||||
return PathMatch(type=ptype, value=value, compiled=compiled)
|
||||
|
||||
|
||||
def _parse_header_match(idx: int, j: int, raw: object) -> HeaderMatch:
|
||||
label = f"route[{idx}] matches headers[{j}]"
|
||||
if not isinstance(raw, dict):
|
||||
raise ValueError(f"{label}: must be an object")
|
||||
raw_dict: dict[str, object] = typing.cast(dict[str, object], raw)
|
||||
name = raw_dict.get("name")
|
||||
if not isinstance(name, str) or not name:
|
||||
raise ValueError(f"{label}: 'name' must be a non-empty string")
|
||||
value = raw_dict.get("value")
|
||||
if not isinstance(value, str):
|
||||
raise ValueError(f"{label}: 'value' must be a string")
|
||||
htype = raw_dict.get("type", "exact")
|
||||
if not isinstance(htype, str) or htype not in HEADER_MATCH_TYPES:
|
||||
raise ValueError(
|
||||
f"{label}: 'type' must be one of {', '.join(HEADER_MATCH_TYPES)} "
|
||||
f"(got {htype!r})"
|
||||
)
|
||||
compiled: re.Pattern[str] | None = None
|
||||
if htype == "regex":
|
||||
try:
|
||||
compiled = re.compile(value)
|
||||
except re.error as e:
|
||||
raise ValueError(
|
||||
f"{label}: regex {value!r} failed to compile: {e}"
|
||||
) from e
|
||||
for k in raw_dict:
|
||||
if k not in ("name", "value", "type"):
|
||||
raise ValueError(f"{label}: unknown key {k!r}")
|
||||
return HeaderMatch(name=name, value=value, type=htype, compiled=compiled)
|
||||
|
||||
|
||||
def _parse_match_entry(idx: int, k: int, raw: object) -> MatchEntry:
|
||||
label = f"route[{idx}] matches[{k}]"
|
||||
if not isinstance(raw, dict):
|
||||
raise ValueError(f"{label}: must be an object")
|
||||
raw_dict: dict[str, object] = typing.cast(dict[str, object], raw)
|
||||
|
||||
paths: tuple[PathMatch, ...] = ()
|
||||
paths_raw = raw_dict.get("paths")
|
||||
if paths_raw is not None:
|
||||
if not isinstance(paths_raw, list):
|
||||
raise ValueError(f"{label}: 'paths' must be a list")
|
||||
paths_list = typing.cast(list[object], paths_raw)
|
||||
paths = tuple(_parse_path_match(idx, j, p) for j, p in enumerate(paths_list))
|
||||
|
||||
methods: tuple[str, ...] = ()
|
||||
methods_raw = raw_dict.get("methods")
|
||||
if methods_raw is not None:
|
||||
if not isinstance(methods_raw, list):
|
||||
raise ValueError(f"{label}: 'methods' must be a list")
|
||||
methods_list = typing.cast(list[object], methods_raw)
|
||||
normalised: list[str] = []
|
||||
for j, m in enumerate(methods_list):
|
||||
if not isinstance(m, str):
|
||||
raise ValueError(f"{label}: methods[{j}] must be a string")
|
||||
upper = m.upper()
|
||||
if upper not in VALID_METHODS:
|
||||
raise ValueError(
|
||||
f"{label}: methods[{j}] {m!r} is not a valid HTTP method"
|
||||
)
|
||||
normalised.append(upper)
|
||||
methods = tuple(normalised)
|
||||
|
||||
headers: tuple[HeaderMatch, ...] = ()
|
||||
headers_raw = raw_dict.get("headers")
|
||||
if headers_raw is not None:
|
||||
if not isinstance(headers_raw, list):
|
||||
raise ValueError(f"{label}: 'headers' must be a list")
|
||||
headers_list = typing.cast(list[object], headers_raw)
|
||||
headers = tuple(
|
||||
_parse_header_match(idx, j, h) for j, h in enumerate(headers_list)
|
||||
)
|
||||
|
||||
for key in raw_dict:
|
||||
if key not in ("paths", "methods", "headers"):
|
||||
raise ValueError(f"{label}: unknown key {key!r}")
|
||||
|
||||
return MatchEntry(paths=paths, methods=methods, headers=headers)
|
||||
|
||||
|
||||
def _parse_detectors(
|
||||
idx: int,
|
||||
host: str,
|
||||
raw_dict: dict[str, object],
|
||||
) -> tuple[tuple[str, ...] | None, tuple[str, ...] | None]:
|
||||
"""Parse the optional `dlp` block on a route, returning
|
||||
(outbound_detectors, inbound_detectors)."""
|
||||
dlp_raw = raw_dict.get("dlp")
|
||||
if dlp_raw is None:
|
||||
return None, None
|
||||
label = f"route[{idx}] ({host})"
|
||||
if not isinstance(dlp_raw, dict):
|
||||
raise ValueError(f"{label}: 'dlp' must be an object")
|
||||
dlp = typing.cast(dict[str, object], dlp_raw)
|
||||
|
||||
def _parse_detector_field(
|
||||
field: str,
|
||||
valid_names: frozenset[str],
|
||||
) -> tuple[str, ...] | None:
|
||||
val = dlp.get(field)
|
||||
if val is None:
|
||||
return None
|
||||
if val is False:
|
||||
return ()
|
||||
if not isinstance(val, list):
|
||||
raise ValueError(
|
||||
f"{label}: dlp.{field} must be false, a list, or omitted"
|
||||
)
|
||||
items = typing.cast(list[object], val)
|
||||
names: list[str] = []
|
||||
for j, item in enumerate(items):
|
||||
if not isinstance(item, str):
|
||||
raise ValueError(
|
||||
f"{label}: dlp.{field}[{j}] must be a string"
|
||||
)
|
||||
if item not in valid_names:
|
||||
raise ValueError(
|
||||
f"{label}: dlp.{field}[{j}] {item!r} is not a valid "
|
||||
f"detector name; valid names: {', '.join(sorted(valid_names))}"
|
||||
)
|
||||
names.append(item)
|
||||
return tuple(names)
|
||||
|
||||
outbound = _parse_detector_field("outbound_detectors", OUTBOUND_DETECTOR_NAMES)
|
||||
inbound = _parse_detector_field("inbound_detectors", INBOUND_DETECTOR_NAMES)
|
||||
|
||||
for k in dlp:
|
||||
if k not in ("outbound_detectors", "inbound_detectors"):
|
||||
raise ValueError(
|
||||
f"{label}: dlp has unknown key {k!r}; accepted keys "
|
||||
f"are 'outbound_detectors', 'inbound_detectors'"
|
||||
)
|
||||
return outbound, inbound
|
||||
|
||||
|
||||
def parse_routes(payload: object) -> tuple[Route, ...]:
|
||||
if not isinstance(payload, dict):
|
||||
raise ValueError("routes payload: top-level must be an object")
|
||||
payload_dict: dict[str, object] = typing.cast(dict[str, object], payload)
|
||||
@@ -98,32 +292,24 @@ def _parse_one(idx: int, raw: object) -> Route:
|
||||
if not isinstance(host, str) or not host:
|
||||
raise ValueError(f"{label}: 'host' must be a non-empty string")
|
||||
|
||||
path_allow_raw: object = raw_dict.get("path_allowlist", [])
|
||||
if not isinstance(path_allow_raw, list):
|
||||
raise ValueError(f"{label} ({host}): 'path_allowlist' must be a list")
|
||||
path_allow_list: list[object] = typing.cast(list[object], path_allow_raw)
|
||||
prefixes: list[str] = []
|
||||
for j, p in enumerate(path_allow_list):
|
||||
if not isinstance(p, str):
|
||||
raise ValueError(
|
||||
f"{label} ({host}): path_allowlist[{j}] must be a string"
|
||||
)
|
||||
if not p.startswith("/"):
|
||||
raise ValueError(
|
||||
f"{label} ({host}): path_allowlist[{j}] {p!r} must be an "
|
||||
f"absolute path prefix starting with '/'"
|
||||
)
|
||||
prefixes.append(p)
|
||||
# matches
|
||||
matches: tuple[MatchEntry, ...] = ()
|
||||
matches_raw = raw_dict.get("matches")
|
||||
if matches_raw is not None:
|
||||
if not isinstance(matches_raw, list):
|
||||
raise ValueError(f"{label} ({host}): 'matches' must be a list")
|
||||
matches_list = typing.cast(list[object], matches_raw)
|
||||
matches = tuple(
|
||||
_parse_match_entry(idx, k, m) for k, m in enumerate(matches_list)
|
||||
)
|
||||
|
||||
# auth (unchanged wire format)
|
||||
auth_scheme: object = raw_dict.get("auth_scheme", "")
|
||||
token_env: object = raw_dict.get("token_env", "")
|
||||
if not isinstance(auth_scheme, str):
|
||||
raise ValueError(f"{label} ({host}): 'auth_scheme' must be a string")
|
||||
if not isinstance(token_env, str):
|
||||
raise ValueError(f"{label} ({host}): 'token_env' must be a string")
|
||||
# Both-or-neither: 'auth' on the manifest side renders to this
|
||||
# pair atomically. A partial pair here means the renderer or a
|
||||
# hand-edited file is broken.
|
||||
if bool(auth_scheme) != bool(token_env):
|
||||
raise ValueError(
|
||||
f"{label} ({host}): 'auth_scheme' and 'token_env' must be both "
|
||||
@@ -131,19 +317,50 @@ def _parse_one(idx: int, raw: object) -> Route:
|
||||
f"token_env={token_env!r})"
|
||||
)
|
||||
|
||||
# git-over-HTTPS policy
|
||||
git_fetch = False
|
||||
git_raw = raw_dict.get("git")
|
||||
if git_raw is not None:
|
||||
if not isinstance(git_raw, dict):
|
||||
raise ValueError(f"{label} ({host}): 'git' must be an object")
|
||||
git_dict: dict[str, object] = typing.cast(dict[str, object], git_raw)
|
||||
fetch_raw = git_dict.get("fetch", False)
|
||||
if fetch_raw is True or fetch_raw is False:
|
||||
git_fetch = fetch_raw
|
||||
else:
|
||||
raise ValueError(f"{label} ({host}): 'git.fetch' must be a boolean")
|
||||
for k in git_dict:
|
||||
if k != "fetch":
|
||||
raise ValueError(
|
||||
f"{label} ({host}): git has unknown key {k!r}; "
|
||||
"accepted key is 'fetch'"
|
||||
)
|
||||
|
||||
# dlp detectors
|
||||
outbound_detectors, inbound_detectors = _parse_detectors(
|
||||
idx, host, raw_dict,
|
||||
)
|
||||
|
||||
for k in raw_dict:
|
||||
if k not in ("host", "matches", "auth_scheme", "token_env", "dlp", "git"):
|
||||
raise ValueError(
|
||||
f"{label} ({host}): unknown key {k!r}; accepted keys "
|
||||
f"are 'host', 'matches', 'auth_scheme', 'token_env', 'dlp', 'git'"
|
||||
)
|
||||
|
||||
return Route(
|
||||
host=host,
|
||||
path_allowlist=tuple(prefixes),
|
||||
matches=matches,
|
||||
auth_scheme=auth_scheme,
|
||||
token_env=token_env,
|
||||
git_fetch=git_fetch,
|
||||
outbound_detectors=outbound_detectors,
|
||||
inbound_detectors=inbound_detectors,
|
||||
)
|
||||
|
||||
|
||||
def load_routes(text: str) -> tuple[Route, ...]:
|
||||
"""Parse YAML text → routes. Raises `ValueError` for both
|
||||
decode and shape errors so callers handle them uniformly.
|
||||
`YamlSubsetError` from the parser is a `ValueError` subclass so
|
||||
it already satisfies the same surface; we let it propagate."""
|
||||
"""Parse YAML text → routes."""
|
||||
try:
|
||||
payload = parse_yaml_subset(text)
|
||||
except YamlSubsetError as e:
|
||||
@@ -151,29 +368,102 @@ def load_routes(text: str) -> tuple[Route, ...]:
|
||||
return parse_routes(payload)
|
||||
|
||||
|
||||
def parse_config(payload: object) -> "Config":
|
||||
"""Parse a full egress config payload (top-level log level + routes)."""
|
||||
if not isinstance(payload, dict):
|
||||
raise ValueError("routes payload: top-level must be an object")
|
||||
payload_dict: dict[str, object] = typing.cast(dict[str, object], payload)
|
||||
|
||||
log_raw: object = payload_dict.get("log", LOG_OFF)
|
||||
if log_raw is True or log_raw is False or not isinstance(log_raw, int) \
|
||||
or log_raw not in (LOG_OFF, LOG_BLOCKS, LOG_FULL):
|
||||
raise ValueError(
|
||||
f"routes payload: 'log' must be {LOG_OFF}, {LOG_BLOCKS}, or {LOG_FULL}"
|
||||
)
|
||||
|
||||
routes = parse_routes(payload)
|
||||
return Config(routes=routes, log=log_raw)
|
||||
|
||||
|
||||
def load_config(text: str) -> "Config":
|
||||
"""Parse YAML text → Config (routes + log flag)."""
|
||||
try:
|
||||
payload = parse_yaml_subset(text)
|
||||
except YamlSubsetError as e:
|
||||
raise ValueError(f"routes payload: invalid YAML: {e}") from e
|
||||
return parse_config(payload)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Match evaluation
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _path_matches(pm: PathMatch, request_path: str) -> bool:
|
||||
if pm.type == "exact":
|
||||
return request_path == pm.value
|
||||
if pm.type == "prefix":
|
||||
if request_path == pm.value:
|
||||
return True
|
||||
if not pm.value.endswith("/"):
|
||||
return request_path.startswith(pm.value + "/")
|
||||
return request_path.startswith(pm.value)
|
||||
if pm.type == "regex" and pm.compiled is not None:
|
||||
return pm.compiled.search(request_path) is not None
|
||||
return False
|
||||
|
||||
|
||||
def _entry_matches(
|
||||
entry: MatchEntry,
|
||||
request_path: str,
|
||||
request_method: str,
|
||||
request_headers: typing.Mapping[str, str],
|
||||
) -> bool:
|
||||
"""All predicates within a MatchEntry are ANDed."""
|
||||
if entry.paths:
|
||||
if not any(_path_matches(pm, request_path) for pm in entry.paths):
|
||||
return False
|
||||
if entry.methods:
|
||||
if request_method.upper() not in entry.methods:
|
||||
return False
|
||||
if entry.headers:
|
||||
for hm in entry.headers:
|
||||
header_val = request_headers.get(hm.name.lower())
|
||||
if header_val is None:
|
||||
return False
|
||||
if hm.type == "exact":
|
||||
if header_val != hm.value:
|
||||
return False
|
||||
elif hm.type == "regex" and hm.compiled is not None:
|
||||
if not hm.compiled.search(header_val):
|
||||
return False
|
||||
return True
|
||||
|
||||
|
||||
def evaluate_matches(
|
||||
route: Route,
|
||||
request_path: str,
|
||||
request_method: str = "GET",
|
||||
request_headers: typing.Mapping[str, str] | None = None,
|
||||
) -> bool:
|
||||
"""Return True if the request matches this route's match entries.
|
||||
Empty matches tuple means all requests match (bare-pass route)."""
|
||||
if not route.matches:
|
||||
return True
|
||||
hdrs: typing.Mapping[str, str] = request_headers or {}
|
||||
return any(
|
||||
_entry_matches(entry, request_path, request_method, hdrs)
|
||||
for entry in route.matches
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Git push detection (unchanged)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def is_git_push_request(path: str, query: str) -> bool:
|
||||
"""Return True if the request is a git smart-HTTP push.
|
||||
|
||||
git push over HTTPS hits two endpoints:
|
||||
GET <repo>/info/refs?service=git-receive-pack (capabilities)
|
||||
POST <repo>/git-receive-pack (the push)
|
||||
|
||||
Fetches use `service=git-upload-pack` / `/git-upload-pack` and
|
||||
are unaffected. Egress-proxy refuses HTTPS push because git-gate's
|
||||
pre-receive gitleaks scan is the gate for outbound git data;
|
||||
routing push through egress would bypass that. Use the
|
||||
bottle.git SSH path if you need to push.
|
||||
|
||||
Universal across routes — the block fires even when no
|
||||
egress route matches the host. A bare-pass route (host with
|
||||
no auth, no path_allowlist) would otherwise let push through to
|
||||
pipelock + upstream untouched.
|
||||
"""
|
||||
if path.endswith("/git-receive-pack"):
|
||||
return True
|
||||
if path.endswith("/info/refs"):
|
||||
# Query string is parsed leniently — `service=git-receive-pack`
|
||||
# may appear with other params in any order.
|
||||
for pair in query.split("&"):
|
||||
k, _, v = pair.partition("=")
|
||||
if k == "service" and v == "git-receive-pack":
|
||||
@@ -181,18 +471,25 @@ def is_git_push_request(path: str, query: str) -> bool:
|
||||
return False
|
||||
|
||||
|
||||
def is_git_fetch_request(path: str, query: str) -> bool:
|
||||
if path.endswith("/git-upload-pack"):
|
||||
return True
|
||||
if path.endswith("/info/refs"):
|
||||
for pair in query.split("&"):
|
||||
k, _, v = pair.partition("=")
|
||||
if k == "service" and v == "git-upload-pack":
|
||||
return True
|
||||
return False
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Route lookup + decision
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def match_route(
|
||||
routes: typing.Sequence[Route],
|
||||
request_host: str,
|
||||
) -> Route | None:
|
||||
"""Return the first route whose `host` matches `request_host`
|
||||
exactly (case-insensitive). DNS names are case-insensitive.
|
||||
|
||||
Wildcard hosts (`*.foo.com`) are NOT supported — they caused
|
||||
too many edge cases (apex match? cert validation? pipelock
|
||||
mirror mismatch?) for too little payoff. Operators that need
|
||||
multiple subdomains declare them individually (or one common
|
||||
parent host as a bare-pass route)."""
|
||||
target = request_host.lower()
|
||||
for r in routes:
|
||||
if r.host.lower() == target:
|
||||
@@ -205,24 +502,10 @@ def decide(
|
||||
request_host: str,
|
||||
request_path: str,
|
||||
environ: typing.Mapping[str, str],
|
||||
*,
|
||||
request_method: str = "GET",
|
||||
request_headers: typing.Mapping[str, str] | None = None,
|
||||
) -> Decision:
|
||||
"""Pure decision: given a route table + request host + path + env,
|
||||
return what the addon should do with the request.
|
||||
|
||||
- No matching route → BLOCK. The route table is the bottle's
|
||||
egress allowlist; defense-in-depth complements pipelock's
|
||||
hostname gate on the downstream leg. A bottle that wants a
|
||||
host reachable from the agent must declare a route for it
|
||||
(bare-pass route — no `auth`, no `path_allowlist` — is fine
|
||||
for hosts that just need passthrough).
|
||||
- Matching route with `path_allowlist` set, request path doesn't
|
||||
start with any of the allowed prefixes → block with a clear
|
||||
reason.
|
||||
- Matching route with an auth pair → forward + inject
|
||||
Authorization. Token comes from `environ[route.token_env]`;
|
||||
missing/empty values block (route declared auth but the secret
|
||||
isn't here — operator misconfig).
|
||||
"""
|
||||
route = match_route(routes, request_host)
|
||||
if route is None:
|
||||
return Decision(
|
||||
@@ -234,15 +517,15 @@ def decide(
|
||||
),
|
||||
)
|
||||
|
||||
if route.path_allowlist:
|
||||
if not any(request_path.startswith(p) for p in route.path_allowlist):
|
||||
return Decision(
|
||||
action="block",
|
||||
reason=(
|
||||
f"egress: path {request_path!r} not in "
|
||||
f"path_allowlist for {route.host!r}"
|
||||
),
|
||||
)
|
||||
if not evaluate_matches(route, request_path, request_method, request_headers):
|
||||
return Decision(
|
||||
action="block",
|
||||
reason=(
|
||||
f"egress: request {request_method} {request_path!r} "
|
||||
f"does not match any entry in matches for "
|
||||
f"{route.host!r}"
|
||||
),
|
||||
)
|
||||
|
||||
if route.auth_scheme and route.token_env:
|
||||
token = environ.get(route.token_env, "")
|
||||
@@ -262,12 +545,181 @@ def decide(
|
||||
return Decision(action="forward")
|
||||
|
||||
|
||||
def decide_git_fetch(
|
||||
routes: typing.Sequence[Route],
|
||||
request_host: str,
|
||||
) -> Decision:
|
||||
route = match_route(routes, request_host)
|
||||
if route is not None and route.git_fetch:
|
||||
return Decision(action="forward")
|
||||
return Decision(
|
||||
action="block",
|
||||
reason=(
|
||||
"egress: git fetch/clone over HTTPS is not allowed by default; "
|
||||
"use git-gate for declared repos or set "
|
||||
"egress.routes[].git.fetch=true for explicit read-only "
|
||||
"HTTPS Git access."
|
||||
),
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# DLP scan dispatch (PRD 0053)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def build_outbound_scan_text(
|
||||
host: str,
|
||||
path: str,
|
||||
query: str,
|
||||
headers: typing.Mapping[str, str],
|
||||
body: str,
|
||||
) -> str:
|
||||
"""Assemble all outbound request surfaces into one string for DLP scanning.
|
||||
|
||||
Covers hostname (DNS tunnelling), path, query params, all headers, body.
|
||||
"""
|
||||
parts: list[str] = [host, path]
|
||||
if query:
|
||||
parts.append(query)
|
||||
for name, value in headers.items():
|
||||
parts.append(f"{name}: {value}")
|
||||
if body:
|
||||
parts.append(body)
|
||||
return "\n".join(parts)
|
||||
|
||||
|
||||
def outbound_scan_headers(
|
||||
route: Route,
|
||||
headers: typing.Mapping[str, str],
|
||||
) -> dict[str, str]:
|
||||
"""Return request headers that should be included in outbound DLP.
|
||||
|
||||
Routes that inject sidecar-owned auth always strip the agent's
|
||||
Authorization header before forwarding. Scanning that header first
|
||||
creates false positives for provider clients that insist on sending
|
||||
their own bearer-shaped placeholder, while still not changing what
|
||||
reaches the upstream.
|
||||
"""
|
||||
out: dict[str, str] = {}
|
||||
skip_auth = bool(route.auth_scheme and route.token_env)
|
||||
for name, value in headers.items():
|
||||
if skip_auth and name.lower() == "authorization":
|
||||
continue
|
||||
out[name] = value
|
||||
return out
|
||||
|
||||
|
||||
def build_inbound_scan_text(
|
||||
headers: typing.Mapping[str, str],
|
||||
body: str,
|
||||
) -> str:
|
||||
"""Assemble inbound response surfaces into one string for DLP scanning.
|
||||
|
||||
Covers all response headers plus body.
|
||||
"""
|
||||
parts: list[str] = []
|
||||
for name, value in headers.items():
|
||||
parts.append(f"{name}: {value}")
|
||||
if body:
|
||||
parts.append(body)
|
||||
return "\n".join(parts)
|
||||
|
||||
|
||||
def _detector_enabled(
|
||||
configured: tuple[str, ...] | None,
|
||||
name: str,
|
||||
) -> bool:
|
||||
"""Check if a named detector is enabled for a route direction.
|
||||
None means all enabled; empty tuple means all disabled."""
|
||||
if configured is None:
|
||||
return True
|
||||
return name in configured
|
||||
|
||||
|
||||
def scan_outbound(
|
||||
route: Route,
|
||||
body: str | bytes,
|
||||
environ: typing.Mapping[str, str],
|
||||
) -> ScanResult | None:
|
||||
# Lazy import to avoid circular deps and keep dlp_detectors optional
|
||||
# at import time (the sidecar copies it flat alongside this file).
|
||||
try:
|
||||
from dlp_detectors import ( # type: ignore[import-not-found]
|
||||
scan_crlf_injection,
|
||||
scan_known_secrets,
|
||||
scan_token_patterns,
|
||||
)
|
||||
except ImportError: # pragma: no cover - host-side path
|
||||
from .dlp_detectors import ( # type: ignore[import-not-found]
|
||||
scan_crlf_injection,
|
||||
scan_known_secrets,
|
||||
scan_token_patterns,
|
||||
)
|
||||
|
||||
text = body if isinstance(body, str) else body.decode("utf-8", errors="replace")
|
||||
|
||||
# CRLF injection is never legitimate — runs unconditionally, not gated
|
||||
# by outbound_detectors config.
|
||||
result = scan_crlf_injection(text)
|
||||
if result is not None:
|
||||
return result
|
||||
|
||||
if _detector_enabled(route.outbound_detectors, "token_patterns"):
|
||||
result = scan_token_patterns(text, location="body")
|
||||
if result is not None:
|
||||
return result
|
||||
|
||||
if _detector_enabled(route.outbound_detectors, "known_secrets"):
|
||||
result = scan_known_secrets(text, location="body", env=environ)
|
||||
if result is not None:
|
||||
return result
|
||||
|
||||
return None
|
||||
|
||||
|
||||
def scan_inbound(
|
||||
route: Route,
|
||||
body: str | bytes,
|
||||
) -> ScanResult | None:
|
||||
try:
|
||||
from dlp_detectors import scan_naive_injection # type: ignore[import-not-found]
|
||||
except ImportError: # pragma: no cover - host-side path
|
||||
from .dlp_detectors import scan_naive_injection # type: ignore[import-not-found]
|
||||
|
||||
text = body if isinstance(body, str) else body.decode("utf-8", errors="replace")
|
||||
|
||||
if _detector_enabled(route.inbound_detectors, "naive_injection_detection"):
|
||||
result = scan_naive_injection(text)
|
||||
if result is not None:
|
||||
return result
|
||||
|
||||
return None
|
||||
|
||||
|
||||
__all__ = [
|
||||
"LOG_BLOCKS",
|
||||
"LOG_FULL",
|
||||
"LOG_OFF",
|
||||
"Config",
|
||||
"Decision",
|
||||
"HeaderMatch",
|
||||
"MatchEntry",
|
||||
"PathMatch",
|
||||
"Route",
|
||||
"ScanResult",
|
||||
"build_inbound_scan_text",
|
||||
"build_outbound_scan_text",
|
||||
"decide",
|
||||
"decide_git_fetch",
|
||||
"evaluate_matches",
|
||||
"is_git_push_request",
|
||||
"is_git_fetch_request",
|
||||
"load_config",
|
||||
"load_routes",
|
||||
"match_route",
|
||||
"outbound_scan_headers",
|
||||
"parse_config",
|
||||
"parse_routes",
|
||||
"scan_inbound",
|
||||
"scan_outbound",
|
||||
]
|
||||
|
||||
@@ -6,15 +6,15 @@
|
||||
# call it as a normal child. Behavior is unchanged:
|
||||
#
|
||||
# * Upstream proxy: when EGRESS_UPSTREAM_PROXY is set, switch
|
||||
# to `--mode upstream:URL` to forward all post-MITM traffic
|
||||
# through pipelock. mitmproxy does NOT honor HTTPS_PROXY on
|
||||
# its outbound side, so the upstream wiring has to be the
|
||||
# mitmproxy mode flag, not env.
|
||||
# to `--mode upstream:URL` to chain through an upstream proxy.
|
||||
# mitmproxy does NOT honor HTTPS_PROXY on its outbound side,
|
||||
# so the upstream wiring has to be the mitmproxy mode flag,
|
||||
# not env.
|
||||
# * Upstream trust: when EGRESS_UPSTREAM_CA is set, build a
|
||||
# combined trust bundle (system roots + pipelock CA) and point
|
||||
# combined trust bundle (system roots + upstream CA) and point
|
||||
# mitmproxy at it. The option REPLACES mitmproxy's default
|
||||
# trust store, so passing pipelock's CA alone would break
|
||||
# route-configured pipelock passthrough hosts.
|
||||
# trust store, so passing the upstream CA alone would break
|
||||
# non-chained hosts.
|
||||
# * `-s /app/egress_addon.py` loads the addon that reads
|
||||
# /etc/egress/routes.yaml.
|
||||
|
||||
@@ -38,11 +38,7 @@ fi
|
||||
|
||||
# Bind address. Docker backend wants `0.0.0.0` (agent dials egress
|
||||
# directly via the docker network alias). Smolmachines backend
|
||||
# wants `127.0.0.1` because the agent dials pipelock — not egress
|
||||
# — and egress is pipelock's localhost-only upstream inside the
|
||||
# bundle. TSI's IP-only allowlist would otherwise let the agent
|
||||
# reach `<bundle-ip>:9099` and bypass pipelock's DLP; binding
|
||||
# 127.0.0.1 inside the bundle closes that gap (PRD 0023 chunk 3).
|
||||
# uses EGRESS_LISTEN_HOST when a non-default binding is needed.
|
||||
LISTEN_HOST_FLAG=""
|
||||
if [ -n "$EGRESS_LISTEN_HOST" ]; then
|
||||
LISTEN_HOST_FLAG="--listen-host $EGRESS_LISTEN_HOST"
|
||||
@@ -56,13 +52,10 @@ if [ -n "$EGRESS_UPSTREAM_CA" ] && [ -f "$EGRESS_UPSTREAM_CA" ]; then
|
||||
fi
|
||||
|
||||
# Scope the proxy env to this process tree only. In the bundle
|
||||
# image (PRD 0024) the four daemons share one container — setting
|
||||
# image (PRD 0024) multiple daemons share one container — setting
|
||||
# HTTPS_PROXY at the container level would route git-gate's git
|
||||
# pushes through pipelock, which is wrong (pipelock doesn't proxy
|
||||
# SSH and would block public git repos). Setting them here means
|
||||
# only mitmdump's subprocess inherits them. In the legacy
|
||||
# four-sidecar setup these env vars are also set in compose; here
|
||||
# they're additionally defensive.
|
||||
# pushes through an upstream proxy unintentionally. Setting them
|
||||
# here means only mitmdump's subprocess inherits them.
|
||||
if [ -n "$EGRESS_UPSTREAM_PROXY" ]; then
|
||||
export HTTPS_PROXY="$EGRESS_UPSTREAM_PROXY"
|
||||
export HTTP_PROXY="$EGRESS_UPSTREAM_PROXY"
|
||||
|
||||
+2
-2
@@ -114,7 +114,7 @@ def _read_secret_silent(name: str, prompt_body: str) -> str:
|
||||
return value
|
||||
|
||||
|
||||
def resolve_env(manifest: Manifest, agent: str) -> ResolvedEnv:
|
||||
def resolve_env(manifest: Manifest) -> ResolvedEnv:
|
||||
"""Iterate the agent's env entries:
|
||||
- secret: prompt at runtime; carry value in forwarded
|
||||
- interpolated: read $HOST_VAR from os.environ; carry value in forwarded
|
||||
@@ -124,7 +124,7 @@ def resolve_env(manifest: Manifest, agent: str) -> ResolvedEnv:
|
||||
backend injects forwarded values via its launcher's env parameter."""
|
||||
forwarded: dict[str, str] = {}
|
||||
literals: dict[str, str] = {}
|
||||
bottle = manifest.bottle_for(agent)
|
||||
bottle = manifest.bottle
|
||||
for name, raw in bottle.env.items():
|
||||
if not name:
|
||||
continue
|
||||
|
||||
+50
-26
@@ -15,9 +15,9 @@ a bare repo on the gate; `git daemon` serves the bare repos over
|
||||
|
||||
The agent never sees the upstream credential under either path.
|
||||
|
||||
Why a third sidecar (not folded into pipelock or ssh-gate): the
|
||||
Why a separate sidecar (not folded into egress or ssh-gate): the
|
||||
gate is the only one of the three that holds upstream push
|
||||
credentials. Mixing it with pipelock would put push creds in the
|
||||
credentials. Mixing it with egress would put push creds in the
|
||||
same blast radius as internet-facing TLS interception; mixing it
|
||||
with ssh-gate would force ssh-gate above L4 and into git-protocol
|
||||
land. See `docs/prds/0008-git-gate.md`.
|
||||
@@ -37,7 +37,7 @@ from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
|
||||
from .log import info
|
||||
from .manifest import Bottle, GitEntry
|
||||
from .manifest import ManifestBottle, ManifestGitEntry
|
||||
|
||||
|
||||
# Short network alias for git-gate inside the sidecar bundle. The
|
||||
@@ -96,9 +96,9 @@ class GitGatePlan:
|
||||
egress_network: str = ""
|
||||
|
||||
|
||||
def git_gate_upstreams_for_bottle(bottle: Bottle) -> tuple[GitGateUpstream, ...]:
|
||||
def git_gate_upstreams_for_bottle(bottle: ManifestBottle) -> tuple[GitGateUpstream, ...]:
|
||||
"""Lift each `bottle.git` entry into a GitGateUpstream. Unique-Name
|
||||
validation already ran in `manifest.Bottle.from_dict`."""
|
||||
validation already ran in `manifest.ManifestBottle.from_dict`."""
|
||||
return tuple(
|
||||
GitGateUpstream(
|
||||
name=e.Name,
|
||||
@@ -113,7 +113,7 @@ def git_gate_upstreams_for_bottle(bottle: Bottle) -> tuple[GitGateUpstream, ...]
|
||||
|
||||
|
||||
def git_gate_render_gitconfig(
|
||||
entries: tuple[GitEntry, ...], gate_host: str, *, scheme: str = "git",
|
||||
entries: tuple[ManifestGitEntry, ...], gate_host: str, *, scheme: str = "git",
|
||||
) -> str:
|
||||
"""Render the agent's ~/.gitconfig content for git-gate
|
||||
`insteadOf` rewrites. Pure host-side, no docker / smolvm;
|
||||
@@ -204,6 +204,7 @@ def git_gate_render_entrypoint(upstreams: tuple[GitGateUpstream, ...]) -> str:
|
||||
" git -C \"$repo\" config git-gate.identityFile \"$keyfile\"",
|
||||
" git -C \"$repo\" config git-gate.knownHosts \"$hostsfile\"",
|
||||
" git -C \"$repo\" config receive.denyCurrentBranch ignore",
|
||||
" git -C \"$repo\" config receive.advertisePushOptions true",
|
||||
" git -C \"$repo\" config http.receivepack true",
|
||||
" install -m 755 /etc/git-gate/pre-receive \"$repo/hooks/pre-receive\"",
|
||||
"}",
|
||||
@@ -280,15 +281,32 @@ if [ ! -f "$hostsfile" ]; then
|
||||
fi
|
||||
ssh_cmd="ssh -i $keyfile -o UserKnownHostsFile=$hostsfile -o StrictHostKeyChecking=yes -o IdentitiesOnly=yes -o BatchMode=yes -o ConnectTimeout=10"
|
||||
|
||||
push_option_count=${GIT_PUSH_OPTION_COUNT:-0}
|
||||
case "$push_option_count" in
|
||||
''|*[!0-9]*)
|
||||
echo "git-gate: invalid GIT_PUSH_OPTION_COUNT=$push_option_count" >&2
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
set --
|
||||
i=0
|
||||
while [ "$i" -lt "$push_option_count" ]; do
|
||||
opt=$(printenv "GIT_PUSH_OPTION_$i" || :)
|
||||
set -- "$@" --push-option="$opt"
|
||||
i=$((i + 1))
|
||||
done
|
||||
|
||||
while IFS=' ' read -r old new ref; do
|
||||
[ -z "$ref" ] && continue
|
||||
if [ "$new" = "$zero" ]; then
|
||||
refspec=":$ref"
|
||||
elif [ "$old" != "$zero" ] && ! git merge-base --is-ancestor "$old" "$new" 2>/dev/null; then
|
||||
refspec="+$new:$ref"
|
||||
else
|
||||
refspec="$new:$ref"
|
||||
fi
|
||||
echo "git-gate: forwarding $ref to origin" >&2
|
||||
if ! GIT_SSH_COMMAND="$ssh_cmd" git push origin "$refspec" 1>&2; then
|
||||
if ! GIT_SSH_COMMAND="$ssh_cmd" git push "$@" origin "$refspec" 1>&2; then
|
||||
echo "git-gate: upstream push failed for $ref" >&2
|
||||
exit 1
|
||||
fi
|
||||
@@ -361,7 +379,7 @@ exit 0
|
||||
|
||||
|
||||
def _provision_dynamic_key(
|
||||
entry: GitEntry,
|
||||
entry: ManifestGitEntry,
|
||||
slug: str,
|
||||
stage_dir: Path,
|
||||
) -> str:
|
||||
@@ -371,13 +389,12 @@ def _provision_dynamic_key(
|
||||
Returns the host-side path to the private key file so the caller
|
||||
can inject it into the GitGateUpstream as `identity_file`."""
|
||||
from .deploy_key_provisioner import get_provisioner
|
||||
pk = entry.ProvisionedKey
|
||||
assert pk is not None
|
||||
token = os.environ.get(pk.token_env)
|
||||
pk = entry.Key
|
||||
token = os.environ.get(pk.forge_token_env)
|
||||
if token is None:
|
||||
raise RuntimeError(
|
||||
f"git-gate.repos[{entry.Name!r}] provisioned_key.token_env"
|
||||
f" = {pk.token_env!r}: env var is not set"
|
||||
f"git-gate.repos[{entry.Name!r}] key.forge_token_env"
|
||||
f" = {pk.forge_token_env!r}: env var is not set"
|
||||
)
|
||||
api_url = pk.api_url or f"https://{entry.UpstreamHost}"
|
||||
provisioner = get_provisioner(pk.provider, token, api_url)
|
||||
@@ -402,7 +419,7 @@ def _provision_dynamic_key(
|
||||
return str(key_file)
|
||||
|
||||
|
||||
def revoke_git_gate_provisioned_keys(bottle: Bottle, stage_dir: Path) -> None:
|
||||
def revoke_git_gate_provisioned_keys(bottle: ManifestBottle, stage_dir: Path) -> None:
|
||||
"""Revoke all deploy keys provisioned for `bottle` during prepare.
|
||||
|
||||
Called at teardown after containers stop. Raises if any revocation
|
||||
@@ -410,18 +427,18 @@ def revoke_git_gate_provisioned_keys(bottle: Bottle, stage_dir: Path) -> None:
|
||||
address manually."""
|
||||
from .deploy_key_provisioner import get_provisioner
|
||||
for entry in bottle.git:
|
||||
if entry.ProvisionedKey is None:
|
||||
if entry.Key.provider != "gitea":
|
||||
continue
|
||||
pk = entry.ProvisionedKey
|
||||
pk = entry.Key
|
||||
id_file = stage_dir / f"{entry.Name}-deploy-key-id"
|
||||
if not id_file.exists():
|
||||
continue
|
||||
key_id = id_file.read_text().strip()
|
||||
token = os.environ.get(pk.token_env)
|
||||
token = os.environ.get(pk.forge_token_env)
|
||||
if token is None:
|
||||
raise RuntimeError(
|
||||
f"git-gate.repos[{entry.Name!r}] provisioned_key.token_env"
|
||||
f" = {pk.token_env!r}: env var is not set;"
|
||||
f"git-gate.repos[{entry.Name!r}] key.forge_token_env"
|
||||
f" = {pk.forge_token_env!r}: env var is not set;"
|
||||
f" cannot revoke deploy key {key_id}"
|
||||
)
|
||||
api_url = pk.api_url or f"https://{entry.UpstreamHost}"
|
||||
@@ -434,18 +451,26 @@ def revoke_git_gate_provisioned_keys(bottle: Bottle, stage_dir: Path) -> None:
|
||||
info(f"revoked deploy key {key_id} for git-gate.repos[{entry.Name!r}]")
|
||||
|
||||
|
||||
def _resolve_identity_file(entry: ManifestGitEntry, slug: str, stage_dir: Path) -> str:
|
||||
"""Return the host-side SSH identity file path for this entry.
|
||||
For gitea entries, provisions a fresh deploy key first."""
|
||||
if entry.Key.provider == "gitea":
|
||||
return _provision_dynamic_key(entry, slug, stage_dir)
|
||||
return entry.IdentityFile
|
||||
|
||||
|
||||
class GitGate(ABC):
|
||||
"""The per-agent git-gate. Encapsulates the host-side prepare
|
||||
(upstream lift + entrypoint/hook render); the sidecar's
|
||||
start/stop lifecycle is backend-specific and lives on concrete
|
||||
subclasses."""
|
||||
|
||||
def prepare(self, bottle: Bottle, slug: str, stage_dir: Path) -> GitGatePlan:
|
||||
def prepare(self, bottle: ManifestBottle, slug: str, stage_dir: Path) -> GitGatePlan:
|
||||
"""Compute the upstream table from `bottle.git` and write the
|
||||
entrypoint, pre-receive hook, and access-hook scripts (mode
|
||||
600) under `stage_dir`. Pure host-side, no docker subprocess.
|
||||
|
||||
For `provisioned_key` entries, also generates and registers
|
||||
For `gitea` key entries, also generates and registers
|
||||
a fresh deploy key via the forge API and writes the private key
|
||||
+ key ID to `stage_dir`.
|
||||
|
||||
@@ -454,11 +479,10 @@ class GitGate(ABC):
|
||||
before passing the plan to `.start`."""
|
||||
upstreams_list = list(git_gate_upstreams_for_bottle(bottle))
|
||||
for i, entry in enumerate(bottle.git):
|
||||
if entry.ProvisionedKey is not None:
|
||||
key_file = _provision_dynamic_key(entry, slug, stage_dir)
|
||||
upstreams_list[i] = dataclasses.replace(
|
||||
upstreams_list[i], identity_file=key_file
|
||||
)
|
||||
upstreams_list[i] = dataclasses.replace(
|
||||
upstreams_list[i],
|
||||
identity_file=_resolve_identity_file(entry, slug, stage_dir),
|
||||
)
|
||||
upstreams = tuple(upstreams_list)
|
||||
entrypoint = stage_dir / "git_gate_entrypoint.sh"
|
||||
entrypoint.write_text(git_gate_render_entrypoint(upstreams))
|
||||
|
||||
@@ -19,8 +19,8 @@ from urllib.parse import urlsplit
|
||||
|
||||
DEFAULT_PORT = 9420
|
||||
|
||||
# Body-size cap matching supervise_server.py's 1 MiB limit.
|
||||
MAX_BODY_BYTES = 1 * 1024 * 1024
|
||||
# Bound memory use while still allowing ordinary git push packfiles.
|
||||
MAX_BODY_BYTES = 100 * 1024 * 1024
|
||||
|
||||
|
||||
class GitHttpHandler(BaseHTTPRequestHandler):
|
||||
|
||||
+226
-134
@@ -18,8 +18,7 @@ Bottle schema (frontmatter):
|
||||
user: { name: <str>, email: <str> } # optional
|
||||
repos: { <name>: <git-gate-entry>, ... } # optional
|
||||
egress: { routes: [ <egress-route>, ... ] }
|
||||
# route keys: host, path_allowlist, auth, role, pipelock
|
||||
# pipelock: { tls_passthrough: <bool>, ssrf_ip_allowlist: [<cidr>, ...] }
|
||||
# route keys: host, matches, auth, role, dlp
|
||||
supervise: <bool> # optional
|
||||
|
||||
Agent schema (frontmatter):
|
||||
@@ -37,10 +36,23 @@ Bottles can ONLY live under $HOME. A bottles/ dir under $CWD is a
|
||||
warn at load time and contributes nothing. The trust boundary is
|
||||
expressed as filesystem layout rather than resolver logic.
|
||||
|
||||
Validation runs once at load. Manifest.from_json_obj is preserved
|
||||
as a programmatic entry point (used by tests) that takes a dict
|
||||
with the same field names — useful for building manifests without
|
||||
on-disk files.
|
||||
Two types are exported:
|
||||
|
||||
ManifestIndex — the multi-agent/bottle collection returned by
|
||||
resolve() and from_json_obj(). Used for agent
|
||||
selection (all_agent_names), validation
|
||||
(require_agent), and lazy loading (load_for_agent).
|
||||
This is the pre-preflight form.
|
||||
|
||||
Manifest — a single-agent/bottle value type holding exactly
|
||||
one agent: ManifestAgent and one bottle:
|
||||
ManifestBottle (with the agent's git-gate.user
|
||||
already overlaid). Returned by load_for_agent().
|
||||
This is the post-preflight form passed to backends.
|
||||
|
||||
ManifestIndex.from_json_obj is preserved as a programmatic entry
|
||||
point (used by tests) that takes a dict with the same field names —
|
||||
useful for building manifests without on-disk files.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
@@ -51,28 +63,28 @@ from pathlib import Path
|
||||
from typing import Mapping
|
||||
|
||||
from .manifest_util import ManifestError, as_json_object
|
||||
from .manifest_agent import Agent, AgentProvider
|
||||
from .manifest_agent import ManifestAgent, ManifestAgentProvider
|
||||
from .manifest_egress import (
|
||||
EGRESS_AUTH_SCHEMES,
|
||||
EgressConfig,
|
||||
EgressRoute,
|
||||
PipelockRoutePolicy,
|
||||
ManifestEgressConfig,
|
||||
ManifestEgressRoute,
|
||||
)
|
||||
from .manifest_git import GitEntry, GitUser, parse_git_gate_config
|
||||
from .manifest_git import ManifestGitEntry, ManifestGitUser, ManifestKeyConfig, parse_git_gate_config
|
||||
from .manifest_schema import BOTTLE_KEYS
|
||||
|
||||
# Re-export everything that callers currently import from this module.
|
||||
__all__ = [
|
||||
"ManifestError",
|
||||
"GitEntry",
|
||||
"GitUser",
|
||||
"AgentProvider",
|
||||
"ManifestGitEntry",
|
||||
"ManifestGitUser",
|
||||
"ManifestKeyConfig",
|
||||
"ManifestAgentProvider",
|
||||
"EGRESS_AUTH_SCHEMES",
|
||||
"PipelockRoutePolicy",
|
||||
"EgressRoute",
|
||||
"EgressConfig",
|
||||
"Agent",
|
||||
"Bottle",
|
||||
"ManifestEgressRoute",
|
||||
"ManifestEgressConfig",
|
||||
"ManifestAgent",
|
||||
"ManifestBottle",
|
||||
"ManifestIndex",
|
||||
"Manifest",
|
||||
]
|
||||
|
||||
@@ -89,27 +101,26 @@ def _section_dict(value: object, label: str) -> dict[str, object]:
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class Bottle:
|
||||
class ManifestBottle:
|
||||
env: Mapping[str, str] = field(default_factory=_empty_str_dict)
|
||||
agent_provider: AgentProvider = field(default_factory=AgentProvider)
|
||||
git: tuple[GitEntry, ...] = ()
|
||||
agent_provider: ManifestAgentProvider = field(default_factory=ManifestAgentProvider)
|
||||
git: tuple[ManifestGitEntry, ...] = ()
|
||||
# Per-bottle git identity (issue #86). Empty default — bottles
|
||||
# that don't set `git-gate.user:` in the manifest skip the
|
||||
# `git config --global` step entirely. A bottle can declare a user
|
||||
# identity without any git-gate.repos upstreams, and vice versa.
|
||||
git_user: GitUser = field(default_factory=GitUser)
|
||||
egress: EgressConfig = field(default_factory=EgressConfig)
|
||||
git_user: ManifestGitUser = field(default_factory=ManifestGitUser)
|
||||
egress: ManifestEgressConfig = field(default_factory=ManifestEgressConfig)
|
||||
# Opt-in per-bottle stuck-recovery sidecar (PRD 0013). When true,
|
||||
# the launch step brings up a supervise sidecar that exposes three
|
||||
# MCP tools to the agent (cred-proxy-block, pipelock-block,
|
||||
# capability-block; the cred-proxy-block tool is renamed and
|
||||
# retargeted at egress in PRD 0017 chunk 3) plus mounts the
|
||||
# current-config dir read-only into the agent at /etc/bot-bottle/
|
||||
# current-config. False (the default) skips the sidecar and mount.
|
||||
# the launch step brings up a supervise sidecar that exposes MCP
|
||||
# tools to the agent (egress-block, capability-block) plus mounts
|
||||
# the current-config dir read-only into the agent at
|
||||
# /etc/bot-bottle/current-config. False (the default) skips the
|
||||
# sidecar and mount.
|
||||
supervise: bool = False
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, name: str, raw: object) -> "Bottle":
|
||||
def from_dict(cls, name: str, raw: object) -> "ManifestBottle":
|
||||
d = as_json_object(raw, f"bottle '{name}'")
|
||||
|
||||
if "runtime" in d:
|
||||
@@ -161,22 +172,22 @@ class Bottle:
|
||||
)
|
||||
env[var] = value
|
||||
|
||||
git: tuple[GitEntry, ...] = ()
|
||||
git_user = GitUser()
|
||||
git: tuple[ManifestGitEntry, ...] = ()
|
||||
git_user = ManifestGitUser()
|
||||
git_raw = d.get("git-gate")
|
||||
if git_raw is not None:
|
||||
git, git_user = parse_git_gate_config(name, git_raw)
|
||||
|
||||
agent_provider = (
|
||||
AgentProvider.from_dict(name, d["agent_provider"])
|
||||
ManifestAgentProvider.from_dict(name, d["agent_provider"])
|
||||
if "agent_provider" in d
|
||||
else AgentProvider()
|
||||
else ManifestAgentProvider()
|
||||
)
|
||||
|
||||
egress = (
|
||||
EgressConfig.from_dict(name, d["egress"])
|
||||
ManifestEgressConfig.from_dict(name, d["egress"])
|
||||
if "egress" in d
|
||||
else EgressConfig()
|
||||
else ManifestEgressConfig()
|
||||
)
|
||||
|
||||
supervise_raw = d.get("supervise", False)
|
||||
@@ -192,14 +203,64 @@ class Bottle:
|
||||
)
|
||||
|
||||
|
||||
def _merge_git_user(
|
||||
agent_user: ManifestGitUser, base_user: ManifestGitUser
|
||||
) -> ManifestGitUser:
|
||||
"""Merge the agent's git.user over the bottle's, agent-wins-on-non-empty."""
|
||||
if agent_user.is_empty():
|
||||
return base_user
|
||||
return ManifestGitUser(
|
||||
name=agent_user.name or base_user.name,
|
||||
email=agent_user.email or base_user.email,
|
||||
)
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class Manifest:
|
||||
bottles: Mapping[str, Bottle]
|
||||
agents: Mapping[str, Agent]
|
||||
"""Single-agent/bottle value type. Returned by ManifestIndex.load_for_agent().
|
||||
|
||||
`bottle` is the effective bottle with the agent's git-gate.user already
|
||||
overlaid per-field (agent wins on non-empty). Backends and provisioners
|
||||
use this directly — no agent_name lookup needed."""
|
||||
|
||||
agent: ManifestAgent
|
||||
bottle: ManifestBottle
|
||||
|
||||
def git_identity_summary(self) -> str | None:
|
||||
"""One-line effective git identity with per-field provenance, e.g.
|
||||
`name=claude (agent), email=eric@dideric.is (bottle)`.
|
||||
Returns None when neither agent nor bottle sets an identity."""
|
||||
over = self.agent.git_user # agent's declared git_user (pre-merge)
|
||||
merged = self.bottle.git_user # effective git_user (post-merge)
|
||||
if merged.is_empty():
|
||||
return None
|
||||
parts: list[str] = []
|
||||
if merged.name:
|
||||
parts.append(f"name={merged.name} ({'agent' if over.name else 'bottle'})")
|
||||
if merged.email:
|
||||
parts.append(f"email={merged.email} ({'agent' if over.email else 'bottle'})")
|
||||
return ", ".join(parts)
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class ManifestIndex:
|
||||
"""Multi-agent/bottle collection. The pre-preflight form.
|
||||
|
||||
In lazy mode (from resolve()/from_md_dirs()) only filenames are scanned;
|
||||
no file content is read. In eager mode (from from_json_obj()) all agents
|
||||
and bottles are pre-parsed. Call load_for_agent() to get a single-value
|
||||
Manifest ready for backend use."""
|
||||
|
||||
bottles: Mapping[str, ManifestBottle]
|
||||
agents: Mapping[str, ManifestAgent]
|
||||
# Set by from_md_dirs; None in from_json_obj (test/programmatic) mode.
|
||||
# Stores the manifest root dirs so load_for_agent can locate files later.
|
||||
home_md: Path | None = field(default=None)
|
||||
cwd_md: Path | None = field(default=None)
|
||||
|
||||
@classmethod
|
||||
def resolve(cls, cwd: str, *, missing_ok: bool = False) -> "Manifest":
|
||||
"""Walk the per-file manifest tree and build a Manifest.
|
||||
def resolve(cls, cwd: str, *, missing_ok: bool = False) -> "ManifestIndex":
|
||||
"""Walk the per-file manifest tree and build a ManifestIndex.
|
||||
|
||||
Layout (PRD 0011):
|
||||
$HOME/.bot-bottle/bottles/<name>.md — bottles (home-only)
|
||||
@@ -212,7 +273,7 @@ class Manifest:
|
||||
boundary.
|
||||
|
||||
If `missing_ok` is true, a missing `$HOME/.bot-bottle/`
|
||||
returns an empty manifest instead of dying. This is for
|
||||
returns an empty index instead of dying. This is for
|
||||
passive UI surfaces like the dashboard, which can still
|
||||
monitor already-running agents without launch config.
|
||||
|
||||
@@ -251,25 +312,16 @@ class Manifest:
|
||||
cls,
|
||||
home_dir: Path,
|
||||
cwd_dir: Path | None,
|
||||
) -> "Manifest":
|
||||
"""Programmatic entry point. Loads bottles from
|
||||
`<home_dir>/bottles/`, home agents from `<home_dir>/agents/`,
|
||||
and (if `cwd_dir` is passed) cwd agents from
|
||||
`<cwd_dir>/agents/`. Cwd agents override home agents on
|
||||
name collision. A `bottles/` subdir under `cwd_dir` is
|
||||
logged as a warning and ignored.
|
||||
) -> "ManifestIndex":
|
||||
"""Return a names-only ManifestIndex. No file content is read; only
|
||||
filenames are scanned for the agent selector. Full parsing happens
|
||||
later, per-agent, via `load_for_agent`.
|
||||
|
||||
Used by tests to build a Manifest from fixture directories
|
||||
A `bottles/` subdir under `cwd_dir` is logged as a warning and
|
||||
ignored — the filesystem layout IS the trust boundary.
|
||||
|
||||
Used by tests to build a ManifestIndex from fixture directories
|
||||
without touching `os.environ`."""
|
||||
bottles_dir = home_dir / "bottles"
|
||||
from .manifest_loader import load_agents_from_dir, load_bottles_from_dir
|
||||
|
||||
bottles = load_bottles_from_dir(bottles_dir)
|
||||
|
||||
bottle_names = set(bottles.keys())
|
||||
agents_dir = home_dir / "agents"
|
||||
agents = load_agents_from_dir(agents_dir, bottle_names, source="$HOME")
|
||||
|
||||
if cwd_dir is not None:
|
||||
stale_bottles = cwd_dir / "bottles"
|
||||
if stale_bottles.is_dir():
|
||||
@@ -283,17 +335,11 @@ class Manifest:
|
||||
f"live under $HOME/.bot-bottle/bottles/ "
|
||||
f"(PRD 0011). Move them or delete."
|
||||
)
|
||||
cwd_agents_dir = cwd_dir / "agents"
|
||||
cwd_agents = load_agents_from_dir(
|
||||
cwd_agents_dir, bottle_names, source="$CWD"
|
||||
)
|
||||
agents = {**agents, **cwd_agents}
|
||||
|
||||
return cls(bottles=bottles, agents=agents)
|
||||
return cls(bottles={}, agents={}, home_md=home_dir, cwd_md=cwd_dir)
|
||||
|
||||
@classmethod
|
||||
def from_json_obj(cls, obj: object) -> "Manifest":
|
||||
"""Validate and build a Manifest from a raw JSON-like dict."""
|
||||
def from_json_obj(cls, obj: object) -> "ManifestIndex":
|
||||
"""Validate and build a ManifestIndex from a raw JSON-like dict."""
|
||||
d = as_json_object(obj, "manifest")
|
||||
raw_bottles_obj = _section_dict(d.get("bottles"), "manifest 'bottles'")
|
||||
raw_agents = _section_dict(d.get("agents"), "manifest 'agents'")
|
||||
@@ -309,80 +355,126 @@ class Manifest:
|
||||
bottles = resolve_bottles(raw_bottles)
|
||||
|
||||
bottle_names = set(bottles.keys())
|
||||
agents: dict[str, Agent] = {
|
||||
n: Agent.from_dict(n, a, bottle_names) for n, a in raw_agents.items()
|
||||
agents: dict[str, ManifestAgent] = {
|
||||
n: ManifestAgent.from_dict(n, a, bottle_names) for n, a in raw_agents.items()
|
||||
}
|
||||
return cls(bottles=bottles, agents=agents)
|
||||
|
||||
@property
|
||||
def all_agent_names(self) -> list[str]:
|
||||
"""Sorted list of all discoverable agent names.
|
||||
|
||||
In names-only mode (from resolve/from_md_dirs) this scans agent
|
||||
filenames without reading their content. In eager mode (from
|
||||
from_json_obj) it returns the pre-parsed agents' names."""
|
||||
if self.home_md is not None:
|
||||
from .manifest_loader import scan_agent_names
|
||||
home_names = set(scan_agent_names(self.home_md / "agents").keys())
|
||||
cwd_names: set[str] = set()
|
||||
if self.cwd_md is not None:
|
||||
cwd_names = set(scan_agent_names(self.cwd_md / "agents").keys())
|
||||
return sorted(home_names | cwd_names)
|
||||
return sorted(self.agents.keys())
|
||||
|
||||
def load_for_agent(self, agent_name: str) -> "Manifest":
|
||||
"""Parse the named agent and its bottle; return a single-value Manifest.
|
||||
|
||||
In lazy mode (from resolve/from_md_dirs) the agent file and its
|
||||
bottle chain are read from disk for the first time here. In eager
|
||||
mode (from_json_obj) the data is already parsed; this just filters
|
||||
down to the requested agent and its bottle.
|
||||
|
||||
The returned Manifest.bottle has the agent's git-gate.user already
|
||||
overlaid (agent wins on non-empty, per-field).
|
||||
|
||||
Always raises ManifestError if the agent is unknown or invalid.
|
||||
Backends call this at preflight inside _validate."""
|
||||
if self.home_md is None:
|
||||
# Eager manifest (from_json_obj): data already parsed; filter to
|
||||
# the one requested agent and its bottle so the returned Manifest
|
||||
# always holds exactly one agent and one bottle regardless of path.
|
||||
if agent_name not in self.agents:
|
||||
available = ", ".join(sorted(self.agents.keys())) or "(none)"
|
||||
raise ManifestError(
|
||||
f"agent '{agent_name}' not defined. Available: {available}"
|
||||
)
|
||||
agent = self.agents[agent_name]
|
||||
raw_bottle = self.bottles[agent.bottle]
|
||||
merged = _merge_git_user(agent.git_user, raw_bottle.git_user)
|
||||
bottle = raw_bottle if merged == raw_bottle.git_user else replace(raw_bottle, git_user=merged)
|
||||
return Manifest(agent=agent, bottle=bottle)
|
||||
|
||||
from .manifest_loader import load_bottle_chain_from_dir, scan_agent_names
|
||||
from .manifest_schema import validate_agent_frontmatter_keys
|
||||
from .yaml_subset import YamlSubsetError, parse_frontmatter
|
||||
|
||||
# Locate the agent file; cwd wins over home on name collision.
|
||||
home_agents = scan_agent_names(self.home_md / "agents")
|
||||
cwd_agents: dict[str, Path] = {}
|
||||
if self.cwd_md is not None:
|
||||
cwd_agents = scan_agent_names(self.cwd_md / "agents")
|
||||
merged_agents = {**home_agents, **cwd_agents}
|
||||
|
||||
if agent_name not in merged_agents:
|
||||
available = ", ".join(sorted(merged_agents.keys())) or "(none)"
|
||||
raise ManifestError(
|
||||
f"agent '{agent_name}' not defined. Available: {available}"
|
||||
)
|
||||
|
||||
agent_path = merged_agents[agent_name]
|
||||
try:
|
||||
fm, body = parse_frontmatter(agent_path.read_text())
|
||||
except OSError as e:
|
||||
raise ManifestError(f"could not read {agent_path}: {e}") from e
|
||||
except YamlSubsetError as e:
|
||||
raise ManifestError(f"{agent_path}: {e}") from e
|
||||
|
||||
validate_agent_frontmatter_keys(agent_path, fm.keys())
|
||||
|
||||
bottle_name = fm.get("bottle")
|
||||
if not isinstance(bottle_name, str) or not bottle_name:
|
||||
raise ManifestError(
|
||||
f"agent '{agent_name}' must declare a 'bottle' field "
|
||||
f"naming a defined bottle"
|
||||
)
|
||||
|
||||
# Load the bottle chain (may raise ManifestError).
|
||||
bottles_dir = self.home_md / "bottles"
|
||||
raw_bottle = load_bottle_chain_from_dir(bottle_name, bottles_dir)
|
||||
|
||||
# Build and validate the full ManifestAgent.
|
||||
agent_dict: dict[str, object] = {
|
||||
"bottle": bottle_name,
|
||||
"skills": fm.get("skills", []),
|
||||
"prompt": body.strip(),
|
||||
}
|
||||
if "git-gate" in fm:
|
||||
agent_dict["git-gate"] = fm["git-gate"]
|
||||
agent = ManifestAgent.from_dict(agent_name, agent_dict, {bottle_name})
|
||||
|
||||
merged_user = _merge_git_user(agent.git_user, raw_bottle.git_user)
|
||||
bottle = raw_bottle if merged_user == raw_bottle.git_user else replace(raw_bottle, git_user=merged_user)
|
||||
return Manifest(agent=agent, bottle=bottle)
|
||||
|
||||
def has_agent(self, name: str) -> bool:
|
||||
return name in self.agents
|
||||
|
||||
def require_agent(self, name: str) -> None:
|
||||
"""Check that `name` is a discoverable agent. In names-only mode
|
||||
this checks whether the .md file exists; in eager mode it checks
|
||||
the pre-parsed agents dict. Does NOT parse file content."""
|
||||
if self.has_agent(name):
|
||||
return
|
||||
available = ", ".join(self.agents.keys())
|
||||
if available:
|
||||
msg = f"agent '{name}' not defined in bot-bottle.json. Available: {available}"
|
||||
raise ManifestError(msg)
|
||||
raise ManifestError(
|
||||
f"agent '{name}' not defined in bot-bottle.json (manifest is empty)."
|
||||
)
|
||||
|
||||
def has_bottle(self, name: str) -> bool:
|
||||
return name in self.bottles
|
||||
|
||||
def require_bottle(self, name: str) -> None:
|
||||
if self.has_bottle(name):
|
||||
return
|
||||
available = ", ".join(self.bottles.keys())
|
||||
if available:
|
||||
raise ManifestError(
|
||||
f"bottle '{name}' not defined in bot-bottle.json. "
|
||||
f"Available bottles: {available}"
|
||||
if self.home_md is not None:
|
||||
# Names-only mode: check file existence without parsing.
|
||||
home_path = self.home_md / "agents" / f"{name}.md"
|
||||
cwd_path = (
|
||||
self.cwd_md / "agents" / f"{name}.md"
|
||||
if self.cwd_md else None
|
||||
)
|
||||
raise ManifestError(f"bottle '{name}' not defined in bot-bottle.json (no bottles defined).")
|
||||
|
||||
def _effective_git_user(self, agent_name: str) -> GitUser:
|
||||
"""Merge the agent's git.user over the referenced bottle's,
|
||||
per-field, agent-wins-on-non-empty (issue #94). Same overlay
|
||||
the `extends:` resolver applies between bottles
|
||||
(`_merge_bottles`)."""
|
||||
agent = self.agents[agent_name]
|
||||
base = self.bottles[agent.bottle].git_user
|
||||
over = agent.git_user
|
||||
if over.is_empty():
|
||||
return base
|
||||
return GitUser(
|
||||
name=over.name or base.name,
|
||||
email=over.email or base.email,
|
||||
if home_path.is_file() or (cwd_path and cwd_path.is_file()):
|
||||
return
|
||||
available = ", ".join(self.all_agent_names) or "(none)"
|
||||
raise ManifestError(
|
||||
f"agent '{name}' not defined. Available: {available}"
|
||||
)
|
||||
|
||||
def bottle_for(self, agent_name: str) -> Bottle:
|
||||
"""Resolve the Bottle the named agent references, with the
|
||||
agent's git.user overlaid on top. The validator guarantees both
|
||||
lookups succeed for a manifest built via from_json_obj.
|
||||
|
||||
The overlay lives here, the single point both backends call to
|
||||
resolve an agent's bottle, so the docker / smolmachines git
|
||||
provisioners pick up the merged identity unchanged."""
|
||||
bottle = self.bottles[self.agents[agent_name].bottle]
|
||||
merged = self._effective_git_user(agent_name)
|
||||
if merged == bottle.git_user:
|
||||
return bottle
|
||||
return replace(bottle, git_user=merged)
|
||||
|
||||
def git_identity_summary(self, agent_name: str) -> str | None:
|
||||
"""One-line effective git identity with per-field provenance
|
||||
for launch summaries, e.g.
|
||||
`name=claude (agent), email=eric@dideric.is (bottle)`.
|
||||
Returns None when neither agent nor bottle sets an identity."""
|
||||
over = self.agents[agent_name].git_user
|
||||
merged = self._effective_git_user(agent_name)
|
||||
if merged.is_empty():
|
||||
return None
|
||||
parts: list[str] = []
|
||||
if merged.name:
|
||||
parts.append(f"name={merged.name} ({'agent' if over.name else 'bottle'})")
|
||||
if merged.email:
|
||||
parts.append(f"email={merged.email} ({'agent' if over.email else 'bottle'})")
|
||||
return ", ".join(parts)
|
||||
|
||||
+118
-17
@@ -2,17 +2,17 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
from dataclasses import dataclass, field
|
||||
from typing import cast
|
||||
|
||||
from .agent_provider import PROVIDER_TEMPLATES
|
||||
from .manifest_util import ManifestError, as_json_object
|
||||
from .manifest_git import GitUser
|
||||
from .manifest_git import ManifestGitUser
|
||||
from .manifest_schema import AGENT_MODEL_KEYS
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class AgentProvider:
|
||||
class ManifestAgentProvider:
|
||||
"""Provider/template for the agent process inside a bottle.
|
||||
|
||||
`template` selects a built-in launch/runtime contract. `dockerfile`
|
||||
@@ -33,15 +33,23 @@ class AgentProvider:
|
||||
dockerfile: str = ""
|
||||
auth_token: str = ""
|
||||
forward_host_credentials: bool = False
|
||||
settings: dict[str, object] = field(default_factory=dict)
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, bottle_name: str, raw: object) -> "AgentProvider":
|
||||
def from_dict(cls, bottle_name: str, raw: object) -> "ManifestAgentProvider":
|
||||
d = as_json_object(raw, f"bottle '{bottle_name}' agent_provider")
|
||||
for k in d:
|
||||
if k not in {"template", "dockerfile", "auth_token", "forward_host_credentials"}:
|
||||
if k not in {
|
||||
"template",
|
||||
"dockerfile",
|
||||
"auth_token",
|
||||
"forward_host_credentials",
|
||||
"settings",
|
||||
}:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' agent_provider has unknown key {k!r}; "
|
||||
f"allowed: template, dockerfile, auth_token, forward_host_credentials"
|
||||
"allowed: template, dockerfile, auth_token, "
|
||||
"forward_host_credentials, settings"
|
||||
)
|
||||
template = d.get("template", "claude")
|
||||
if not isinstance(template, str) or not template:
|
||||
@@ -49,11 +57,6 @@ class AgentProvider:
|
||||
f"bottle '{bottle_name}' agent_provider.template must be a "
|
||||
f"non-empty string"
|
||||
)
|
||||
if template not in PROVIDER_TEMPLATES:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' agent_provider.template {template!r} "
|
||||
f"is not one of {', '.join(sorted(PROVIDER_TEMPLATES))}"
|
||||
)
|
||||
dockerfile = d.get("dockerfile", "")
|
||||
if not isinstance(dockerfile, str):
|
||||
raise ManifestError(
|
||||
@@ -66,6 +69,12 @@ class AgentProvider:
|
||||
f"bottle '{bottle_name}' agent_provider.auth_token must be a "
|
||||
f"string (was {type(auth_token).__name__})"
|
||||
)
|
||||
if auth_token and template not in PROVIDER_TEMPLATES:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' agent_provider.auth_token is only "
|
||||
f"supported for built-in templates "
|
||||
f"({', '.join(sorted(PROVIDER_TEMPLATES))})"
|
||||
)
|
||||
if auth_token and template != "claude":
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' agent_provider.auth_token is only "
|
||||
@@ -77,21 +86,29 @@ class AgentProvider:
|
||||
f"bottle '{bottle_name}' agent_provider.forward_host_credentials "
|
||||
f"must be a boolean (was {type(forward_host_credentials).__name__})"
|
||||
)
|
||||
if forward_host_credentials and template not in PROVIDER_TEMPLATES:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' agent_provider.forward_host_credentials "
|
||||
f"is only supported for built-in templates "
|
||||
f"({', '.join(sorted(PROVIDER_TEMPLATES))})"
|
||||
)
|
||||
if forward_host_credentials and template != "codex":
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' agent_provider.forward_host_credentials "
|
||||
"is currently only supported for template 'codex'"
|
||||
)
|
||||
settings = _parse_provider_settings(bottle_name, template, d.get("settings"))
|
||||
return cls(
|
||||
template=template,
|
||||
dockerfile=dockerfile,
|
||||
auth_token=auth_token,
|
||||
forward_host_credentials=forward_host_credentials,
|
||||
settings=settings,
|
||||
)
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class Agent:
|
||||
class ManifestAgent:
|
||||
bottle: str
|
||||
skills: tuple[str, ...] = ()
|
||||
prompt: str = ""
|
||||
@@ -99,10 +116,10 @@ class Agent:
|
||||
# bottle's git-gate.user per-field at `Manifest.bottle_for`. Only
|
||||
# `user` is allowed at the agent level; `repos` stays bottle-only
|
||||
# because it carries credentials and host trust.
|
||||
git_user: GitUser = GitUser()
|
||||
git_user: ManifestGitUser = ManifestGitUser()
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, name: str, raw: object, bottle_names: set[str]) -> "Agent":
|
||||
def from_dict(cls, name: str, raw: object, bottle_names: set[str]) -> "ManifestAgent":
|
||||
d = as_json_object(raw, f"agent '{name}'")
|
||||
unknown = set(d.keys()) - AGENT_MODEL_KEYS
|
||||
if unknown:
|
||||
@@ -157,11 +174,11 @@ class Agent:
|
||||
|
||||
# git-gate: agents may declare only `git-gate.user` (name/email).
|
||||
# `git-gate.repos` is bottle-only — it carries credentials and host trust.
|
||||
git_user = GitUser()
|
||||
git_user = ManifestGitUser()
|
||||
git_raw = d.get("git-gate")
|
||||
if git_raw is not None:
|
||||
gd = as_json_object(git_raw, f"agent '{name}' git-gate")
|
||||
for k in gd.keys():
|
||||
for k in gd:
|
||||
if k != "user":
|
||||
raise ManifestError(
|
||||
f"agent '{name}' git-gate.{k} is not allowed at the "
|
||||
@@ -170,6 +187,90 @@ class Agent:
|
||||
f"(it carries credentials and host trust)."
|
||||
)
|
||||
if "user" in gd:
|
||||
git_user = GitUser.from_dict(name, gd["user"])
|
||||
git_user = ManifestGitUser.from_dict(name, gd["user"])
|
||||
|
||||
return cls(bottle=bottle, skills=skills, prompt=prompt, git_user=git_user)
|
||||
|
||||
|
||||
def _parse_provider_settings(
|
||||
bottle_name: str,
|
||||
template: str,
|
||||
raw: object,
|
||||
) -> dict[str, object]:
|
||||
if raw is None:
|
||||
return {}
|
||||
if template != "pi":
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' agent_provider.settings is only "
|
||||
"supported for template 'pi'"
|
||||
)
|
||||
settings = as_json_object(raw, f"bottle '{bottle_name}' agent_provider.settings")
|
||||
allowed = {
|
||||
"provider",
|
||||
"base_url",
|
||||
"api",
|
||||
"api_key",
|
||||
"api_key_env",
|
||||
"models",
|
||||
"context_window",
|
||||
"max_tokens_field",
|
||||
"max_tokens",
|
||||
"supports_developer_role",
|
||||
"supports_reasoning_effort",
|
||||
}
|
||||
for key in settings:
|
||||
if key not in allowed:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' agent_provider.settings has unknown "
|
||||
f"key {key!r}; allowed: {', '.join(sorted(allowed))}"
|
||||
)
|
||||
for key in ("provider", "base_url", "api", "api_key", "api_key_env"):
|
||||
value = settings.get(key)
|
||||
if value is not None and (not isinstance(value, str) or not value):
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' agent_provider.settings.{key} must "
|
||||
"be a non-empty string"
|
||||
)
|
||||
max_tokens_field = settings.get("max_tokens_field")
|
||||
if max_tokens_field is not None and max_tokens_field not in (
|
||||
"max_tokens", "max_completion_tokens",
|
||||
):
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' agent_provider.settings.max_tokens_field "
|
||||
"must be 'max_tokens' or 'max_completion_tokens'"
|
||||
)
|
||||
if settings.get("api_key") is not None and settings.get("api_key_env") is not None:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' agent_provider.settings may set either "
|
||||
"api_key or api_key_env, not both"
|
||||
)
|
||||
models = settings.get("models")
|
||||
if models is not None:
|
||||
if not isinstance(models, list) or not models:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' agent_provider.settings.models must "
|
||||
"be a non-empty array of strings"
|
||||
)
|
||||
for i, model in enumerate(models):
|
||||
if not isinstance(model, str) or not model:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' agent_provider.settings.models[{i}] "
|
||||
"must be a non-empty string"
|
||||
)
|
||||
for key in ("supports_developer_role", "supports_reasoning_effort"):
|
||||
value = settings.get(key)
|
||||
if value is not None and not isinstance(value, bool):
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' agent_provider.settings.{key} must "
|
||||
f"be a boolean (was {type(value).__name__})"
|
||||
)
|
||||
for key in ("context_window", "max_tokens"):
|
||||
value = settings.get(key)
|
||||
if value is not None and (
|
||||
not isinstance(value, int) or isinstance(value, bool) or value <= 0
|
||||
):
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' agent_provider.settings.{key} must "
|
||||
f"be a positive integer (was {type(value).__name__})"
|
||||
)
|
||||
return dict(settings)
|
||||
|
||||
+263
-143
@@ -1,33 +1,31 @@
|
||||
"""Egress routing manifest dataclasses and helpers."""
|
||||
"""Egress routing manifest dataclasses and helpers (PRD 0017, PRD 0053)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import ipaddress
|
||||
from dataclasses import dataclass, field
|
||||
import re
|
||||
from dataclasses import dataclass
|
||||
from typing import cast
|
||||
|
||||
from .manifest_util import ManifestError, as_json_object
|
||||
|
||||
|
||||
# Auth schemes for the egress route's optional `auth` block.
|
||||
# Same values cred-proxy accepts today; `token` sidesteps the Gitea
|
||||
# token-not-Bearer quirk (go-gitea/gitea#16734).
|
||||
EGRESS_AUTH_SCHEMES = ("Bearer", "token")
|
||||
|
||||
PATH_MATCH_TYPES = ("exact", "prefix", "regex")
|
||||
HEADER_MATCH_TYPES = ("exact", "regex")
|
||||
|
||||
VALID_METHODS = frozenset({
|
||||
"GET", "HEAD", "POST", "PUT", "DELETE", "PATCH", "OPTIONS", "TRACE",
|
||||
"CONNECT",
|
||||
})
|
||||
|
||||
OUTBOUND_DETECTOR_NAMES = frozenset({"token_patterns", "known_secrets"})
|
||||
INBOUND_DETECTOR_NAMES = frozenset({"naive_injection_detection"})
|
||||
|
||||
|
||||
def validate_egress_routes(
|
||||
bottle_name: str,
|
||||
routes: tuple[EgressRoute, ...],
|
||||
routes: tuple[ManifestEgressRoute, ...],
|
||||
) -> None:
|
||||
"""Cross-validation for `bottle.egress.routes`: hosts must be unique.
|
||||
|
||||
The proxy matches by exact-host (v1); duplicate hosts leave the
|
||||
route choice ambiguous so we reject them up front.
|
||||
|
||||
No cross-validation against `bottle.git-gate.repos` is performed.
|
||||
git-gate (SSH push/fetch) and egress (HTTPS) broker different
|
||||
protocols; declaring both for the same host is a legitimate dev
|
||||
setup."""
|
||||
seen_hosts: dict[str, None] = {}
|
||||
for r in routes:
|
||||
key = r.Host.lower()
|
||||
@@ -40,132 +38,62 @@ def validate_egress_routes(
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class PipelockRoutePolicy:
|
||||
"""Per-route pipelock policy overrides.
|
||||
|
||||
`TlsPassthrough` adds the route host to pipelock's
|
||||
`tls_interception.passthrough_domains`, so pipelock still enforces
|
||||
the hostname allowlist but does not MITM/decrypt request bodies or
|
||||
headers for that host.
|
||||
|
||||
`SsrfIpAllowlist` adds explicit IPs/CIDRs to pipelock's SSRF
|
||||
allowlist for private/internal destinations behind this route.
|
||||
"""
|
||||
|
||||
TlsPassthrough: bool = False
|
||||
SsrfIpAllowlist: tuple[str, ...] = ()
|
||||
|
||||
@classmethod
|
||||
def from_dict(
|
||||
cls, bottle_name: str, idx: int, raw: object,
|
||||
) -> "PipelockRoutePolicy":
|
||||
label = f"bottle '{bottle_name}' egress.routes[{idx}] pipelock"
|
||||
d = as_json_object(raw, label)
|
||||
for k in d:
|
||||
if k not in ("tls_passthrough", "ssrf_ip_allowlist"):
|
||||
raise ManifestError(
|
||||
f"{label} has unknown key {k!r}; "
|
||||
f"only 'tls_passthrough' and 'ssrf_ip_allowlist' "
|
||||
f"are accepted"
|
||||
)
|
||||
tls_passthrough_raw = d.get("tls_passthrough", False)
|
||||
if not isinstance(tls_passthrough_raw, bool):
|
||||
raise ManifestError(
|
||||
f"{label}.tls_passthrough must be a boolean "
|
||||
f"(was {type(tls_passthrough_raw).__name__})"
|
||||
)
|
||||
ssrf_raw = d.get("ssrf_ip_allowlist", [])
|
||||
if not isinstance(ssrf_raw, list):
|
||||
raise ManifestError(
|
||||
f"{label}.ssrf_ip_allowlist must be an array "
|
||||
f"(was {type(ssrf_raw).__name__})"
|
||||
)
|
||||
ssrf_ip_allowlist: list[str] = []
|
||||
for j, item in enumerate(ssrf_raw):
|
||||
if not isinstance(item, str) or not item:
|
||||
raise ManifestError(
|
||||
f"{label}.ssrf_ip_allowlist[{j}] must be a non-empty "
|
||||
f"string (was {type(item).__name__})"
|
||||
)
|
||||
try:
|
||||
ipaddress.ip_network(item, strict=False)
|
||||
except ValueError as e:
|
||||
raise ManifestError(
|
||||
f"{label}.ssrf_ip_allowlist[{j}] must be an IP address "
|
||||
f"or CIDR (was {item!r}): {e}"
|
||||
) from e
|
||||
ssrf_ip_allowlist.append(item)
|
||||
return cls(
|
||||
TlsPassthrough=tls_passthrough_raw,
|
||||
SsrfIpAllowlist=tuple(ssrf_ip_allowlist),
|
||||
)
|
||||
class ManifestPathMatch:
|
||||
Type: str = "prefix"
|
||||
Value: str = ""
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class EgressRoute:
|
||||
"""One route on the per-bottle egress sidecar (PRD 0017).
|
||||
class ManifestHeaderMatch:
|
||||
Name: str = ""
|
||||
Value: str = ""
|
||||
Type: str = "exact"
|
||||
|
||||
`Host` matches the request's hostname (case-insensitive). The
|
||||
optional `PathAllowlist` constrains the URL path to a set of
|
||||
prefixes; empty tuple means no path-level filtering. The optional
|
||||
`AuthScheme` / `TokenRef` pair drives credential injection:
|
||||
when set, the proxy strips any inbound Authorization and injects
|
||||
`<AuthScheme> <value-of-host-env-named-by-TokenRef>`. When the
|
||||
manifest's `auth` block is omitted both fields are empty strings —
|
||||
no Authorization is written, no token forwarded.
|
||||
|
||||
`Role` is reserved for future use; all role strings are currently
|
||||
rejected by the validator.
|
||||
@dataclass(frozen=True)
|
||||
class ManifestMatchEntry:
|
||||
Paths: tuple[ManifestPathMatch, ...] = ()
|
||||
Methods: tuple[str, ...] = ()
|
||||
Headers: tuple[ManifestHeaderMatch, ...] = ()
|
||||
|
||||
Validation rules (enforced in `from_dict`):
|
||||
- `host` required, non-empty.
|
||||
- `path_allowlist` optional, list of absolute path prefixes.
|
||||
- `auth` optional. If present, MUST carry both `scheme` and
|
||||
`token_ref` as non-empty strings; an empty `auth: {}` is an
|
||||
error rather than a synonym for "no auth" (omit `auth` for
|
||||
that case).
|
||||
- `role` optional, reserved — any non-empty value is rejected.
|
||||
"""
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class ManifestEgressRoute:
|
||||
Host: str
|
||||
PathAllowlist: tuple[str, ...] = ()
|
||||
Matches: tuple[ManifestMatchEntry, ...] = ()
|
||||
AuthScheme: str = ""
|
||||
TokenRef: str = ""
|
||||
Role: tuple[str, ...] = ()
|
||||
Pipelock: PipelockRoutePolicy = field(default_factory=PipelockRoutePolicy)
|
||||
GitFetch: bool = False
|
||||
OutboundDetectors: tuple[str, ...] | None = None
|
||||
InboundDetectors: tuple[str, ...] | None = None
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, bottle_name: str, idx: int, raw: object) -> "EgressRoute":
|
||||
def from_dict(cls, bottle_name: str, idx: int, raw: object) -> "ManifestEgressRoute":
|
||||
label = f"bottle '{bottle_name}' egress.routes[{idx}]"
|
||||
d = as_json_object(raw, label)
|
||||
host = d.get("host")
|
||||
if not isinstance(host, str) or not host:
|
||||
raise ManifestError(f"{label} missing required string field 'host'")
|
||||
|
||||
path_allow_raw = d.get("path_allowlist")
|
||||
prefixes: tuple[str, ...] = ()
|
||||
if path_allow_raw is not None:
|
||||
if not isinstance(path_allow_raw, list):
|
||||
# --- matches ---
|
||||
matches: tuple[ManifestMatchEntry, ...] = ()
|
||||
matches_raw = d.get("matches")
|
||||
if matches_raw is not None:
|
||||
if not isinstance(matches_raw, list):
|
||||
raise ManifestError(
|
||||
f"{label} path_allowlist must be an array "
|
||||
f"(was {type(path_allow_raw).__name__})"
|
||||
f"{label} matches must be an array "
|
||||
f"(was {type(matches_raw).__name__})"
|
||||
)
|
||||
path_list = cast(list[object], path_allow_raw)
|
||||
collected: list[str] = []
|
||||
for j, p in enumerate(path_list):
|
||||
if not isinstance(p, str):
|
||||
raise ManifestError(
|
||||
f"{label} path_allowlist[{j}] must be a string "
|
||||
f"(was {type(p).__name__})"
|
||||
)
|
||||
if not p.startswith("/"):
|
||||
raise ManifestError(
|
||||
f"{label} path_allowlist[{j}] {p!r} must be an "
|
||||
f"absolute path prefix starting with '/'"
|
||||
)
|
||||
collected.append(p)
|
||||
prefixes = tuple(collected)
|
||||
matches_list = cast(list[object], matches_raw)
|
||||
entries: list[ManifestMatchEntry] = []
|
||||
for k, entry_raw in enumerate(matches_list):
|
||||
entries.append(
|
||||
_parse_match_entry(label, k, entry_raw)
|
||||
)
|
||||
matches = tuple(entries)
|
||||
|
||||
# --- auth ---
|
||||
auth_scheme = ""
|
||||
token_ref = ""
|
||||
if "auth" in d:
|
||||
@@ -203,6 +131,7 @@ class EgressRoute:
|
||||
auth_scheme = auth_scheme_raw
|
||||
token_ref = token_ref_raw
|
||||
|
||||
# --- role (reserved) ---
|
||||
role_raw = d.get("role")
|
||||
roles: tuple[str, ...] = ()
|
||||
if role_raw is None:
|
||||
@@ -229,43 +158,228 @@ class EgressRoute:
|
||||
f"the 'role' field is reserved for future use"
|
||||
)
|
||||
|
||||
pipelock = (
|
||||
PipelockRoutePolicy.from_dict(bottle_name, idx, d["pipelock"])
|
||||
if "pipelock" in d
|
||||
else PipelockRoutePolicy()
|
||||
)
|
||||
# --- dlp ---
|
||||
outbound_detectors: tuple[str, ...] | None = None
|
||||
inbound_detectors: tuple[str, ...] | None = None
|
||||
if "dlp" in d:
|
||||
outbound_detectors, inbound_detectors = _parse_dlp_block(
|
||||
label, d.get("dlp"),
|
||||
)
|
||||
|
||||
# --- git-over-HTTPS policy ---
|
||||
git_fetch = False
|
||||
if "git" in d:
|
||||
git_d = as_json_object(d.get("git"), f"{label} git")
|
||||
raw_fetch = git_d.get("fetch", False)
|
||||
if isinstance(raw_fetch, bool):
|
||||
git_fetch = raw_fetch
|
||||
else:
|
||||
raise ManifestError(
|
||||
f"{label} git.fetch must be a boolean "
|
||||
f"(was {type(raw_fetch).__name__})"
|
||||
)
|
||||
for k in git_d:
|
||||
if k != "fetch":
|
||||
raise ManifestError(
|
||||
f"{label} git has unknown key {k!r}; "
|
||||
f"only 'fetch' is accepted"
|
||||
)
|
||||
|
||||
for k in d:
|
||||
if k not in ("host", "path_allowlist", "auth", "role", "pipelock"):
|
||||
if k not in ("host", "matches", "auth", "role", "dlp", "git"):
|
||||
raise ManifestError(
|
||||
f"{label} has unknown key {k!r}; accepted keys are "
|
||||
f"'host', 'path_allowlist', 'auth', 'role', 'pipelock'"
|
||||
f"'host', 'matches', 'auth', 'role', 'dlp', 'git'"
|
||||
)
|
||||
|
||||
return cls(
|
||||
Host=host,
|
||||
PathAllowlist=prefixes,
|
||||
Matches=matches,
|
||||
AuthScheme=auth_scheme,
|
||||
TokenRef=token_ref,
|
||||
Role=roles,
|
||||
Pipelock=pipelock,
|
||||
GitFetch=git_fetch,
|
||||
OutboundDetectors=outbound_detectors,
|
||||
InboundDetectors=inbound_detectors,
|
||||
)
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class EgressConfig:
|
||||
"""Per-bottle egress configuration. Today this is just the
|
||||
route table; the nesting under `egress:` leaves room for
|
||||
per-bottle proxy settings (port override, log level, etc.) in
|
||||
follow-ups."""
|
||||
def _parse_match_entry(
|
||||
route_label: str, k: int, raw: object,
|
||||
) -> ManifestMatchEntry:
|
||||
label = f"{route_label} matches[{k}]"
|
||||
d = as_json_object(raw, label)
|
||||
|
||||
routes: tuple[EgressRoute, ...] = ()
|
||||
paths: tuple[ManifestPathMatch, ...] = ()
|
||||
paths_raw = d.get("paths")
|
||||
if paths_raw is not None:
|
||||
if not isinstance(paths_raw, list):
|
||||
raise ManifestError(f"{label} paths must be an array")
|
||||
paths_list = cast(list[object], paths_raw)
|
||||
parsed_paths: list[ManifestPathMatch] = []
|
||||
for j, p_raw in enumerate(paths_list):
|
||||
parsed_paths.append(_parse_path_match(label, j, p_raw))
|
||||
paths = tuple(parsed_paths)
|
||||
|
||||
methods: tuple[str, ...] = ()
|
||||
methods_raw = d.get("methods")
|
||||
if methods_raw is not None:
|
||||
if not isinstance(methods_raw, list):
|
||||
raise ManifestError(f"{label} methods must be an array")
|
||||
methods_list = cast(list[object], methods_raw)
|
||||
normalised: list[str] = []
|
||||
for j, m in enumerate(methods_list):
|
||||
if not isinstance(m, str):
|
||||
raise ManifestError(
|
||||
f"{label} methods[{j}] must be a string"
|
||||
)
|
||||
upper = m.upper()
|
||||
if upper not in VALID_METHODS:
|
||||
raise ManifestError(
|
||||
f"{label} methods[{j}] {m!r} is not a valid HTTP method"
|
||||
)
|
||||
normalised.append(upper)
|
||||
methods = tuple(normalised)
|
||||
|
||||
headers: tuple[ManifestHeaderMatch, ...] = ()
|
||||
headers_raw = d.get("headers")
|
||||
if headers_raw is not None:
|
||||
if not isinstance(headers_raw, list):
|
||||
raise ManifestError(f"{label} headers must be an array")
|
||||
headers_list = cast(list[object], headers_raw)
|
||||
parsed_headers: list[ManifestHeaderMatch] = []
|
||||
for j, h_raw in enumerate(headers_list):
|
||||
parsed_headers.append(_parse_header_match(label, j, h_raw))
|
||||
headers = tuple(parsed_headers)
|
||||
|
||||
for key in d:
|
||||
if key not in ("paths", "methods", "headers"):
|
||||
raise ManifestError(f"{label} has unknown key {key!r}")
|
||||
|
||||
return ManifestMatchEntry(Paths=paths, Methods=methods, Headers=headers)
|
||||
|
||||
|
||||
def _parse_path_match(
|
||||
entry_label: str, j: int, raw: object,
|
||||
) -> ManifestPathMatch:
|
||||
label = f"{entry_label} paths[{j}]"
|
||||
d = as_json_object(raw, label)
|
||||
ptype = d.get("type", "prefix")
|
||||
if not isinstance(ptype, str) or ptype not in PATH_MATCH_TYPES:
|
||||
raise ManifestError(
|
||||
f"{label} type must be one of {', '.join(PATH_MATCH_TYPES)} "
|
||||
f"(got {ptype!r})"
|
||||
)
|
||||
value = d.get("value")
|
||||
if not isinstance(value, str) or not value:
|
||||
raise ManifestError(f"{label} value must be a non-empty string")
|
||||
if ptype in ("exact", "prefix") and not value.startswith("/"):
|
||||
raise ManifestError(
|
||||
f"{label} value {value!r} must start with '/' for type {ptype!r}"
|
||||
)
|
||||
if ptype == "regex":
|
||||
try:
|
||||
re.compile(value)
|
||||
except re.error as e:
|
||||
raise ManifestError(
|
||||
f"{label} regex {value!r} failed to compile: {e}"
|
||||
) from e
|
||||
for k in d:
|
||||
if k not in ("type", "value"):
|
||||
raise ManifestError(f"{label} has unknown key {k!r}")
|
||||
return ManifestPathMatch(Type=ptype, Value=value)
|
||||
|
||||
|
||||
def _parse_header_match(
|
||||
entry_label: str, j: int, raw: object,
|
||||
) -> ManifestHeaderMatch:
|
||||
label = f"{entry_label} headers[{j}]"
|
||||
d = as_json_object(raw, label)
|
||||
name = d.get("name")
|
||||
if not isinstance(name, str) or not name:
|
||||
raise ManifestError(f"{label} name must be a non-empty string")
|
||||
value = d.get("value")
|
||||
if not isinstance(value, str):
|
||||
raise ManifestError(f"{label} value must be a string")
|
||||
htype = d.get("type", "exact")
|
||||
if not isinstance(htype, str) or htype not in HEADER_MATCH_TYPES:
|
||||
raise ManifestError(
|
||||
f"{label} type must be one of {', '.join(HEADER_MATCH_TYPES)} "
|
||||
f"(got {htype!r})"
|
||||
)
|
||||
if htype == "regex":
|
||||
try:
|
||||
re.compile(value)
|
||||
except re.error as e:
|
||||
raise ManifestError(
|
||||
f"{label} regex {value!r} failed to compile: {e}"
|
||||
) from e
|
||||
for k in d:
|
||||
if k not in ("name", "value", "type"):
|
||||
raise ManifestError(f"{label} has unknown key {k!r}")
|
||||
return ManifestHeaderMatch(Name=name, Value=value, Type=htype)
|
||||
|
||||
|
||||
def _parse_dlp_block(
|
||||
route_label: str,
|
||||
raw: object,
|
||||
) -> tuple[tuple[str, ...] | None, tuple[str, ...] | None]:
|
||||
label = f"{route_label} dlp"
|
||||
d = as_json_object(raw, label)
|
||||
|
||||
def _parse_field(
|
||||
field: str,
|
||||
valid_names: frozenset[str],
|
||||
) -> tuple[str, ...] | None:
|
||||
val = d.get(field)
|
||||
if val is None:
|
||||
return None
|
||||
if val is False:
|
||||
return ()
|
||||
if not isinstance(val, list):
|
||||
raise ManifestError(
|
||||
f"{label} {field} must be false, a list, or omitted"
|
||||
)
|
||||
items = cast(list[object], val)
|
||||
names: list[str] = []
|
||||
for j, item in enumerate(items):
|
||||
if not isinstance(item, str):
|
||||
raise ManifestError(
|
||||
f"{label} {field}[{j}] must be a string"
|
||||
)
|
||||
if item not in valid_names:
|
||||
raise ManifestError(
|
||||
f"{label} {field}[{j}] {item!r} is not a valid "
|
||||
f"detector; valid: {', '.join(sorted(valid_names))}"
|
||||
)
|
||||
names.append(item)
|
||||
return tuple(names)
|
||||
|
||||
outbound = _parse_field("outbound_detectors", OUTBOUND_DETECTOR_NAMES)
|
||||
inbound = _parse_field("inbound_detectors", INBOUND_DETECTOR_NAMES)
|
||||
|
||||
for k in d:
|
||||
if k not in ("outbound_detectors", "inbound_detectors"):
|
||||
raise ManifestError(
|
||||
f"{label} has unknown key {k!r}; accepted keys are "
|
||||
f"'outbound_detectors', 'inbound_detectors'"
|
||||
)
|
||||
return outbound, inbound
|
||||
|
||||
|
||||
LOG_LEVELS = frozenset({0, 1, 2})
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class ManifestEgressConfig:
|
||||
routes: tuple[ManifestEgressRoute, ...] = ()
|
||||
Log: int = 0
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, bottle_name: str, raw: object) -> "EgressConfig":
|
||||
def from_dict(cls, bottle_name: str, raw: object) -> "ManifestEgressConfig":
|
||||
d = as_json_object(raw, f"bottle '{bottle_name}' egress")
|
||||
routes_raw = d.get("routes")
|
||||
routes: tuple[EgressRoute, ...] = ()
|
||||
routes: tuple[ManifestEgressRoute, ...] = ()
|
||||
if routes_raw is not None:
|
||||
if not isinstance(routes_raw, list):
|
||||
raise ManifestError(
|
||||
@@ -274,14 +388,20 @@ class EgressConfig:
|
||||
)
|
||||
routes_list = cast(list[object], routes_raw)
|
||||
routes = tuple(
|
||||
EgressRoute.from_dict(bottle_name, i, entry)
|
||||
ManifestEgressRoute.from_dict(bottle_name, i, entry)
|
||||
for i, entry in enumerate(routes_list)
|
||||
)
|
||||
validate_egress_routes(bottle_name, routes)
|
||||
log_raw = d.get("log", 0)
|
||||
if isinstance(log_raw, bool) or not isinstance(log_raw, int) \
|
||||
or log_raw not in LOG_LEVELS:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' egress.log must be 0, 1, or 2"
|
||||
)
|
||||
for k in d:
|
||||
if k != "routes":
|
||||
if k not in ("routes", "log"):
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' egress has unknown key {k!r}; "
|
||||
f"only 'routes' is accepted"
|
||||
f"accepted keys are 'routes', 'log'"
|
||||
)
|
||||
return cls(routes=routes)
|
||||
return cls(routes=routes, Log=log_raw)
|
||||
|
||||
+106
-36
@@ -5,25 +5,31 @@ from __future__ import annotations
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from .manifest import Bottle, GitEntry
|
||||
from .manifest import ManifestBottle
|
||||
from .manifest_egress import ManifestEgressConfig
|
||||
|
||||
|
||||
def resolve_bottles(raws: dict[str, dict[str, object]]) -> dict[str, Bottle]:
|
||||
"""Apply `extends:` chains and return resolved Bottle objects."""
|
||||
cache: dict[str, Bottle] = {}
|
||||
def resolve_bottles(raws: dict[str, dict[str, object]]) -> dict[str, ManifestBottle]:
|
||||
"""Apply `extends:` chains and return resolved ManifestBottle objects."""
|
||||
cache: dict[str, ManifestBottle] = {}
|
||||
# Per-bottle effective git-gate.repos, as raw dicts keyed by repo name.
|
||||
# Threaded alongside `cache` so a child can field-merge against its
|
||||
# parent's repos without reconstructing them from parsed entries.
|
||||
repos_cache: dict[str, dict[str, object]] = {}
|
||||
for name in raws:
|
||||
if name not in cache:
|
||||
_resolve_one_bottle(name, raws, cache, ())
|
||||
_resolve_one_bottle(name, raws, cache, repos_cache, ())
|
||||
return cache
|
||||
|
||||
|
||||
def _resolve_one_bottle(
|
||||
name: str,
|
||||
raws: dict[str, dict[str, object]],
|
||||
cache: dict[str, Bottle],
|
||||
cache: dict[str, ManifestBottle],
|
||||
repos_cache: dict[str, dict[str, object]],
|
||||
seen: tuple[str, ...],
|
||||
) -> Bottle:
|
||||
from .manifest import Bottle, ManifestError
|
||||
) -> ManifestBottle:
|
||||
from .manifest import ManifestBottle, ManifestError
|
||||
|
||||
if name in cache:
|
||||
return cache[name]
|
||||
@@ -32,14 +38,15 @@ def _resolve_one_bottle(
|
||||
raise ManifestError(f"bottle '{name}' is in an extends cycle: {chain}")
|
||||
raw = raws[name]
|
||||
parent_name_raw = raw.get("extends")
|
||||
# Strip `extends:` before passing to Bottle.from_dict so it
|
||||
# is not accidentally treated as a real Bottle field by future
|
||||
# Strip `extends:` before passing to ManifestBottle.from_dict so it
|
||||
# is not accidentally treated as a real ManifestBottle field by future
|
||||
# schema additions. It is only meaningful here.
|
||||
child_raw = {k: v for k, v in raw.items() if k != "extends"}
|
||||
|
||||
if parent_name_raw is None:
|
||||
bottle = Bottle.from_dict(name, child_raw)
|
||||
bottle = ManifestBottle.from_dict(name, child_raw)
|
||||
cache[name] = bottle
|
||||
repos_cache[name] = _resolve_repos_raw({}, child_raw)
|
||||
return bottle
|
||||
|
||||
if not isinstance(parent_name_raw, str):
|
||||
@@ -59,49 +66,69 @@ def _resolve_one_bottle(
|
||||
f"bottle '{name}' extends '{parent_name}' which is not "
|
||||
f"defined. Available bottles: {avail}"
|
||||
)
|
||||
parent = _resolve_one_bottle(parent_name, raws, cache, seen + (name,))
|
||||
bottle = _merge_bottles(parent, child_raw, name)
|
||||
parent = _resolve_one_bottle(
|
||||
parent_name, raws, cache, repos_cache, seen + (name,)
|
||||
)
|
||||
merged_repos_raw = _resolve_repos_raw(repos_cache[parent_name], child_raw)
|
||||
bottle = _merge_bottles(parent, child_raw, merged_repos_raw, name)
|
||||
cache[name] = bottle
|
||||
repos_cache[name] = merged_repos_raw
|
||||
return bottle
|
||||
|
||||
|
||||
def _merge_bottles(
|
||||
parent: Bottle,
|
||||
parent: ManifestBottle,
|
||||
child_raw: dict[str, object],
|
||||
merged_repos_raw: dict[str, object],
|
||||
name: str,
|
||||
) -> Bottle:
|
||||
) -> ManifestBottle:
|
||||
"""Apply PRD 0025 merge rules."""
|
||||
from .manifest import Bottle, GitUser
|
||||
from .manifest import ManifestBottle, ManifestGitUser
|
||||
from .manifest_egress import validate_egress_routes
|
||||
from .manifest_util import as_json_object
|
||||
|
||||
# Parse the child's declared fields into a Bottle (with the
|
||||
# git-gate.repos: when the child declares repos, inject the already
|
||||
# name-merged repo set (computed by _resolve_repos_raw) so the child
|
||||
# parses with the full inherited+overridden list (issue #237).
|
||||
if _child_declares_git_gate_repos(child_raw):
|
||||
git_raw = as_json_object(child_raw.get("git-gate", {}), "child git-gate")
|
||||
child_raw = {**child_raw, "git-gate": {**git_raw, "repos": merged_repos_raw}}
|
||||
|
||||
# Parse the child's declared fields into a ManifestBottle (with the
|
||||
# usual defaults for anything missing). Validation runs the same
|
||||
# way it would for a leaf bottle: typos / wrong types die here.
|
||||
child = Bottle.from_dict(name, child_raw)
|
||||
child = ManifestBottle.from_dict(name, child_raw)
|
||||
|
||||
# env: dict merge, child wins on collision.
|
||||
merged_env = {**parent.env, **child.env}
|
||||
|
||||
# git-gate.user: per-field overlay. Each non-empty field on child
|
||||
# wins; empties fall through to parent. The default GitUser()
|
||||
# wins; empties fall through to parent. The default ManifestGitUser()
|
||||
# is two empty strings, so a child that omits git-gate.user
|
||||
# inherits the parent's user verbatim.
|
||||
merged_git_user = GitUser(
|
||||
merged_git_user = ManifestGitUser(
|
||||
name=child.git_user.name or parent.git_user.name,
|
||||
email=child.git_user.email or parent.git_user.email,
|
||||
)
|
||||
|
||||
# git-gate.repos: missing means inherit; an explicit empty object
|
||||
# clears; otherwise parent and child merge by UpstreamHost with
|
||||
# child entries replacing duplicate hosts.
|
||||
# git-gate.repos: when declared, child.git already holds the merged
|
||||
# set (an explicit empty dict clears parent, leaving child.git empty).
|
||||
# When omitted, the parent's entries are inherited verbatim.
|
||||
if _child_declares_git_gate_repos(child_raw):
|
||||
merged_git = _merge_git_remotes(parent.git, child.git) if child.git else ()
|
||||
merged_git = child.git
|
||||
else:
|
||||
merged_git = parent.git
|
||||
|
||||
# Presence-driven full-replace for the remaining list-valued +
|
||||
# scalar fields.
|
||||
merged_egress = child.egress if "egress" in child_raw else parent.egress
|
||||
# egress.routes: missing means inherit; otherwise parent and child
|
||||
# route lists concatenate. Other egress scalar fields remain
|
||||
# presence-driven overlays.
|
||||
merged_egress = (
|
||||
_merge_egress(parent.egress, child.egress, child_raw)
|
||||
if "egress" in child_raw
|
||||
else parent.egress
|
||||
)
|
||||
|
||||
# Presence-driven full-replace for the remaining scalar fields.
|
||||
merged_agent_provider = (
|
||||
child.agent_provider
|
||||
if "agent_provider" in child_raw
|
||||
@@ -112,7 +139,7 @@ def _merge_bottles(
|
||||
)
|
||||
validate_egress_routes(name, merged_egress.routes)
|
||||
|
||||
return Bottle(
|
||||
return ManifestBottle(
|
||||
env=merged_env,
|
||||
agent_provider=merged_agent_provider,
|
||||
git=merged_git,
|
||||
@@ -122,6 +149,45 @@ def _merge_bottles(
|
||||
)
|
||||
|
||||
|
||||
def _resolve_repos_raw(
|
||||
parent_repos: dict[str, object],
|
||||
child_raw: dict[str, object],
|
||||
) -> dict[str, object]:
|
||||
"""Compute a bottle's effective git-gate.repos as raw dicts.
|
||||
|
||||
Repos are keyed by name. When the child omits git-gate.repos it
|
||||
inherits the parent's set verbatim; an explicit empty dict clears it.
|
||||
Otherwise parent and child unite by name, with same-name entries
|
||||
field-merged (parent fields are defaults, child fields win)."""
|
||||
from .manifest_util import as_json_object
|
||||
|
||||
if not _child_declares_git_gate_repos(child_raw):
|
||||
return parent_repos
|
||||
child_repos = _declared_repos_raw(child_raw)
|
||||
if not child_repos:
|
||||
return {}
|
||||
# Parent entries keep their order; child-only names are appended.
|
||||
names = list(parent_repos) + [n for n in child_repos if n not in parent_repos]
|
||||
return {
|
||||
name: {
|
||||
**as_json_object(parent_repos.get(name, {}), "parent git-gate repo"),
|
||||
**as_json_object(child_repos.get(name, {}), "child git-gate repo"),
|
||||
}
|
||||
for name in names
|
||||
}
|
||||
|
||||
|
||||
def _declared_repos_raw(child_raw: dict[str, object]) -> dict[str, object]:
|
||||
"""Return the child's explicitly declared git-gate.repos as raw dicts,
|
||||
or an empty dict when none are declared."""
|
||||
from .manifest_util import as_json_object
|
||||
|
||||
if not _child_declares_git_gate_repos(child_raw):
|
||||
return {}
|
||||
git_raw = as_json_object(child_raw.get("git-gate", {}), "child git-gate")
|
||||
return as_json_object(git_raw.get("repos", {}), "child git-gate.repos")
|
||||
|
||||
|
||||
def _child_declares_git_gate_repos(child_raw: dict[str, object]) -> bool:
|
||||
from .manifest_util import as_json_object
|
||||
|
||||
@@ -132,11 +198,15 @@ def _child_declares_git_gate_repos(child_raw: dict[str, object]) -> bool:
|
||||
return "repos" in git_obj
|
||||
|
||||
|
||||
def _merge_git_remotes(
|
||||
parent: tuple[GitEntry, ...],
|
||||
child: tuple[GitEntry, ...],
|
||||
) -> tuple[GitEntry, ...]:
|
||||
by_host = {entry.UpstreamHost: entry for entry in parent}
|
||||
for entry in child:
|
||||
by_host[entry.UpstreamHost] = entry
|
||||
return tuple(by_host.values())
|
||||
def _merge_egress(
|
||||
parent: ManifestEgressConfig,
|
||||
child: ManifestEgressConfig,
|
||||
child_raw: dict[str, object],
|
||||
) -> ManifestEgressConfig:
|
||||
from .manifest_egress import ManifestEgressConfig
|
||||
from .manifest_util import as_json_object
|
||||
|
||||
child_egress_raw = as_json_object(child_raw.get("egress"), "child egress")
|
||||
routes = parent.routes + child.routes
|
||||
log = child.Log if "log" in child_egress_raw else parent.Log
|
||||
return ManifestEgressConfig(routes=routes, Log=log)
|
||||
|
||||
+85
-78
@@ -4,7 +4,6 @@ from __future__ import annotations
|
||||
|
||||
import re
|
||||
from dataclasses import dataclass
|
||||
from typing import Optional
|
||||
|
||||
from .manifest_util import ManifestError, as_json_object
|
||||
|
||||
@@ -13,6 +12,8 @@ from .manifest_util import ManifestError, as_json_object
|
||||
# defence; this regex is belt-and-suspenders and documents intent).
|
||||
_GIT_NAME_RE = re.compile(r"^[A-Za-z0-9._-]+$")
|
||||
|
||||
_KEY_PROVIDERS = {"static", "gitea"}
|
||||
|
||||
|
||||
def _opt_str(value: object, label: str) -> str:
|
||||
if value is None:
|
||||
@@ -57,7 +58,7 @@ def parse_git_upstream(url: str, label: str) -> tuple[str, str, str, str]:
|
||||
return (user, host, port, path)
|
||||
|
||||
|
||||
def validate_unique_git_names(bottle_name: str, git: tuple[GitEntry, ...]) -> None:
|
||||
def validate_unique_git_names(bottle_name: str, git: tuple[ManifestGitEntry, ...]) -> None:
|
||||
seen: dict[str, None] = {}
|
||||
for g in git:
|
||||
if g.Name in seen:
|
||||
@@ -69,25 +70,27 @@ def validate_unique_git_names(bottle_name: str, git: tuple[GitEntry, ...]) -> No
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class ProvisionedKeyConfig:
|
||||
"""Configuration for automatic deploy-key lifecycle management
|
||||
(PRD 0048). Used when a git-gate.repos entry opts out of a
|
||||
static identity file and instead wants a fresh SSH keypair
|
||||
generated at spin-up and revoked at teardown.
|
||||
class ManifestKeyConfig:
|
||||
"""Configuration for a repo's SSH key in git-gate.repos.
|
||||
|
||||
`provider` names the contrib sub-package to load (e.g. `gitea`).
|
||||
`token_env` is the name of a host-side env var carrying the API
|
||||
token; the value is read at provision time, never stored on the
|
||||
plan. `api_url` is the forge's HTTP API root; if empty, it is
|
||||
derived from the upstream URL's host at provision time."""
|
||||
`provider` is either `"static"` (a pre-existing key on the host) or
|
||||
`"gitea"` (automatic deploy-key lifecycle via the Gitea API).
|
||||
|
||||
For `static`: `path` is the host-side absolute path to the SSH private key.
|
||||
|
||||
For `gitea`: `forge_token_env` is the name of a host-side env var
|
||||
carrying the Gitea API token; the value is read at provision time,
|
||||
never stored on the plan. `api_url` is the forge's HTTP API root; if
|
||||
empty, it is derived from the upstream URL's host at provision time."""
|
||||
|
||||
provider: str
|
||||
token_env: str
|
||||
path: str = ""
|
||||
forge_token_env: str = ""
|
||||
api_url: str = ""
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class GitEntry:
|
||||
class ManifestGitEntry:
|
||||
"""One upstream the per-agent git-gate (PRD 0008) is allowed to
|
||||
talk to. `Upstream` is the real remote URL the agent would push to
|
||||
if there were no gate; the gate hosts a bare repo at /git/<Name>.git
|
||||
@@ -99,15 +102,16 @@ class GitEntry:
|
||||
stashed in the `Upstream*` fields so the git-gate render step
|
||||
doesn't have to re-parse.
|
||||
|
||||
Manifest source: `git-gate.repos.<Name>` (PRD 0047/0048). Exactly
|
||||
one of `identity` (static key path) or `provisioned_key` (automatic
|
||||
lifecycle) must be present. The internal field names are stable."""
|
||||
Manifest source: `git-gate.repos.<Name>` (PRD 0047/0048). A `key`
|
||||
block is required; `key.provider` is `"static"` or `"gitea"`. For
|
||||
`static`, `IdentityFile` is populated at parse time from `key.path`.
|
||||
For `gitea`, `IdentityFile` is populated at provision time."""
|
||||
|
||||
Name: str
|
||||
Upstream: str
|
||||
Key: ManifestKeyConfig = ManifestKeyConfig(provider="")
|
||||
IdentityFile: str = ""
|
||||
KnownHostKey: str = ""
|
||||
ProvisionedKey: Optional[ProvisionedKeyConfig] = None
|
||||
RemoteKey: str = ""
|
||||
UpstreamUser: str = ""
|
||||
UpstreamHost: str = ""
|
||||
@@ -117,11 +121,11 @@ class GitEntry:
|
||||
@classmethod
|
||||
def from_repos_entry(
|
||||
cls, bottle_name: str, repo_name: str, raw: object
|
||||
) -> "GitEntry":
|
||||
) -> "ManifestGitEntry":
|
||||
"""Parse one entry from `git-gate.repos.<repo_name>`.
|
||||
|
||||
YAML keys: `url` (required), exactly one of `identity` or
|
||||
`provisioned_key` (required), `host_key` (optional).
|
||||
YAML keys: `url` (required), `key` (required object with
|
||||
`provider`, and provider-specific fields), `host_key` (optional).
|
||||
The repo_name becomes `Name`."""
|
||||
if not repo_name:
|
||||
raise ManifestError(
|
||||
@@ -135,10 +139,10 @@ class GitEntry:
|
||||
label = f"git-gate.repos[{repo_name!r}]"
|
||||
d = as_json_object(raw, f"bottle '{bottle_name}' {label}")
|
||||
for k in d:
|
||||
if k not in {"url", "identity", "provisioned_key", "host_key"}:
|
||||
if k not in {"url", "key", "host_key"}:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' {label} has unknown key {k!r}; "
|
||||
f"allowed: url, identity, provisioned_key, host_key"
|
||||
f"allowed: url, key, host_key"
|
||||
)
|
||||
upstream = d.get("url")
|
||||
if not isinstance(upstream, str) or not upstream:
|
||||
@@ -146,32 +150,13 @@ class GitEntry:
|
||||
f"bottle '{bottle_name}' {label} missing required string field 'url'"
|
||||
)
|
||||
|
||||
has_identity = "identity" in d
|
||||
has_provisioned = "provisioned_key" in d
|
||||
if has_identity and has_provisioned:
|
||||
if "key" not in d:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' {label} must set exactly one of "
|
||||
f"'identity' or 'provisioned_key'; got both."
|
||||
)
|
||||
if not has_identity and not has_provisioned:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' {label} must set exactly one of "
|
||||
f"'identity' or 'provisioned_key'; got neither."
|
||||
f"bottle '{bottle_name}' {label} missing required 'key' block"
|
||||
)
|
||||
key_config = _parse_key_config(bottle_name, label, d["key"])
|
||||
|
||||
ident = ""
|
||||
provisioned_key: Optional[ProvisionedKeyConfig] = None
|
||||
if has_identity:
|
||||
raw_ident = d.get("identity")
|
||||
if not isinstance(raw_ident, str) or not raw_ident:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' {label} 'identity' must be a non-empty string"
|
||||
)
|
||||
ident = raw_ident
|
||||
else:
|
||||
provisioned_key = _parse_provisioned_key_config(
|
||||
bottle_name, label, d["provisioned_key"]
|
||||
)
|
||||
ident = key_config.path if key_config.provider == "static" else ""
|
||||
|
||||
khk = _opt_str(
|
||||
d.get("host_key"),
|
||||
@@ -183,9 +168,9 @@ class GitEntry:
|
||||
return cls(
|
||||
Name=repo_name,
|
||||
Upstream=upstream,
|
||||
Key=key_config,
|
||||
IdentityFile=ident,
|
||||
KnownHostKey=khk,
|
||||
ProvisionedKey=provisioned_key,
|
||||
RemoteKey=host,
|
||||
UpstreamUser=user,
|
||||
UpstreamHost=host,
|
||||
@@ -194,42 +179,64 @@ class GitEntry:
|
||||
)
|
||||
|
||||
|
||||
def _parse_provisioned_key_config(
|
||||
def _parse_key_config(
|
||||
bottle_name: str, label: str, raw: object
|
||||
) -> ProvisionedKeyConfig:
|
||||
d = as_json_object(raw, f"bottle '{bottle_name}' {label}.provisioned_key")
|
||||
for k in d:
|
||||
if k not in {"provider", "token_env", "api_url"}:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' {label}.provisioned_key has unknown key {k!r}; "
|
||||
f"allowed: provider, token_env, api_url"
|
||||
)
|
||||
) -> ManifestKeyConfig:
|
||||
d = as_json_object(raw, f"bottle '{bottle_name}' {label}.key")
|
||||
provider = d.get("provider")
|
||||
if not isinstance(provider, str) or not provider:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' {label}.provisioned_key missing required "
|
||||
f"bottle '{bottle_name}' {label}.key missing required "
|
||||
f"string field 'provider'"
|
||||
)
|
||||
token_env = d.get("token_env")
|
||||
if not isinstance(token_env, str) or not token_env:
|
||||
if provider not in _KEY_PROVIDERS:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' {label}.provisioned_key missing required "
|
||||
f"string field 'token_env'"
|
||||
f"bottle '{bottle_name}' {label}.key provider {provider!r} is unknown; "
|
||||
f"allowed: {', '.join(sorted(_KEY_PROVIDERS))}"
|
||||
)
|
||||
api_url_raw = d.get("api_url", "")
|
||||
if not isinstance(api_url_raw, str):
|
||||
|
||||
if provider == "gitea":
|
||||
for k in d:
|
||||
if k not in {"provider", "forge_token_env", "api_url"}:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' {label}.key has unknown key {k!r} "
|
||||
f"for provider 'gitea'; allowed: provider, forge_token_env, api_url"
|
||||
)
|
||||
forge_token_env = d.get("forge_token_env")
|
||||
if not isinstance(forge_token_env, str) or not forge_token_env:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' {label}.key missing required "
|
||||
f"string field 'forge_token_env' for provider 'gitea'"
|
||||
)
|
||||
api_url_raw = d.get("api_url", "")
|
||||
if not isinstance(api_url_raw, str):
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' {label}.key 'api_url' must be a string"
|
||||
)
|
||||
return ManifestKeyConfig(
|
||||
provider=provider,
|
||||
forge_token_env=forge_token_env,
|
||||
api_url=api_url_raw,
|
||||
)
|
||||
|
||||
# provider == "static"
|
||||
for k in d:
|
||||
if k not in {"provider", "path"}:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' {label}.key has unknown key {k!r} "
|
||||
f"for provider 'static'; allowed: provider, path"
|
||||
)
|
||||
path = d.get("path")
|
||||
if not isinstance(path, str) or not path:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' {label}.provisioned_key 'api_url' must be a string"
|
||||
f"bottle '{bottle_name}' {label}.key missing required "
|
||||
f"string field 'path' for provider 'static'"
|
||||
)
|
||||
return ProvisionedKeyConfig(
|
||||
provider=provider,
|
||||
token_env=token_env,
|
||||
api_url=api_url_raw,
|
||||
)
|
||||
return ManifestKeyConfig(provider=provider, path=path)
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class GitUser:
|
||||
class ManifestGitUser:
|
||||
"""Per-bottle `git config --global user.name` / `user.email`
|
||||
pair (issue #86). The agent's commits inside the bottle are
|
||||
attributed to this identity rather than the agent image's
|
||||
@@ -244,9 +251,9 @@ class GitUser:
|
||||
email: str = ""
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, bottle_name: str, raw: object) -> "GitUser":
|
||||
def from_dict(cls, bottle_name: str, raw: object) -> "ManifestGitUser":
|
||||
d = as_json_object(raw, f"bottle '{bottle_name}' git-gate.user")
|
||||
for k in d.keys():
|
||||
for k in d:
|
||||
if k not in {"name", "email"}:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' git-gate.user has unknown key {k!r}; "
|
||||
@@ -279,9 +286,9 @@ class GitUser:
|
||||
def parse_git_gate_config(
|
||||
bottle_name: str,
|
||||
raw: object,
|
||||
) -> tuple[tuple[GitEntry, ...], GitUser]:
|
||||
) -> tuple[tuple[ManifestGitEntry, ...], ManifestGitUser]:
|
||||
d = as_json_object(raw, f"bottle '{bottle_name}' git-gate")
|
||||
for k in d.keys():
|
||||
for k in d:
|
||||
if k not in {"user", "repos"}:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' git-gate has unknown key {k!r}; "
|
||||
@@ -289,17 +296,17 @@ def parse_git_gate_config(
|
||||
)
|
||||
|
||||
git_user = (
|
||||
GitUser.from_dict(bottle_name, d["user"])
|
||||
ManifestGitUser.from_dict(bottle_name, d["user"])
|
||||
if "user" in d
|
||||
else GitUser()
|
||||
else ManifestGitUser()
|
||||
)
|
||||
|
||||
git: tuple[GitEntry, ...] = ()
|
||||
git: tuple[ManifestGitEntry, ...] = ()
|
||||
repos_raw = d.get("repos")
|
||||
if repos_raw is not None:
|
||||
repos = as_json_object(repos_raw, f"bottle '{bottle_name}' git-gate.repos")
|
||||
git = tuple(
|
||||
GitEntry.from_repos_entry(bottle_name, name, entry)
|
||||
ManifestGitEntry.from_repos_entry(bottle_name, name, entry)
|
||||
for name, entry in repos.items()
|
||||
)
|
||||
validate_unique_git_names(bottle_name, git)
|
||||
|
||||
@@ -8,21 +8,19 @@ from typing import TYPE_CHECKING
|
||||
from .log import warn
|
||||
from .manifest_schema import (
|
||||
entity_name_from_path,
|
||||
validate_agent_frontmatter_keys,
|
||||
validate_bottle_frontmatter_keys,
|
||||
)
|
||||
from .manifest_util import ManifestError
|
||||
from .yaml_subset import YamlSubsetError, parse_frontmatter
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from .manifest import Agent, Bottle
|
||||
from .manifest import ManifestBottle
|
||||
|
||||
|
||||
def check_stale_json(dir_path: Path, md_dir: Path, label: str) -> None:
|
||||
"""Die if `<dir_path>/bot-bottle.json` exists but `md_dir` does
|
||||
not. The manifest format changed in PRD 0011 and we do not want
|
||||
to silently leave the JSON content unused."""
|
||||
from .manifest import ManifestError
|
||||
|
||||
legacy = dir_path / "bot-bottle.json"
|
||||
if legacy.is_file() and not md_dir.exists():
|
||||
raise ManifestError(
|
||||
@@ -34,48 +32,13 @@ def check_stale_json(dir_path: Path, md_dir: Path, label: str) -> None:
|
||||
)
|
||||
|
||||
|
||||
def load_bottles_from_dir(bottles_dir: Path) -> dict[str, Bottle]:
|
||||
"""Walk `<bottles_dir>/*.md`, parse each as a bottle, and return
|
||||
`{name: Bottle}`. Missing dir returns an empty dict."""
|
||||
from .manifest import ManifestError
|
||||
from .manifest_extends import resolve_bottles
|
||||
def scan_agent_names(agents_dir: Path) -> dict[str, Path]:
|
||||
"""Scan `<agents_dir>/*.md` for valid filenames and return `{name: path}`.
|
||||
|
||||
raws: dict[str, dict[str, object]] = {}
|
||||
if not bottles_dir.is_dir():
|
||||
return {}
|
||||
for path in sorted(bottles_dir.glob("*.md")):
|
||||
name = entity_name_from_path(path)
|
||||
if name is None:
|
||||
warn(
|
||||
f"skipping {path}: filename must match "
|
||||
f"[a-z][a-z0-9-]*.md (got {path.name!r})"
|
||||
)
|
||||
continue
|
||||
try:
|
||||
fm, _body = parse_frontmatter(path.read_text())
|
||||
except OSError as e:
|
||||
raise ManifestError(f"could not read {path}: {e}") from e
|
||||
except YamlSubsetError as e:
|
||||
raise ManifestError(f"{path}: {e}") from e
|
||||
validate_bottle_frontmatter_keys(path, fm.keys())
|
||||
raws[name] = fm
|
||||
return resolve_bottles(raws)
|
||||
|
||||
|
||||
def load_agents_from_dir(
|
||||
agents_dir: Path,
|
||||
bottle_names: set[str],
|
||||
*,
|
||||
source: str, # noqa: F841 — unused, but required by interface
|
||||
) -> dict[str, Agent]:
|
||||
"""Walk `<agents_dir>/*.md`, parse each as an agent, and return
|
||||
`{name: Agent}`. The Markdown body becomes the agent's prompt.
|
||||
Missing dir returns an empty dict."""
|
||||
from .manifest import Agent, ManifestError
|
||||
|
||||
out: dict[str, Agent] = {}
|
||||
No file content is read. Invalid filenames are skipped with a warning."""
|
||||
result: dict[str, Path] = {}
|
||||
if not agents_dir.is_dir():
|
||||
return out
|
||||
return result
|
||||
for path in sorted(agents_dir.glob("*.md")):
|
||||
name = entity_name_from_path(path)
|
||||
if name is None:
|
||||
@@ -84,22 +47,45 @@ def load_agents_from_dir(
|
||||
f"[a-z][a-z0-9-]*.md (got {path.name!r})"
|
||||
)
|
||||
continue
|
||||
result[name] = path
|
||||
return result
|
||||
|
||||
|
||||
def load_bottle_chain_from_dir(
|
||||
bottle_name: str, bottles_dir: Path
|
||||
) -> ManifestBottle:
|
||||
"""Load `bottle_name` and its full `extends:` chain from `bottles_dir`,
|
||||
returning the resolved ManifestBottle.
|
||||
|
||||
Only the files in the extends chain are read — unrelated bottle files
|
||||
are never touched. Raises ManifestError on parse or validation failure."""
|
||||
from .manifest_extends import resolve_bottles
|
||||
|
||||
raws: dict[str, dict[str, object]] = {}
|
||||
to_load = [bottle_name]
|
||||
while to_load:
|
||||
name = to_load.pop()
|
||||
if name in raws:
|
||||
continue
|
||||
path = bottles_dir / f"{name}.md"
|
||||
if not path.is_file():
|
||||
avail = ", ".join(
|
||||
p.stem for p in sorted(bottles_dir.glob("*.md")) if p.is_file()
|
||||
) or "(none)"
|
||||
raise ManifestError(
|
||||
f"bottle '{name}' not found at {path}. "
|
||||
f"Available: {avail}"
|
||||
)
|
||||
try:
|
||||
fm, body = parse_frontmatter(path.read_text())
|
||||
fm, _body = parse_frontmatter(path.read_text())
|
||||
except OSError as e:
|
||||
raise ManifestError(f"could not read {path}: {e}") from e
|
||||
except YamlSubsetError as e:
|
||||
raise ManifestError(f"{path}: {e}") from e
|
||||
validate_agent_frontmatter_keys(path, fm.keys())
|
||||
# Build the dict Agent.from_dict expects. The body becomes
|
||||
# prompt; Claude Code passthrough fields stay in fm and get
|
||||
# ignored by Agent.from_dict (reads bottle/skills/git-gate/prompt).
|
||||
agent_dict: dict[str, object] = {
|
||||
"bottle": fm.get("bottle"),
|
||||
"skills": fm.get("skills", []),
|
||||
"prompt": body.strip(),
|
||||
}
|
||||
if "git-gate" in fm:
|
||||
agent_dict["git-gate"] = fm["git-gate"]
|
||||
out[name] = Agent.from_dict(name, agent_dict, bottle_names)
|
||||
return out
|
||||
validate_bottle_frontmatter_keys(path, fm.keys())
|
||||
raws[name] = dict(fm)
|
||||
parent = fm.get("extends")
|
||||
if isinstance(parent, str):
|
||||
to_load.append(parent)
|
||||
|
||||
return resolve_bottles(raws)[bottle_name]
|
||||
|
||||
@@ -1,541 +0,0 @@
|
||||
"""Pipelock sidecar lifecycle for the per-agent egress topology.
|
||||
|
||||
Pipelock (https://github.com/luckyPipewrench/pipelock) is an HTTP
|
||||
forward proxy with hostname allowlisting + DLP scanning + URL-entropy
|
||||
checks. One sidecar per agent, attached to the agent's --internal
|
||||
network and a per-agent user-defined egress bridge.
|
||||
|
||||
Post-PRD-0017 topology: the agent's HTTP_PROXY points at egress
|
||||
(not pipelock); egress sets `HTTPS_PROXY=pipelock` on its
|
||||
outbound leg. So pipelock no longer sees the agent's connections
|
||||
directly — it sees the egress → upstream leg, applies the
|
||||
hostname allowlist + DLP body scan there, and forwards to the real
|
||||
upstream.
|
||||
|
||||
Image pin: ghcr.io/luckypipewrench/pipelock@sha256:<digest> for tag 2.3.0.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import cast
|
||||
|
||||
from .egress import EgressRoute, egress_routes_for_bottle
|
||||
from .supervise import SUPERVISE_HOSTNAME
|
||||
from .manifest import Bottle
|
||||
|
||||
# Hosts pipelock should NOT TLS-MITM, even when tls_interception is
|
||||
# enabled. This is now route-owned manifest policy via
|
||||
# `egress.routes[].pipelock.tls_passthrough`; no provider hosts are
|
||||
# injected implicitly.
|
||||
DEFAULT_TLS_PASSTHROUGH: tuple[str, ...] = ()
|
||||
|
||||
|
||||
# In-container paths the rendered pipelock YAML references under
|
||||
# `tls_interception`. The pipelock binary expects the per-bottle CA
|
||||
# cert + key at these exact paths inside its container — independent
|
||||
# of how the daemon is wrapped (own container, sidecar bundle, etc.),
|
||||
# which is why they live in the platform-neutral module.
|
||||
PIPELOCK_CA_CERT_IN_CONTAINER = "/etc/pipelock-ca.pem"
|
||||
PIPELOCK_CA_KEY_IN_CONTAINER = "/etc/pipelock-ca-key.pem"
|
||||
|
||||
|
||||
# Short network alias for pipelock inside the sidecar bundle. The
|
||||
# agent's HTTP_PROXY (when no egress is declared) and any in-bundle
|
||||
# consumer's URL both reference this name.
|
||||
PIPELOCK_HOSTNAME = "pipelock"
|
||||
|
||||
|
||||
# --- Allowlist resolution --------------------------------------------------
|
||||
|
||||
|
||||
def pipelock_effective_allowlist(
|
||||
bottle: Bottle,
|
||||
provider_routes: tuple[EgressRoute, ...] = (),
|
||||
) -> list[str]:
|
||||
"""Hostnames pipelock allows. Sorted for stability.
|
||||
|
||||
Always mirrors `egress_routes_for_bottle(bottle, provider_routes)` —
|
||||
egress is the single allowlist surface, and pipelock's allowlist is
|
||||
the downstream copy for defense-in-depth + DLP body scanning. For
|
||||
bottles without any `egress.routes[]` declared, this is empty except
|
||||
for supervise sidecar traffic when `supervise: true`.
|
||||
|
||||
The supervise sidecar's hostname is auto-added when supervise
|
||||
is enabled (sibling-sidecar traffic that flows through pipelock
|
||||
would otherwise be 403'd). Git upstreams declared in
|
||||
`bottle.git` do NOT contribute here — git traffic flows
|
||||
through git-gate (PRD 0008), not pipelock."""
|
||||
seen: dict[str, None] = {}
|
||||
for r in egress_routes_for_bottle(bottle, provider_routes):
|
||||
if r.host:
|
||||
seen.setdefault(r.host, None)
|
||||
if bottle.supervise:
|
||||
seen.setdefault(SUPERVISE_HOSTNAME, None)
|
||||
return sorted(seen.keys())
|
||||
|
||||
|
||||
def pipelock_seed_phrase_detection_enabled(bottle: Bottle) -> bool:
|
||||
"""Whether pipelock's BIP-39 seed-phrase detector stays on.
|
||||
|
||||
LLM conversation bodies legitimately trip the detector — any 12+
|
||||
English words that pass the BIP-39 checksum match — so agents can
|
||||
get blocked on ordinary prompts/responses regardless of provider
|
||||
(Claude, Codex/OpenAI, or future harnesses). We tried two narrower
|
||||
knobs first:
|
||||
|
||||
- `suppress: [{rule, path}]` — pipelock accepts the schema
|
||||
but the entry only silences the alert; the body_dlp block
|
||||
still fires.
|
||||
- `rules.disabled: ["dlp:BIP-39 Seed Phrase"]` — same shape,
|
||||
same outcome: 403 still returned.
|
||||
|
||||
Empirically only `seed_phrase_detection.enabled: false`
|
||||
actually stops the block (verified by sending a 12-word BIP-39
|
||||
body through three pipelock instances). It is a global toggle —
|
||||
no per-path / per-host knob in pipelock 2.3.0 — so we turn off
|
||||
only this detector for every bottle. The rest of pipelock's DLP
|
||||
defaults and request-body/header scanning remain enabled."""
|
||||
del bottle # kept for call-site stability and future policy knobs.
|
||||
return False
|
||||
|
||||
|
||||
def pipelock_effective_tls_passthrough(
|
||||
bottle: Bottle,
|
||||
provider_routes: tuple[EgressRoute, ...] = (),
|
||||
) -> list[str]:
|
||||
"""Hostnames pipelock should pass through (no TLS MITM).
|
||||
|
||||
A manifest route opts in with `pipelock.tls_passthrough: true`
|
||||
(lifted into `EgressRoute.tls_passthrough` in `egress_manifest_routes`).
|
||||
Provider routes that set `tls_passthrough=True` (e.g. Codex credential
|
||||
routes where egress injects the host bearer after the agent boundary)
|
||||
are also included. Both arrive via `egress_routes_for_bottle` — no
|
||||
provider-specific branching needed here.
|
||||
"""
|
||||
seen: dict[str, None] = {host: None for host in DEFAULT_TLS_PASSTHROUGH}
|
||||
for route in egress_routes_for_bottle(bottle, provider_routes):
|
||||
if route.tls_passthrough:
|
||||
seen.setdefault(route.host, None)
|
||||
return sorted(seen.keys())
|
||||
|
||||
|
||||
def pipelock_effective_ssrf_ip_allowlist(
|
||||
bottle: Bottle,
|
||||
extra: tuple[str, ...] = (),
|
||||
) -> list[str]:
|
||||
"""IP/CIDR entries that bypass pipelock's SSRF destination guard.
|
||||
|
||||
Launch code can pass backend-owned entries through `extra`, while
|
||||
route-owned entries come from `pipelock.ssrf_ip_allowlist`.
|
||||
"""
|
||||
seen: dict[str, None] = {ip: None for ip in extra}
|
||||
for route in bottle.egress.routes:
|
||||
for ip in route.Pipelock.SsrfIpAllowlist:
|
||||
seen.setdefault(ip, None)
|
||||
return sorted(seen.keys())
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
# --- Config build + YAML render --------------------------------------------
|
||||
|
||||
|
||||
def pipelock_build_config(
|
||||
bottle: Bottle,
|
||||
*,
|
||||
ca_cert_path: str = "",
|
||||
ca_key_path: str = "",
|
||||
ssrf_ip_allowlist: tuple[str, ...] = (),
|
||||
provider_routes: tuple[EgressRoute, ...] = (),
|
||||
) -> dict[str, object]:
|
||||
"""Build the structured pipelock config dict the sidecar will load.
|
||||
|
||||
Deliberately carries no env values, no secrets, no per-agent
|
||||
customization beyond the resolved hostname list. The shape mirrors
|
||||
the YAML pipelock expects on disk; `pipelock_render_yaml` serializes
|
||||
it. Tests assert on this dict; production code renders it.
|
||||
|
||||
`ca_cert_path` / `ca_key_path` are the **in-container** paths the
|
||||
pipelock sidecar will read its CA from at runtime (they're
|
||||
populated into the container at start time via `docker cp`).
|
||||
Pass both or neither: both → emit `tls_interception` block with
|
||||
`enabled: true`; neither → omit the block entirely (pipelock
|
||||
falls back to its built-in default of `enabled: false`). Used
|
||||
by PRD 0006 to turn on pipelock's native TLS interception.
|
||||
|
||||
`ssrf_ip_allowlist` is the list of IPs / CIDRs that bypass
|
||||
pipelock's SSRF guard. Pipelock blocks RFC1918-resolved
|
||||
destinations by default, which would catch sibling-sidecar
|
||||
traffic on the bottle's internal Docker network in 172.x space
|
||||
(e.g. egress → pipelock on the upstream leg). Pass the
|
||||
bottle's internal network CIDR here so internal-network requests
|
||||
pass through pipelock while api_allowlist + body-scanning still
|
||||
apply. Empty by default; omitted from the rendered yaml when
|
||||
empty so pipelock keeps its built-in SSRF defaults."""
|
||||
cfg: dict[str, object] = {
|
||||
"version": 1,
|
||||
"mode": "strict",
|
||||
"enforce": True,
|
||||
"api_allowlist": pipelock_effective_allowlist(bottle, provider_routes),
|
||||
"forward_proxy": {"enabled": True},
|
||||
}
|
||||
if not pipelock_seed_phrase_detection_enabled(bottle):
|
||||
cfg["seed_phrase_detection"] = {"enabled": False}
|
||||
cfg["dlp"] = {"include_defaults": True, "scan_env": True}
|
||||
# Body-scan enforcement is a separate pipelock section (each DLP
|
||||
# "surface" — body, MCP, response — has its own action). Pipelock's
|
||||
# built-in default for request_body_scanning is "warn" (forward
|
||||
# with a log line); bot-bottle hard-codes "block" so a hit
|
||||
# actually stops the request from leaving the egress network.
|
||||
#
|
||||
# `scan_headers: true` + `header_mode: all` extends the scan to
|
||||
# every request header — pipelock's default `header_mode:
|
||||
# sensitive` only checks Authorization / Cookie / X-Api-Key /
|
||||
# X-Token / Proxy-Authorization / X-Goog-Api-Key, which an
|
||||
# agent attempting to exfil could trivially avoid by picking
|
||||
# a non-sensitive header name. "all" closes the gap; pipelock
|
||||
# caps it at the same max_body_bytes the body scan uses.
|
||||
cfg["request_body_scanning"] = {
|
||||
"action": "block",
|
||||
"scan_headers": True,
|
||||
"header_mode": "all",
|
||||
}
|
||||
if ca_cert_path or ca_key_path:
|
||||
if not (ca_cert_path and ca_key_path):
|
||||
raise ValueError(
|
||||
"pipelock_build_config: pass both ca_cert_path and ca_key_path "
|
||||
"to enable tls_interception, or neither to leave it off"
|
||||
)
|
||||
cfg["tls_interception"] = {
|
||||
"enabled": True,
|
||||
"ca_cert": ca_cert_path,
|
||||
"ca_key": ca_key_path,
|
||||
"passthrough_domains": pipelock_effective_tls_passthrough(bottle, provider_routes),
|
||||
}
|
||||
effective_ssrf_ip_allowlist = pipelock_effective_ssrf_ip_allowlist(
|
||||
bottle, ssrf_ip_allowlist,
|
||||
)
|
||||
if effective_ssrf_ip_allowlist:
|
||||
cfg["ssrf"] = {"ip_allowlist": effective_ssrf_ip_allowlist}
|
||||
return cfg
|
||||
|
||||
|
||||
_PIPELOCK_TOP_LEVEL_KEYS = {
|
||||
"version",
|
||||
"mode",
|
||||
"enforce",
|
||||
"api_allowlist",
|
||||
"seed_phrase_detection",
|
||||
"forward_proxy",
|
||||
"dlp",
|
||||
"request_body_scanning",
|
||||
"tls_interception",
|
||||
"ssrf",
|
||||
}
|
||||
|
||||
|
||||
def _pipelock_render_error(section: str, key: str, expected: str) -> ValueError:
|
||||
return ValueError(
|
||||
f"pipelock_render_yaml: {section}.{key} must be {expected}"
|
||||
)
|
||||
|
||||
|
||||
def _reject_unknown_keys(
|
||||
section: str,
|
||||
obj: dict[str, object],
|
||||
allowed: set[str],
|
||||
) -> None:
|
||||
for key in sorted(set(obj) - allowed):
|
||||
raise ValueError(f"pipelock_render_yaml: {section}.{key} is unsupported")
|
||||
|
||||
|
||||
def _required_dict(
|
||||
obj: dict[str, object],
|
||||
section: str,
|
||||
key: str,
|
||||
) -> dict[str, object]:
|
||||
value = obj.get(key)
|
||||
if not isinstance(value, dict):
|
||||
raise _pipelock_render_error(section, key, "a mapping")
|
||||
return cast(dict[str, object], value)
|
||||
|
||||
|
||||
def _required_bool(obj: dict[str, object], section: str, key: str) -> bool:
|
||||
value = obj.get(key)
|
||||
if not isinstance(value, bool):
|
||||
raise _pipelock_render_error(section, key, "a boolean")
|
||||
return value
|
||||
|
||||
|
||||
def _required_int(obj: dict[str, object], section: str, key: str) -> int:
|
||||
value = obj.get(key)
|
||||
if isinstance(value, bool) or not isinstance(value, int):
|
||||
raise _pipelock_render_error(section, key, "an integer")
|
||||
return value
|
||||
|
||||
|
||||
def _required_str(obj: dict[str, object], section: str, key: str) -> str:
|
||||
value = obj.get(key)
|
||||
if not isinstance(value, str):
|
||||
raise _pipelock_render_error(section, key, "a string")
|
||||
return value
|
||||
|
||||
|
||||
def _required_str_list(
|
||||
obj: dict[str, object],
|
||||
section: str,
|
||||
key: str,
|
||||
) -> list[str]:
|
||||
value = obj.get(key)
|
||||
if not isinstance(value, list):
|
||||
raise _pipelock_render_error(section, key, "a list of strings")
|
||||
value_list = cast(list[object], value)
|
||||
if not all(isinstance(v, str) for v in value_list):
|
||||
raise _pipelock_render_error(section, key, "a list of strings")
|
||||
return cast(list[str], value)
|
||||
|
||||
|
||||
def _optional_str_list(
|
||||
obj: dict[str, object],
|
||||
section: str,
|
||||
key: str,
|
||||
) -> list[str]:
|
||||
if key not in obj:
|
||||
return []
|
||||
return _required_str_list(obj, section, key)
|
||||
|
||||
|
||||
def _optional_bool(
|
||||
obj: dict[str, object],
|
||||
section: str,
|
||||
key: str,
|
||||
) -> bool | None:
|
||||
if key not in obj:
|
||||
return None
|
||||
return _required_bool(obj, section, key)
|
||||
|
||||
|
||||
def _optional_str(
|
||||
obj: dict[str, object],
|
||||
section: str,
|
||||
key: str,
|
||||
) -> str | None:
|
||||
if key not in obj:
|
||||
return None
|
||||
return _required_str(obj, section, key)
|
||||
|
||||
|
||||
def _validate_pipelock_render_config(cfg: dict[str, object]) -> dict[str, object]:
|
||||
_reject_unknown_keys("config", cfg, _PIPELOCK_TOP_LEVEL_KEYS)
|
||||
normalized: dict[str, object] = {
|
||||
"version": _required_int(cfg, "config", "version"),
|
||||
"mode": _required_str(cfg, "config", "mode"),
|
||||
"enforce": _required_bool(cfg, "config", "enforce"),
|
||||
"api_allowlist": _required_str_list(cfg, "config", "api_allowlist"),
|
||||
}
|
||||
|
||||
if "seed_phrase_detection" in cfg:
|
||||
spd = _required_dict(cfg, "config", "seed_phrase_detection")
|
||||
_reject_unknown_keys("seed_phrase_detection", spd, {"enabled"})
|
||||
normalized["seed_phrase_detection"] = {
|
||||
"enabled": _required_bool(spd, "seed_phrase_detection", "enabled"),
|
||||
}
|
||||
|
||||
fp = _required_dict(cfg, "config", "forward_proxy")
|
||||
_reject_unknown_keys("forward_proxy", fp, {"enabled"})
|
||||
normalized["forward_proxy"] = {
|
||||
"enabled": _required_bool(fp, "forward_proxy", "enabled"),
|
||||
}
|
||||
|
||||
dlp = _required_dict(cfg, "config", "dlp")
|
||||
_reject_unknown_keys("dlp", dlp, {"include_defaults", "scan_env"})
|
||||
normalized["dlp"] = {
|
||||
"include_defaults": _required_bool(dlp, "dlp", "include_defaults"),
|
||||
"scan_env": _required_bool(dlp, "dlp", "scan_env"),
|
||||
}
|
||||
|
||||
rbs = _required_dict(cfg, "config", "request_body_scanning")
|
||||
_reject_unknown_keys(
|
||||
"request_body_scanning",
|
||||
rbs,
|
||||
{"action", "scan_headers", "header_mode"},
|
||||
)
|
||||
normalized_rbs: dict[str, object] = {
|
||||
"action": _required_str(rbs, "request_body_scanning", "action"),
|
||||
}
|
||||
scan_headers = _optional_bool(rbs, "request_body_scanning", "scan_headers")
|
||||
if scan_headers is not None:
|
||||
normalized_rbs["scan_headers"] = scan_headers
|
||||
header_mode = _optional_str(rbs, "request_body_scanning", "header_mode")
|
||||
if header_mode is not None:
|
||||
normalized_rbs["header_mode"] = header_mode
|
||||
normalized["request_body_scanning"] = normalized_rbs
|
||||
|
||||
if "tls_interception" in cfg:
|
||||
tls = _required_dict(cfg, "config", "tls_interception")
|
||||
_reject_unknown_keys(
|
||||
"tls_interception",
|
||||
tls,
|
||||
{"enabled", "ca_cert", "ca_key", "passthrough_domains"},
|
||||
)
|
||||
normalized["tls_interception"] = {
|
||||
"enabled": _required_bool(tls, "tls_interception", "enabled"),
|
||||
"ca_cert": _required_str(tls, "tls_interception", "ca_cert"),
|
||||
"ca_key": _required_str(tls, "tls_interception", "ca_key"),
|
||||
"passthrough_domains": _optional_str_list(
|
||||
tls, "tls_interception", "passthrough_domains",
|
||||
),
|
||||
}
|
||||
|
||||
if "ssrf" in cfg:
|
||||
ssrf = _required_dict(cfg, "config", "ssrf")
|
||||
_reject_unknown_keys("ssrf", ssrf, {"ip_allowlist"})
|
||||
normalized["ssrf"] = {
|
||||
"ip_allowlist": _required_str_list(ssrf, "ssrf", "ip_allowlist"),
|
||||
}
|
||||
|
||||
return normalized
|
||||
|
||||
|
||||
def pipelock_render_yaml(cfg: dict[str, object]) -> str:
|
||||
"""Render a pipelock config dict (as produced by
|
||||
`pipelock_build_config`) as YAML. Hand-rolled so we don't take a
|
||||
YAML-parser dependency for a fixed, narrow shape."""
|
||||
def _bool(b: object) -> str:
|
||||
return "true" if b else "false"
|
||||
|
||||
cfg = _validate_pipelock_render_config(cfg)
|
||||
lines: list[str] = []
|
||||
lines.append(f"version: {cfg['version']}")
|
||||
lines.append(f"mode: {cfg['mode']}")
|
||||
lines.append(f"enforce: {_bool(cast(bool, cfg['enforce']))}")
|
||||
lines.append("")
|
||||
lines.append("api_allowlist:")
|
||||
api_allowlist = cast(list[str], cfg["api_allowlist"])
|
||||
for h in api_allowlist:
|
||||
lines.append(f' - "{h}"')
|
||||
lines.append("")
|
||||
if "seed_phrase_detection" in cfg:
|
||||
lines.append("seed_phrase_detection:")
|
||||
spd = cast(dict[str, object], cfg["seed_phrase_detection"])
|
||||
lines.append(f" enabled: {_bool(cast(bool, spd['enabled']))}")
|
||||
lines.append("")
|
||||
lines.append("forward_proxy:")
|
||||
fp = cast(dict[str, object], cfg["forward_proxy"])
|
||||
lines.append(f" enabled: {_bool(cast(bool, fp['enabled']))}")
|
||||
lines.append("")
|
||||
lines.append("dlp:")
|
||||
dlp = cast(dict[str, object], cfg["dlp"])
|
||||
lines.append(f" include_defaults: {_bool(cast(bool, dlp['include_defaults']))}")
|
||||
lines.append(f" scan_env: {_bool(cast(bool, dlp['scan_env']))}")
|
||||
lines.append("")
|
||||
lines.append("request_body_scanning:")
|
||||
rbs = cast(dict[str, object], cfg["request_body_scanning"])
|
||||
lines.append(f' action: "{cast(str, rbs["action"])}"')
|
||||
if "scan_headers" in rbs:
|
||||
lines.append(f" scan_headers: {_bool(cast(bool, rbs['scan_headers']))}")
|
||||
if "header_mode" in rbs:
|
||||
lines.append(f' header_mode: "{cast(str, rbs["header_mode"])}"')
|
||||
if "tls_interception" in cfg:
|
||||
lines.append("")
|
||||
lines.append("tls_interception:")
|
||||
tls = cast(dict[str, object], cfg["tls_interception"])
|
||||
lines.append(f" enabled: {_bool(cast(bool, tls['enabled']))}")
|
||||
lines.append(f' ca_cert: "{cast(str, tls["ca_cert"])}"')
|
||||
lines.append(f' ca_key: "{cast(str, tls["ca_key"])}"')
|
||||
passthrough = cast(list[str], tls["passthrough_domains"])
|
||||
if passthrough:
|
||||
lines.append(" passthrough_domains:")
|
||||
for d in passthrough:
|
||||
lines.append(f' - "{d}"')
|
||||
if "ssrf" in cfg:
|
||||
lines.append("")
|
||||
lines.append("ssrf:")
|
||||
ssrf = cast(dict[str, object], cfg["ssrf"])
|
||||
lines.append(" ip_allowlist:")
|
||||
ip_allowlist = cast(list[str], ssrf["ip_allowlist"])
|
||||
for ip in ip_allowlist:
|
||||
lines.append(f' - "{ip}"')
|
||||
return "\n".join(lines) + "\n"
|
||||
|
||||
|
||||
# --- Proxy class -----------------------------------------------------------
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class PipelockProxyPlan:
|
||||
"""Output of PipelockProxy.prepare; consumed by .start when the
|
||||
sidecar needs to be brought up.
|
||||
|
||||
yaml_path + slug are filled in at prepare time (host-side, side-
|
||||
effect-free; the YAML references the in-container CA paths
|
||||
already so it doesn't need the host paths to be valid). The
|
||||
remaining fields are populated by the backend's launch step
|
||||
via `dataclasses.replace`: internal/egress networks once
|
||||
those networks exist, the CA host paths once the one-shot
|
||||
`pipelock tls init` has run, and `internal_network_cidr` once
|
||||
Docker has assigned a subnet to the internal network. Empty
|
||||
defaults are sentinels meaning "not yet set"; `.start` validates
|
||||
that they are populated.
|
||||
|
||||
`internal_network_cidr` ends up on pipelock's `ssrf.ip_allowlist`
|
||||
so traffic from sibling sidecars (egress → pipelock on the
|
||||
upstream leg, etc.) bypasses pipelock's RFC1918 SSRF guard while
|
||||
api_allowlist and body-scanning still apply."""
|
||||
|
||||
yaml_path: Path
|
||||
slug: str
|
||||
internal_network: str = ""
|
||||
internal_network_cidr: str = ""
|
||||
egress_network: str = ""
|
||||
ca_cert_host_path: Path = Path()
|
||||
ca_key_host_path: Path = Path()
|
||||
|
||||
|
||||
class PipelockProxy:
|
||||
"""The pipelock egress proxy. Encapsulates the YAML-config
|
||||
generation; the container lifecycle is owned by whatever
|
||||
wraps the daemon (compose-managed pipelock container on docker,
|
||||
sidecar-bundle PID 1 on smolmachines).
|
||||
|
||||
Backends instantiate the class directly — there are no
|
||||
platform-specific subclasses; the in-container CA paths are
|
||||
universal module-level constants
|
||||
(`PIPELOCK_CA_CERT_IN_CONTAINER` / `PIPELOCK_CA_KEY_IN_CONTAINER`)."""
|
||||
|
||||
def prepare(
|
||||
self,
|
||||
bottle: Bottle,
|
||||
slug: str,
|
||||
stage_dir: Path,
|
||||
provider_routes: tuple[EgressRoute, ...] = (),
|
||||
) -> PipelockProxyPlan:
|
||||
"""Write the pipelock yaml config (mode 600) under `stage_dir`
|
||||
and return the plan for launch. Pure host-side, no docker
|
||||
subprocess.
|
||||
|
||||
`slug` is the agent-derived identifier (lowercased,
|
||||
hyphen-normalized) used as the suffix in every per-agent
|
||||
resource name — the agent container, the sidecar bundle
|
||||
container, the internal/egress networks. It's stored on the
|
||||
returned plan so the backend's launch step can derive those
|
||||
names.
|
||||
|
||||
The CA paths the YAML references are the module-level
|
||||
in-container constants. The host-side counterparts are
|
||||
generated by the launch step (not here, so prepare stays
|
||||
side-effect-free on docker) and added to the plan via
|
||||
`dataclasses.replace` before the daemon starts."""
|
||||
yaml_path = stage_dir / "pipelock.yaml"
|
||||
cfg = pipelock_build_config(
|
||||
bottle,
|
||||
ca_cert_path=PIPELOCK_CA_CERT_IN_CONTAINER,
|
||||
ca_key_path=PIPELOCK_CA_KEY_IN_CONTAINER,
|
||||
provider_routes=provider_routes,
|
||||
)
|
||||
yaml_path.write_text(pipelock_render_yaml(cfg))
|
||||
yaml_path.chmod(0o600)
|
||||
return PipelockProxyPlan(yaml_path=yaml_path, slug=slug)
|
||||
+27
-39
@@ -1,7 +1,7 @@
|
||||
"""Per-bottle sidecar supervisor (PRD 0024 chunk 1).
|
||||
|
||||
PID 1 inside the `bot-bottle-sidecars` bundle image. Spawns
|
||||
the configured daemons (egress, pipelock, git-gate, supervise),
|
||||
the configured daemons (egress, git-gate, supervise),
|
||||
forwards SIGTERM/SIGINT to each child, and propagates per-daemon
|
||||
stdout+stderr to the container log with a `[name] ` prefix.
|
||||
|
||||
@@ -19,7 +19,7 @@ PR; the interim policy is "don't take the bundle down for one
|
||||
sick daemon."
|
||||
|
||||
Daemon subset is env-driven. The compose renderer narrows it via
|
||||
`BOT_BOTTLE_SIDECAR_DAEMONS=egress,pipelock` for bottles that
|
||||
`BOT_BOTTLE_SIDECAR_DAEMONS=egress` for bottles that
|
||||
don't use git-gate or supervise. Default: all daemons.
|
||||
|
||||
Stdlib-only by design — adding supervisord/s6/runit for four
|
||||
@@ -57,15 +57,9 @@ class _DaemonSpec:
|
||||
# Env-var name prefixes that carry egress-only credentials.
|
||||
# `egress_apply.py` assigns `EGRESS_TOKEN_<n>` slots that egress
|
||||
# reads to inject `Authorization` headers on configured routes;
|
||||
# every other daemon in the bundle (especially pipelock with
|
||||
# `scan_env: true`) MUST NOT see these values or it'll match the
|
||||
# injected token in the request egress just sent and 403-block
|
||||
# the legitimate traffic (issue #84). The agent itself runs in a
|
||||
# different machine and never has access to these slots in the
|
||||
# first place, so stripping them from non-egress daemons loses no
|
||||
# DLP coverage — pipelock can't catch the exfil of a value the
|
||||
# agent doesn't have.
|
||||
# no other daemon in the bundle should see these values.
|
||||
_EGRESS_ONLY_ENV_PREFIXES: tuple[str, ...] = ("EGRESS_TOKEN_",)
|
||||
_READY_GATED_DAEMONS: tuple[str, ...] = ("git-gate", "git-http")
|
||||
|
||||
|
||||
def _env_for_daemon(name: str, base_env: dict[str, str]) -> dict[str, str]:
|
||||
@@ -81,28 +75,30 @@ def _env_for_daemon(name: str, base_env: dict[str, str]) -> dict[str, str]:
|
||||
}
|
||||
|
||||
|
||||
# Order matters only for first-launch race-window reasons: egress
|
||||
# starts first so pipelock's upstream connect succeeds during
|
||||
# pipelock's own startup. git-gate and supervise are independent.
|
||||
# Pipelock binds 0.0.0.0:8888 explicitly. Without `--listen` it
|
||||
# defaults to 127.0.0.1 which would be unreachable from sibling
|
||||
# services on the docker network. The legacy four-sidecar
|
||||
# compose renderer passed the same flag; the bundle keeps the
|
||||
# explicit binding.
|
||||
_DAEMONS: tuple[_DaemonSpec, ...] = (
|
||||
_DaemonSpec("egress", ("/bin/sh", "/app/egress-entrypoint.sh")),
|
||||
_DaemonSpec(
|
||||
"pipelock",
|
||||
("/usr/local/bin/pipelock", "run",
|
||||
"--config", "/etc/pipelock.yaml",
|
||||
"--listen", "0.0.0.0:8888"),
|
||||
),
|
||||
_DaemonSpec("git-gate", ("/bin/sh", "/git-gate-entrypoint.sh")),
|
||||
_DaemonSpec("git-http", ("python3", "/app/git_http_backend.py")),
|
||||
_DaemonSpec("supervise", ("python3", "/app/supervise_server.py")),
|
||||
)
|
||||
|
||||
|
||||
def _argv_for_daemon(name: str, argv: Sequence[str], env: dict[str, str]) -> list[str]:
|
||||
ready_file = env.get("BOT_BOTTLE_GIT_GATE_READY_FILE", "").strip()
|
||||
if name not in _READY_GATED_DAEMONS or not ready_file:
|
||||
return list(argv)
|
||||
return [
|
||||
"/bin/sh",
|
||||
"-c",
|
||||
"while [ ! -f \"$BOT_BOTTLE_GIT_GATE_READY_FILE\" ]; do "
|
||||
"sleep 0.1; "
|
||||
"done; "
|
||||
"exec \"$@\"",
|
||||
name,
|
||||
*argv,
|
||||
]
|
||||
|
||||
|
||||
def _selected_daemons(
|
||||
env: dict[str, str],
|
||||
all_daemons: Sequence[_DaemonSpec] | None = None,
|
||||
@@ -139,12 +135,13 @@ def _pump(name: str, stream: IO[bytes]) -> None:
|
||||
|
||||
|
||||
def _spawn(spec: _DaemonSpec) -> subprocess.Popen[bytes]:
|
||||
env = _env_for_daemon(spec.name, dict(os.environ))
|
||||
proc = subprocess.Popen(
|
||||
list(spec.argv),
|
||||
_argv_for_daemon(spec.name, spec.argv, env),
|
||||
stdout=subprocess.PIPE,
|
||||
stderr=subprocess.STDOUT,
|
||||
bufsize=0,
|
||||
env=_env_for_daemon(spec.name, dict(os.environ)),
|
||||
env=env,
|
||||
)
|
||||
threading.Thread(
|
||||
target=_pump, args=(spec.name, proc.stdout), daemon=True
|
||||
@@ -303,10 +300,8 @@ class _Supervisor:
|
||||
|
||||
def restart_daemon(self, daemon_name: str, *, grace: float = 5.0) -> bool:
|
||||
"""Terminate one named child and spawn a fresh one, leaving
|
||||
the other daemons running. Used by the pipelock-apply path:
|
||||
pipelock has no in-process reload, so apply_allowlist_change
|
||||
runs `docker kill --signal USR1 <bundle>` after writing the
|
||||
new yaml; the supervisor catches SIGUSR1 and calls this.
|
||||
the other daemons running. A daemon that has no in-process
|
||||
reload can be restarted this way after its config file changes.
|
||||
|
||||
Behavior: SIGTERM → wait up to `grace` seconds → SIGKILL if
|
||||
still alive → spawn a replacement under the same DaemonSpec.
|
||||
@@ -314,8 +309,8 @@ class _Supervisor:
|
||||
forward_signal / shutdown calls reach the new pid.
|
||||
|
||||
Returns True iff a daemon by that name was running and a
|
||||
replacement spawned; False if no such daemon (the
|
||||
compose-renderer subset said this bottle doesn't run it)."""
|
||||
replacement spawned; False if no such daemon (not wired
|
||||
for this bottle)."""
|
||||
if self.shutdown_at is not None:
|
||||
_log(f"restart {daemon_name} skipped; supervisor is shutting down")
|
||||
return False
|
||||
@@ -367,13 +362,6 @@ def main(argv: Sequence[str] | None = None) -> int:
|
||||
# delivers SIGHUP to PID 1 (this supervisor); forward it to
|
||||
# mitmdump so it reloads its addon.
|
||||
signal.signal(signal.SIGHUP, lambda *_: sup.forward_signal(signal.SIGHUP, "egress")) # type: ignore
|
||||
# SIGUSR1 pipelock-restart path: pipelock_apply.py runs
|
||||
# `docker kill --signal USR1 <bundle>` after writing
|
||||
# pipelock.yaml. Pipelock has no in-process reload, so the
|
||||
# supervisor restarts the pipelock daemon in place (other
|
||||
# daemons keep running — specifically supervise, whose MCP
|
||||
# socket would drop on a whole-container `docker restart`).
|
||||
signal.signal(signal.SIGUSR1, lambda *_: sup.request_restart("pipelock")) # type: ignore
|
||||
|
||||
while not sup.tick():
|
||||
time.sleep(_POLL_INTERVAL)
|
||||
|
||||
+4
-20
@@ -6,8 +6,7 @@ sits on the bottle's internal network and exposes three MCP tools the
|
||||
agent calls when it hits a stuck-recovery category:
|
||||
|
||||
* egress-block — agent proposes a new routes.yaml
|
||||
* pipelock-block — agent proposes a new pipelock allowlist
|
||||
* capability-block — agent proposes a new agent Dockerfile
|
||||
* capability-block — agent proposes a new agent Dockerfile
|
||||
|
||||
Each tool call: the agent passes the full proposed file plus a
|
||||
justification text. The sidecar validates the proposal syntactically,
|
||||
@@ -49,13 +48,9 @@ from pathlib import Path
|
||||
SUPERVISE_HOSTNAME = "supervise"
|
||||
SUPERVISE_PORT = 9100
|
||||
|
||||
TOOL_EGRESS_BLOCK = "egress-block"
|
||||
TOOL_PIPELOCK_BLOCK = "pipelock-block"
|
||||
TOOL_CAPABILITY_BLOCK = "capability-block"
|
||||
TOOL_LIST_EGRESS_ROUTES = "list-egress-routes"
|
||||
TOOLS: tuple[str, ...] = (
|
||||
TOOL_EGRESS_BLOCK,
|
||||
TOOL_PIPELOCK_BLOCK,
|
||||
TOOL_CAPABILITY_BLOCK,
|
||||
TOOL_LIST_EGRESS_ROUTES,
|
||||
)
|
||||
@@ -73,11 +68,8 @@ EGRESS_INTROSPECT_URL = "http://_egress.local/allowlist"
|
||||
# capability-block has no on-disk config the operator edits in place
|
||||
# (the Dockerfile is rebuilt, not patched), so it has no audit log
|
||||
# here — those changes are captured by git history + the rebuild
|
||||
# record laid down in PRD 0016.
|
||||
COMPONENT_FOR_TOOL: dict[str, str] = {
|
||||
TOOL_EGRESS_BLOCK: "egress",
|
||||
TOOL_PIPELOCK_BLOCK: "pipelock",
|
||||
}
|
||||
# record laid down in PRD 0016. egress-block was removed in issue #198.
|
||||
COMPONENT_FOR_TOOL: dict[str, str] = {}
|
||||
|
||||
STATUS_APPROVED = "approved"
|
||||
STATUS_MODIFIED = "modified"
|
||||
@@ -85,8 +77,7 @@ STATUS_REJECTED = "rejected"
|
||||
STATUSES: tuple[str, ...] = (STATUS_APPROVED, STATUS_MODIFIED, STATUS_REJECTED)
|
||||
|
||||
# Operator-initiated audit entries (no tool call). PRD 0014's
|
||||
# `routes edit <bottle>` and PRD 0015's `pipelock edit <bottle>`
|
||||
# verbs write entries with this action.
|
||||
# `routes edit <bottle>` verb writes entries with this action.
|
||||
ACTION_OPERATOR_EDIT = "operator-edit"
|
||||
|
||||
QUEUE_DIR_IN_CONTAINER = "/run/supervise/queue"
|
||||
@@ -474,8 +465,6 @@ class Supervise(ABC):
|
||||
self,
|
||||
slug: str,
|
||||
stage_dir: Path,
|
||||
*,
|
||||
dockerfile_content: str = "",
|
||||
) -> SupervisePlan:
|
||||
"""Stage the per-bottle queue dir on the host and the
|
||||
current-config dir under `stage_dir`. Returns the plan;
|
||||
@@ -485,9 +474,6 @@ class Supervise(ABC):
|
||||
queue_dir.mkdir(parents=True, exist_ok=True)
|
||||
current_config_dir = stage_dir / "current-config"
|
||||
current_config_dir.mkdir(parents=True, exist_ok=True)
|
||||
dockerfile_path = current_config_dir / CURRENT_CONFIG_DOCKERFILE
|
||||
dockerfile_path.write_text(dockerfile_content)
|
||||
dockerfile_path.chmod(0o644)
|
||||
return SupervisePlan(
|
||||
slug=slug,
|
||||
queue_dir=queue_dir,
|
||||
@@ -560,9 +546,7 @@ __all__ = [
|
||||
"EGRESS_FORWARD_PROXY",
|
||||
"EGRESS_INTROSPECT_URL",
|
||||
"TOOL_CAPABILITY_BLOCK",
|
||||
"TOOL_EGRESS_BLOCK",
|
||||
"TOOL_LIST_EGRESS_ROUTES",
|
||||
"TOOL_PIPELOCK_BLOCK",
|
||||
"archive_proposal",
|
||||
"audit_dir",
|
||||
"audit_log_path",
|
||||
|
||||
+17
-229
@@ -1,8 +1,10 @@
|
||||
"""Supervise sidecar HTTP server (PRD 0013).
|
||||
|
||||
Per-bottle MCP server exposing three tools — `egress-block`,
|
||||
`pipelock-block`, `capability-block` — that the agent calls to
|
||||
propose config changes when stuck. Each tool call:
|
||||
Per-bottle MCP server exposing tools the agent calls to propose config
|
||||
changes when stuck. The egress-block tool was removed in issue #198;
|
||||
the remaining tools are `capability-block` and `list-egress-routes`.
|
||||
|
||||
Each queued tool call:
|
||||
|
||||
1. Validates the proposed file syntactically.
|
||||
2. Writes a Proposal to /run/supervise/queue/ (bind-mounted from
|
||||
@@ -18,7 +20,7 @@ Speaks MCP over HTTP+JSON-RPC. Methods handled:
|
||||
|
||||
* `initialize` — handshake; returns server info + caps.
|
||||
* `notifications/initialized` — ack-only.
|
||||
* `tools/list` — returns the three tool definitions.
|
||||
* `tools/list` — returns the tool definitions.
|
||||
* `tools/call` — validates, queues, blocks, returns.
|
||||
|
||||
Everything else returns JSON-RPC error -32601 (method not found).
|
||||
@@ -38,7 +40,6 @@ import sys
|
||||
import time
|
||||
import typing
|
||||
import urllib.error
|
||||
import urllib.parse
|
||||
import urllib.request
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
@@ -134,84 +135,15 @@ def jsonrpc_error(request_id: object, code: int, message: str) -> bytes:
|
||||
|
||||
|
||||
TOOL_DEFINITIONS: list[dict[str, object]] = [
|
||||
{
|
||||
"name": _sv.TOOL_EGRESS_BLOCK,
|
||||
"description": (
|
||||
"Call when egress refused your HTTPS request — host "
|
||||
"without a matching route, or a path outside the route's "
|
||||
"path_allowlist (typically a 403 from the proxy). Propose "
|
||||
"a SINGLE route to add: the host you need + (optionally) "
|
||||
"a path_allowlist + (optionally) an auth block. The "
|
||||
"supervisor merges the route into the live table at "
|
||||
"approval time — you do NOT need to see or reproduce the "
|
||||
"existing routes, and you do not pass a full routes file. "
|
||||
"If the host already has a route, the proposed "
|
||||
"path_allowlist entries are unioned with the existing "
|
||||
"ones (host stays single-route). The operator approves "
|
||||
"or rejects in the supervise TUI. On approval the "
|
||||
"supervisor writes the merged routes.yaml, SIGHUPs "
|
||||
"egress (atomic swap, no dropped connections), and "
|
||||
"mirrors the host onto pipelock's allowlist for the "
|
||||
"downstream gate."
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"host": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"The hostname to allow (e.g. 'api.github.com'). "
|
||||
"Case-insensitive on match."
|
||||
),
|
||||
},
|
||||
"path_allowlist": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": (
|
||||
"Optional URL path prefixes the route permits. "
|
||||
"Each must start with '/'. Omit to allow all "
|
||||
"paths under this host (bare-pass route)."
|
||||
),
|
||||
},
|
||||
"auth": {
|
||||
"type": "object",
|
||||
"description": (
|
||||
"Optional credential injection. {scheme, "
|
||||
"token_ref}: scheme is 'Bearer' or 'token'; "
|
||||
"token_ref names the host env var holding the "
|
||||
"secret value. Omit to add a host without "
|
||||
"credential injection. Ignored if the host "
|
||||
"already has a route (operator decides auth "
|
||||
"changes, not the agent)."
|
||||
),
|
||||
"properties": {
|
||||
"scheme": {"type": "string"},
|
||||
"token_ref": {"type": "string"},
|
||||
},
|
||||
"required": ["scheme", "token_ref"],
|
||||
"additionalProperties": False,
|
||||
},
|
||||
"justification": {
|
||||
"type": "string",
|
||||
"description": "Why this host needs to be allowed.",
|
||||
},
|
||||
},
|
||||
"required": ["host", "justification"],
|
||||
},
|
||||
},
|
||||
{
|
||||
"name": _sv.TOOL_LIST_EGRESS_ROUTES,
|
||||
"description": (
|
||||
"List the current egress route table — the bottle's "
|
||||
"primary egress allowlist. Returns JSON with one entry "
|
||||
"per allowed host, each carrying its path_allowlist (if "
|
||||
"any) and whether the proxy injects Authorization for "
|
||||
"the route. Use this before composing an "
|
||||
"`egress-block` proposal so the new routes file "
|
||||
"extends the live one rather than replacing it. "
|
||||
"Pipelock's allowlist is a mirror of this set — every "
|
||||
"host listed here is also reachable through pipelock's "
|
||||
"downstream hostname gate."
|
||||
"allowlist. Returns JSON with one entry per allowed host, "
|
||||
"each carrying its matches rules (if any) and whether "
|
||||
"the proxy injects Authorization for the route. Use this "
|
||||
"before composing an `egress-block` proposal so the new "
|
||||
"routes file extends the live one rather than replacing it."
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
@@ -219,48 +151,12 @@ TOOL_DEFINITIONS: list[dict[str, object]] = [
|
||||
"additionalProperties": False,
|
||||
},
|
||||
},
|
||||
{
|
||||
"name": _sv.TOOL_PIPELOCK_BLOCK,
|
||||
"description": (
|
||||
"Call when pipelock refused your outbound request and "
|
||||
"the failing host is genuinely missing from the bottle's "
|
||||
"allowlist (vs. blocked for DLP reasons — those need a "
|
||||
"different remediation). In practice pipelock's allowlist "
|
||||
"is now a mirror of the egress routes set by "
|
||||
"`egress-block`, so prefer that tool when you want "
|
||||
"to add a host. This tool stays available for the rare "
|
||||
"case where pipelock and egress have diverged. "
|
||||
"Pass the full URL you tried to hit (scheme + host + "
|
||||
"path); the supervisor extracts the hostname and merges "
|
||||
"it into pipelock's allowlist. On approval the "
|
||||
"supervisor restarts pipelock."
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"failed_url": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"The full URL pipelock blocked, e.g. "
|
||||
"https://api.github.com/repos/foo/bar. Scheme "
|
||||
"and hostname are required; path is recorded "
|
||||
"as operator context."
|
||||
),
|
||||
},
|
||||
"justification": {
|
||||
"type": "string",
|
||||
"description": "Why the new host should be allowed.",
|
||||
},
|
||||
},
|
||||
"required": ["failed_url", "justification"],
|
||||
},
|
||||
},
|
||||
{
|
||||
"name": _sv.TOOL_CAPABILITY_BLOCK,
|
||||
"description": (
|
||||
"Call when the bottle is missing a tool, skill, permission, "
|
||||
"or env var you need — something that lives in the agent "
|
||||
"Dockerfile rather than in routes or the pipelock allowlist. "
|
||||
"Dockerfile rather than in the egress routes. "
|
||||
"Read the current Dockerfile from "
|
||||
"/etc/bot-bottle/current-config/Dockerfile, compose a "
|
||||
"modified version, and pass the full new file plus a "
|
||||
@@ -286,27 +182,10 @@ TOOL_DEFINITIONS: list[dict[str, object]] = [
|
||||
]
|
||||
|
||||
|
||||
# Map each tool to the input field that carries the agent's
|
||||
# tool-specific payload (stored in Proposal.proposed_file as
|
||||
# free-form text the apply path interprets per tool).
|
||||
#
|
||||
# egress-block: JSON object describing a SINGLE route to
|
||||
# add — `{host, path_allowlist?, auth?}`. The
|
||||
# supervisor merges this into the live routes
|
||||
# file at approval time.
|
||||
# pipelock-block: the full failed URL (scheme + host + path) —
|
||||
# supervisor extracts the host, merges into the
|
||||
# bottle's current allowlist; the path is shown
|
||||
# to the operator for context (pipelock doesn't
|
||||
# do path-level matching).
|
||||
# capability-block: full proposed Dockerfile
|
||||
#
|
||||
# Egress-proxy-block doesn't use a single "field name" → the JSON
|
||||
# payload is constructed from multiple structured input fields in
|
||||
# `handle_egress_block`. The mapping stays one-entry-per-tool
|
||||
# so the generic dispatch keeps working for the other two.
|
||||
# Map each non-egress tool to the input field that carries the agent's
|
||||
# payload (stored in Proposal.proposed_file). egress-block builds its
|
||||
# payload from structured input fields in `handle_egress_block`.
|
||||
PROPOSED_FILE_FIELD: dict[str, str] = {
|
||||
_sv.TOOL_PIPELOCK_BLOCK: "failed_url",
|
||||
_sv.TOOL_CAPABILITY_BLOCK: "dockerfile",
|
||||
}
|
||||
|
||||
@@ -314,34 +193,13 @@ PROPOSED_FILE_FIELD: dict[str, str] = {
|
||||
# --- Validation ------------------------------------------------------------
|
||||
|
||||
|
||||
# Auth schemes accepted on egress-block proposals — match the
|
||||
# manifest-side EGRESS_AUTH_SCHEMES.
|
||||
_AUTH_SCHEMES = ("Bearer", "token")
|
||||
|
||||
|
||||
def validate_proposed_file(tool: str, content: str) -> None:
|
||||
"""Syntactic validation. The operator is the real gate; this just
|
||||
catches obvious paste-errors / wrong-tool selections before they
|
||||
enter the queue."""
|
||||
if not content.strip():
|
||||
raise _RpcError(ERR_INVALID_PARAMS, f"{tool}: proposed file is empty")
|
||||
if tool == _sv.TOOL_PIPELOCK_BLOCK:
|
||||
# `content` is the full failed URL. Require scheme + host so
|
||||
# the supervisor can extract a hostname for the allowlist
|
||||
# merge; the path is preserved for operator context.
|
||||
parsed = urllib.parse.urlsplit(content.strip())
|
||||
if parsed.scheme not in ("http", "https"):
|
||||
raise _RpcError(
|
||||
ERR_INVALID_PARAMS,
|
||||
f"{tool}: failed_url must start with http:// or https:// "
|
||||
f"(got {content!r})",
|
||||
)
|
||||
if not parsed.hostname:
|
||||
raise _RpcError(
|
||||
ERR_INVALID_PARAMS,
|
||||
f"{tool}: failed_url is missing a hostname (got {content!r})",
|
||||
)
|
||||
elif tool == _sv.TOOL_CAPABILITY_BLOCK:
|
||||
if tool == _sv.TOOL_CAPABILITY_BLOCK:
|
||||
# Dockerfiles are too varied to validate syntactically beyond
|
||||
# non-empty. The operator reads the diff in the TUI.
|
||||
pass
|
||||
@@ -349,70 +207,6 @@ def validate_proposed_file(tool: str, content: str) -> None:
|
||||
raise _RpcError(ERR_INVALID_PARAMS, f"unknown tool {tool!r}")
|
||||
|
||||
|
||||
def _validate_and_bundle_egress_route(
|
||||
args: dict[str, object],
|
||||
) -> str:
|
||||
"""Validate egress-block input fields and bundle them into
|
||||
a JSON string that becomes the Proposal.proposed_file. Raises
|
||||
_RpcError on bad input — the agent retries with a fixed shape."""
|
||||
tool = _sv.TOOL_EGRESS_BLOCK
|
||||
host = args.get("host")
|
||||
if not isinstance(host, str) or not host.strip():
|
||||
raise _RpcError(
|
||||
ERR_INVALID_PARAMS,
|
||||
f"{tool}: 'host' is required and must be a non-empty string",
|
||||
)
|
||||
payload: dict[str, object] = {"host": host}
|
||||
|
||||
path_allow_raw = args.get("path_allowlist")
|
||||
if path_allow_raw is not None:
|
||||
if not isinstance(path_allow_raw, list):
|
||||
raise _RpcError(
|
||||
ERR_INVALID_PARAMS,
|
||||
f"{tool}: 'path_allowlist' must be an array of strings",
|
||||
)
|
||||
prefixes: list[str] = []
|
||||
for i, p in enumerate(path_allow_raw):
|
||||
if not isinstance(p, str):
|
||||
raise _RpcError(
|
||||
ERR_INVALID_PARAMS,
|
||||
f"{tool}: path_allowlist[{i}] must be a string",
|
||||
)
|
||||
if not p.startswith("/"):
|
||||
raise _RpcError(
|
||||
ERR_INVALID_PARAMS,
|
||||
f"{tool}: path_allowlist[{i}] {p!r} must start with '/'",
|
||||
)
|
||||
prefixes.append(p)
|
||||
if prefixes:
|
||||
payload["path_allowlist"] = prefixes
|
||||
|
||||
auth_raw = args.get("auth")
|
||||
if auth_raw is not None:
|
||||
if not isinstance(auth_raw, dict):
|
||||
raise _RpcError(
|
||||
ERR_INVALID_PARAMS,
|
||||
f"{tool}: 'auth' must be an object with 'scheme' and 'token_ref'",
|
||||
)
|
||||
scheme = auth_raw.get("scheme")
|
||||
token_ref = auth_raw.get("token_ref")
|
||||
if not isinstance(scheme, str) or scheme not in _AUTH_SCHEMES:
|
||||
raise _RpcError(
|
||||
ERR_INVALID_PARAMS,
|
||||
f"{tool}: auth.scheme must be one of "
|
||||
f"{', '.join(_AUTH_SCHEMES)} (got {scheme!r})",
|
||||
)
|
||||
if not isinstance(token_ref, str) or not token_ref:
|
||||
raise _RpcError(
|
||||
ERR_INVALID_PARAMS,
|
||||
f"{tool}: auth.token_ref must be a non-empty string "
|
||||
f"naming the host env var holding the token",
|
||||
)
|
||||
payload["auth"] = {"scheme": scheme, "token_ref": token_ref}
|
||||
|
||||
return json.dumps(payload, indent=2) + "\n"
|
||||
|
||||
|
||||
# --- MCP handlers ----------------------------------------------------------
|
||||
|
||||
|
||||
@@ -498,13 +292,7 @@ def handle_tools_call(
|
||||
f"{name}: 'justification' is required and must be a non-empty string",
|
||||
)
|
||||
|
||||
if name == _sv.TOOL_EGRESS_BLOCK:
|
||||
# Structured input → JSON bundle on Proposal.proposed_file.
|
||||
# The dashboard's apply step (egress_apply.add_route)
|
||||
# parses this JSON, fetches the current routes, merges in
|
||||
# the new one, and writes the merged file.
|
||||
proposed_file = _validate_and_bundle_egress_route(args_raw)
|
||||
elif name in PROPOSED_FILE_FIELD:
|
||||
if name in PROPOSED_FILE_FIELD:
|
||||
file_field = PROPOSED_FILE_FIELD[name]
|
||||
proposed_file = args_raw.get(file_field)
|
||||
if not isinstance(proposed_file, str):
|
||||
|
||||
+31
-37
@@ -63,18 +63,12 @@ from typing import cast
|
||||
|
||||
class YamlSubsetError(ValueError):
|
||||
"""Raised when input violates the YAML subset's rules. Callers
|
||||
that want fatal-exit semantics (manifest loader, pipelock-apply,
|
||||
that want fatal-exit semantics (manifest loader, egress-apply,
|
||||
etc.) catch this at their own boundary and forward to `die`;
|
||||
callers running outside the bot-bottle CLI process (the
|
||||
egress sidecar's addon) handle it as a normal exception."""
|
||||
|
||||
|
||||
def die(msg: str) -> None:
|
||||
"""Module-local helper so the parser body reads cleanly. Just
|
||||
raises YamlSubsetError — the `bot-bottle: error: ` prefix
|
||||
is added by the boundary `die` in `bot_bottle.log`."""
|
||||
raise YamlSubsetError(msg)
|
||||
|
||||
|
||||
# --- Tokenizer / line preprocessing ----------------------------------------
|
||||
|
||||
@@ -119,7 +113,7 @@ def _tokenize(text: str) -> list[_Line]:
|
||||
# editors render them differently and the spec says spaces.
|
||||
leading = len(raw) - len(raw.lstrip(" \t"))
|
||||
if "\t" in raw[:leading]:
|
||||
die(f"yaml-subset: tab character in indent on line {n}")
|
||||
raise YamlSubsetError(f"yaml-subset: tab character in indent on line {n}")
|
||||
stripped = raw.strip()
|
||||
if not stripped:
|
||||
continue
|
||||
@@ -169,14 +163,14 @@ def _parse_scalar(s: str, lineno: int) -> object:
|
||||
s.startswith("'") and s.endswith("'")
|
||||
):
|
||||
if len(s) < 2:
|
||||
die(f"yaml-subset: unterminated quoted string on line {lineno}")
|
||||
raise YamlSubsetError(f"yaml-subset: unterminated quoted string on line {lineno}")
|
||||
body = s[1:-1]
|
||||
if s.startswith('"'):
|
||||
# JSON-style escapes for double quotes.
|
||||
try:
|
||||
return body.encode("utf-8").decode("unicode_escape")
|
||||
except UnicodeDecodeError as e:
|
||||
die(f"yaml-subset: bad escape on line {lineno}: {e}")
|
||||
raise YamlSubsetError(f"yaml-subset: bad escape on line {lineno}: {e}")
|
||||
else:
|
||||
# Single quotes: only '' → ' (standard YAML); no other escapes.
|
||||
return body.replace("''", "'")
|
||||
@@ -186,7 +180,7 @@ def _parse_scalar(s: str, lineno: int) -> object:
|
||||
if s in _RESERVED_BOOL_LIKE:
|
||||
if s in ("true", "false"):
|
||||
return s == "true"
|
||||
die(
|
||||
raise YamlSubsetError(
|
||||
f"yaml-subset: bare {s!r} on line {lineno} is ambiguous "
|
||||
f"(use literal `true` / `false`, or quote it as a string)"
|
||||
)
|
||||
@@ -203,22 +197,22 @@ def _parse_scalar(s: str, lineno: int) -> object:
|
||||
|
||||
# Look-alikes that we reject to keep the user in control.
|
||||
if _DATE_RX.match(s):
|
||||
die(
|
||||
raise YamlSubsetError(
|
||||
f"yaml-subset: bare {s!r} on line {lineno} looks like a "
|
||||
f"date — quote it as a string or use an explicit int"
|
||||
)
|
||||
if _OCTAL_RX.match(s):
|
||||
die(
|
||||
raise YamlSubsetError(
|
||||
f"yaml-subset: bare {s!r} on line {lineno} looks like an "
|
||||
f"octal/0-prefixed integer — quote it as a string"
|
||||
)
|
||||
if _HEX_RX.match(s):
|
||||
die(
|
||||
raise YamlSubsetError(
|
||||
f"yaml-subset: bare {s!r} on line {lineno} looks like a "
|
||||
f"hex integer — quote it as a string"
|
||||
)
|
||||
if _FLOAT_RX.match(s):
|
||||
die(
|
||||
raise YamlSubsetError(
|
||||
f"yaml-subset: floats not supported (line {lineno}, "
|
||||
f"value {s!r}); use an int or quote as a string"
|
||||
)
|
||||
@@ -241,7 +235,7 @@ def _parse_inline(s: str, lineno: int) -> object:
|
||||
s = s.strip()
|
||||
if s.startswith("["):
|
||||
if not s.endswith("]"):
|
||||
die(f"yaml-subset: unterminated `[` on line {lineno}")
|
||||
raise YamlSubsetError(f"yaml-subset: unterminated `[` on line {lineno}")
|
||||
body = s[1:-1].strip()
|
||||
if not body:
|
||||
return []
|
||||
@@ -252,21 +246,21 @@ def _parse_inline(s: str, lineno: int) -> object:
|
||||
return items
|
||||
if s.startswith("{"):
|
||||
if not s.endswith("}"):
|
||||
die(f"yaml-subset: unterminated `{{` on line {lineno}")
|
||||
raise YamlSubsetError(f"yaml-subset: unterminated `{{` on line {lineno}")
|
||||
body = s[1:-1].strip()
|
||||
if not body:
|
||||
return {}
|
||||
out: dict[str, object] = {}
|
||||
for raw in _split_flow(body, lineno, "dict"):
|
||||
if ":" not in raw:
|
||||
die(
|
||||
raise YamlSubsetError(
|
||||
f"yaml-subset: inline dict entry on line {lineno} "
|
||||
f"missing `:` ({raw!r})"
|
||||
)
|
||||
k, _, v = raw.partition(":")
|
||||
k = k.strip()
|
||||
if not _BARE_RX.match(k):
|
||||
die(
|
||||
raise YamlSubsetError(
|
||||
f"yaml-subset: inline dict key on line {lineno} "
|
||||
f"must be a bare identifier ({k!r})"
|
||||
)
|
||||
@@ -296,7 +290,7 @@ def _split_flow(body: str, lineno: int, kind: str) -> list[str]:
|
||||
elif ch in "]}":
|
||||
depth_b -= 1
|
||||
if depth_b > 0:
|
||||
die(
|
||||
raise YamlSubsetError(
|
||||
f"yaml-subset: nested flow {kind} on line "
|
||||
f"{lineno} (only one level of flow allowed)"
|
||||
)
|
||||
@@ -330,7 +324,7 @@ def _split_key_value(content: str, lineno: int) -> tuple[str, str]:
|
||||
# ambiguous with URLs etc.).
|
||||
if i + 1 >= len(content) or content[i + 1] in (" ", "\t"):
|
||||
return content[:i].strip(), content[i + 1:].lstrip()
|
||||
die(f"yaml-subset: line {lineno} missing `: ` separator: {content!r}")
|
||||
raise YamlSubsetError(f"yaml-subset: line {lineno} missing `: ` separator: {content!r}")
|
||||
return "", "" # unreachable, but needed for type checker
|
||||
|
||||
|
||||
@@ -341,15 +335,15 @@ def _parse_block(
|
||||
to live at `base_indent`. Returns (value, new_idx) where
|
||||
`new_idx` is the index of the first unconsumed line."""
|
||||
if idx >= len(lines):
|
||||
die("yaml-subset: unexpected end of document")
|
||||
raise YamlSubsetError("yaml-subset: unexpected end of document")
|
||||
first = lines[idx]
|
||||
if first.indent < base_indent:
|
||||
die(
|
||||
raise YamlSubsetError(
|
||||
f"yaml-subset: line {first.lineno} indented less than "
|
||||
f"expected (got {first.indent}, expected >= {base_indent})"
|
||||
)
|
||||
if first.indent > base_indent:
|
||||
die(
|
||||
raise YamlSubsetError(
|
||||
f"yaml-subset: line {first.lineno} indented more than "
|
||||
f"expected (got {first.indent}, expected {base_indent})"
|
||||
)
|
||||
@@ -366,18 +360,18 @@ def _parse_block_mapping(
|
||||
while idx < len(lines) and lines[idx].indent == base_indent:
|
||||
line = lines[idx]
|
||||
if line.content.startswith("- "):
|
||||
die(
|
||||
raise YamlSubsetError(
|
||||
f"yaml-subset: line {line.lineno} unexpected list "
|
||||
f"item at mapping indent (got `-`, expected `key:`)"
|
||||
)
|
||||
key, value_text = _split_key_value(line.content, line.lineno)
|
||||
if not _BARE_RX.match(key):
|
||||
die(
|
||||
raise YamlSubsetError(
|
||||
f"yaml-subset: line {line.lineno} key {key!r} is not "
|
||||
f"a bare identifier"
|
||||
)
|
||||
if key in out:
|
||||
die(
|
||||
raise YamlSubsetError(
|
||||
f"yaml-subset: line {line.lineno} duplicate key {key!r}"
|
||||
)
|
||||
if value_text:
|
||||
@@ -417,7 +411,7 @@ def _parse_block_list(
|
||||
content_col = base_indent + 2
|
||||
first_key, first_value_text = _split_key_value(rest, line.lineno)
|
||||
if not _BARE_RX.match(first_key):
|
||||
die(
|
||||
raise YamlSubsetError(
|
||||
f"yaml-subset: line {line.lineno} key {first_key!r} "
|
||||
f"is not a bare identifier"
|
||||
)
|
||||
@@ -440,12 +434,12 @@ def _parse_block_list(
|
||||
break # next list item, not a sibling key
|
||||
k, v_text = _split_key_value(ln.content, ln.lineno)
|
||||
if not _BARE_RX.match(k):
|
||||
die(
|
||||
raise YamlSubsetError(
|
||||
f"yaml-subset: line {ln.lineno} key {k!r} is "
|
||||
f"not a bare identifier"
|
||||
)
|
||||
if k in item:
|
||||
die(f"yaml-subset: line {ln.lineno} duplicate key {k!r}")
|
||||
raise YamlSubsetError(f"yaml-subset: line {ln.lineno} duplicate key {k!r}")
|
||||
if v_text:
|
||||
item[k] = _parse_inline(v_text, ln.lineno)
|
||||
idx += 1
|
||||
@@ -501,7 +495,7 @@ def parse_yaml_subset(text: str) -> dict[str, object]:
|
||||
for n, raw in enumerate(text.splitlines(), start=1):
|
||||
s = raw.strip()
|
||||
if s.startswith("|") or s.startswith(">") or s.startswith("- |") or s.startswith("- >"):
|
||||
die(
|
||||
raise YamlSubsetError(
|
||||
f"yaml-subset: line {n} uses a multi-line block "
|
||||
f"scalar (`|` / `>`) — not supported. Use a quoted "
|
||||
f"single-line string instead."
|
||||
@@ -511,12 +505,12 @@ def parse_yaml_subset(text: str) -> dict[str, object]:
|
||||
# not when it's inside a quoted string. Cheap check: any
|
||||
# bare `&foo:` / `*foo` at the start of a value position.
|
||||
if re.search(r"(^|\s)[&*][A-Za-z0-9_]+", s):
|
||||
die(
|
||||
raise YamlSubsetError(
|
||||
f"yaml-subset: line {n} uses anchors / aliases "
|
||||
f"(`&` / `*`) — not supported."
|
||||
)
|
||||
if "!!" in s and not (s.count("'") % 2 or s.count('"') % 2):
|
||||
die(
|
||||
raise YamlSubsetError(
|
||||
f"yaml-subset: line {n} uses a YAML tag (`!!`) — not "
|
||||
f"supported."
|
||||
)
|
||||
@@ -526,18 +520,18 @@ def parse_yaml_subset(text: str) -> dict[str, object]:
|
||||
return {}
|
||||
base_indent = lines[0].indent
|
||||
if base_indent != 0:
|
||||
die(
|
||||
raise YamlSubsetError(
|
||||
f"yaml-subset: top-level content must start in column 0 "
|
||||
f"(got column {base_indent} on line {lines[0].lineno})"
|
||||
)
|
||||
value, consumed = _parse_block(lines, 0, 0)
|
||||
if consumed < len(lines):
|
||||
die(
|
||||
raise YamlSubsetError(
|
||||
f"yaml-subset: trailing content starting on line "
|
||||
f"{lines[consumed].lineno}"
|
||||
)
|
||||
if not isinstance(value, dict):
|
||||
die("yaml-subset: top-level value must be a mapping")
|
||||
raise YamlSubsetError("yaml-subset: top-level value must be a mapping")
|
||||
return cast(dict[str, object], value)
|
||||
|
||||
|
||||
@@ -576,7 +570,7 @@ def parse_frontmatter(text: str) -> tuple[dict[str, object], str]:
|
||||
fm_end_lineno = line_idx
|
||||
break
|
||||
if body_start < 0:
|
||||
die("frontmatter: opening `---` has no matching closing `---`")
|
||||
raise YamlSubsetError("frontmatter: opening `---` has no matching closing `---`")
|
||||
|
||||
fm_text = text[line_starts[1]:line_starts[fm_end_lineno]] if fm_end_lineno > 1 else ""
|
||||
fm = parse_yaml_subset(fm_text)
|
||||
|
||||
+6
-4
@@ -22,7 +22,9 @@ mounted in. That topology breaks two assumptions those tests make:
|
||||
`http://127.0.0.1:<host_port>` from inside the job time out.
|
||||
|
||||
The affected tests (`test_orphan_cleanup.test_create_and_remove`,
|
||||
`test_pipelock_sidecar_smoke.test_smoke`) still run locally where the
|
||||
test process and Docker daemon share a host. Making them work in CI
|
||||
is a follow-up: either re-write them to discover container IPs via
|
||||
`docker inspect`, or reconfigure the runner with host networking.
|
||||
`test_sidecar_bundle_image.TestSidecarBundleImage`,
|
||||
`test_sidecar_bundle_compose.TestSidecarBundleCompose`) still run
|
||||
locally where the test process and Docker daemon share a host.
|
||||
Making them work in CI is a follow-up: either re-write them to
|
||||
discover container IPs via `docker inspect`, or reconfigure the
|
||||
runner with host networking.
|
||||
|
||||
@@ -13,13 +13,13 @@ Add Content-Length validation and a body-size cap to `git_http_backend.py` so ma
|
||||
|
||||
`bot_bottle/git_http_backend.py` calls `int(self.headers.get("Content-Length", 0))` without catching `ValueError`. A request with a non-numeric Content-Length raises an unhandled exception in the request handler.
|
||||
|
||||
The handler reads the full declared length into memory before passing the body to `git http-backend` with no upper bound. A local or compromised client can force arbitrarily high memory use. For comparison, `supervise_server.py` caps request bodies at 1 MiB.
|
||||
The handler reads the full declared length into memory before passing the body to `git http-backend` with no upper bound. A local or compromised client can force arbitrarily high memory use.
|
||||
|
||||
## Goals / Success Criteria
|
||||
|
||||
- A missing or non-numeric Content-Length returns HTTP 400.
|
||||
- A negative Content-Length returns HTTP 400.
|
||||
- A body larger than the cap (1 MiB, matching `supervise_server.py`) returns HTTP 413.
|
||||
- A body larger than the cap (100 MiB) returns HTTP 413.
|
||||
- Valid Git smart-HTTP pushes and fetches continue to work.
|
||||
- Unit tests cover: missing length, non-numeric length, negative length, over-cap length, and a valid push/fetch passthrough.
|
||||
|
||||
@@ -43,12 +43,12 @@ Out of scope:
|
||||
|
||||
## Design
|
||||
|
||||
Wrap the Content-Length parse in a try/except and return 400 on `ValueError`. Add an explicit check for negative values. After parsing, compare the declared length against a module-level `MAX_BODY_BYTES` constant (default 1 MiB) and return 413 if exceeded. Read exactly `min(content_length, MAX_BODY_BYTES)` bytes.
|
||||
Wrap the Content-Length parse in a try/except and return 400 on `ValueError`. Add an explicit check for negative values. After parsing, compare the declared length against a module-level `MAX_BODY_BYTES` constant (default 100 MiB) and return 413 if exceeded. Read exactly `min(content_length, MAX_BODY_BYTES)` bytes.
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
- Unit tests using `unittest.mock` to drive the handler with crafted headers.
|
||||
- Test cases: no Content-Length header, `Content-Length: abc`, `Content-Length: -1`, `Content-Length: 2097152` (over cap), and a normal small POST body.
|
||||
- Test cases: no Content-Length header, `Content-Length: abc`, `Content-Length: -1`, a declared length above `MAX_BODY_BYTES`, and a normal small POST body.
|
||||
|
||||
Run:
|
||||
|
||||
|
||||
@@ -0,0 +1,436 @@
|
||||
# PRD 0052: Egress DLP addon
|
||||
|
||||
- **Status:** Active
|
||||
- **Author:** claude
|
||||
- **Created:** 2026-06-05
|
||||
- **Issue:** #195
|
||||
|
||||
## Summary
|
||||
|
||||
With pipelock removed (PR #193), the egress proxy no longer performs DLP
|
||||
scanning on traffic to or from the agent. This PRD implements a replacement
|
||||
directly inside the mitmproxy egress addon: per-route DLP detectors that
|
||||
scan outbound requests for credential leakage and inbound responses for
|
||||
prompt injection attempts.
|
||||
|
||||
The manifest route schema is also upgraded in this PRD from the flat
|
||||
`path_allowlist` field to a structured `matches` block modelled on the
|
||||
[Kubernetes Gateway API `HTTPRoute`](https://gateway-api.sigs.k8s.io/reference/spec/#gateway.networking.k8s.io/v1.HTTPRouteMatch)
|
||||
match vocabulary. This upgrade is a hard cutover — no compatibility shim
|
||||
for the old format. The rationale and format survey are in the
|
||||
[YAML route matching formats research doc](https://gitea.dideric.is/didericis/bot-bottle/src/branch/main/docs/research/yaml-route-matching-formats.md).
|
||||
DLP detectors attach to the new `matches`-based routes directly.
|
||||
|
||||
The design follows the recommendation in the
|
||||
[DLP research document (PR #192)](https://gitea.dideric.is/didericis/bot-bottle/pulls/192)
|
||||
and covers all three remaining implementation phases from that plan:
|
||||
|
||||
1. Token pattern detection (Phase 1a)
|
||||
2. Known-secrets detection (Phase 1b)
|
||||
3. Naive prompt injection detection (Phase 2)
|
||||
|
||||
## Problem
|
||||
|
||||
Pipelock was removed because it could not support per-route response
|
||||
scanning, blocking selective DLP policies (e.g., skip scanning `.whl`
|
||||
downloads while keeping scanning on API calls). Removing it left the egress
|
||||
proxy with no DLP capability at all. The egress addon already holds per-route
|
||||
logic for path allowlisting and credential injection; DLP rules belong in the
|
||||
same place.
|
||||
|
||||
The existing `path_allowlist` field is also limiting: it only supports path
|
||||
prefixes, with no way to express exact-path, regex, method, or header
|
||||
constraints. The Gateway API match vocabulary is a well-specified, widely
|
||||
deployed standard that covers all of these without inventing new syntax.
|
||||
|
||||
## Goals / Success Criteria
|
||||
|
||||
1. Outbound request bodies and headers are scanned for known token patterns
|
||||
(AWS, GitHub, Anthropic, etc.) before the request reaches the upstream.
|
||||
Matches are blocked immediately.
|
||||
2. Outbound request bodies are scanned for provisioned secrets that the
|
||||
agent should not have direct access to. Matches are blocked immediately.
|
||||
3. Inbound response bodies are scanned for prompt disclosure and jailbreak
|
||||
signals. High-confidence matches are blocked; medium-confidence matches
|
||||
emit a log warning and are forwarded.
|
||||
4. DLP scanning is enabled by default on every route. Individual routes can
|
||||
selectively disable outbound detectors, inbound detectors, or both via a
|
||||
`dlp` block in the manifest.
|
||||
5. All detector logic lives in `egress_addon_core.py` (pure Python, no
|
||||
mitmproxy dependency) and is covered by unit tests on the host.
|
||||
6. Each route's `matches` block supports path (exact/prefix/regex), HTTP
|
||||
method, and header predicates using Gateway API match semantics.
|
||||
7. The manifest change is a hard cutover: `path_allowlist` is removed with
|
||||
no fallback, no deprecation alias, and no loud exception for old-format
|
||||
manifests. Old manifests that use `path_allowlist` will fail validation
|
||||
at load time with an unknown-key error (same as any other unrecognised
|
||||
key today).
|
||||
|
||||
## Non-goals
|
||||
|
||||
- LLM-based semantic prompt injection detection (explicitly deferred to a
|
||||
potential Phase 2b per the research doc).
|
||||
- Entropy-based secret detection (excluded from scope; too many false
|
||||
positives on binary API responses and compressed payloads).
|
||||
- BIP-39 seed-phrase detection.
|
||||
- Generic DLP (credit cards, SSNs, PII) — scope is narrow: AI/credential
|
||||
exfil relevant to agent containment.
|
||||
- Changes to the cred-proxy sidecar.
|
||||
- Streaming response scanning (scan buffered response body only).
|
||||
- Glob-style path matching — regex covers every case glob would handle
|
||||
without adding a third path-matching language.
|
||||
|
||||
## Design
|
||||
|
||||
### Route matching: Gateway API `matches` vocabulary
|
||||
|
||||
The existing `path_allowlist` field is replaced by a `matches` list. The
|
||||
vocabulary mirrors Kubernetes Gateway API `HTTPRouteMatch` (see the
|
||||
[route matching research doc](https://gitea.dideric.is/didericis/bot-bottle/src/branch/main/docs/research/yaml-route-matching-formats.md)
|
||||
for a full format survey and rationale). Gateway API was chosen because it
|
||||
is spec-backed, implementation-tested across multiple proxies, and its
|
||||
`{type, value}` pattern is consistent and schema-validatable.
|
||||
|
||||
**AND/OR semantics** (same as Gateway API):
|
||||
- Predicates *within* a single `matches` entry are ANDed.
|
||||
- Multiple entries in the `matches` list are ORed — the route matches if
|
||||
any entry matches.
|
||||
|
||||
```yaml
|
||||
egress:
|
||||
routes:
|
||||
# Bare route — all traffic to this host is forwarded (no path/method/header
|
||||
# constraints). Equivalent to the old path_allowlist-omitted case.
|
||||
- host: api.anthropic.com
|
||||
auth:
|
||||
scheme: Bearer
|
||||
token_ref: EGRESS_TOKEN_0
|
||||
|
||||
# Two match entries (OR): GET/HEAD on /packages/** OR POST on /upload
|
||||
- host: files.pythonhosted.org
|
||||
matches:
|
||||
- paths:
|
||||
- type: prefix
|
||||
value: /packages/
|
||||
methods: [GET, HEAD]
|
||||
- paths:
|
||||
- type: exact
|
||||
value: /upload
|
||||
methods: [POST]
|
||||
dlp:
|
||||
inbound_detectors: false # skip response scanning (binary downloads)
|
||||
|
||||
# Header + regex path — only JSON API responses on versioned endpoints
|
||||
- host: internal-api.corp
|
||||
matches:
|
||||
- paths:
|
||||
- type: regex
|
||||
value: "^/v[0-9]+/"
|
||||
headers:
|
||||
- name: Content-Type
|
||||
type: exact
|
||||
value: application/json
|
||||
dlp:
|
||||
outbound_detectors: false
|
||||
inbound_detectors: false
|
||||
```
|
||||
|
||||
#### Path matching types
|
||||
|
||||
| `type` | Semantics |
|
||||
|--------|-----------|
|
||||
| `exact` | Full path must equal `value` exactly |
|
||||
| `prefix` | Path must start with `value` at a segment boundary (matches `/api/v1` for value `/api/v1`, rejects `/api/v10`) |
|
||||
| `regex` | RE2 regex; rejected at load time if pattern fails to compile. Use for wildcard needs: `/api/[^/]+/data` instead of glob |
|
||||
|
||||
`type` defaults to `prefix` when omitted (preserves the semantic of the
|
||||
old `path_allowlist`).
|
||||
|
||||
#### Method matching
|
||||
|
||||
`methods` is a list of HTTP method names, case-insensitive at parse time —
|
||||
`get`, `GET`, and `Get` are all accepted and stored as uppercase internally.
|
||||
An absent or empty `methods` list means all methods are permitted.
|
||||
|
||||
#### Header matching
|
||||
|
||||
`headers` is a list of `{name, value, type}` objects. ALL listed headers
|
||||
must match (AND semantics). To OR on header values, use multiple `matches`
|
||||
entries.
|
||||
|
||||
| `type` | Semantics |
|
||||
|--------|-----------|
|
||||
| `exact` | Header value equals `value` (default when `type` omitted) |
|
||||
| `regex` | Header value matches RE2 regex |
|
||||
|
||||
### Manifest schema — `dlp` block
|
||||
|
||||
Each `egress.routes` entry gains an optional `dlp` key alongside `matches`
|
||||
and `auth`:
|
||||
|
||||
```yaml
|
||||
egress:
|
||||
routes:
|
||||
- host: api.anthropic.com
|
||||
# dlp omitted → all detectors on (default)
|
||||
|
||||
- host: files.pythonhosted.org
|
||||
dlp:
|
||||
inbound_detectors: false # skip response scanning (binary downloads)
|
||||
|
||||
- host: internal-docs.corp
|
||||
dlp:
|
||||
outbound_detectors: false
|
||||
inbound_detectors: false # trusted internal, no scanning
|
||||
```
|
||||
|
||||
`outbound_detectors` controls scanning of the *request* body + headers
|
||||
leaving the agent. `inbound_detectors` controls scanning of the *response*
|
||||
body arriving from the upstream.
|
||||
|
||||
Valid values per field:
|
||||
- Omitted (or `null`) — default: all detectors active.
|
||||
- `false` — scanning disabled for this direction on this route.
|
||||
- A list of detector names — only the listed detectors run.
|
||||
|
||||
Named outbound detectors: `token_patterns`, `known_secrets`.
|
||||
Named inbound detectors: `naive_injection_detection`.
|
||||
|
||||
The manifest parser (`manifest_egress.py`) validates the `dlp` block and
|
||||
rejects unknown detector names.
|
||||
|
||||
### Manifest schema — `git` block
|
||||
|
||||
HTTPS Git clone/fetch traffic is not implied by a host-level egress route.
|
||||
Smart HTTP Git fetch uses `git-upload-pack`, which can transfer large repo
|
||||
packfiles and bypass the git-gate mirror path. It is therefore blocked by
|
||||
default and must be explicitly enabled per route:
|
||||
|
||||
```yaml
|
||||
egress:
|
||||
routes:
|
||||
- host: github.com
|
||||
git:
|
||||
fetch: true
|
||||
```
|
||||
|
||||
`git.fetch: true` permits read-only smart HTTP clone/fetch requests
|
||||
(`git-upload-pack`) after the normal host and `matches` checks pass. HTTPS
|
||||
Git push (`git-receive-pack`) remains blocked by the egress addon.
|
||||
|
||||
### `EgressRoute` changes
|
||||
|
||||
`EgressRoute` replaces `PathAllowlist` with `Matches` and gains two new
|
||||
DLP fields. `MatchEntry` captures one AND-predicate block:
|
||||
|
||||
```python
|
||||
@dataclass(frozen=True)
|
||||
class PathMatch:
|
||||
type: str # "exact" | "prefix" | "regex"
|
||||
value: str
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class HeaderMatch:
|
||||
name: str
|
||||
value: str
|
||||
type: str = "exact" # "exact" | "regex"
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class MatchEntry:
|
||||
paths: tuple[PathMatch, ...] = () # empty = match any path
|
||||
methods: tuple[str, ...] = () # empty = match any method (uppercase)
|
||||
headers: tuple[HeaderMatch, ...] = () # empty = match any headers
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class EgressRoute:
|
||||
Host: str
|
||||
Matches: tuple[MatchEntry, ...] = () # empty = match all requests
|
||||
AuthScheme: str = ""
|
||||
TokenRef: str = ""
|
||||
Role: tuple[str, ...] = ()
|
||||
GitFetch: bool = False
|
||||
OutboundDetectors: tuple[str, ...] | None = None # None = all enabled
|
||||
InboundDetectors: tuple[str, ...] | None = None # None = all enabled
|
||||
```
|
||||
|
||||
`manifest_egress.py`'s `from_dict` parses the new `matches` block and `dlp`
|
||||
block; `path_allowlist` is no longer a recognised key and will be rejected
|
||||
by the unknown-key check.
|
||||
|
||||
### `Route` changes in `egress_addon_core.py`
|
||||
|
||||
The addon-side `Route` and its helper types mirror the manifest-side changes.
|
||||
`match_route` is extended to evaluate the `Matches` list:
|
||||
|
||||
```python
|
||||
@dataclass(frozen=True)
|
||||
class Route:
|
||||
host: str
|
||||
matches: tuple[MatchEntry, ...] = ()
|
||||
auth_scheme: str = ""
|
||||
token_env: str = ""
|
||||
git_fetch: bool = False
|
||||
outbound_detectors: tuple[str, ...] | None = None
|
||||
inbound_detectors: tuple[str, ...] | None = None
|
||||
```
|
||||
|
||||
`decide()` feeds through `match_route` (unchanged host lookup) then
|
||||
evaluates the match entries in order; if the route has no `matches` entries
|
||||
all requests pass. Path `prefix` type uses segment-boundary checking
|
||||
(`/api/v1` matches `/api/v1/foo` but not `/api/v10`).
|
||||
|
||||
### Detector interface
|
||||
|
||||
Each detector is a pure function:
|
||||
|
||||
```python
|
||||
def scan(body: str | bytes, *, env: Mapping[str, str] = {}) -> ScanResult | None:
|
||||
...
|
||||
```
|
||||
|
||||
`ScanResult` carries:
|
||||
|
||||
```python
|
||||
@dataclass(frozen=True)
|
||||
class ScanResult:
|
||||
severity: str # "block" or "warn"
|
||||
reason: str
|
||||
```
|
||||
|
||||
`scan` returns `None` if the body is clean, `ScanResult` otherwise.
|
||||
|
||||
### Detector: `token_patterns`
|
||||
|
||||
Regex patterns for well-known credential formats, applied to the outbound
|
||||
request body and `Authorization` header (before the addon strips it — the
|
||||
strip happens after DLP scanning so that the scan sees any credential the
|
||||
agent tried to smuggle):
|
||||
|
||||
| Token type | Pattern |
|
||||
|------------|---------|
|
||||
| AWS access key | `AKIA[0-9A-Z]{16}` |
|
||||
| GitHub token (classic) | `ghp_[A-Za-z0-9_]{36}` |
|
||||
| GitHub fine-grained | `github_pat_[A-Za-z0-9_]{82}` |
|
||||
| Anthropic API key | `sk-ant-[A-Za-z0-9\-_]{93}` |
|
||||
| OpenAI API key | `sk-[A-Za-z0-9]{48}` |
|
||||
| Stripe live key | `sk_live_[A-Za-z0-9]{24}` |
|
||||
| Generic Bearer JWT | `Bearer\s+[A-Za-z0-9._\-]{50,}` |
|
||||
|
||||
Action: `"block"` on any match. No tolerance — a credential in an outbound
|
||||
request is always a violation.
|
||||
|
||||
### Detector: `known_secrets`
|
||||
|
||||
At request time the egress addon has access to `os.environ`, which includes
|
||||
all `token_env` values declared by route auth blocks. The detector:
|
||||
|
||||
1. Collects all `EGRESS_TOKEN_*` values from the environment (the naming
|
||||
contract established by `manifest_egress.py`'s `TokenRef` rendering).
|
||||
2. For each secret value, derives encoded variants: raw, base64, URL-encoded,
|
||||
hex.
|
||||
3. Scans the outbound request body for any variant.
|
||||
|
||||
Action: `"block"` on match.
|
||||
|
||||
This detector does **not** accept a custom detector name in the YAML — it
|
||||
is always named `known_secrets`. The environment is passed in via the `env`
|
||||
keyword argument to `scan`.
|
||||
|
||||
### Detector: `naive_injection_detection`
|
||||
|
||||
Pattern-based inbound response scanner. Uses two tiers:
|
||||
|
||||
**Tier 1 — BLOCK (credential + disclosure together):**
|
||||
- Response contains a token-pattern match (reuses `token_patterns` regex
|
||||
set) AND a prompt-disclosure phrase (e.g., `system prompt`, `my instructions
|
||||
are`, `hidden rules`).
|
||||
|
||||
**Tier 2 — WARN (multiple jailbreak signals):**
|
||||
- Two or more jailbreak phrases detected (e.g., `ignore previous`,
|
||||
`forget everything`, `pretend you are`, `act as`).
|
||||
- OR explicit prompt disclosure (`system prompt:`) without a credential.
|
||||
|
||||
**Tier 3 — ALLOW:**
|
||||
- Single jailbreak keyword without additional context.
|
||||
- Common documentation phrases.
|
||||
|
||||
See the DLP research doc for the full phrase lists and pseudocode.
|
||||
|
||||
### Wiring into `egress_addon.py`
|
||||
|
||||
Two new mitmproxy hooks are added alongside the existing `request` hook:
|
||||
|
||||
```python
|
||||
def request(self, flow: http.HTTPFlow) -> None:
|
||||
# ... existing match + auth-injection logic ...
|
||||
# After route decision, if action == "forward":
|
||||
result = scan_outbound(route, flow.request, os.environ)
|
||||
if result and result.severity == "block":
|
||||
flow.response = http.Response.make(403, result.reason.encode(), ...)
|
||||
return
|
||||
|
||||
def response(self, flow: http.HTTPFlow) -> None:
|
||||
route = match_route(self.routes, flow.request.pretty_host)
|
||||
if route is None:
|
||||
return # already blocked at request time
|
||||
result = scan_inbound(route, flow.response)
|
||||
if result and result.severity == "block":
|
||||
flow.response = http.Response.make(403, result.reason.encode(), ...)
|
||||
elif result and result.severity == "warn":
|
||||
sys.stderr.write(f"egress DLP warn: {result.reason}\n")
|
||||
```
|
||||
|
||||
`scan_outbound` and `scan_inbound` are pure functions in
|
||||
`egress_addon_core.py` that dispatch to the per-route detector list.
|
||||
|
||||
### Ordering: auth strip vs. DLP scan
|
||||
|
||||
The DLP outbound scan sees the *agent's original* `Authorization` header
|
||||
before the addon strips it. This ensures that a token the agent smuggled
|
||||
in the header is caught. The strip + optional re-injection still happens
|
||||
afterward, preserving the existing credential-injection security model.
|
||||
|
||||
## Implementation chunks
|
||||
|
||||
1. **New `matches` block + `EgressRoute` / `Route` restructure.**
|
||||
Remove `path_allowlist` from `manifest_egress.py` and `egress_addon_core.py`.
|
||||
Add `MatchEntry`, `PathMatch`, `HeaderMatch` types. Parse `matches` in
|
||||
`EgressRoute.from_dict` and `_parse_one`; unknown-key rejection handles
|
||||
old `path_allowlist` manifests. Add `OutboundDetectors` / `InboundDetectors`
|
||||
to `EgressRoute` and `Route`; parse `dlp` block. Extend
|
||||
`tests/unit/test_manifest_egress.py` and `tests/unit/test_egress_addon_core.py`
|
||||
with match and dlp valid/invalid cases.
|
||||
|
||||
2. **Token-patterns detector (Phase 1a).**
|
||||
New module `bot_bottle/dlp_detectors.py` (host-importable) and
|
||||
companion flat copy for the sidecar bundle. Add `TokenPatternsDetector`
|
||||
with the regex set above. Wire `scan_outbound` into the `request` hook
|
||||
in `egress_addon.py`. Unit tests in `tests/unit/test_dlp_detectors.py`.
|
||||
|
||||
3. **Known-secrets detector (Phase 1b).**
|
||||
Add `KnownSecretsDetector` to `dlp_detectors.py`. Collect
|
||||
`EGRESS_TOKEN_*` from env; derive encoded variants; scan request body.
|
||||
Extend unit tests. Wire into `scan_outbound`.
|
||||
|
||||
4. **Naive prompt injection detector (Phase 2).**
|
||||
Add `NaiveInjectionDetector` to `dlp_detectors.py`. Wire
|
||||
`scan_inbound` into the new `response` hook in `egress_addon.py`.
|
||||
Extend unit tests. Activate PRD 0052 (`Status: Draft → Active`) in
|
||||
this commit.
|
||||
|
||||
## Open questions
|
||||
|
||||
1. **Response body buffering:** mitmproxy's `response` hook already has
|
||||
the full body for non-streaming responses. For streaming (chunked)
|
||||
responses the body may be empty or incomplete at hook time. Scope for
|
||||
now: log a warning and skip scanning on streaming responses; revisit
|
||||
if needed.
|
||||
2. **Encoding breadth for `known_secrets`:** Start with raw + base64 +
|
||||
URL-encoded + hex. Add GZIP / base32 if real-world evasion attempts
|
||||
appear.
|
||||
3. **`EGRESS_TOKEN_*` naming contract:** The detector relies on the
|
||||
env-var naming convention from `manifest_egress.py`. If that contract
|
||||
changes, the detector must be updated in lock-step.
|
||||
@@ -0,0 +1,269 @@
|
||||
# PRD 0053: User-defined agent provider plugins
|
||||
|
||||
- **Status:** Active
|
||||
- **Author:** claude
|
||||
- **Created:** 2026-06-04
|
||||
|
||||
## Summary
|
||||
|
||||
The `get_provider()` registry in `bot_bottle/agent_provider.py` is a closed list —
|
||||
only `"claude"` and `"codex"` are valid templates, validated at manifest-load time and
|
||||
again at launch. Users who want to run a different agent (Gemini, Aider, a custom
|
||||
local model wrapper) cannot add a provider without forking the package.
|
||||
|
||||
This PRD opens the registry to user-defined plugins. A plugin placed at
|
||||
`~/.bot-bottle/contrib/<name>/` is discovered and loaded at launch time. The manifest
|
||||
accepts any non-empty template string that names a built-in or resolves to a user
|
||||
plugin at that path.
|
||||
|
||||
Alongside discovery, this PRD moves CA and git provisioning out of the Docker backend
|
||||
and into the `AgentProvider` ABC as overridable methods. The current standalone
|
||||
`provision/ca.py` and `provision/git.py` files in the Docker backend are deleted;
|
||||
their logic becomes the default implementations on the ABC. This lets exotic provider
|
||||
images (different base OS, different user, non-standard trust mechanism) override
|
||||
provisioning freely without the abstraction fighting them.
|
||||
|
||||
The preceding commit on this PR moves `codex_auth.py` from `bot_bottle/` into
|
||||
`bot_bottle/contrib/codex/` — a clean-up that fits naturally here since this PR
|
||||
also clarifies that `contrib/` is the per-provider home.
|
||||
|
||||
## Problem
|
||||
|
||||
Users building unconventional setups hit a hard wall: the template validation in
|
||||
`manifest_agent.AgentProvider.from_dict` rejects any string not in `PROVIDER_TEMPLATES`.
|
||||
There is no escape hatch short of editing bot-bottle's source.
|
||||
|
||||
PRD 0050 moved provider logic into `contrib/` specifically so a third provider would
|
||||
be "cheap to add" — but "cheap" today still means a pull request against the bot-bottle
|
||||
repo, not a drop-in file in the user's home directory. The filesystem layout is already
|
||||
the right shape; the discovery step is missing.
|
||||
|
||||
Beyond discovery, the Docker backend's `provision_ca` and `provision_git` functions
|
||||
bake in Debian-specific commands (`update-ca-certificates`) and a hardcoded container
|
||||
user (`node`). A user plugin that runs as a different user, or on a different base OS,
|
||||
silently gets the wrong provisioning with no way to correct it short of forking.
|
||||
|
||||
## Goals / Success Criteria
|
||||
|
||||
1. A user places `~/.bot-bottle/contrib/<name>/agent_provider.py` — a file that exports
|
||||
a class inheriting `AgentProvider` — sets `agent_provider.template: <name>` in a
|
||||
bottle's frontmatter, and launches a bottle using that provider with no changes to
|
||||
the bot-bottle source.
|
||||
2. The plugin directory may also contain a `Dockerfile` at
|
||||
`~/.bot-bottle/contrib/<name>/Dockerfile`; the existing three-tier Dockerfile cascade
|
||||
(per-bottle override → manifest `dockerfile:` field → provider default) uses this
|
||||
path as the provider default for user plugins.
|
||||
3. The manifest validator accepts any non-empty template string. Unknown templates that
|
||||
resolve to no user plugin still raise a clear error, but at launch (via `get_provider`)
|
||||
rather than at manifest-load time.
|
||||
4. Built-in provider knobs (`auth_token` → claude only; `forward_host_credentials` →
|
||||
codex only) are guarded to built-in template names. Bottles using a user provider
|
||||
may set neither knob.
|
||||
5. `get_provider(template)` checks `~/.bot-bottle/contrib/<template>/agent_provider.py`
|
||||
before the built-ins, so a user can shadow a built-in for local testing.
|
||||
6. A clear `ValueError` is raised if the user plugin file exists but contains no
|
||||
`AgentProvider` subclass.
|
||||
7. `AgentProvider` gains `provision_ca(self, bottle, plan)` and
|
||||
`provision_git(self, bottle, plan)` with default implementations that reproduce
|
||||
current Docker/Debian/node behavior. Built-in providers inherit the defaults
|
||||
unchanged. User plugins override either method when their image diverges.
|
||||
8. `bot_bottle/backend/docker/provision/ca.py` and
|
||||
`bot_bottle/backend/docker/provision/git.py` are deleted. The Docker backend base
|
||||
class calls `provider.provision_ca(bottle, plan)` and
|
||||
`provider.provision_git(bottle, plan)` directly.
|
||||
|
||||
## Non-goals
|
||||
|
||||
- Packaging or distributing user plugins as installable Python packages.
|
||||
- A plugin registry, index, or discovery beyond the filesystem path convention.
|
||||
- Adding a third built-in provider.
|
||||
- Validating that user plugin images, Dockerfiles, or commands exist before launch
|
||||
(same policy as built-ins).
|
||||
- Sandboxing user plugin code — plugins run with full Python interpreter access.
|
||||
- Per-provider opt-out of the egress sidecar or network provisioning (follow-on).
|
||||
|
||||
## Scope
|
||||
|
||||
### In scope
|
||||
|
||||
- `get_provider(template: str) -> AgentProvider` gains a `_load_user_plugin(template)`
|
||||
step that checks `~/.bot-bottle/contrib/<template>/agent_provider.py` first, then
|
||||
falls through to the built-in look-ups.
|
||||
- `_load_user_plugin` uses `importlib.util.spec_from_file_location` to load the module
|
||||
and returns the first `AgentProvider` subclass found in its `__dict__`. Raises
|
||||
`ValueError` if the file exists but exports no subclass.
|
||||
- The Dockerfile cascade in the Docker backend's `resolve_plan()` uses
|
||||
`~/.bot-bottle/contrib/<template>/Dockerfile` as the provider default for user
|
||||
plugins (the same slot currently occupied by `Dockerfile.claude` / `Dockerfile.codex`
|
||||
for built-ins).
|
||||
- `manifest_agent.AgentProvider.from_dict`: the `template not in PROVIDER_TEMPLATES`
|
||||
check is removed; the two built-in-specific knob guards (`auth_token` → claude,
|
||||
`forward_host_credentials` → codex) are tightened to `template in PROVIDER_TEMPLATES`
|
||||
so they are skipped for user-defined names.
|
||||
- `PROVIDER_TEMPLATES` remains in `agent_provider.py` as the set of built-in names for
|
||||
use by tests and any enumeration callers.
|
||||
- `AgentProvider` ABC gains:
|
||||
```python
|
||||
def provision_ca(self, bottle: Bottle, plan: BottlePlan) -> None: ...
|
||||
def provision_git(self, bottle: Bottle, plan: BottlePlan) -> None: ...
|
||||
```
|
||||
Default implementations reproduce the current `provision/ca.py` and
|
||||
`provision/git.py` logic exactly (Debian `update-ca-certificates`, `node` user,
|
||||
`/home/node` home).
|
||||
- `bot_bottle/backend/docker/provision/ca.py` and
|
||||
`bot_bottle/backend/docker/provision/git.py` deleted. The Docker backend base
|
||||
class substitutes direct calls to the provider methods.
|
||||
- Unit tests for the discovery path:
|
||||
- Plugin found and loaded → correct `AgentProvider` instance returned.
|
||||
- Plugin file exists but exports no subclass → `ValueError`.
|
||||
- Unknown template with no user plugin → `ValueError` from `get_provider`.
|
||||
- Built-in template name still works normally even when no user plugin exists.
|
||||
- Unit tests for the provisioning delegation:
|
||||
- A provider subclass that overrides `provision_ca` has its override called.
|
||||
- A provider subclass that overrides `provision_git` has its override called.
|
||||
- One paragraph added to `README.md` under a new "Custom providers" section describing
|
||||
the `~/.bot-bottle/contrib/<name>/` convention (both `agent_provider.py` and
|
||||
`Dockerfile`), the `provision_ca` / `provision_git` override points, and pointing at
|
||||
the existing contrib providers as reference implementations.
|
||||
|
||||
### Out of scope
|
||||
|
||||
- Hot-reloading plugins during a running session.
|
||||
- Plugin versioning or dependency declaration.
|
||||
- Changes to the smolmachines backend provisioning path.
|
||||
|
||||
## Proposed design
|
||||
|
||||
### Discovery in `get_provider`
|
||||
|
||||
```python
|
||||
import importlib.util
|
||||
|
||||
def get_provider(template: str) -> AgentProvider:
|
||||
user_plugin = _load_user_plugin(template)
|
||||
if user_plugin is not None:
|
||||
return user_plugin
|
||||
if template == PROVIDER_CLAUDE:
|
||||
from .contrib.claude.agent_provider import ClaudeAgentProvider
|
||||
return ClaudeAgentProvider()
|
||||
if template == PROVIDER_CODEX:
|
||||
from .contrib.codex.agent_provider import CodexAgentProvider
|
||||
return CodexAgentProvider()
|
||||
raise ValueError(f"unknown agent provider template: {template!r}")
|
||||
|
||||
|
||||
def _load_user_plugin(template: str) -> AgentProvider | None:
|
||||
plugin_path = (
|
||||
Path.home() / ".bot-bottle" / "contrib" / template / "agent_provider.py"
|
||||
)
|
||||
if not plugin_path.exists():
|
||||
return None
|
||||
spec = importlib.util.spec_from_file_location(
|
||||
f"_user_contrib_{template}.agent_provider", plugin_path
|
||||
)
|
||||
if spec is None or spec.loader is None:
|
||||
raise ValueError(f"user plugin at {plugin_path} could not be loaded")
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(mod) # type: ignore[union-attr]
|
||||
for obj in vars(mod).values():
|
||||
if (
|
||||
isinstance(obj, type)
|
||||
and issubclass(obj, AgentProvider)
|
||||
and obj is not AgentProvider
|
||||
):
|
||||
return obj()
|
||||
raise ValueError(
|
||||
f"user plugin at {plugin_path} defines no AgentProvider subclass"
|
||||
)
|
||||
```
|
||||
|
||||
### Dockerfile convention for user plugins
|
||||
|
||||
`resolve_plan()` in the Docker backend already has a three-tier cascade. For user
|
||||
plugins the provider-default slot is filled by:
|
||||
|
||||
```python
|
||||
Path.home() / ".bot-bottle" / "contrib" / template / "Dockerfile"
|
||||
```
|
||||
|
||||
Per-bottle overrides and manifest `dockerfile:` fields continue to take precedence.
|
||||
|
||||
### Provisioning methods on `AgentProvider`
|
||||
|
||||
```python
|
||||
class AgentProvider(ABC):
|
||||
...
|
||||
def provision_ca(self, bottle: Bottle, plan: BottlePlan) -> None:
|
||||
"""Install the egress MITM CA into the agent container's trust store.
|
||||
Override for non-Debian base images or non-standard trust mechanisms."""
|
||||
cert_host_path, label = select_ca_cert(plan.egress_plan)
|
||||
bottle.cp_in(str(cert_host_path), AGENT_CA_PATH)
|
||||
bottle.exec(
|
||||
f"chmod 644 {AGENT_CA_PATH} && update-ca-certificates",
|
||||
user="root",
|
||||
)
|
||||
log_ca_fingerprint(cert_host_path, label)
|
||||
|
||||
def provision_git(self, bottle: Bottle, plan: BottlePlan) -> None:
|
||||
"""Configure git inside the agent container.
|
||||
Override for images that run as a different user or use a non-standard home."""
|
||||
_provision_cwd_git(plan, bottle)
|
||||
_provision_git_gate_config(plan, bottle)
|
||||
_provision_git_user(plan, bottle)
|
||||
```
|
||||
|
||||
The Docker backend base class replaces the direct calls to the old standalone
|
||||
functions with:
|
||||
|
||||
```python
|
||||
provider.provision_ca(bottle, plan)
|
||||
provider.provision_git(bottle, plan)
|
||||
```
|
||||
|
||||
### Manifest validation change
|
||||
|
||||
In `manifest_agent.AgentProvider.from_dict`, remove the hard rejection:
|
||||
|
||||
```python
|
||||
# Before
|
||||
if template not in PROVIDER_TEMPLATES:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' agent_provider.template {template!r} "
|
||||
f"is not one of {', '.join(sorted(PROVIDER_TEMPLATES))}"
|
||||
)
|
||||
|
||||
# After — removed entirely; get_provider() raises at launch for unknown names
|
||||
```
|
||||
|
||||
Guard the built-in knob checks with `template in PROVIDER_TEMPLATES`:
|
||||
|
||||
```python
|
||||
if auth_token and template == "claude": # unchanged
|
||||
...
|
||||
if auth_token and template not in PROVIDER_TEMPLATES:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' agent_provider.auth_token is only "
|
||||
f"supported for built-in templates ({', '.join(sorted(PROVIDER_TEMPLATES))})"
|
||||
)
|
||||
if forward_host_credentials and template == "codex": # unchanged
|
||||
...
|
||||
if forward_host_credentials and template not in PROVIDER_TEMPLATES:
|
||||
raise ManifestError(
|
||||
f"bottle '{bottle_name}' agent_provider.forward_host_credentials "
|
||||
f"is only supported for built-in templates"
|
||||
)
|
||||
```
|
||||
|
||||
## Open questions
|
||||
|
||||
1. **`BOT_BOTTLE_CONTRIB_DIR` env var.** Omitted for now — `~/.bot-bottle/contrib/`
|
||||
is consistent with the rest of the user config layout. Revisit if the need surfaces.
|
||||
|
||||
## References
|
||||
|
||||
- PRD 0050 — agent provider contrib (established `contrib/` as the per-provider home)
|
||||
- PRD 0048 — SSH deploy key provisioning (the `contrib/` convention)
|
||||
- `bot_bottle/agent_provider.py` — `get_provider`, `PROVIDER_TEMPLATES`, `AgentProvider` ABC
|
||||
- `bot_bottle/manifest_agent.py` — template validation that this PRD relaxes
|
||||
- `bot_bottle/backend/docker/provision/ca.py` — current CA provisioner (to be deleted)
|
||||
- `bot_bottle/backend/docker/provision/git.py` — current git provisioner (to be deleted)
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user