From ebad90bfa9b1de3c9426d8919544d26c8c1d6a71 Mon Sep 17 00:00:00 2001 From: didericis Date: Tue, 30 Jun 2026 17:39:33 -0400 Subject: [PATCH] docs(prd): adopt forge sidecar (option 3) for native integration Flip the forge-native-integration PRD from option 2 (agent calls the Gitea API directly via cred-proxy; done signal parsed from comments) to option 3 per issue #317 comment 2715: a forge sidecar backed by a Forge abstract class. - signal_done(status, summary) replaces comment-parsing as the done signal - semantic audit trail from the sidecar feeds provenance directly - read-anywhere / write-scoped enforcement, tighter than repo-wide API keys - forge-agnostic agent prompts and sidecar protocol - DeployKeyProvisioner subsumption deferred; share the HTTP client only Co-Authored-By: Claude Opus 4.8 Claude-Session: https://claude.ai/code/session_01WL77TgFxKbs3cidGMG9dz7 --- docs/prds/prd-new-forge-native-integration.md | 282 +++++++++++++----- 1 file changed, 200 insertions(+), 82 deletions(-) diff --git a/docs/prds/prd-new-forge-native-integration.md b/docs/prds/prd-new-forge-native-integration.md index 764da95..ee12845 100644 --- a/docs/prds/prd-new-forge-native-integration.md +++ b/docs/prds/prd-new-forge-native-integration.md @@ -11,17 +11,29 @@ Add a webhook-driven orchestration layer that lets Gitea issues and PR comments drive bot-bottle sessions end-to-end with no operator in the loop for the happy path. An issue assigned to a member of the configured agent org and labelled with an agent name triggers a headless bottle launch; the bottle processes the -issue, opens a PR, and posts a done-comment via the Gitea API (through -cred-proxy) before exiting. The orchestrator detects the done-comment, freezes -the bottle, and attaches a provenance footer. Subsequent PR comments rehydrate -the frozen bottle. The bottle is destroyed when the PR closes. +issue, opens a PR, and interacts with the forge through a **forge sidecar** — +the agent never touches the Gitea API or its credentials directly. The agent +calls `signal_done(status, summary)` on the sidecar when a work unit is +complete; the sidecar relays that to the orchestrator over a queue dir (the same +pattern as the supervise sidecar), so completion is an unambiguous in-band +signal rather than a comment the orchestrator has to parse. The orchestrator +freezes the bottle and attaches a provenance footer. Subsequent PR comments +rehydrate the frozen bottle. The bottle is destroyed when the PR closes. + +The forge sidecar is backed by a `Forge` abstract class with per-provider +implementations (Gitea first), so the agent's prompts and the sidecar protocol +stay forge-agnostic. The sidecar logs forge operations semantically ("read PR +description", "posted comment", "signalled done"), giving richer provenance than +post-hoc egress-byte parsing, and enforces a **read-anywhere / write-scoped** +permission model: the agent may read for context but may only write to the +issue and PRs it was assigned. The separation of concerns across the two layers: bot-bottle owns the headless -launch primitives, forge state, Gitea client, and provenance builder. -`bot-bottle-orchestrator` (separate binary) owns the webhook listener, bottle -lifecycle loop, and monitoring dashboard; it calls into bot-bottle via -`./cli.py orchestrate`, a thin wrapper command. This PRD covers bot-bottle's -side of that contract. +launch primitives, the forge sidecar + `Forge` abstraction, forge state, and the +provenance builder. `bot-bottle-orchestrator` (separate binary) owns the webhook +listener, bottle lifecycle loop, and monitoring dashboard; it calls into +bot-bottle via `./cli.py orchestrate`, a thin wrapper command. This PRD covers +bot-bottle's side of that contract. ## Problem @@ -29,9 +41,22 @@ Today an operator must open the TUI, select an agent and bottle, confirm the preflight, and type prompts interactively. This blocks "issue → PR" automation and produces no durable audit record of what the agent did. The security model already provides the right isolation and egress controls; the missing pieces are -the headless launch primitive that `bot-bottle-orchestrator` can call, the -in-bottle Gitea API access the agent uses to signal completion, and the -provenance trail that makes the audit story legible to reviewers on every PR. +the headless launch primitive that `bot-bottle-orchestrator` can call, a +forge-interaction surface the agent uses to read context, post comments, and +signal completion, and the provenance trail that makes the audit story legible +to reviewers on every PR. + +That forge-interaction surface could be built two ways: (2) give the agent the +Gitea API directly with cred-proxy injecting the token, or (3) put a forge +sidecar between the agent and the forge. This PRD takes **option 3**. The +deciding factors: a sidecar `signal_done` call is an unambiguous completion +signal where comment-parsing is a correctness risk that surfaces in production; +the sidecar produces a semantic audit trail rather than HTTP bytes, which is +load-bearing for provenance (the stated product priority); and the sidecar can +enforce scope tighter than repo-wide API-key permissions, reducing blast radius +for a prompt-injected agent. The costs — a second sidecar process per forge run, +a new failure mode if it crashes, and per-forge implementation cost — are +accepted as the price of those properties. ## Goals / Success Criteria @@ -42,16 +67,20 @@ provenance trail that makes the audit story legible to reviewers on every PR. 2. An issue assigned to a member of the configured org (`FORGE_ORG`, default `bot-bottle`) and labelled `bot-bottle:` is the trigger convention. Org membership is verified via the Gitea API at event time. -3. Forge-targeted bottles receive a set of env vars at launch - (`FORGE_GITEA_API`, `FORGE_OWNER`, `FORGE_REPO`, `FORGE_ISSUE_NUMBER`) so - the agent knows where to post its done-comment without hardcoding forge - context in the agent manifest. -4. The agent's egress policy for forge runs includes `gitea.` with Bearer - auth injected by cred-proxy, enabling direct Gitea API calls from inside the - bottle. -5. The done-comment the agent posts is the done signal. A watchdog timeout - (configurable, default 30 min) causes the orchestrator to post the - done-comment on the agent's behalf if the agent exits without posting one. +3. Forge-targeted bottles run a **forge sidecar** that exposes a small, + forge-agnostic API (comment/issue/PR CRUD plus `signal_done`) over the same + queue-dir + HTTP/JSON-RPC machinery as the supervise sidecar. The agent calls + the sidecar; it never sees the forge token or forge-specific endpoints. +4. The sidecar is backed by a `Forge` abstract class. Gitea is the first + concrete implementation; adding a forge means a new subclass, not changes to + the agent prompt or sidecar protocol. The sidecar enforces a read-anywhere / + write-scoped model: writes are limited to the assigned issue and its PRs; + reads are unrestricted for context. +5. The agent calls `signal_done(status, summary)` on the sidecar when a work + unit is complete; the sidecar relays it to the orchestrator over a queue dir. + This is the done signal — no comment parsing. A watchdog timeout + (configurable, default 30 min) causes the orchestrator to treat the run as + done-without-self-report if the agent exits without signalling. 6. Every orchestrator-posted comment ends with a provenance footer: agent name, bottle name(s), slug, start time, duration, exit code, gitleaks result, and egress summary. @@ -61,7 +90,9 @@ provenance trail that makes the audit story legible to reviewers on every PR. issue/PR URLs. 9. Unit tests cover: label parsing, org-membership check path, forge state read/write, provenance footer rendering, headless launch arg construction, - forge env var injection, echo-loop guard. + forge env var injection, sidecar request dispatch through the `Forge` + abstraction, write-scope enforcement (reject writes outside the assigned + issue/PRs), and `signal_done` queue relay. ## Non-goals @@ -74,6 +105,11 @@ provenance trail that makes the audit story legible to reviewers on every PR. - Bottle destruction on issue close (PR close only; issue close is ambiguous). - Concurrent multi-issue handling (one blocking run per orchestrator process). - A monitoring dashboard (orchestrator-side concern). +- Folding `DeployKeyProvisioner` into the `Forge` abstraction. Deploy-key + provisioning runs at bottle-provision time on the host; the forge sidecar runs + inside the bottle at agent time. The two have different lifecycles and actors, + so coupling them into one class is deferred to a follow-up. This PRD only + shares the Gitea HTTP client between them. ## Design @@ -151,9 +187,9 @@ def start_headless( """Non-interactive bottle launch. Returns (slug, exit_code).""" ``` -`forge_env` is merged into the bottle's `guest_env` so the agent receives the -forge context as env vars (see below). The caller freezes the bottle after -`start_headless` returns. +`forge_env` carries the forge context and token to the forge sidecar launched +alongside the agent (see below); the agent process itself does not receive the +token. The caller freezes the bottle after `start_headless` returns. **`resume_headless`** — new function in `bot_bottle/cli/resume.py`: @@ -162,61 +198,124 @@ def resume_headless(slug: str, *, prompt: str, backend_name: str | None = None) """Rehydrate a frozen bottle and run one headless prompt. Returns exit_code.""" ``` +### Forge sidecar + +Forge-targeted bottles run a forge sidecar alongside the agent, mirroring the +supervise sidecar: a per-bottle process that exposes an HTTP/JSON-RPC endpoint +over a Unix socket and relays events to the orchestrator through a queue dir. +The agent calls the sidecar; the sidecar holds the forge token and makes the +actual forge API calls. The agent never receives the credential and never sees a +forge-specific endpoint — swapping Gitea for another forge does not change the +agent prompt or the sidecar protocol. + +The sidecar is configured at launch from the forge context (owner, repo, issue, +PR) and the token, supplied by the orchestrator — not baked into the agent +manifest. Because the sidecar owns the token, forge traffic does not need a +cred-proxy egress route on the agent; the agent's egress policy is unchanged by +forge targeting. + +**Sidecar protocol** (forge-agnostic; each method maps to a `Forge` call): + +| Method | Scope | Purpose | +|---|---|---| +| `read_issue(number)` | read-anywhere | Read issue/PR body for context | +| `read_comments(number)` | read-anywhere | Read a thread for context | +| `post_comment(number, body)` | write-scoped | Post to the assigned issue/PR | +| `update_description(number, body)` | write-scoped | Edit the assigned issue/PR body | +| `signal_done(status, summary)` | — | Relay completion to the orchestrator | + +**Scope enforcement** is read-anywhere / write-scoped: read methods accept any +issue/PR number for context; write methods are rejected unless the target is the +assigned issue or one of its PRs. This is tighter than Gitea's repo-wide API-key +permissions and bounds the blast radius of a prompt-injected agent. Rejections +are logged semantically (operation, target, reason) so the audit trail records +attempted out-of-scope writes, not just allowed ones. + +**Semantic audit**: every sidecar call is logged as a structured operation +("read PR #318 description", "posted comment to #317", "signalled done: +success") rather than as opaque HTTP bytes. This log feeds provenance directly, +with no post-hoc egress-log parsing. + +### `Forge` abstraction — `bot_bottle/contrib/forge/` + +The sidecar dispatches to a `Forge` abstract class. Each provider implements the +operations behind the sidecar protocol: + +```python +class Forge(abc.ABC): + @abc.abstractmethod + def read_issue(self, number: int) -> Issue: ... + @abc.abstractmethod + def read_comments(self, number: int) -> list[Comment]: ... + @abc.abstractmethod + def post_comment(self, number: int, body: str) -> None: ... + @abc.abstractmethod + def update_description(self, number: int, body: str) -> None: ... + @abc.abstractmethod + def is_org_member(self, org: str, username: str) -> bool: ... + @abc.abstractmethod + def get_pr_for_issue(self, number: int) -> int | None: ... + @abc.abstractmethod + def is_pr_open(self, number: int) -> bool: ... +``` + +`GiteaForge` is the first and only concrete implementation in this PRD. It wraps +the Gitea HTTP client (below). Adding GitHub or GitLab later is a new subclass; +the sidecar, protocol, and agent prompt are untouched. + +> **Deferred:** `DeployKeyProvisioner` is *not* folded into `Forge` here. +> Deploy-key provisioning runs on the host at provision time; the sidecar runs +> in the bottle at agent time. They have different lifecycles and actors, so a +> shared abstract base would couple two unrelated auth contexts. For now they +> only share the Gitea HTTP client; a later PRD can revisit unification. + ### Forge env vars -The orchestrator builds this dict and passes it to `start_headless` as -`forge_env`: +The orchestrator passes forge context to the **sidecar** (not the agent) at +launch. The agent does not need owner/repo/issue env vars to construct API +calls, since it only names issue/PR numbers to the sidecar: | Var | Example | Purpose | |---|---|---| -| `FORGE_GITEA_API` | `https://gitea.dideric.is/api/v1` | Base URL for Gitea API calls | +| `FORGE_GITEA_API` | `https://gitea.dideric.is/api/v1` | Base URL the sidecar calls | | `FORGE_OWNER` | `didericis` | Repo owner | | `FORGE_REPO` | `bot-bottle` | Repo name | -| `FORGE_ISSUE_NUMBER` | `317` | Issue that triggered the run | -| `FORGE_PR_NUMBER` | `318` | PR to comment on (empty until PR exists) | +| `FORGE_ISSUE_NUMBER` | `317` | Assigned issue (defines write scope) | +| `FORGE_PR_NUMBER` | `318` | Assigned PR (empty until PR exists) | -The agent's system prompt (from the manifest) instructs it to post a comment to -`$FORGE_GITEA_API/repos/$FORGE_OWNER/$FORGE_REPO/issues/$FORGE_ISSUE_NUMBER/comments` -when it finishes a work unit. The instruction is part of the forge-specific -agent prompt, not the base agent manifest, so non-forge runs are unaffected. - -### Gitea egress for forge-targeted bottles - -Forge-targeted bottles get an additional egress route injected by the -orchestrator at launch time. This is passed as an extra `EgressRoute` in the -`BottleSpec` (or via the forge env and bottle manifest) rather than requiring -operators to add it to every agent manifest: - -```yaml -host: gitea.dideric.is -auth: - scheme: Bearer - token_env: GITEA_TOKEN -``` - -The cred-proxy injects the token; the agent never sees the raw credential. +The agent's forge-specific prompt instructs it to call `signal_done` on the +sidecar when a work unit is complete, and to use the sidecar for any +comment/description writes. The instruction is forge-agnostic and is part of the +forge prompt overlay, not the base agent manifest, so non-forge runs are +unaffected. ### Done signal and watchdog -The agent posts a Gitea comment when it finishes a work unit. The orchestrator -webhook listener receives the `issue_comment` event and: +The agent calls `signal_done(status, summary)` on the sidecar when it finishes a +work unit. The sidecar writes the event to its queue dir; the orchestrator reads +it and: -1. Verifies the commenter is a member of `FORGE_ORG`. -2. Reads the forge state for `(owner, repo, issue_number)`. -3. If `status == "running"`, treats the comment as the done signal: freezes the - bottle, appends the provenance footer to the same comment thread, sets +1. Reads the forge state for `(owner, repo, issue_number)`. +2. If `status == "running"`, treats the event as the done signal: freezes the + bottle, posts a summary comment with the provenance footer, sets `status = "frozen"`. -**Watchdog**: the orchestrator tracks `last_checkin_at` in forge state. A -background thread wakes every minute. If `now - last_checkin_at > FORGE_WATCHDOG_TIMEOUT` -(default 30 min, configurable via env) and `status == "running"`, the -orchestrator posts the provenance footer comment on behalf of the agent and -freezes the bottle. +Because completion is an explicit `signal_done` call, the orchestrator does not +parse comment text to detect "done", and intermediate comments the agent posts +mid-run cannot be mistaken for completion. -Echo-loop guard: comments from members of `FORGE_ORG` that are not the -currently-running slug's agent user are still dispatched as resume triggers, not -as done signals. The comment-is-done-signal path checks that -`comment.user.login == agent_git_user` (read from forge state). +**Watchdog**: the orchestrator tracks `last_checkin_at` in forge state, updated +on each sidecar event. A background thread wakes every minute. If +`now - last_checkin_at > FORGE_WATCHDOG_TIMEOUT` (default 30 min, configurable +via env) and `status == "running"`, the orchestrator treats the run as +done-without-self-report: it posts the provenance footer (with `watchdog_fired` +set) and freezes the bottle. + +**Sidecar-death failure mode**: if the forge sidecar crashes mid-run the agent +loses forge access while the bottle is otherwise healthy. The orchestrator +detects a dead sidecar (socket/queue gone) the same way it detects a stalled +agent and falls back to the watchdog path, posting a footer that flags the +incomplete run. ### Forge state — `bot_bottle/contrib/gitea/forge_state.py` @@ -289,13 +388,16 @@ Output (collapsed by default): | duration | 4m 12s | | exit | 0 ✓ | | gitleaks | ✓ no secrets detected | -| done signal | agent comment *(or: watchdog — agent did not check in)* | +| done signal | sidecar `signal_done` *(or: watchdog — agent did not signal)* | -**Egress** (deny-by-default; 3 routes allowed) +**Egress** (deny-by-default; 2 routes allowed) - `api.anthropic.com` — Bearer auth -- `gitea.dideric.is` — Bearer auth - `pypi.org` — unauthenticated +Forge traffic is not an agent egress route — the forge sidecar holds the token +and makes those calls out of band. The provenance footer's forge operations come +from the sidecar's semantic audit log. + ``` @@ -303,19 +405,26 @@ The egress summary is read from `~/.bot-bottle/state//egress/`. When unavailable the section is omitted. `watchdog_fired=True` changes the "done signal" row to warn reviewers. -### Gitea client — `bot_bottle/contrib/gitea/client.py` +### Gitea HTTP client — `bot_bottle/contrib/gitea/client.py` + +`GiteaForge` (and the existing `GiteaDeployKeyProvisioner`) share one thin HTTP +client. Unlike the option-2 design, the token is held by the sidecar process and +passed to the client directly — there is no agent-side cred-proxy route to +inject it, because the agent never makes forge calls. ```python class GiteaClient: - def __init__(self, *, api_url: str) -> None: ... + def __init__(self, *, api_url: str, owner: str, repo: str, token: str) -> None: ... def is_org_member(self, org: str, username: str) -> bool: ... - def post_comment(self, owner: str, repo: str, issue_number: int, body: str) -> None: ... - def get_pr_for_issue(self, owner: str, repo: str, issue_number: int) -> int | None: ... - def is_pr_open(self, owner: str, repo: str, pr_number: int) -> bool: ... + def post_comment(self, issue_number: int, body: str) -> None: ... + def update_comment_body(self, issue_number: int, body: str) -> None: ... + def get_pr_for_issue(self, issue_number: int) -> int | None: ... + def is_pr_open(self, pr_number: int) -> bool: ... ``` -Auth is not configured in the client — the egress layer injects the token on -the way out, matching the existing `GiteaDeployKeyProvisioner` pattern. +Sharing only the HTTP client (not an abstract base) is the deliberate boundary +between the sidecar and the deploy-key provisioner — see the deferral note under +the `Forge` abstraction. ### Implementation chunks @@ -327,16 +436,25 @@ the way out, matching the existing `GiteaDeployKeyProvisioner` pattern. read/write/delete/all helpers, atomic rename. Tests: round-trip JSON, missing file → None, atomic write. -3. **Gitea client** — `contrib/gitea/client.py`: `is_org_member`, - `post_comment`, `get_pr_for_issue`, `is_pr_open`. Tests: mock - `urllib.request.urlopen`, assert payloads and 404-as-false for membership. +3. **`Forge` abstraction + Gitea client** — `contrib/forge/base.py` (`Forge` + ABC) and `contrib/gitea/client.py` + `GiteaForge`: `is_org_member`, + `read_issue`, `read_comments`, `post_comment`, `update_description`, + `get_pr_for_issue`, `is_pr_open`. Tests: mock `urllib.request.urlopen`, + assert payloads and 404-as-false for membership. -4. **Provenance** — `contrib/gitea/provenance.py`: `build_provenance_footer`. +4. **Forge sidecar** — sidecar process exposing the protocol over a Unix socket, + queue-dir relay, write-scope enforcement, semantic op log, `signal_done`. + Reuses the supervise sidecar bundle machinery. Tests: dispatch each method to + the `Forge`, reject out-of-scope writes, `signal_done` writes a queue event, + scope-rejection is logged. + +5. **Provenance** — `contrib/gitea/provenance.py`: `build_provenance_footer`. Tests: required fields present, watchdog row text, egress omitted when log absent. -5. **`./cli.py orchestrate`** — `cli/orchestrate.py` with `start`, `resume`, - `status` subcommands wired into `cli.py`. Tests: arg parsing, `start` +6. **`./cli.py orchestrate`** — `cli/orchestrate.py` with `start`, `resume`, + `status` subcommands wired into `cli.py`; `start` launches the forge sidecar + alongside the agent for forge-targeted runs. Tests: arg parsing, `start` delegates to `start_headless`, `resume` delegates to `resume_headless`. ## Provenance as the product