Files
bot-bottle/docs/prds/prd-new-forge-native-integration.md
T

13 KiB

PRD prd-new: Forge native integration

  • Status: Draft
  • Author: claude
  • Created: 2026-06-29
  • Issue: #317

Summary

Add a webhook-driven orchestration layer that lets Gitea issues and PR comments drive bot-bottle sessions end-to-end with no operator in the loop for the happy path. An issue assigned to a member of the configured agent org and labelled with an agent name triggers a headless bottle launch; the bottle processes the issue, opens a PR, and posts a done-comment via the Gitea API (through cred-proxy) before exiting. The orchestrator detects the done-comment, freezes the bottle, and attaches a provenance footer. Subsequent PR comments rehydrate the frozen bottle. The bottle is destroyed when the PR closes.

The separation of concerns across the two layers: bot-bottle owns the headless launch primitives, forge state, Gitea client, and provenance builder. bot-bottle-orchestrator (separate binary) owns the webhook listener, bottle lifecycle loop, and monitoring dashboard; it calls into bot-bottle via ./cli.py orchestrate, a thin wrapper command. This PRD covers bot-bottle's side of that contract.

Problem

Today an operator must open the TUI, select an agent and bottle, confirm the preflight, and type prompts interactively. This blocks "issue → PR" automation and produces no durable audit record of what the agent did. The security model already provides the right isolation and egress controls; the missing pieces are the headless launch primitive that bot-bottle-orchestrator can call, the in-bottle Gitea API access the agent uses to signal completion, and the provenance trail that makes the audit story legible to reviewers on every PR.

Goals / Success Criteria

  1. ./cli.py orchestrate start and ./cli.py orchestrate resume are the non-interactive counterparts to start and resume. They accept agent, bottle, and prompt via flags rather than TUI pickers, and exit when the agent process exits.
  2. An issue assigned to a member of the configured org (FORGE_ORG, default bot-bottle) and labelled bot-bottle:<agent-name> is the trigger convention. Org membership is verified via the Gitea API at event time.
  3. Forge-targeted bottles receive a set of env vars at launch (FORGE_GITEA_API, FORGE_OWNER, FORGE_REPO, FORGE_ISSUE_NUMBER) so the agent knows where to post its done-comment without hardcoding forge context in the agent manifest.
  4. The agent's egress policy for forge runs includes gitea.<host> with Bearer auth injected by cred-proxy, enabling direct Gitea API calls from inside the bottle.
  5. The done-comment the agent posts is the done signal. A watchdog timeout (configurable, default 30 min) causes the orchestrator to post the done-comment on the agent's behalf if the agent exits without posting one.
  6. Every orchestrator-posted comment ends with a provenance footer: agent name, bottle name(s), slug, start time, duration, exit code, gitleaks result, and egress summary.
  7. Forge state (issue → slug, status) is persisted to disk and survives orchestrator restarts.
  8. ./cli.py orchestrate status lists active forge-managed bottles and their issue/PR URLs.
  9. Unit tests cover: label parsing, org-membership check path, forge state read/write, provenance footer rendering, headless launch arg construction, forge env var injection, echo-loop guard.

Non-goals

  • Webhook signature verification (HMAC-SHA256). Added as a follow-up.
  • The bot-bottle-orchestrator binary itself — this PRD covers bot-bottle's side of the interface only. The orchestrator is a separate project.
  • GitHub or GitLab support.
  • Multiple simultaneous forge bottles per issue.
  • Automatic retry on agent error exit.
  • Bottle destruction on issue close (PR close only; issue close is ambiguous).
  • Concurrent multi-issue handling (one blocking run per orchestrator process).
  • A monitoring dashboard (orchestrator-side concern).

Design

Targeting convention

An issue is forge-targeted when both hold:

  • At least one assignee is a member of the Gitea org named by FORGE_ORG (default bot-bottle). Checked via GET /api/v1/orgs/{org}/members/{user}.
  • At least one label has the prefix bot-bottle:. The suffix names the agent manifest, e.g. bot-bottle:implementer → agent implementer.

FORGE_ORG is read at orchestrate-command startup. It is not embedded in manifests or state files; the orchestrator stamps its value into log output for auditability.

An optional label bot-bottle-bottle:<name> overrides bottle selection. When absent the agent's default bottle is used.

./cli.py orchestrate — the thin wrapper

./cli.py orchestrate start  --agent AGENT [--bottle BOTTLE ...] --prompt PROMPT
                            [--label LABEL] [--backend BACKEND]
./cli.py orchestrate resume --slug SLUG --prompt PROMPT [--backend BACKEND]
./cli.py orchestrate status

orchestrate start is start_headless exposed as a subcommand. It prepares the bottle non-interactively, launches the agent in print mode, and exits with the agent's exit code. The caller (bot-bottle-orchestrator) manages freeze, state, and Gitea comments around it.

orchestrate resume is resume_headless exposed as a subcommand.

orchestrate status prints the forge state table.

Headless primitives

attach_agent_headless — new function in bot_bottle/cli/start.py:

def attach_agent_headless(
    bottle: Bottle,
    *,
    prompt: str,
    resume: bool = False,
    agent_provider_template: str = "claude",
    startup_args: tuple[str, ...] = (),
) -> int:
    runtime = runtime_for(agent_provider_template)
    agent_args = list(runtime.bypass_args)   # --dangerously-skip-permissions
    agent_args.extend(startup_args)
    agent_args.append("--no-interactive")
    if resume:
        agent_args.extend(runtime.resume_args)  # --continue
    agent_args.extend(["-p", prompt])
    return bottle.exec_agent(agent_args, tty=False)

start_headless — new function in bot_bottle/cli/start.py that mirrors _launch_bottle without any TUI steps:

def start_headless(
    manifest: ManifestIndex,
    *,
    agent_name: str,
    bottle_names: tuple[str, ...],
    label: str,
    prompt: str,
    forge_env: dict[str, str] | None = None,
    backend_name: str | None = None,
) -> tuple[str, int]:
    """Non-interactive bottle launch. Returns (slug, exit_code)."""

forge_env is merged into the bottle's guest_env so the agent receives the forge context as env vars (see below). The caller freezes the bottle after start_headless returns.

resume_headless — new function in bot_bottle/cli/resume.py:

def resume_headless(slug: str, *, prompt: str, backend_name: str | None = None) -> int:
    """Rehydrate a frozen bottle and run one headless prompt. Returns exit_code."""

Forge env vars

The orchestrator builds this dict and passes it to start_headless as forge_env:

Var Example Purpose
FORGE_GITEA_API https://gitea.dideric.is/api/v1 Base URL for Gitea API calls
FORGE_OWNER didericis Repo owner
FORGE_REPO bot-bottle Repo name
FORGE_ISSUE_NUMBER 317 Issue that triggered the run
FORGE_PR_NUMBER 318 PR to comment on (empty until PR exists)

The agent's system prompt (from the manifest) instructs it to post a comment to $FORGE_GITEA_API/repos/$FORGE_OWNER/$FORGE_REPO/issues/$FORGE_ISSUE_NUMBER/comments when it finishes a work unit. The instruction is part of the forge-specific agent prompt, not the base agent manifest, so non-forge runs are unaffected.

Gitea egress for forge-targeted bottles

Forge-targeted bottles get an additional egress route injected by the orchestrator at launch time. This is passed as an extra EgressRoute in the BottleSpec (or via the forge env and bottle manifest) rather than requiring operators to add it to every agent manifest:

host: gitea.dideric.is
auth:
  scheme: Bearer
  token_env: GITEA_TOKEN

The cred-proxy injects the token; the agent never sees the raw credential.

Done signal and watchdog

The agent posts a Gitea comment when it finishes a work unit. The orchestrator webhook listener receives the issue_comment event and:

  1. Verifies the commenter is a member of FORGE_ORG.
  2. Reads the forge state for (owner, repo, issue_number).
  3. If status == "running", treats the comment as the done signal: freezes the bottle, appends the provenance footer to the same comment thread, sets status = "frozen".

Watchdog: the orchestrator tracks last_checkin_at in forge state. A background thread wakes every minute. If now - last_checkin_at > FORGE_WATCHDOG_TIMEOUT (default 30 min, configurable via env) and status == "running", the orchestrator posts the provenance footer comment on behalf of the agent and freezes the bottle.

Echo-loop guard: comments from members of FORGE_ORG that are not the currently-running slug's agent user are still dispatched as resume triggers, not as done signals. The comment-is-done-signal path checks that comment.user.login == agent_git_user (read from forge state).

Forge state — bot_bottle/contrib/gitea/forge_state.py

~/.bot-bottle/forge/
    <owner>/
        <repo>/
            issue-<n>.json

Schema:

{
  "slug": "implementer-abc12",
  "pr_number": 42,
  "agent_name": "implementer",
  "bottle_names": ["claude"],
  "backend_name": "docker",
  "agent_git_user": "didericis-claude",
  "issue_number": 17,
  "owner": "didericis",
  "repo": "bot-bottle",
  "status": "frozen",
  "last_checkin_at": "2026-06-29T12:04:12-04:00"
}

status: "running" | "frozen" | "destroyed".

Public API:

def write_forge_state(state: ForgeState) -> None: ...
def read_forge_state(owner: str, repo: str, issue_number: int) -> ForgeState | None: ...
def delete_forge_state(owner: str, repo: str, issue_number: int) -> None: ...
def all_forge_states() -> list[ForgeState]: ...

Writes use atomic rename (os.replace) for crash safety.

Provenance — bot_bottle/contrib/gitea/provenance.py

def build_provenance_footer(
    slug: str,
    *,
    agent_name: str,
    bottle_names: tuple[str, ...],
    started_at: str,
    finished_at: str,
    exit_code: int,
    watchdog_fired: bool = False,
    egress_log_path: Path | None = None,
) -> str:
    """Return a markdown string for appending to a Gitea comment body."""

Output (collapsed by default):

<details><summary>🔬 Run provenance</summary>

| Field | Value |
|---|---|
| agent | `implementer` |
| bottle | `claude` |
| slug | `implementer-abc12` |
| started | 2026-06-29T12:00:00-04:00 |
| duration | 4m 12s |
| exit | 0 ✓ |
| gitleaks | ✓ no secrets detected |
| done signal | agent comment *(or: watchdog — agent did not check in)* |

**Egress** (deny-by-default; 3 routes allowed)
- `api.anthropic.com` — Bearer auth
- `gitea.dideric.is` — Bearer auth
- `pypi.org` — unauthenticated

</details>

The egress summary is read from ~/.bot-bottle/state/<slug>/egress/. When unavailable the section is omitted. watchdog_fired=True changes the "done signal" row to warn reviewers.

Gitea client — bot_bottle/contrib/gitea/client.py

class GiteaClient:
    def __init__(self, *, api_url: str) -> None: ...
    def is_org_member(self, org: str, username: str) -> bool: ...
    def post_comment(self, owner: str, repo: str, issue_number: int, body: str) -> None: ...
    def get_pr_for_issue(self, owner: str, repo: str, issue_number: int) -> int | None: ...
    def is_pr_open(self, owner: str, repo: str, pr_number: int) -> bool: ...

Auth is not configured in the client — the egress layer injects the token on the way out, matching the existing GiteaDeployKeyProvisioner pattern.

Implementation chunks

  1. Headless primitivesattach_agent_headless + start_headless (with forge_env param) in cli/start.py; resume_headless in cli/resume.py. Tests: no tty, correct arg order, forge_env appears in guest_env.

  2. Forge statecontrib/gitea/forge_state.py: ForgeState dataclass, read/write/delete/all helpers, atomic rename. Tests: round-trip JSON, missing file → None, atomic write.

  3. Gitea clientcontrib/gitea/client.py: is_org_member, post_comment, get_pr_for_issue, is_pr_open. Tests: mock urllib.request.urlopen, assert payloads and 404-as-false for membership.

  4. Provenancecontrib/gitea/provenance.py: build_provenance_footer. Tests: required fields present, watchdog row text, egress omitted when log absent.

  5. ./cli.py orchestratecli/orchestrate.py with start, resume, status subcommands wired into cli.py. Tests: arg parsing, start delegates to start_headless, resume delegates to resume_headless.

Provenance as the product

Every orchestrator-posted comment ends with the provenance footer — non-optional and not configurable off. PRs that land without a footer were not produced by this integration. The watchdog_fired flag in the footer flags runs where the agent did not self-report completion, so reviewers know the audit trail may be incomplete.

The footer links to the bot-bottle repo pinned to the commit SHA active during the run (not main), so the policy that governed the run is permanently anchored in the PR history.