Files
bot-bottle/docs/prds/prd-new-forge-native-integration.md
T

14 KiB

PRD prd-new: Forge native integration

  • Status: Draft
  • Author: claude
  • Created: 2026-06-29
  • Issue: #317

Summary

Add a webhook-driven orchestration layer that lets Gitea issues and PR comments drive bot-bottle sessions end-to-end — no operator in the loop for the happy path. An issue assigned to the agent user and labelled with a bottle name triggers a headless bottle launch; the bottle processes the issue, opens a PR, and is frozen. Subsequent PR comments rehydrate the bottle with the comment as input. The PR is destroyed when it is closed. Every run emits a provenance footer recording the agent identity, bottle, model, egress activity, and gitleaks outcome so each PR carries a verifiable audit trail of how it was produced.

Problem

Today an operator must open the TUI, select an agent and bottle, confirm the preflight, and type prompts interactively. This loop is fine for exploratory work but blocks "issue → PR" automation: nothing triggers a bottle from a forge event, and nothing captures what the agent did in a durable, PR-visible record. The security model already produces the right isolation and egress controls; the missing piece is the orchestration layer that closes the loop between forge events and running bottles, plus a provenance trail that makes the audit story legible to reviewers.

Goals / Success Criteria

  1. ./cli.py forge listen starts a webhook listener. Gitea delivers issue and PR events to it.
  2. An issue opened with assignee matching the configured agent Gitea username and at least one bot-bottle:<agent-name> label launches a headless bottle. The issue title + body is the initial prompt.
  3. The bottle runs claude --dangerously-skip-permissions --no-interactive -p "<prompt>" (non-interactive print mode). When it exits, the orchestrator freezes the bottle and posts a comment with the provenance footer.
  4. A new comment on the PR associated with the issue rehydrates the bottle with claude ... --continue -p "<comment body>" and re-freezes on exit.
  5. Closing the PR destroys the bottle and cleans up forge state.
  6. Every comment the orchestrator posts includes a provenance footer: agent name, bottle name(s), model, egress summary, gitleaks pass/fail, start time, and duration.
  7. Forge state (issue → slug mapping) survives orchestrator restarts: a new listen process picks up in-flight bottles from the forge state directory.
  8. ./cli.py forge status lists active forge-managed bottles and their associated issue/PR URLs.
  9. Unit tests cover: label parsing, forge state read/write, provenance footer rendering, headless launch path (no TUI calls), orchestrator event dispatch.

Non-goals

  • Webhook signature verification (HMAC-SHA256 of the X-Gitea-Signature header). Can be added as a follow-up; the listener accepts all POSTs for now.
  • GitHub or GitLab event support.
  • Multiple simultaneous forge bottles per issue.
  • Automatic retry on agent error exit.
  • Bottle destruction on issue close (only PR close is in scope; issue close is ambiguous — the issue may close before the PR does).
  • Auto-discovery of repos to watch; the operator configures the Gitea webhook URL manually.
  • Parallelism between the orchestrator and the running bottle (one active bottle per issue at a time; a new comment while the bottle is running is queued by re-freezing after each exit).

Design

Label convention

An issue is forge-targeted when both of the following are true:

  • Assignee login matches FORGE_AGENT_USER env var (default: didericis-claude).
  • At least one label has the prefix bot-bottle:. The suffix names the agent manifest, e.g. bot-bottle:implementer → agent implementer.

If the label suffix matches no known agent, the orchestrator posts an error comment and does nothing.

Optionally, a second label bot-bottle-bottle:<bottle-name> overrides the bottle selection (analagous to multi-bottle selection in PRD 0066). When absent, the agent's default bottle is used.

Headless launch — attach_agent_headless

A new function in bot_bottle/cli/start.py:

def attach_agent_headless(
    bottle: Bottle,
    *,
    prompt: str,
    resume: bool = False,
    agent_provider_template: str = "claude",
    startup_args: tuple[str, ...] = (),
) -> int:
    """Run the provider CLI inside bottle in non-interactive print mode.

    Blocks until the agent exits; returns the exit code. No tty.
    resume=True adds --continue so the agent resumes its last session
    before processing prompt."""
    runtime = runtime_for(agent_provider_template)
    agent_args = list(runtime.bypass_args)   # --dangerously-skip-permissions
    agent_args.extend(startup_args)
    agent_args.append("--no-interactive")
    if resume:
        agent_args.extend(runtime.resume_args)  # --continue
    agent_args.extend(["-p", prompt])
    return bottle.exec_agent(agent_args, tty=False)

The system prompt from the agent's manifest .md file is still applied via --append-system-prompt-file in startup_args (provisioned by ClaudeAgentProvider.provision_prompt). The -p arg is the user-visible prompt the issue or comment supplies.

Headless start — start_headless

A new function in bot_bottle/cli/start.py that mirrors _launch_bottle but skips all TUI steps:

def start_headless(
    manifest: ManifestIndex,
    *,
    agent_name: str,
    bottle_names: tuple[str, ...],
    label: str,
    prompt: str,
    backend_name: str | None = None,
) -> tuple[str, int]:
    """Non-interactive bottle launch for forge-driven runs.

    Prepares the bottle, runs attach_agent_headless, and freezes on exit.
    Returns (slug, exit_code). Does not prompt the operator or open a tty.
    Raises on backend errors."""

start_headless:

  1. Builds a BottleSpec with copy_cwd=False, color="".
  2. Calls backend.prepare directly (no preflight render, no y/N prompt).
  3. Enters backend.launch(plan) and calls attach_agent_headless(bottle, prompt=prompt).
  4. Captures session state and returns (slug, exit_code).

The caller (orchestrator) is responsible for calling get_freezer(backend_name).commit_slug(slug) after the bottle exits.

Headless resume — resume_headless

A new function in bot_bottle/cli/resume.py that mirrors cmd_resume but non-interactively:

def resume_headless(
    slug: str,
    *,
    prompt: str,
    backend_name: str | None = None,
) -> int:
    """Rehydrate a frozen bottle and run one headless prompt. Returns exit_code."""

Forge state — bot_bottle/contrib/gitea/forge_state.py

Per-issue tracking persisted to disk:

~/.bot-bottle/forge/
    <owner>/
        <repo>/
            issue-<n>.json

Schema:

{
  "slug": "implementer-abc12",
  "pr_number": 42,
  "agent_name": "implementer",
  "bottle_names": ["claude"],
  "backend_name": "docker",
  "issue_number": 17,
  "owner": "didericis",
  "repo": "bot-bottle",
  "status": "frozen"
}

status is one of "running" | "frozen" | "destroyed".

Public API:

def write_forge_state(state: ForgeState) -> None: ...
def read_forge_state(owner: str, repo: str, issue_number: int) -> ForgeState | None: ...
def delete_forge_state(owner: str, repo: str, issue_number: int) -> None: ...
def all_forge_states() -> list[ForgeState]: ...

Provenance — bot_bottle/contrib/gitea/provenance.py

Reads bottle metadata and egress log summary, produces a markdown section:

def build_provenance_footer(
    slug: str,
    *,
    started_at: str,
    finished_at: str,
    exit_code: int,
    egress_log_path: Path | None = None,
) -> str:
    """Return a markdown string suitable for appending to a PR/comment body."""

Output format (collapsed by default via <details>):

<details><summary>🔬 Run provenance</summary>

| Field | Value |
|---|---|
| agent | `implementer` |
| bottle | `claude` |
| slug | `implementer-abc12` |
| started | 2026-06-29T12:00:00-04:00 |
| duration | 4m 12s |
| exit | 0 ✓ |
| gitleaks | ✓ no secrets detected |

**Egress summary** (deny-by-default; routes allowed: 2)
- `api.anthropic.com` — Bearer auth
- `gitea.dideric.is` — unauthenticated

</details>

The egress summary is read from the egress log written by the egress proxy sidecar in ~/.bot-bottle/state/<slug>/egress/. When unavailable (backend has no egress log), the section is omitted rather than erroring.

Gitea API client — bot_bottle/contrib/gitea/client.py

Thin stdlib-only HTTP wrapper used by the orchestrator:

class GiteaClient:
    def __init__(self, *, api_url: str) -> None: ...
    def post_comment(self, owner: str, repo: str, issue_number: int, body: str) -> None: ...
    def get_pr_for_issue(self, owner: str, repo: str, issue_number: int) -> int | None:
        """Return the PR number whose body references issue_number, or None."""
    def close_bottle_is_pr_open(self, owner: str, repo: str, pr_number: int) -> bool: ...

Authentication is not configured in the client — the egress layer injects the Gitea token on the way out (same pattern as GiteaDeployKeyProvisioner).

Orchestrator — bot_bottle/contrib/gitea/orchestrator.py

class ForgeOrchestrator:
    def __init__(
        self,
        *,
        manifest: ManifestIndex,
        gitea_client: GiteaClient,
        agent_user: str,
        backend_name: str | None = None,
    ) -> None: ...

    def on_issue_opened(self, event: dict) -> None: ...
    def on_issue_comment_created(self, event: dict) -> None: ...
    def on_pull_request_closed(self, event: dict) -> None: ...

on_issue_opened:

  1. Extract owner, repo, issue_number, assignees, labels, title, body.
  2. Verify assignee contains agent_user. Bail silently if not.
  3. Parse bot-bottle:<agent-name> label. Post error comment + return if absent or unknown.
  4. Parse optional bot-bottle-bottle:<bottle-name> label; else bottle_names = ().
  5. Build prompt: f"Issue #{issue_number}: {title}\n\n{body}".
  6. Call start_headless(manifest, agent_name=..., bottle_names=..., label=..., prompt=...).
  7. Write forge state (status="running").
  8. On bottle exit: get_freezer(backend).commit_slug(slug).
  9. Update forge state status="frozen", set pr_number by querying Gitea for a PR referencing the issue.
  10. Post provenance comment on the PR (or the issue if no PR found).

on_issue_comment_created:

  1. Look up forge state by (owner, repo, issue_number). Skip if not found or destroyed.
  2. Skip if comment author is agent_user (prevents echo loops).
  3. Skip if forge state status == "running" (already active; queue is out of scope).
  4. Update forge state status="running".
  5. Call resume_headless(slug, prompt=comment_body).
  6. Re-freeze: get_freezer(backend).commit_slug(slug).
  7. Update forge state status="frozen".
  8. Post provenance comment.

on_pull_request_closed:

  1. Match pr_number against all forge states for (owner, repo).
  2. Destroy the bottle: call the backend's teardown for slug and delete the image.
  3. Set status="destroyed".

Webhook listener — bot_bottle/contrib/gitea/webhook_server.py

Small http.server.BaseHTTPRequestHandler that:

  • Accepts POST /webhook.
  • Reads the X-Gitea-Event header to select the handler.
  • Deserializes the JSON body and calls the orchestrator's matching on_* method.
  • Returns HTTP 200 for known events, 204 for unknown (no-op).

Runs in the same thread as the CLI (blocking serve_forever). The orchestrator handlers are synchronous; long-running launches block the listener thread for the duration. (Concurrent multi-issue handling is out of scope for the MVP.)

CLI — bot_bottle/cli/forge.py

./cli.py forge listen [--host HOST] [--port PORT] [--agent-user USER]
./cli.py forge status

listen defaults: --host 0.0.0.0 --port 8765 --agent-user $FORGE_AGENT_USER.

status prints a table of active forge bottles (slug, issue URL, PR URL, status).

forge is registered in cli.py alongside start, resume, commit, etc.

Provenance as the product

Every comment the orchestrator posts ends with the provenance footer. The footer is not optional and not configurable off. This is load-bearing: it is the audit trail that lets human reviewers verify what the agent did, what credentials it had access to, what it called out to, and whether gitleaks caught anything. PRs that land without a provenance footer were not opened by the forge integration.

The footer also links back to the bot-bottle repo (anchored to the commit SHA used for the run, not main) so the policy that governed the run is pinned in the PR history.

Implementation chunks

  1. Headless primitivesattach_agent_headless + start_headless in cli/start.py; resume_headless in cli/resume.py. Tests: assert no tty, correct arg construction with and without resume=True.

  2. Forge statecontrib/gitea/forge_state.py: ForgeState dataclass, write_forge_state, read_forge_state, delete_forge_state, all_forge_states. Tests: round-trip JSON, missing file returns None, concurrent-write safety via atomic rename.

  3. Gitea clientcontrib/gitea/client.py: post_comment, get_pr_for_issue. Tests: mock urllib.request.urlopen and assert payloads.

  4. Provenancecontrib/gitea/provenance.py: build_provenance_footer. Tests: verify footer contains all required fields; verify graceful omission when egress log is absent.

  5. Orchestratorcontrib/gitea/orchestrator.py: ForgeOrchestrator with the three on_* handlers. Tests: mock start_headless, resume_headless, get_freezer, GiteaClient, forge_state.*; assert correct calls for each event path (happy path, unknown label, echo-loop prevention, status=running guard).

  6. Webhook listenercontrib/gitea/webhook_server.py. Tests: mock orchestrator methods; assert correct dispatch per X-Gitea-Event value and correct HTTP status codes.

  7. CLI wiringcli/forge.py + registration in cli.py. Tests: cmd_forge_status tabular output, cmd_forge_listen argument parsing.

Open questions

None.