bot-bottle/docs/prds/prd-new-forge-native-integration.md

# PRD prd-new: Forge native integration

- **Status:** Draft
- **Author:** claude
- **Created:** 2026-06-29
- **Issue:** #317

## Summary

Add a webhook-driven orchestration layer that lets Gitea issues and PR comments
drive bot-bottle sessions end-to-end — no operator in the loop for the happy
path. An issue assigned to the agent user and labelled with a bottle name
triggers a headless bottle launch; the bottle processes the issue, opens a PR,
and is frozen. Subsequent PR comments rehydrate the bottle with the comment as
input. The PR is destroyed when it is closed. Every run emits a provenance
footer recording the agent identity, bottle, model, egress activity, and
gitleaks outcome so each PR carries a verifiable audit trail of how it was
produced.

## Problem

Today an operator must open the TUI, select an agent and bottle, confirm the
preflight, and type prompts interactively. This loop is fine for exploratory
work but blocks "issue → PR" automation: nothing triggers a bottle from a forge
event, and nothing captures what the agent did in a durable, PR-visible record.
The security model already produces the right isolation and egress controls; the
missing piece is the orchestration layer that closes the loop between forge
events and running bottles, plus a provenance trail that makes the audit story
legible to reviewers.

## Goals / Success Criteria

1. `./cli.py forge listen` starts a webhook listener. Gitea delivers issue and
   PR events to it.
2. An issue opened with assignee matching the configured agent Gitea username
   and at least one `bot-bottle:<agent-name>` label launches a headless bottle.
   The issue title + body is the initial prompt.
3. The bottle runs `claude --dangerously-skip-permissions --no-interactive -p
   "<prompt>"` (non-interactive print mode). When it exits, the orchestrator
   freezes the bottle and posts a comment with the provenance footer.
4. A new comment on the PR associated with the issue rehydrates the bottle with
   `claude ... --continue -p "<comment body>"` and re-freezes on exit.
5. Closing the PR destroys the bottle and cleans up forge state.
6. Every comment the orchestrator posts includes a provenance footer: agent
   name, bottle name(s), model, egress summary, gitleaks pass/fail, start time,
   and duration.
7. Forge state (issue → slug mapping) survives orchestrator restarts: a new
   `listen` process picks up in-flight bottles from the forge state directory.
8. `./cli.py forge status` lists active forge-managed bottles and their
   associated issue/PR URLs.
9. Unit tests cover: label parsing, forge state read/write, provenance footer
   rendering, headless launch path (no TUI calls), orchestrator event dispatch.

## Non-goals

- Webhook signature verification (HMAC-SHA256 of the `X-Gitea-Signature` header).
  Can be added as a follow-up; the listener accepts all POSTs for now.
- GitHub or GitLab event support.
- Multiple simultaneous forge bottles per issue.
- Automatic retry on agent error exit.
- Bottle destruction on issue close (only PR close is in scope; issue close is
  ambiguous — the issue may close before the PR does).
- Auto-discovery of repos to watch; the operator configures the Gitea webhook
  URL manually.
- Parallelism between the orchestrator and the running bottle (one active
  bottle per issue at a time; a new comment while the bottle is running is
  queued by re-freezing after each exit).

## Design

### Label convention

An issue is forge-targeted when **both** of the following are true:

- Assignee login matches `FORGE_AGENT_USER` env var (default: `didericis-claude`).
- At least one label has the prefix `bot-bottle:`. The suffix names the agent
  manifest, e.g. `bot-bottle:implementer` → agent `implementer`.

If the label suffix matches no known agent, the orchestrator posts an error
comment and does nothing.

Optionally, a second label `bot-bottle-bottle:<bottle-name>` overrides the
bottle selection (analagous to multi-bottle selection in PRD 0066). When absent,
the agent's default bottle is used.

### Headless launch — `attach_agent_headless`

A new function in `bot_bottle/cli/start.py`:

```python
def attach_agent_headless(
    bottle: Bottle,
    *,
    prompt: str,
    resume: bool = False,
    agent_provider_template: str = "claude",
    startup_args: tuple[str, ...] = (),
) -> int:
    """Run the provider CLI inside bottle in non-interactive print mode.

    Blocks until the agent exits; returns the exit code. No tty.
    resume=True adds --continue so the agent resumes its last session
    before processing prompt."""
    runtime = runtime_for(agent_provider_template)
    agent_args = list(runtime.bypass_args)   # --dangerously-skip-permissions
    agent_args.extend(startup_args)
    agent_args.append("--no-interactive")
    if resume:
        agent_args.extend(runtime.resume_args)  # --continue
    agent_args.extend(["-p", prompt])
    return bottle.exec_agent(agent_args, tty=False)
```

The system prompt from the agent's manifest `.md` file is still applied via
`--append-system-prompt-file` in `startup_args` (provisioned by
`ClaudeAgentProvider.provision_prompt`). The `-p` arg is the user-visible
prompt the issue or comment supplies.

### Headless start — `start_headless`

A new function in `bot_bottle/cli/start.py` that mirrors `_launch_bottle` but
skips all TUI steps:

```python
def start_headless(
    manifest: ManifestIndex,
    *,
    agent_name: str,
    bottle_names: tuple[str, ...],
    label: str,
    prompt: str,
    backend_name: str | None = None,
) -> tuple[str, int]:
    """Non-interactive bottle launch for forge-driven runs.

    Prepares the bottle, runs attach_agent_headless, and freezes on exit.
    Returns (slug, exit_code). Does not prompt the operator or open a tty.
    Raises on backend errors."""
```

`start_headless`:
1. Builds a `BottleSpec` with `copy_cwd=False`, `color=""`.
2. Calls `backend.prepare` directly (no preflight render, no y/N prompt).
3. Enters `backend.launch(plan)` and calls `attach_agent_headless(bottle, prompt=prompt)`.
4. Captures session state and returns `(slug, exit_code)`.

The caller (orchestrator) is responsible for calling
`get_freezer(backend_name).commit_slug(slug)` after the bottle exits.

### Headless resume — `resume_headless`

A new function in `bot_bottle/cli/resume.py` that mirrors `cmd_resume` but
non-interactively:

```python
def resume_headless(
    slug: str,
    *,
    prompt: str,
    backend_name: str | None = None,
) -> int:
    """Rehydrate a frozen bottle and run one headless prompt. Returns exit_code."""
```

### Forge state — `bot_bottle/contrib/gitea/forge_state.py`

Per-issue tracking persisted to disk:

```
~/.bot-bottle/forge/
    <owner>/
        <repo>/
            issue-<n>.json
```

Schema:

```json
{
  "slug": "implementer-abc12",
  "pr_number": 42,
  "agent_name": "implementer",
  "bottle_names": ["claude"],
  "backend_name": "docker",
  "issue_number": 17,
  "owner": "didericis",
  "repo": "bot-bottle",
  "status": "frozen"
}
```

`status` is one of `"running"` | `"frozen"` | `"destroyed"`.

Public API:

```python
def write_forge_state(state: ForgeState) -> None: ...
def read_forge_state(owner: str, repo: str, issue_number: int) -> ForgeState | None: ...
def delete_forge_state(owner: str, repo: str, issue_number: int) -> None: ...
def all_forge_states() -> list[ForgeState]: ...
```

### Provenance — `bot_bottle/contrib/gitea/provenance.py`

Reads bottle metadata and egress log summary, produces a markdown section:

```python
def build_provenance_footer(
    slug: str,
    *,
    started_at: str,
    finished_at: str,
    exit_code: int,
    egress_log_path: Path | None = None,
) -> str:
    """Return a markdown string suitable for appending to a PR/comment body."""
```

Output format (collapsed by default via `<details>`):

```markdown
<details><summary>🔬 Run provenance</summary>

| Field | Value |
|---|---|
| agent | `implementer` |
| bottle | `claude` |
| slug | `implementer-abc12` |
| started | 2026-06-29T12:00:00-04:00 |
| duration | 4m 12s |
| exit | 0 ✓ |
| gitleaks | ✓ no secrets detected |

**Egress summary** (deny-by-default; routes allowed: 2)
- `api.anthropic.com` — Bearer auth
- `gitea.dideric.is` — unauthenticated

</details>
```

The egress summary is read from the egress log written by the egress proxy
sidecar in `~/.bot-bottle/state/<slug>/egress/`. When unavailable (backend
has no egress log), the section is omitted rather than erroring.

### Gitea API client — `bot_bottle/contrib/gitea/client.py`

Thin stdlib-only HTTP wrapper used by the orchestrator:

```python
class GiteaClient:
    def __init__(self, *, api_url: str) -> None: ...
    def post_comment(self, owner: str, repo: str, issue_number: int, body: str) -> None: ...
    def get_pr_for_issue(self, owner: str, repo: str, issue_number: int) -> int | None:
        """Return the PR number whose body references issue_number, or None."""
    def close_bottle_is_pr_open(self, owner: str, repo: str, pr_number: int) -> bool: ...
```

Authentication is not configured in the client — the egress layer injects the
Gitea token on the way out (same pattern as `GiteaDeployKeyProvisioner`).

### Orchestrator — `bot_bottle/contrib/gitea/orchestrator.py`

```python
class ForgeOrchestrator:
    def __init__(
        self,
        *,
        manifest: ManifestIndex,
        gitea_client: GiteaClient,
        agent_user: str,
        backend_name: str | None = None,
    ) -> None: ...

    def on_issue_opened(self, event: dict) -> None: ...
    def on_issue_comment_created(self, event: dict) -> None: ...
    def on_pull_request_closed(self, event: dict) -> None: ...
```

`on_issue_opened`:
1. Extract `owner`, `repo`, `issue_number`, `assignees`, `labels`, `title`, `body`.
2. Verify assignee contains `agent_user`. Bail silently if not.
3. Parse `bot-bottle:<agent-name>` label. Post error comment + return if absent or unknown.
4. Parse optional `bot-bottle-bottle:<bottle-name>` label; else `bottle_names = ()`.
5. Build prompt: `f"Issue #{issue_number}: {title}\n\n{body}"`.
6. Call `start_headless(manifest, agent_name=..., bottle_names=..., label=..., prompt=...)`.
7. Write forge state (status=`"running"`).
8. On bottle exit: `get_freezer(backend).commit_slug(slug)`.
9. Update forge state `status="frozen"`, set `pr_number` by querying Gitea for a PR
   referencing the issue.
10. Post provenance comment on the PR (or the issue if no PR found).

`on_issue_comment_created`:
1. Look up forge state by `(owner, repo, issue_number)`. Skip if not found or destroyed.
2. Skip if comment author is `agent_user` (prevents echo loops).
3. Skip if forge state `status == "running"` (already active; queue is out of scope).
4. Update forge state `status="running"`.
5. Call `resume_headless(slug, prompt=comment_body)`.
6. Re-freeze: `get_freezer(backend).commit_slug(slug)`.
7. Update forge state `status="frozen"`.
8. Post provenance comment.

`on_pull_request_closed`:
1. Match `pr_number` against all forge states for `(owner, repo)`.
2. Destroy the bottle: call the backend's teardown for `slug` and delete the image.
3. Set `status="destroyed"`.

### Webhook listener — `bot_bottle/contrib/gitea/webhook_server.py`

Small `http.server.BaseHTTPRequestHandler` that:
- Accepts `POST /webhook`.
- Reads the `X-Gitea-Event` header to select the handler.
- Deserializes the JSON body and calls the orchestrator's matching `on_*` method.
- Returns HTTP 200 for known events, 204 for unknown (no-op).

Runs in the same thread as the CLI (blocking `serve_forever`). The orchestrator
handlers are synchronous; long-running launches block the listener thread for
the duration. (Concurrent multi-issue handling is out of scope for the MVP.)

### CLI — `bot_bottle/cli/forge.py`

```
./cli.py forge listen [--host HOST] [--port PORT] [--agent-user USER]
./cli.py forge status
```

`listen` defaults: `--host 0.0.0.0 --port 8765 --agent-user $FORGE_AGENT_USER`.

`status` prints a table of active forge bottles (slug, issue URL, PR URL, status).

`forge` is registered in `cli.py` alongside `start`, `resume`, `commit`, etc.

## Provenance as the product

Every comment the orchestrator posts ends with the provenance footer. The footer
is not optional and not configurable off. This is load-bearing: it is the audit
trail that lets human reviewers verify what the agent did, what credentials it
had access to, what it called out to, and whether gitleaks caught anything.
PRs that land without a provenance footer were not opened by the forge
integration.

The footer also links back to the bot-bottle repo (anchored to the commit SHA
used for the run, not `main`) so the policy that governed the run is pinned in
the PR history.

## Implementation chunks

1. **Headless primitives** — `attach_agent_headless` + `start_headless` in
   `cli/start.py`; `resume_headless` in `cli/resume.py`. Tests: assert no tty,
   correct arg construction with and without `resume=True`.

2. **Forge state** — `contrib/gitea/forge_state.py`: `ForgeState` dataclass,
   `write_forge_state`, `read_forge_state`, `delete_forge_state`,
   `all_forge_states`. Tests: round-trip JSON, missing file returns None,
   concurrent-write safety via atomic rename.

3. **Gitea client** — `contrib/gitea/client.py`: `post_comment`,
   `get_pr_for_issue`. Tests: mock `urllib.request.urlopen` and assert payloads.

4. **Provenance** — `contrib/gitea/provenance.py`: `build_provenance_footer`.
   Tests: verify footer contains all required fields; verify graceful omission
   when egress log is absent.

5. **Orchestrator** — `contrib/gitea/orchestrator.py`: `ForgeOrchestrator`
   with the three `on_*` handlers. Tests: mock `start_headless`,
   `resume_headless`, `get_freezer`, `GiteaClient`, `forge_state.*`; assert
   correct calls for each event path (happy path, unknown label, echo-loop
   prevention, status=running guard).

6. **Webhook listener** — `contrib/gitea/webhook_server.py`. Tests: mock
   orchestrator methods; assert correct dispatch per `X-Gitea-Event` value and
   correct HTTP status codes.

7. **CLI wiring** — `cli/forge.py` + registration in `cli.py`. Tests:
   `cmd_forge_status` tabular output, `cmd_forge_listen` argument parsing.

## Open questions

None.