Compare commits

..

7 Commits

Author SHA1 Message Date
didericis 4cb106b48d docs(prd): reconcile headless primitives with shipped start --headless
#315 already merged `start --headless` (assume_yes on _launch_bottle +
AgentProvider.headless_prompt). The PRD's proposed start_headless /
attach_agent_headless helpers were redundant with it, and the latter
diverged by hand-rolling --no-interactive/-p instead of using the
headless_prompt provider abstraction. Drop them.

Scope the remaining headless work to what's actually new: a forge_env
hook threaded into the existing _launch_bottle core, and a `resume
--headless` path (resume has no non-interactive entry point today).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01WL77TgFxKbs3cidGMG9dz7
2026-06-30 17:46:59 -04:00
didericis ebad90bfa9 docs(prd): adopt forge sidecar (option 3) for native integration
Flip the forge-native-integration PRD from option 2 (agent calls the
Gitea API directly via cred-proxy; done signal parsed from comments) to
option 3 per issue #317 comment 2715: a forge sidecar backed by a Forge
abstract class.

- signal_done(status, summary) replaces comment-parsing as the done signal
- semantic audit trail from the sidecar feeds provenance directly
- read-anywhere / write-scoped enforcement, tighter than repo-wide API keys
- forge-agnostic agent prompts and sidecar protocol
- DeployKeyProvisioner subsumption deferred; share the HTTP client only

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01WL77TgFxKbs3cidGMG9dz7
2026-06-30 17:39:33 -04:00
didericis-claude 1789deaf73 docs: update forge PRD — orchestrator split, done signal, org targeting, forge env vars 2026-06-29 12:39:02 -04:00
didericis-claude b607d68a0e docs: add PRD for forge native integration 2026-06-29 12:10:41 -04:00
didericis 94eca35b4f fix(skills): validate skill names and quote provisioning paths
test / unit (push) Successful in 55s
test / integration (push) Successful in 23s
test / coverage (push) Successful in 1m11s
Update Quality Badges / update-badges (push) Successful in 1m3s
lint / lint (push) Successful in 2m18s
Skill names become host/guest path segments interpolated into the
`bottle.exec` shell strings in each contrib provider's provision_skills.
They were validated only as strings, so a name with shell metacharacters
or path traversal could reach the command.

Layer two defenses:
  - Primary: reject any skill name that isn't kebab-case
    ([a-z][a-z0-9-]*) at manifest load, reusing the convention already
    enforced on bottle/agent filenames (new is_valid_entity_name helper
    in manifest_schema). Fails loud and early, protecting every consumer
    of the name — not just the exec call sites.
  - Failsafe: shlex.quote the interpolated skills_dir / dst paths in the
    claude, codex, and pi providers, so a future unvalidated field can't
    inject shell metacharacters even if it bypasses the load-time check.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NkwFXLFff9PYPy4wgVBJp9
2026-06-27 02:15:30 -04:00
didericis f787764364 refactor(manifest): break import cycle by extracting ManifestBottle to a leaf module
test / unit (pull_request) Successful in 57s
test / integration (pull_request) Successful in 27s
test / coverage (pull_request) Successful in 1m23s
lint / lint (push) Successful in 2m24s
test / unit (push) Successful in 59s
test / integration (push) Successful in 26s
test / coverage (push) Successful in 1m17s
Update Quality Badges / update-badges (push) Successful in 1m13s
manifest.py imported the extends/loader resolvers, while those resolvers
needed ManifestBottle back from manifest.py — a true bidirectional cycle
papered over with in-function imports and TYPE_CHECKING guards (not clear
dependency inversion).

Extract ManifestBottle into a new leaf module manifest_bottle.py that depends
only on the other leaf modules (manifest_util/agent/egress/git/schema).
manifest.py re-exports ManifestBottle, so `from .manifest import ManifestBottle`
callers are unaffected. With the cycle gone:

- manifest_extends and manifest_loader import ManifestBottle from
  manifest_bottle and their other deps from the real source modules, all at
  top level (TYPE_CHECKING block removed).
- manifest.py imports the extends/loader/schema/yaml_subset/log helpers at
  module top; all per-function lazy imports in the cluster are removed.

No behavior change; full unit suite green, pyright clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NkwFXLFff9PYPy4wgVBJp9
2026-06-26 23:42:03 -04:00
didericis-claude a256e5762a Merge pull request 'DLP injection-check perf, bounded variant cache, dedup supervise schema' (#312) from dlp-supervise-quality-fixes into main
lint / lint (push) Successful in 2m22s
test / unit (push) Successful in 50s
test / integration (push) Successful in 18s
test / coverage (push) Successful in 1m2s
Update Quality Badges / update-badges (push) Successful in 1m9s
2026-06-26 23:30:16 -04:00
11 changed files with 663 additions and 167 deletions
+7 -3
View File
@@ -217,7 +217,7 @@ class ClaudeAgentProvider(AgentProvider):
if not agent.skills:
return
skills_dir = _skills_dir(plan.guest_home)
bottle.exec(f"mkdir -p {skills_dir}", user="root")
bottle.exec(f"mkdir -p {shlex.quote(skills_dir)}", user="root")
for name in agent.skills:
src = host_skill_dir(name)
if not os.path.isdir(src):
@@ -227,9 +227,13 @@ class ClaudeAgentProvider(AgentProvider):
)
dst = f"{skills_dir}/{name}"
info(f"copying skill {name} into {bottle.name}:{dst}")
bottle.exec(f"rm -rf {dst} && mkdir -p {dst}", user="root")
# Defense in depth: skill names are validated kebab-case at
# manifest load, but quote the path so a future unvalidated
# field can't inject shell metacharacters here either.
dst_q = shlex.quote(dst)
bottle.exec(f"rm -rf {dst_q} && mkdir -p {dst_q}", user="root")
bottle.cp_in(f"{src}/.", f"{dst}/")
bottle.exec(f"chown -R node:node {dst}", user="root")
bottle.exec(f"chown -R node:node {dst_q}", user="root")
def provision_prompt(self, plan: "BottlePlan", bottle: "Bottle") -> str | None:
"""Copy the prompt file into the guest, fix ownership/mode.
+7 -3
View File
@@ -183,7 +183,7 @@ class CodexAgentProvider(AgentProvider):
if not agent.skills:
return
skills_dir = _skills_dir(plan.guest_home)
bottle.exec(f"mkdir -p {skills_dir}", user="root")
bottle.exec(f"mkdir -p {shlex.quote(skills_dir)}", user="root")
for name in agent.skills:
src = host_skill_dir(name)
if not os.path.isdir(src):
@@ -193,9 +193,13 @@ class CodexAgentProvider(AgentProvider):
)
dst = f"{skills_dir}/{name}"
info(f"copying skill {name} into {bottle.name}:{dst}")
bottle.exec(f"rm -rf {dst} && mkdir -p {dst}", user="root")
# Defense in depth: skill names are validated kebab-case at
# manifest load, but quote the path so a future unvalidated
# field can't inject shell metacharacters here either.
dst_q = shlex.quote(dst)
bottle.exec(f"rm -rf {dst_q} && mkdir -p {dst_q}", user="root")
bottle.cp_in(f"{src}/.", f"{dst}/")
bottle.exec(f"chown -R node:node {dst}", user="root")
bottle.exec(f"chown -R node:node {dst_q}", user="root")
def provision_prompt(self, plan: "BottlePlan", bottle: "Bottle") -> str | None:
"""Copy the prompt file into the guest, fix ownership/mode.
+7 -3
View File
@@ -238,7 +238,7 @@ class PiAgentProvider(AgentProvider):
if not agent.skills:
return
skills_dir = _skills_dir(plan.guest_home)
bottle.exec(f"mkdir -p {skills_dir}", user="root")
bottle.exec(f"mkdir -p {shlex.quote(skills_dir)}", user="root")
for name in agent.skills:
src = host_skill_dir(name)
if not os.path.isdir(src):
@@ -248,9 +248,13 @@ class PiAgentProvider(AgentProvider):
)
dst = f"{skills_dir}/{name}"
info(f"copying skill {name} into {bottle.name}:{dst}")
bottle.exec(f"rm -rf {dst} && mkdir -p {dst}", user="root")
# Defense in depth: skill names are validated kebab-case at
# manifest load, but quote the path so a future unvalidated
# field can't inject shell metacharacters here either.
dst_q = shlex.quote(dst)
bottle.exec(f"rm -rf {dst_q} && mkdir -p {dst_q}", user="root")
bottle.cp_in(f"{src}/.", f"{dst}/")
bottle.exec(f"chown -R node:node {dst}", user="root")
bottle.exec(f"chown -R node:node {dst_q}", user="root")
def provision_prompt(self, plan: "BottlePlan", bottle: "Bottle") -> str | None:
prompt_path = _prompt_path(plan.guest_home)
+12 -122
View File
@@ -62,15 +62,25 @@ from dataclasses import dataclass, field, replace
from pathlib import Path
from typing import Mapping
from .log import warn
from .manifest_util import ManifestError, as_json_object
from .manifest_agent import ManifestAgent, ManifestAgentProvider
from .manifest_bottle import ManifestBottle
from .manifest_egress import (
EGRESS_AUTH_SCHEMES,
ManifestEgressConfig,
ManifestEgressRoute,
)
from .manifest_git import ManifestGitEntry, ManifestGitUser, ManifestKeyConfig, parse_git_gate_config
from .manifest_schema import BOTTLE_KEYS
from .manifest_extends import merge_bottles_runtime, resolve_bottles
from .manifest_git import ManifestGitEntry, ManifestGitUser, ManifestKeyConfig
from .manifest_loader import (
check_stale_json,
load_bottle_chain_from_dir,
scan_agent_names,
scan_bottle_names,
)
from .manifest_schema import validate_agent_frontmatter_keys
from .yaml_subset import YamlSubsetError, parse_frontmatter
# Re-export everything that callers currently import from this module.
__all__ = [
@@ -89,10 +99,6 @@ __all__ = [
]
def _empty_str_dict() -> dict[str, str]:
return {}
def _section_dict(value: object, label: str) -> dict[str, object]:
"""Like as_json_object but treats absent/null as an empty section."""
if value is None:
@@ -100,107 +106,6 @@ def _section_dict(value: object, label: str) -> dict[str, object]:
return as_json_object(value, label)
@dataclass(frozen=True)
class ManifestBottle:
env: Mapping[str, str] = field(default_factory=_empty_str_dict)
agent_provider: ManifestAgentProvider = field(default_factory=ManifestAgentProvider)
git: tuple[ManifestGitEntry, ...] = ()
# Per-bottle git identity (issue #86). Empty default — bottles
# that don't set `git-gate.user:` in the manifest skip the
# `git config --global` step entirely. A bottle can declare a user
# identity without any git-gate.repos upstreams, and vice versa.
git_user: ManifestGitUser = field(default_factory=ManifestGitUser)
egress: ManifestEgressConfig = field(default_factory=ManifestEgressConfig)
# Per-bottle stuck-recovery sidecar (PRD 0013). When true (the
# default, issue #249), the launch step brings up a supervise
# sidecar that exposes egress MCP tools to the agent. Set
# `supervise: false` to skip the sidecar.
supervise: bool = True
@classmethod
def from_dict(cls, name: str, raw: object) -> "ManifestBottle":
d = as_json_object(raw, f"bottle '{name}'")
if "runtime" in d:
raise ManifestError(
f"bottle '{name}' has a 'runtime' field, which is no longer "
f"supported. gVisor (runsc) is now auto-detected by the "
f"backend; remove the 'runtime' field from the bottle "
f"definition."
)
if "ssh" in d:
raise ManifestError(
f"bottle '{name}' has an 'ssh' field, which has been removed "
f"(PRD 0009). Declare upstreams under 'git-gate.repos' with "
f"url + identity + host_key; the git-gate sidecar (PRD 0008) "
f"holds the credential and gitleaks-scans pushes."
)
if "git" in d:
raise ManifestError(
f"bottle '{name}' uses 'git' which has been replaced by "
f"'git-gate' (PRD 0047). Move git.user → git-gate.user "
f"and git.remotes → git-gate.repos (fields: url, identity, host_key)."
)
if "git_user" in d:
raise ManifestError(
f"bottle '{name}' has a 'git_user' field, which has been "
f"removed. Move it under 'git-gate.user'."
)
unknown = set(d.keys()) - BOTTLE_KEYS
if unknown:
allowed = ", ".join(sorted(BOTTLE_KEYS))
raise ManifestError(
f"bottle '{name}' has unknown key(s) {sorted(unknown)}; "
f"allowed keys are {allowed}."
)
env: dict[str, str] = {}
env_raw = d.get("env")
if env_raw is not None:
env_dict = as_json_object(env_raw, f"bottle '{name}' env")
for var, value in env_dict.items():
if not isinstance(value, str):
raise ManifestError(
f"env entry {var} in bottle '{name}' must be a JSON string "
f"(was {type(value).__name__}). Use \"?<message>\" for prompt-at-runtime."
)
env[var] = value
git: tuple[ManifestGitEntry, ...] = ()
git_user = ManifestGitUser()
git_raw = d.get("git-gate")
if git_raw is not None:
git, git_user = parse_git_gate_config(name, git_raw)
agent_provider = (
ManifestAgentProvider.from_dict(name, d["agent_provider"])
if "agent_provider" in d
else ManifestAgentProvider()
)
egress = (
ManifestEgressConfig.from_dict(name, d["egress"])
if "egress" in d
else ManifestEgressConfig()
)
supervise_raw = d.get("supervise", True)
if not isinstance(supervise_raw, bool):
raise ManifestError(
f"bottle '{name}' supervise must be a boolean "
f"(was {type(supervise_raw).__name__})"
)
return cls(
env=env, agent_provider=agent_provider, git=git,
git_user=git_user, egress=egress, supervise=supervise_raw,
)
def _merge_git_user(
agent_user: ManifestGitUser, base_user: ManifestGitUser
) -> ManifestGitUser:
@@ -237,8 +142,6 @@ def _resolve_effective_bottle_eager(
When bottle_names is non-empty they are merged in order. When empty, falls
back to agent.bottle. Raises ManifestError when neither is set."""
from .manifest_extends import merge_bottles_runtime
if bottle_names:
resolved: list[ManifestBottle] = []
for bn in bottle_names:
@@ -270,9 +173,6 @@ def _resolve_effective_bottle_lazy(
When bottle_names is non-empty they are resolved from disk and merged in
order. When empty, falls back to agent_bottle. Raises ManifestError when
neither is set."""
from .manifest_extends import merge_bottles_runtime
from .manifest_loader import load_bottle_chain_from_dir
if bottle_names:
resolved = [load_bottle_chain_from_dir(bn, bottles_dir) for bn in bottle_names]
return merge_bottles_runtime(resolved)
@@ -358,8 +258,6 @@ class ManifestIndex:
home_md = home_dir / ".bot-bottle"
cwd_md = cwd_dir / ".bot-bottle"
from .manifest_loader import check_stale_json
check_stale_json(home_dir, home_md, "$HOME")
if cwd_dir.resolve() != home_dir.resolve():
check_stale_json(cwd_dir, cwd_md, "$CWD")
@@ -399,7 +297,6 @@ class ManifestIndex:
files = sorted(stale_bottles.glob("*.md"))
if files:
names = ", ".join(p.name for p in files)
from .log import warn
warn(
f"ignoring bottle file(s) under "
f"{stale_bottles}: {names}. Bottles can only "
@@ -421,7 +318,6 @@ class ManifestIndex:
raw_bottles: dict[str, dict[str, object]] = {}
for n, b in raw_bottles_obj.items():
raw_bottles[n] = as_json_object(b, f"bottle '{n}'")
from .manifest_extends import resolve_bottles
bottles = resolve_bottles(raw_bottles)
@@ -439,7 +335,6 @@ class ManifestIndex:
filenames without reading their content. In eager mode (from
from_json_obj) it returns the pre-parsed bottles' names."""
if self.home_md is not None:
from .manifest_loader import scan_bottle_names
return scan_bottle_names(self.home_md / "bottles")
return sorted(self.bottles.keys())
@@ -451,7 +346,6 @@ class ManifestIndex:
filenames without reading their content. In eager mode (from
from_json_obj) it returns the pre-parsed agents' names."""
if self.home_md is not None:
from .manifest_loader import scan_agent_names
home_names = set(scan_agent_names(self.home_md / "agents").keys())
cwd_names: set[str] = set()
if self.cwd_md is not None:
@@ -509,10 +403,6 @@ class ManifestIndex:
"""Lazy path (resolve/from_md_dirs): read and parse the agent file and
its bottle chain from disk for the first time here."""
assert self.home_md is not None # guaranteed by load_for_agent dispatch
from .manifest_loader import scan_agent_names
from .manifest_schema import validate_agent_frontmatter_keys
from .yaml_subset import YamlSubsetError, parse_frontmatter
# Locate the agent file; cwd wins over home on name collision.
home_agents = scan_agent_names(self.home_md / "agents")
cwd_agents: dict[str, Path] = {}
+11 -1
View File
@@ -8,7 +8,7 @@ from typing import cast
from .agent_provider import PROVIDER_TEMPLATES
from .manifest_util import ManifestError, as_json_object
from .manifest_git import ManifestGitUser
from .manifest_schema import AGENT_MODEL_KEYS
from .manifest_schema import AGENT_MODEL_KEYS, is_valid_entity_name
@dataclass(frozen=True)
@@ -161,6 +161,16 @@ class ManifestAgent:
f"agent '{name}' skills[{i}] must be a string "
f"(was {type(skill).__name__})"
)
# Skill names become host/guest path segments and are
# interpolated into provisioning shell commands, so they
# must fit the same kebab-case convention as bottle/agent
# filenames — rejecting anything that could break out of a
# path segment or inject shell metacharacters.
if not is_valid_entity_name(skill):
raise ManifestError(
f"agent '{name}' skills[{i}] {skill!r} is not a valid "
f"skill name; must match [a-z][a-z0-9-]*"
)
collected.append(skill)
skills = tuple(collected)
+129
View File
@@ -0,0 +1,129 @@
"""The `ManifestBottle` value type.
Split out of `manifest.py` so the `extends:`/loader resolvers can import it
without a circular dependency: `manifest.py` imports those resolvers, while
they only need this value type. Everything here depends on leaf modules
(`manifest_util`, `manifest_agent`, `manifest_egress`, `manifest_git`,
`manifest_schema`), so this module sits at the bottom of the manifest layer.
`manifest.py` re-exports `ManifestBottle`, so existing
`from .manifest import ManifestBottle` callers are unaffected.
"""
from __future__ import annotations
from dataclasses import dataclass, field
from typing import Mapping
from .manifest_util import ManifestError, as_json_object
from .manifest_agent import ManifestAgentProvider
from .manifest_egress import ManifestEgressConfig
from .manifest_git import ManifestGitEntry, ManifestGitUser, parse_git_gate_config
from .manifest_schema import BOTTLE_KEYS
__all__ = ["ManifestBottle"]
def _empty_str_dict() -> dict[str, str]:
return {}
@dataclass(frozen=True)
class ManifestBottle:
env: Mapping[str, str] = field(default_factory=_empty_str_dict)
agent_provider: ManifestAgentProvider = field(default_factory=ManifestAgentProvider)
git: tuple[ManifestGitEntry, ...] = ()
# Per-bottle git identity (issue #86). Empty default — bottles
# that don't set `git-gate.user:` in the manifest skip the
# `git config --global` step entirely. A bottle can declare a user
# identity without any git-gate.repos upstreams, and vice versa.
git_user: ManifestGitUser = field(default_factory=ManifestGitUser)
egress: ManifestEgressConfig = field(default_factory=ManifestEgressConfig)
# Per-bottle stuck-recovery sidecar (PRD 0013). When true (the
# default, issue #249), the launch step brings up a supervise
# sidecar that exposes egress MCP tools to the agent. Set
# `supervise: false` to skip the sidecar.
supervise: bool = True
@classmethod
def from_dict(cls, name: str, raw: object) -> "ManifestBottle":
d = as_json_object(raw, f"bottle '{name}'")
if "runtime" in d:
raise ManifestError(
f"bottle '{name}' has a 'runtime' field, which is no longer "
f"supported. gVisor (runsc) is now auto-detected by the "
f"backend; remove the 'runtime' field from the bottle "
f"definition."
)
if "ssh" in d:
raise ManifestError(
f"bottle '{name}' has an 'ssh' field, which has been removed "
f"(PRD 0009). Declare upstreams under 'git-gate.repos' with "
f"url + identity + host_key; the git-gate sidecar (PRD 0008) "
f"holds the credential and gitleaks-scans pushes."
)
if "git" in d:
raise ManifestError(
f"bottle '{name}' uses 'git' which has been replaced by "
f"'git-gate' (PRD 0047). Move git.user → git-gate.user "
f"and git.remotes → git-gate.repos (fields: url, identity, host_key)."
)
if "git_user" in d:
raise ManifestError(
f"bottle '{name}' has a 'git_user' field, which has been "
f"removed. Move it under 'git-gate.user'."
)
unknown = set(d.keys()) - BOTTLE_KEYS
if unknown:
allowed = ", ".join(sorted(BOTTLE_KEYS))
raise ManifestError(
f"bottle '{name}' has unknown key(s) {sorted(unknown)}; "
f"allowed keys are {allowed}."
)
env: dict[str, str] = {}
env_raw = d.get("env")
if env_raw is not None:
env_dict = as_json_object(env_raw, f"bottle '{name}' env")
for var, value in env_dict.items():
if not isinstance(value, str):
raise ManifestError(
f"env entry {var} in bottle '{name}' must be a JSON string "
f"(was {type(value).__name__}). Use \"?<message>\" for prompt-at-runtime."
)
env[var] = value
git: tuple[ManifestGitEntry, ...] = ()
git_user = ManifestGitUser()
git_raw = d.get("git-gate")
if git_raw is not None:
git, git_user = parse_git_gate_config(name, git_raw)
agent_provider = (
ManifestAgentProvider.from_dict(name, d["agent_provider"])
if "agent_provider" in d
else ManifestAgentProvider()
)
egress = (
ManifestEgressConfig.from_dict(name, d["egress"])
if "egress" in d
else ManifestEgressConfig()
)
supervise_raw = d.get("supervise", True)
if not isinstance(supervise_raw, bool):
raise ManifestError(
f"bottle '{name}' supervise must be a boolean "
f"(was {type(supervise_raw).__name__})"
)
return cls(
env=env, agent_provider=agent_provider, git=git,
git_user=git_user, egress=egress, supervise=supervise_raw,
)
+4 -28
View File
@@ -2,11 +2,10 @@
from __future__ import annotations
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from .manifest import ManifestBottle
from .manifest_egress import ManifestEgressConfig
from .manifest_bottle import ManifestBottle
from .manifest_egress import ManifestEgressConfig, validate_egress_routes
from .manifest_git import ManifestGitUser, parse_git_gate_config
from .manifest_util import ManifestError, as_json_object
def merge_bottles_runtime(bottles: "list[ManifestBottle]") -> "ManifestBottle":
@@ -27,9 +26,6 @@ def merge_bottles_runtime(bottles: "list[ManifestBottle]") -> "ManifestBottle":
def _merge_two_bottles_runtime(base: "ManifestBottle", override: "ManifestBottle") -> "ManifestBottle":
from .manifest import ManifestBottle, ManifestGitUser
from .manifest_egress import ManifestEgressConfig
merged_env = {**base.env, **override.env}
merged_git_user = ManifestGitUser(
@@ -81,8 +77,6 @@ def _resolve_one_bottle(
repos_cache: dict[str, dict[str, object]],
seen: tuple[str, ...],
) -> ManifestBottle:
from .manifest import ManifestBottle, ManifestError
if name in cache:
return cache[name]
if name in seen:
@@ -174,11 +168,6 @@ def _fold_two_bottles(
later_repos_raw: dict[str, object],
) -> tuple[ManifestBottle, dict[str, object]]:
"""Combine two resolved parent bottles; later wins over earlier."""
from .manifest import ManifestBottle, ManifestGitUser
from .manifest_egress import ManifestEgressConfig
from .manifest_git import parse_git_gate_config
from .manifest_util import as_json_object
merged_env = {**earlier.env, **later.env}
merged_git_user = ManifestGitUser(
@@ -227,10 +216,6 @@ def _merge_bottles(
name: str,
) -> ManifestBottle:
"""Apply PRD 0025 merge rules."""
from .manifest import ManifestBottle, ManifestGitUser
from .manifest_egress import validate_egress_routes
from .manifest_util import as_json_object
# git-gate.repos: when the child declares repos, inject the already
# name-merged repo set (computed by _resolve_repos_raw) so the child
# parses with the full inherited+overridden list (issue #237).
@@ -303,8 +288,6 @@ def _resolve_repos_raw(
inherits the parent's set verbatim; an explicit empty dict clears it.
Otherwise parent and child unite by name, with same-name entries
field-merged (parent fields are defaults, child fields win)."""
from .manifest_util import as_json_object
if not _child_declares_git_gate_repos(child_raw):
return parent_repos
child_repos = _declared_repos_raw(child_raw)
@@ -324,8 +307,6 @@ def _resolve_repos_raw(
def _declared_repos_raw(child_raw: dict[str, object]) -> dict[str, object]:
"""Return the child's explicitly declared git-gate.repos as raw dicts,
or an empty dict when none are declared."""
from .manifest_util import as_json_object
if not _child_declares_git_gate_repos(child_raw):
return {}
git_raw = as_json_object(child_raw.get("git-gate", {}), "child git-gate")
@@ -333,8 +314,6 @@ def _declared_repos_raw(child_raw: dict[str, object]) -> dict[str, object]:
def _child_declares_git_gate_repos(child_raw: dict[str, object]) -> bool:
from .manifest_util import as_json_object
git_raw = child_raw.get("git-gate")
if git_raw is None:
return False
@@ -347,9 +326,6 @@ def _merge_egress(
child: ManifestEgressConfig,
child_raw: dict[str, object],
) -> ManifestEgressConfig:
from .manifest_egress import ManifestEgressConfig
from .manifest_util import as_json_object
child_egress_raw = as_json_object(child_raw.get("egress"), "child egress")
routes = parent.routes + child.routes
log = child.Log if "log" in child_egress_raw else parent.Log
+2 -6
View File
@@ -3,9 +3,10 @@
from __future__ import annotations
from pathlib import Path
from typing import TYPE_CHECKING
from .log import warn
from .manifest_bottle import ManifestBottle
from .manifest_extends import resolve_bottles
from .manifest_schema import (
entity_name_from_path,
validate_bottle_frontmatter_keys,
@@ -13,9 +14,6 @@ from .manifest_schema import (
from .manifest_util import ManifestError
from .yaml_subset import YamlSubsetError, parse_frontmatter
if TYPE_CHECKING:
from .manifest import ManifestBottle
def check_stale_json(dir_path: Path, md_dir: Path, label: str) -> None:
"""Die if `<dir_path>/bot-bottle.json` exists but `md_dir` does
@@ -78,8 +76,6 @@ def load_bottle_chain_from_dir(
Only the files in the extends chain are read — unrelated bottle files
are never touched. Raises ManifestError on parse or validation failure."""
from .manifest_extends import resolve_bottles
raws: dict[str, dict[str, object]] = {}
to_load = [bottle_name]
while to_load:
+8 -1
View File
@@ -33,13 +33,20 @@ AGENT_KEYS = (
AGENT_MODEL_KEYS = AGENT_KEYS | frozenset({"prompt"})
def is_valid_entity_name(name: str) -> bool:
"""True if `name` fits the kebab-case `[a-z][a-z0-9-]*` convention
shared by bottle/agent filenames and skill names. Names that satisfy
this are also safe to interpolate into a host/guest path segment."""
return bool(_FILENAME_RX.match(name))
def entity_name_from_path(path: Path) -> str | None:
"""Return the entity name implied by the filename, or None if the
filename does not fit the [a-z][a-z0-9-]* convention."""
if path.suffix != ".md":
return None
stem = path.stem
if not _FILENAME_RX.match(stem):
if not is_valid_entity_name(stem):
return None
return stem
@@ -0,0 +1,460 @@
# PRD prd-new: Forge native integration
- **Status:** Draft
- **Author:** claude
- **Created:** 2026-06-29
- **Issue:** #317
## Summary
Add a webhook-driven orchestration layer that lets Gitea issues and PR comments
drive bot-bottle sessions end-to-end with no operator in the loop for the happy
path. An issue assigned to a member of the configured agent org and labelled
with an agent name triggers a headless bottle launch; the bottle processes the
issue, opens a PR, and interacts with the forge through a **forge sidecar**
the agent never touches the Gitea API or its credentials directly. The agent
calls `signal_done(status, summary)` on the sidecar when a work unit is
complete; the sidecar relays that to the orchestrator over a queue dir (the same
pattern as the supervise sidecar), so completion is an unambiguous in-band
signal rather than a comment the orchestrator has to parse. The orchestrator
freezes the bottle and attaches a provenance footer. Subsequent PR comments
rehydrate the frozen bottle. The bottle is destroyed when the PR closes.
The forge sidecar is backed by a `Forge` abstract class with per-provider
implementations (Gitea first), so the agent's prompts and the sidecar protocol
stay forge-agnostic. The sidecar logs forge operations semantically ("read PR
description", "posted comment", "signalled done"), giving richer provenance than
post-hoc egress-byte parsing, and enforces a **read-anywhere / write-scoped**
permission model: the agent may read for context but may only write to the
issue and PRs it was assigned.
The separation of concerns across the two layers: bot-bottle owns the headless
launch primitives, the forge sidecar + `Forge` abstraction, forge state, and the
provenance builder. `bot-bottle-orchestrator` (separate binary) owns the webhook
listener, bottle lifecycle loop, and monitoring dashboard; it calls into
bot-bottle via `./cli.py orchestrate`, a thin wrapper command. This PRD covers
bot-bottle's side of that contract.
## Problem
Today an operator must open the TUI, select an agent and bottle, confirm the
preflight, and type prompts interactively. This blocks "issue → PR" automation
and produces no durable audit record of what the agent did. The security model
already provides the right isolation and egress controls, and `start --headless`
(#315) already gives `bot-bottle-orchestrator` a non-interactive launch path.
The missing pieces are a headless `resume` counterpart for rehydrating frozen
bottles, a forge-interaction surface the agent uses to read context, post
comments, and signal completion, and the provenance trail that makes the audit
story legible to reviewers on every PR.
That forge-interaction surface could be built two ways: (2) give the agent the
Gitea API directly with cred-proxy injecting the token, or (3) put a forge
sidecar between the agent and the forge. This PRD takes **option 3**. The
deciding factors: a sidecar `signal_done` call is an unambiguous completion
signal where comment-parsing is a correctness risk that surfaces in production;
the sidecar produces a semantic audit trail rather than HTTP bytes, which is
load-bearing for provenance (the stated product priority); and the sidecar can
enforce scope tighter than repo-wide API-key permissions, reducing blast radius
for a prompt-injected agent. The costs — a second sidecar process per forge run,
a new failure mode if it crashes, and per-forge implementation cost — are
accepted as the price of those properties.
## Goals / Success Criteria
1. Headless launch already exists: `./cli.py start <agent> --headless --prompt`
(#315) runs non-interactively with no TUI selectors or y/N preflight. This
PRD builds on it rather than re-introducing it. The remaining gap is a
matching headless `resume` path (`./cli.py resume --headless`), since
rehydrating a frozen bottle for a new prompt is required by the freeze /
rehydrate loop and `resume` has no non-interactive entry point today.
2. An issue assigned to a member of the configured org (`FORGE_ORG`, default
`bot-bottle`) and labelled `bot-bottle:<agent-name>` is the trigger
convention. Org membership is verified via the Gitea API at event time.
3. Forge-targeted bottles run a **forge sidecar** that exposes a small,
forge-agnostic API (comment/issue/PR CRUD plus `signal_done`) over the same
queue-dir + HTTP/JSON-RPC machinery as the supervise sidecar. The agent calls
the sidecar; it never sees the forge token or forge-specific endpoints.
4. The sidecar is backed by a `Forge` abstract class. Gitea is the first
concrete implementation; adding a forge means a new subclass, not changes to
the agent prompt or sidecar protocol. The sidecar enforces a read-anywhere /
write-scoped model: writes are limited to the assigned issue and its PRs;
reads are unrestricted for context.
5. The agent calls `signal_done(status, summary)` on the sidecar when a work
unit is complete; the sidecar relays it to the orchestrator over a queue dir.
This is the done signal — no comment parsing. A watchdog timeout
(configurable, default 30 min) causes the orchestrator to treat the run as
done-without-self-report if the agent exits without signalling.
6. Every orchestrator-posted comment ends with a provenance footer: agent name,
bottle name(s), slug, start time, duration, exit code, gitleaks result, and
egress summary.
7. Forge state (issue → slug, status) is persisted to disk and survives
orchestrator restarts.
8. `./cli.py orchestrate status` lists active forge-managed bottles and their
issue/PR URLs.
9. Unit tests cover: label parsing, org-membership check path, forge state
read/write, provenance footer rendering, headless launch arg construction,
forge env var injection, sidecar request dispatch through the `Forge`
abstraction, write-scope enforcement (reject writes outside the assigned
issue/PRs), and `signal_done` queue relay.
## Non-goals
- Webhook signature verification (HMAC-SHA256). Added as a follow-up.
- The `bot-bottle-orchestrator` binary itself — this PRD covers bot-bottle's
side of the interface only. The orchestrator is a separate project.
- GitHub or GitLab support.
- Multiple simultaneous forge bottles per issue.
- Automatic retry on agent error exit.
- Bottle destruction on issue close (PR close only; issue close is ambiguous).
- Concurrent multi-issue handling (one blocking run per orchestrator process).
- A monitoring dashboard (orchestrator-side concern).
- Folding `DeployKeyProvisioner` into the `Forge` abstraction. Deploy-key
provisioning runs at bottle-provision time on the host; the forge sidecar runs
inside the bottle at agent time. The two have different lifecycles and actors,
so coupling them into one class is deferred to a follow-up. This PRD only
shares the Gitea HTTP client between them.
## Design
### Targeting convention
An issue is forge-targeted when **both** hold:
- At least one assignee is a member of the Gitea org named by `FORGE_ORG`
(default `bot-bottle`). Checked via `GET /api/v1/orgs/{org}/members/{user}`.
- At least one label has the prefix `bot-bottle:`. The suffix names the agent
manifest, e.g. `bot-bottle:implementer` → agent `implementer`.
`FORGE_ORG` is read at orchestrate-command startup. It is not embedded in
manifests or state files; the orchestrator stamps its value into log output for
auditability.
An optional label `bot-bottle-bottle:<name>` overrides bottle selection. When
absent the agent's default bottle is used.
### `./cli.py orchestrate` — the thin wrapper
```
./cli.py orchestrate start --agent AGENT [--bottle BOTTLE ...] --prompt PROMPT
[--label LABEL] [--backend BACKEND]
./cli.py orchestrate resume --slug SLUG --prompt PROMPT [--backend BACKEND]
./cli.py orchestrate status
```
`orchestrate start` is a thin shim over the already-shipped `start --headless`
(#315): it forwards agent / bottle / label / prompt and adds the forge-specific
wiring (`forge_env`, sidecar launch). It does not re-implement headless launch.
The caller (`bot-bottle-orchestrator`) manages freeze, state, and the forge
sidecar's done signal around it.
`orchestrate resume` is the shim over the new `resume --headless` (below).
`orchestrate status` prints the forge state table.
### Headless primitives — what exists vs. what's new
Headless **start** already shipped in #315 and this PRD reuses it as-is:
- `./cli.py start <agent> --headless --prompt TEXT` — no TUI selectors, no y/N
preflight. Internally `_start_headless()` calls the shared `_launch_bottle()`
with `assume_yes=True` and `headless_prompt_text=prompt`.
- The prompt is delivered through `AgentProvider.headless_prompt(prompt)`
claude `-p`, codex positional, pi `-p`. The orchestrator does **not** hand-roll
agent args; it relies on this provider abstraction. (An earlier draft proposed
`start_headless` / `attach_agent_headless` helpers that constructed
`--no-interactive`/`-p` directly — those are dropped as redundant with, and
divergent from, what #315 merged.)
Two additions are needed on top of #315:
**1. A `forge_env` hook on the headless launch path.** The orchestrator needs to
pass forge context + token through to the forge sidecar launched alongside the
agent. This is a parameter threaded into `_launch_bottle` (the same core
`start --headless` already uses), not a parallel launch function. The agent
process itself does not receive the token.
**2. `resume --headless`** — new in `bot_bottle/cli/resume.py`, mirroring the
`--headless` flag on `start`:
```
./cli.py resume <slug> --headless --prompt TEXT
```
It rehydrates a frozen bottle and runs one headless prompt via the same
`assume_yes` + `headless_prompt` path, returning the agent's exit code. `resume`
has no non-interactive entry point today, so this is genuinely new work rather
than a rename of an existing helper.
### Forge sidecar
Forge-targeted bottles run a forge sidecar alongside the agent, mirroring the
supervise sidecar: a per-bottle process that exposes an HTTP/JSON-RPC endpoint
over a Unix socket and relays events to the orchestrator through a queue dir.
The agent calls the sidecar; the sidecar holds the forge token and makes the
actual forge API calls. The agent never receives the credential and never sees a
forge-specific endpoint — swapping Gitea for another forge does not change the
agent prompt or the sidecar protocol.
The sidecar is configured at launch from the forge context (owner, repo, issue,
PR) and the token, supplied by the orchestrator — not baked into the agent
manifest. Because the sidecar owns the token, forge traffic does not need a
cred-proxy egress route on the agent; the agent's egress policy is unchanged by
forge targeting.
**Sidecar protocol** (forge-agnostic; each method maps to a `Forge` call):
| Method | Scope | Purpose |
|---|---|---|
| `read_issue(number)` | read-anywhere | Read issue/PR body for context |
| `read_comments(number)` | read-anywhere | Read a thread for context |
| `post_comment(number, body)` | write-scoped | Post to the assigned issue/PR |
| `update_description(number, body)` | write-scoped | Edit the assigned issue/PR body |
| `signal_done(status, summary)` | — | Relay completion to the orchestrator |
**Scope enforcement** is read-anywhere / write-scoped: read methods accept any
issue/PR number for context; write methods are rejected unless the target is the
assigned issue or one of its PRs. This is tighter than Gitea's repo-wide API-key
permissions and bounds the blast radius of a prompt-injected agent. Rejections
are logged semantically (operation, target, reason) so the audit trail records
attempted out-of-scope writes, not just allowed ones.
**Semantic audit**: every sidecar call is logged as a structured operation
("read PR #318 description", "posted comment to #317", "signalled done:
success") rather than as opaque HTTP bytes. This log feeds provenance directly,
with no post-hoc egress-log parsing.
### `Forge` abstraction — `bot_bottle/contrib/forge/`
The sidecar dispatches to a `Forge` abstract class. Each provider implements the
operations behind the sidecar protocol:
```python
class Forge(abc.ABC):
@abc.abstractmethod
def read_issue(self, number: int) -> Issue: ...
@abc.abstractmethod
def read_comments(self, number: int) -> list[Comment]: ...
@abc.abstractmethod
def post_comment(self, number: int, body: str) -> None: ...
@abc.abstractmethod
def update_description(self, number: int, body: str) -> None: ...
@abc.abstractmethod
def is_org_member(self, org: str, username: str) -> bool: ...
@abc.abstractmethod
def get_pr_for_issue(self, number: int) -> int | None: ...
@abc.abstractmethod
def is_pr_open(self, number: int) -> bool: ...
```
`GiteaForge` is the first and only concrete implementation in this PRD. It wraps
the Gitea HTTP client (below). Adding GitHub or GitLab later is a new subclass;
the sidecar, protocol, and agent prompt are untouched.
> **Deferred:** `DeployKeyProvisioner` is *not* folded into `Forge` here.
> Deploy-key provisioning runs on the host at provision time; the sidecar runs
> in the bottle at agent time. They have different lifecycles and actors, so a
> shared abstract base would couple two unrelated auth contexts. For now they
> only share the Gitea HTTP client; a later PRD can revisit unification.
### Forge env vars
The orchestrator passes forge context to the **sidecar** (not the agent) at
launch. The agent does not need owner/repo/issue env vars to construct API
calls, since it only names issue/PR numbers to the sidecar:
| Var | Example | Purpose |
|---|---|---|
| `FORGE_GITEA_API` | `https://gitea.dideric.is/api/v1` | Base URL the sidecar calls |
| `FORGE_OWNER` | `didericis` | Repo owner |
| `FORGE_REPO` | `bot-bottle` | Repo name |
| `FORGE_ISSUE_NUMBER` | `317` | Assigned issue (defines write scope) |
| `FORGE_PR_NUMBER` | `318` | Assigned PR (empty until PR exists) |
The agent's forge-specific prompt instructs it to call `signal_done` on the
sidecar when a work unit is complete, and to use the sidecar for any
comment/description writes. The instruction is forge-agnostic and is part of the
forge prompt overlay, not the base agent manifest, so non-forge runs are
unaffected.
### Done signal and watchdog
The agent calls `signal_done(status, summary)` on the sidecar when it finishes a
work unit. The sidecar writes the event to its queue dir; the orchestrator reads
it and:
1. Reads the forge state for `(owner, repo, issue_number)`.
2. If `status == "running"`, treats the event as the done signal: freezes the
bottle, posts a summary comment with the provenance footer, sets
`status = "frozen"`.
Because completion is an explicit `signal_done` call, the orchestrator does not
parse comment text to detect "done", and intermediate comments the agent posts
mid-run cannot be mistaken for completion.
**Watchdog**: the orchestrator tracks `last_checkin_at` in forge state, updated
on each sidecar event. A background thread wakes every minute. If
`now - last_checkin_at > FORGE_WATCHDOG_TIMEOUT` (default 30 min, configurable
via env) and `status == "running"`, the orchestrator treats the run as
done-without-self-report: it posts the provenance footer (with `watchdog_fired`
set) and freezes the bottle.
**Sidecar-death failure mode**: if the forge sidecar crashes mid-run the agent
loses forge access while the bottle is otherwise healthy. The orchestrator
detects a dead sidecar (socket/queue gone) the same way it detects a stalled
agent and falls back to the watchdog path, posting a footer that flags the
incomplete run.
### Forge state — `bot_bottle/contrib/gitea/forge_state.py`
```
~/.bot-bottle/forge/
<owner>/
<repo>/
issue-<n>.json
```
Schema:
```json
{
"slug": "implementer-abc12",
"pr_number": 42,
"agent_name": "implementer",
"bottle_names": ["claude"],
"backend_name": "docker",
"agent_git_user": "didericis-claude",
"issue_number": 17,
"owner": "didericis",
"repo": "bot-bottle",
"status": "frozen",
"last_checkin_at": "2026-06-29T12:04:12-04:00"
}
```
`status`: `"running"` | `"frozen"` | `"destroyed"`.
Public API:
```python
def write_forge_state(state: ForgeState) -> None: ...
def read_forge_state(owner: str, repo: str, issue_number: int) -> ForgeState | None: ...
def delete_forge_state(owner: str, repo: str, issue_number: int) -> None: ...
def all_forge_states() -> list[ForgeState]: ...
```
Writes use atomic rename (`os.replace`) for crash safety.
### Provenance — `bot_bottle/contrib/gitea/provenance.py`
```python
def build_provenance_footer(
slug: str,
*,
agent_name: str,
bottle_names: tuple[str, ...],
started_at: str,
finished_at: str,
exit_code: int,
watchdog_fired: bool = False,
egress_log_path: Path | None = None,
) -> str:
"""Return a markdown string for appending to a Gitea comment body."""
```
Output (collapsed by default):
```markdown
<details><summary>🔬 Run provenance</summary>
| Field | Value |
|---|---|
| agent | `implementer` |
| bottle | `claude` |
| slug | `implementer-abc12` |
| started | 2026-06-29T12:00:00-04:00 |
| duration | 4m 12s |
| exit | 0 ✓ |
| gitleaks | ✓ no secrets detected |
| done signal | sidecar `signal_done` *(or: watchdog — agent did not signal)* |
**Egress** (deny-by-default; 2 routes allowed)
- `api.anthropic.com` — Bearer auth
- `pypi.org` — unauthenticated
Forge traffic is not an agent egress route — the forge sidecar holds the token
and makes those calls out of band. The provenance footer's forge operations come
from the sidecar's semantic audit log.
</details>
```
The egress summary is read from `~/.bot-bottle/state/<slug>/egress/`. When
unavailable the section is omitted. `watchdog_fired=True` changes the
"done signal" row to warn reviewers.
### Gitea HTTP client — `bot_bottle/contrib/gitea/client.py`
`GiteaForge` (and the existing `GiteaDeployKeyProvisioner`) share one thin HTTP
client. Unlike the option-2 design, the token is held by the sidecar process and
passed to the client directly — there is no agent-side cred-proxy route to
inject it, because the agent never makes forge calls.
```python
class GiteaClient:
def __init__(self, *, api_url: str, owner: str, repo: str, token: str) -> None: ...
def is_org_member(self, org: str, username: str) -> bool: ...
def post_comment(self, issue_number: int, body: str) -> None: ...
def update_comment_body(self, issue_number: int, body: str) -> None: ...
def get_pr_for_issue(self, issue_number: int) -> int | None: ...
def is_pr_open(self, pr_number: int) -> bool: ...
```
Sharing only the HTTP client (not an abstract base) is the deliberate boundary
between the sidecar and the deploy-key provisioner — see the deferral note under
the `Forge` abstraction.
### Implementation chunks
1. **Headless additions on top of #315** — thread a `forge_env` parameter into
the existing `_launch_bottle` core (the one `start --headless` already uses);
add a `--headless` path to `cli/resume.py` reusing `assume_yes` +
`headless_prompt`. No new `start_headless`/`attach_agent_headless` helpers.
Tests: `forge_env` reaches the sidecar/`guest_env`; `resume --headless` skips
the TUI and y/N preflight and returns the agent exit code.
2. **Forge state**`contrib/gitea/forge_state.py`: `ForgeState` dataclass,
read/write/delete/all helpers, atomic rename. Tests: round-trip JSON, missing
file → None, atomic write.
3. **`Forge` abstraction + Gitea client** — `contrib/forge/base.py` (`Forge`
ABC) and `contrib/gitea/client.py` + `GiteaForge`: `is_org_member`,
`read_issue`, `read_comments`, `post_comment`, `update_description`,
`get_pr_for_issue`, `is_pr_open`. Tests: mock `urllib.request.urlopen`,
assert payloads and 404-as-false for membership.
4. **Forge sidecar** — sidecar process exposing the protocol over a Unix socket,
queue-dir relay, write-scope enforcement, semantic op log, `signal_done`.
Reuses the supervise sidecar bundle machinery. Tests: dispatch each method to
the `Forge`, reject out-of-scope writes, `signal_done` writes a queue event,
scope-rejection is logged.
5. **Provenance**`contrib/gitea/provenance.py`: `build_provenance_footer`.
Tests: required fields present, watchdog row text, egress omitted when log
absent.
6. **`./cli.py orchestrate`** — `cli/orchestrate.py` with `start`, `resume`,
`status` subcommands wired into `cli.py`; `start` launches the forge sidecar
alongside the agent for forge-targeted runs. Tests: arg parsing, `start`
delegates to `start --headless`, `resume` delegates to `resume --headless`.
## Provenance as the product
Every orchestrator-posted comment ends with the provenance footer — non-optional
and not configurable off. PRs that land without a footer were not produced by
this integration. The `watchdog_fired` flag in the footer flags runs where the
agent did not self-report completion, so reviewers know the audit trail may be
incomplete.
The footer links to the bot-bottle repo pinned to the commit SHA active during
the run (not `main`), so the policy that governed the run is permanently
anchored in the PR history.
+16
View File
@@ -165,6 +165,22 @@ class TestAgentValidation(unittest.TestCase):
with self.assertRaises(ManifestError):
ManifestAgent.from_dict("a", {"skills": [5]}, set())
def test_skill_name_rejects_shell_metacharacters(self) -> None:
# Skill names become host/guest path segments interpolated into
# provisioning shell commands; anything outside kebab-case is
# rejected at load so it can never reach a `bottle.exec` string.
for bad in ("foo; rm -rf /", "../escape", "foo bar", "Foo", "-leading"):
with self.assertRaises(ManifestError):
ManifestAgent.from_dict("a", {"skills": [bad]}, set())
def test_skill_name_accepts_kebab_case(self) -> None:
agent = ManifestAgent.from_dict(
"a", {"skills": ["init-entry", "quality-eval", "skill0"]}, set()
)
self.assertEqual(
agent.skills, ("init-entry", "quality-eval", "skill0")
)
def test_prompt_not_string(self) -> None:
with self.assertRaises(ManifestError):
ManifestAgent.from_dict("a", {"prompt": 5}, set())