Commit Graph

380 Commits

Author SHA1 Message Date
didericis 8caa79ee76 test(supervise): ratchet supervise coverage to >=90%
test / unit (pull_request) Successful in 46s
test / integration (pull_request) Successful in 17s
test / coverage (pull_request) Successful in 58s
lint / lint (push) Successful in 1m54s
Sixth per-module ratchet under ADR 0004. Cover the queue/audit
malformed-input and fallback branches:

- path helpers (bot_bottle_root, queue_dir_for_slug,
  _id_from_proposal_filename non-match)
- read_proposal / read_response reject non-object JSON
- list_pending_proposals skips unreadable/non-dict/incomplete
  proposals and ones with a response already present
- wait_for_response tolerates a malformed or incomplete response file
  and then times out at the deadline
- read_audit_entries returns [] for a missing log and skips blank /
  non-JSON / non-dict / missing-field lines
- the fcntl flock helpers swallow OSError on a bad fd

supervise.py: 89% -> 99%. The one remaining line is an unreachable
`continue` (glob already guarantees the .proposal.json suffix).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NkwFXLFff9PYPy4wgVBJp9
2026-06-25 22:19:37 -04:00
didericis 74060192e0 test(manifest): ratchet manifest + manifest_agent to >=90%
test / unit (pull_request) Successful in 46s
test / integration (pull_request) Successful in 17s
test / coverage (pull_request) Successful in 56s
lint / lint (push) Successful in 1m54s
Fifth per-module ratchet under ADR 0004. Drive the validation
rejection and edge paths:

- ManifestBottle.from_dict: unknown key, non-string env value,
  non-bool supervise, removed `runtime` field.
- ManifestAgentProvider.from_dict: unknown key, empty template,
  non-string dockerfile, auth_token / forward_host_credentials
  template constraints.
- _parse_provider_settings: pass-through for non-built-in templates,
  startup_args shape, and the pi-specific string/int/bool/models/
  max_tokens_field/api-key-conflict checks.
- ManifestAgent.from_dict: bottle empty/undefined, skills shape, prompt
  type, agent-level git-gate.repos rejection, empty git-gate allowed.
- Eager ManifestIndex: empty bottles section, unknown-agent load,
  has_agent / require_agent, git_identity_summary (set and empty).

manifest_agent.py: 84% -> 99%; manifest.py: 86% -> 94%. Remaining
manifest.py misses are the lazy on-disk loader paths exercised by the
integration suite.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NkwFXLFff9PYPy4wgVBJp9
2026-06-25 22:15:07 -04:00
didericis 5365a7a852 test(git-gate): ratchet git_gate coverage to >=90%
test / unit (pull_request) Successful in 43s
test / integration (pull_request) Successful in 17s
test / coverage (pull_request) Successful in 58s
lint / lint (push) Successful in 1m53s
Fourth per-module ratchet under ADR 0004. Cover the pure
`git_gate_render_gitconfig` renderer (empty entries, insteadOf URL,
scheme override, RemoteKey ssh alias with/without non-default port,
newline-injection rejection) and the dynamic gitea deploy-key
lifecycle with the forge provisioner mocked:

- `_provision_dynamic_key`: writes key + key-id files, strips `.git`
  from owner/repo, builds the proposal title; missing token raises.
- `revoke_git_gate_provisioned_keys`: revokes a gitea key when the
  id-file is present, skips static-provider entries and missing
  id-files, raises on a missing token.

bot_bottle/git_gate.py: 70% -> 99% (unit only). Two remaining partial
branches are inner conditionals on the alias/owner-repo paths.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NkwFXLFff9PYPy4wgVBJp9
2026-06-25 22:11:19 -04:00
didericis f289b6382c test(egress): ratchet egress_addon_core coverage to >=90%
lint / lint (push) Successful in 1m51s
test / unit (pull_request) Successful in 44s
test / integration (pull_request) Successful in 16s
test / coverage (pull_request) Successful in 57s
Third per-module ratchet under ADR 0004. Add a parsing/serialization
suite for the egress engine's core:

- route validation rejections: payload/route shape, host, auth pairing,
  git block, every matches sub-field (paths/methods/headers type +
  regex-compile + unknown-key), and the dlp block (detector type/name,
  outbound_on_match, unknown key)
- a full valid route round-trips; detectors:false disables
- parse_config log-level validation + load_config invalid-YAML
- route_to_yaml_dict: minimal/auth/git/dlp/matches with default-omission
- evaluate_matches: exact/prefix/regex paths, method filter, exact +
  regex header matching (match and non-match)

egress_addon_core.py: 84% -> 99%. The two remaining missed statements
are defensive guards (an unreachable separator-return and a
no-matching-path-type fallthrough).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NkwFXLFff9PYPy4wgVBJp9
2026-06-25 22:04:27 -04:00
didericis 3073230f58 test(yaml): ratchet yaml_subset coverage to >=90%
lint / lint (push) Successful in 1m51s
test / unit (pull_request) Successful in 45s
test / integration (pull_request) Successful in 16s
test / coverage (pull_request) Successful in 58s
Second per-module ratchet under ADR 0004. Add a branch-coverage suite
for the YAML-subset parser's reachable error/edge cases: literal `#`,
blank-line skipping, unterminated/empty/bad inline list+dict, quoted
commas in flow, missing `:` separators, non-bare keys, empty block ->
None, bare-dash nested lists, quoted-colon list scalars, nested/empty
list-item mappings, duplicate keys, document-level rejections
(block scalars, anchors, tags, non-column-0, top-level list), and
empty frontmatter.

yaml_subset.py: 82% -> 95%. The remaining misses are dead/defensive
guards (e.g. the unreachable bool branch, indent-mismatch raises that
the callers never trigger).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NkwFXLFff9PYPy4wgVBJp9
2026-06-25 22:00:17 -04:00
didericis 18059f2a78 test(egress): ratchet egress_addon coverage to >=90%
lint / lint (push) Successful in 1m52s
test / unit (pull_request) Successful in 44s
test / integration (pull_request) Successful in 16s
test / coverage (pull_request) Successful in 58s
First per-module ratchet under ADR 0004. Extend the adapter flow suite
to cover the remaining behavioural gaps:

- inbound response DLP: injection block (403), warn (logged, forwarded),
  and LOG_FULL response logging
- WebSocket inbound (server->client) scanning: injection kills the
  connection; warn does not; no-websocket is a no-op
- redaction scrubs the token in a header and the request path, not just
  the body
- supervise queue-write OSError fails closed (403)
- _token_allow_timeout_from_env: unset/valid/non-numeric/non-positive
- SIGHUP handler reloads routes; a reload failure keeps the last good
  config
- LOG_FULL logs the forwarded request

egress_addon.py: 76% -> 94%. The remaining misses are the low-value
edges (no-SIGHUP platform, hostname-redaction-fails-closed) called out
in the egress adapter PR.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NkwFXLFff9PYPy4wgVBJp9
2026-06-25 21:54:36 -04:00
didericis 632ab002ed ci(coverage): risk-weighted coverage policy + diff-coverage gate
lint / lint (push) Successful in 1m52s
test / unit (pull_request) Successful in 46s
test / integration (pull_request) Successful in 16s
test / coverage (pull_request) Successful in 1m2s
Adopt ADR 0004: stop chasing a single global coverage number and
measure what matters instead.

- Omit the genuinely-interactive `cli/init.py` shell (read_tty_line
  prompt loops) alongside the existing `cli/tui.py`, with a rationale
  comment in .coveragerc. Subprocess/backend orchestration is NOT
  omitted — it stays visible and is scored via the integration suite.
- scripts/coverage.sh runs unit + integration under one coverage
  measurement (the policy's yardstick) and can report the critical
  security/logic core held to the >=90% target.
- scripts/diff_coverage.py is a stdlib-only gate (no diff-cover dep):
  new/changed executable lines must be >=90% covered. This is the
  enforced regression guard; the global number is informational.
- CI gains a `coverage` job: combined report + the diff-coverage gate.
- Unit-test `cli/__init__.py` dispatch/exit-code mapping (it's logic,
  not I/O, so it earns tests rather than an omit).

Combined unit+integration coverage now reports 83% global / 87% across
the critical modules; per-module ratcheting toward 90% is the ongoing
work this policy frames.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NkwFXLFff9PYPy4wgVBJp9
2026-06-25 21:29:08 -04:00
didericis af7f74dc32 test(egress): cover egress_addon adapter; drop coverage omit
lint / lint (push) Successful in 1m50s
test / unit (pull_request) Successful in 46s
test / integration (pull_request) Successful in 17s
The mitmproxy adapter `egress_addon.py` was omitted from coverage
because it can't import on the host (mitmproxy is sidecar-only) and
only its log-redaction helpers were exercised. Add a request/response
flow suite that stubs mitmproxy and drives the adapter glue:
introspection, allowlist enforcement, auth strip+inject, git
push/fetch blocking, the outbound-DLP block/redact/supervise policy
branches (including the operator approval round-trip), inbound
response scanning, and WebSocket frame scanning.

Removes the `bot_bottle/egress_addon.py` omit from `.coveragerc`;
the adapter now reports ~76% covered.

Closes #286

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NkwFXLFff9PYPy4wgVBJp9
2026-06-25 19:31:21 -04:00
didericis ca910f8f4f fix(start): show bottle lineage root-first with -> arrows
lint / lint (push) Successful in 1m51s
test / unit (push) Successful in 43s
test / integration (push) Successful in 17s
test / unit (pull_request) Successful in 43s
test / integration (pull_request) Successful in 18s
prd-number / assign-numbers (push) Successful in 21s
Update Quality Badges / update-badges (push) Failing after 1m47s
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 16:13:19 -04:00
didericis-codex 338c08a243 test: fix cli selector typing 2026-06-25 16:13:19 -04:00
didericis-claude 6faa6f67aa feat(tui,start): space/enter split, bottle lineage, YAML preflight
Three UX improvements requested in #270 review:

- filter_multiselect: Space toggles selection, Enter confirms (was both)
- bottle picker: bottles with extends chains show ancestry labels
  (e.g. 'claude-dev <- bot-bottle-dev <- dev') for at-a-glance lineage
- preflight: replaces key-value summary with YAML of the resolved manifest
2026-06-25 16:13:19 -04:00
didericis-claude b6ae6af63a fix(types): resolve pyright errors introduced in #269 changes
- manifest.py: remove unused load_bottle_chain_from_dir import
- manifest_extends.py: drop redundant ManifestEgressRoute annotation
- test_cli_start_selector.py: remove unused call import
- test_cli_tui.py: move Optional/constants to top, annotate FakeScreen,
  remove unused curses import
- test_manifest_bottle_merge.py: add type args to dict, annotate **kwargs
2026-06-25 16:13:19 -04:00
didericis-claude ad72eeddc1 feat(tui): add reordering to filter_multiselect
Tab switches focus to the selected-order panel; K/J shift the
highlighted item up/down; Space/Enter removes it. The filter list dims
while the order panel is active. Help line updates per focus mode.
2026-06-25 16:13:19 -04:00
didericis-claude 1ba185d1e0 feat(#269): separate agent and bottle selection at launch time
- `bottle:` in agent frontmatter is now optional; agents without it
  are portable and require bottles to be selected at launch.
- Adds `filter_multiselect` to `tui.py`: multi-select picker with
  ordered selection list, Space/Enter to toggle, Ctrl-D to confirm.
- `ManifestIndex` gains `all_bottle_names` and `load_for_agent` accepts
  `bottle_names: tuple[str, ...]` to merge bottles in order at runtime.
- `merge_bottles_runtime` in `manifest_extends.py` applies the same
  field-merge rules as `extends:` to pre-resolved bottle objects.
- `BottleSpec` gains `bottle_names`; `_validate` and `write_launch_metadata`
  thread it through so `resume` replays the same bottle configuration.
- `cmd_start` shows the bottle multiselect after agent selection,
  pre-populated from the agent's `bottle:` field when present.
- Existing agents with `bottle:` declared continue to work unchanged.
2026-06-25 16:13:19 -04:00
didericis-claude 45a096413f fix: add type annotations to __exit__ context manager (pyright)
lint / lint (push) Successful in 1m47s
test / unit (pull_request) Successful in 43s
test / integration (pull_request) Successful in 16s
2026-06-25 15:03:06 -04:00
didericis c6479d62e4 test: add coverage for git gate and supervise server 2026-06-25 15:03:06 -04:00
didericis-claude 302920e290 feat: support multiple parents in bottle extends:
Allow extends: to accept a list of bottle names in addition to a plain
string. Parents are resolved independently and folded left-to-right
into a single combined parent before the child is merged on top, so
orthogonal concerns (base env, networking, agent provider) can live in
separate bottles without forcing a linear chain.

Merge rules for the parent fold: env dict-merge with later winning on
collision; git-gate.user per-field overlay; git-gate.repos union by
name with later winning per-field on same name; egress.routes
concatenated; all scalar fields (supervise, agent_provider, egress.log)
use last-wins. The existing child-wins-over-all-parents rule is
unchanged. Cycle detection, diamond deduplication, and missing/invalid
parent errors all work across multi-parent graphs.

Closes #268
2026-06-25 05:10:03 -04:00
didericis-codex d2072b13be feat!: remove capability apply
test / unit (pull_request) Successful in 36s
test / integration (pull_request) Successful in 18s
lint / lint (push) Failing after 1m53s
test / unit (push) Successful in 40s
test / integration (push) Successful in 20s
Update Quality Badges / update-badges (push) Successful in 1m37s
2026-06-25 08:58:28 +00:00
didericis-claude 515a95a79d fix: escape quotes/newlines in YAML and gitconfig emitters
test / unit (pull_request) Successful in 35s
test / integration (pull_request) Successful in 17s
lint / lint (push) Successful in 1m48s
test / unit (push) Successful in 32s
test / integration (push) Successful in 16s
Update Quality Badges / update-badges (push) Successful in 1m18s
Closes #258.

`egress_render_routes` and `_render_match_entry` now pass all manifest
strings (host, auth_scheme, token_env, path/header values) through
`_yaml_str_escape` before interpolating into double-quoted YAML scalars,
preventing stray `"` or newlines from corrupting routes.yaml.

`git_gate_render_gitconfig` now calls `_gitconfig_validate_value` on
each Upstream value (and the derived alias) before writing the
`insteadOf` line, rejecting any value containing a newline that would
inject arbitrary gitconfig keys.
2026-06-25 04:23:13 -04:00
didericis-claude 0bace7615a refactor: rename GIT_GATE_DAEMON_TIMEOUT_SECS to GIT_GATE_TIMEOUT_SECS
test / unit (pull_request) Successful in 33s
test / integration (pull_request) Successful in 17s
lint / lint (push) Successful in 1m48s
test / unit (push) Successful in 35s
test / integration (push) Successful in 17s
Update Quality Badges / update-badges (push) Successful in 1m20s
The constant now covers the daemon path, the HTTP backend access-hook,
and the git http-backend CGI subprocess, so 'daemon' in the name was
too narrow. Updated the comment to list all three current uses.
2026-06-25 04:12:43 -04:00
didericis-claude c0d3f16519 refactor: import GIT_GATE_DAEMON_TIMEOUT_SECS instead of duplicating the value 2026-06-25 04:12:43 -04:00
didericis-claude 508c537deb fix: add explicit timeouts to subprocess and HTTP calls in git-gate paths
Closes #255. Without timeouts, a hung upstream during the access-hook
or git http-backend CGI call (git_http_backend.py) and a stalled Gitea
API during deploy-key provisioning (contrib/gitea/deploy_key_provisioner.py)
could wedge a sidecar indefinitely. Adds GIT_HTTP_BACKEND_TIMEOUT_SECS
(30s) to both subprocess.run calls in the HTTP backend, mirroring the
existing GIT_GATE_DAEMON_TIMEOUT_SECS on the daemon path. Adds
_API_TIMEOUT_SECS (30s) and _KEYGEN_TIMEOUT_SECS (10s) to the Gitea
provisioner's urlopen and ssh-keygen calls. Tests verify the timeout
values are forwarded in all four call sites.
2026-06-25 04:12:43 -04:00
didericis-claude d99dba037c feat(supervise): typed RPC error taxonomy for dispatch
test / unit (pull_request) Successful in 31s
test / integration (pull_request) Successful in 17s
lint / lint (push) Successful in 1m48s
test / unit (push) Successful in 32s
test / integration (push) Successful in 18s
Update Quality Badges / update-badges (push) Successful in 1m20s
Introduce _RpcClientError and _RpcInternalError as distinct subclasses
of _RpcError so the dispatcher can handle bad requests and server-side
faults differently — returning client errors verbatim and logging
internal faults with their cause before replying ERR_INTERNAL.

Wrap write_proposal and archive_proposal IO with _RpcInternalError
so OS failures surface through the typed path instead of the bare
Exception fallback. All existing raise _RpcError(...) call sites
converted to _RpcClientError.

Closes #253
2026-06-25 04:02:39 -04:00
didericis-claude 9a878bd885 fix: guard CGI Status-line parse in _write_cgi_response
test / unit (pull_request) Successful in 32s
test / integration (pull_request) Successful in 16s
lint / lint (push) Successful in 1m47s
test / unit (push) Successful in 32s
test / integration (push) Successful in 16s
Update Quality Badges / update-badges (push) Successful in 1m19s
An empty or non-numeric Status: header from git http-backend raised
ValueError/IndexError that escaped the handler thread. Wrap the parse
in a try/except and fall back to HTTP 500 instead.

Closes #254
2026-06-25 03:47:05 -04:00
didericis 0f72843150 fix(macos-container): anchor relative Dockerfile path to build context
test / unit (pull_request) Successful in 33s
test / integration (pull_request) Successful in 17s
lint / lint (push) Successful in 1m49s
test / unit (push) Successful in 33s
test / integration (push) Successful in 18s
Update Quality Badges / update-badges (push) Successful in 1m19s
`container build` resolves -f relative to the current working directory,
not the build context, so builds failed from any cwd other than the repo
root. Anchor a relative Dockerfile to the context before passing it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 03:27:46 -04:00
didericis fd6b14fb32 fix: route remote control through provider startup args
test / unit (pull_request) Successful in 32s
test / integration (pull_request) Successful in 18s
lint / lint (push) Successful in 1m46s
test / unit (push) Successful in 30s
test / integration (push) Successful in 17s
Update Quality Badges / update-badges (push) Successful in 1m23s
2026-06-25 03:08:47 -04:00
didericis-claude 9f9aa2e762 refactor: remove load_routes, use load_config(...).routes in tests
test / unit (pull_request) Successful in 48s
test / integration (pull_request) Successful in 26s
lint / lint (push) Successful in 1m45s
test / unit (push) Successful in 32s
test / integration (push) Successful in 17s
Update Quality Badges / update-badges (push) Successful in 1m21s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-25 06:07:47 +00:00
didericis-codex 454baaf3a1 fix(egress): validate proposed full config
lint / lint (push) Successful in 2m23s
test / unit (pull_request) Successful in 47s
test / integration (pull_request) Successful in 28s
2026-06-25 05:25:42 +00:00
didericis-codex e7dacf7d86 fix: satisfy pyright for log redaction tests
test / integration (pull_request) Successful in 29s
prd-number / assign-numbers (push) Successful in 1m6s
Update Quality Badges / update-badges (push) Successful in 1m40s
test / unit (pull_request) Successful in 52s
lint / lint (push) Successful in 2m20s
test / unit (push) Successful in 50s
test / integration (push) Successful in 27s
2026-06-25 00:32:42 -04:00
didericis-claude 9b929d0684 fix(egress): strip injected Authorization and redact bodies in LOG_FULL path
_log_request and _log_response wrote headers and bodies to stderr verbatim.
_log_request also included the sidecar-injected upstream Authorization value,
exposing live bearer tokens on every allowed request under LOG_FULL.

Apply redact_tokens to all header values and bodies in both log functions;
exclude the authorization header from _log_request entirely since its value
is always a live sidecar-injected credential by the time _log_request runs.

Closes #257
2026-06-25 00:32:42 -04:00
didericis-codex d9a9eef276 docs: remove prd-new code citations
test / integration (pull_request) Successful in 46s
test / unit (pull_request) Successful in 1m4s
lint / lint (push) Successful in 2m36s
prd-number / assign-numbers (push) Successful in 1m24s
test / integration (push) Successful in 34s
test / unit (push) Successful in 52s
Update Quality Badges / update-badges (push) Successful in 2m11s
2026-06-25 03:57:41 +00:00
didericis-codex 5204b98777 refactor(egress): centralize launch env entries
lint / lint (push) Successful in 2m12s
test / unit (pull_request) Successful in 43s
test / integration (pull_request) Successful in 25s
2026-06-25 03:35:24 +00:00
didericis-codex 14ae89580a fix(egress): wire canary env for smolmachines
lint / lint (push) Successful in 2m16s
test / unit (pull_request) Successful in 42s
test / integration (pull_request) Successful in 23s
2026-06-25 03:31:51 +00:00
didericis-codex 4808ef557a fix(egress): randomize canary secret env name
lint / lint (push) Successful in 2m15s
test / unit (pull_request) Successful in 45s
test / integration (pull_request) Successful in 26s
2026-06-25 03:25:37 +00:00
didericis-codex 0a7e166b35 fix(tests): remove unused dlp entropy import
lint / lint (push) Successful in 2m8s
test / unit (pull_request) Successful in 40s
test / integration (pull_request) Successful in 23s
2026-06-24 23:09:11 -04:00
didericis-claude 11cf12188d feat(egress): inject per-session canary token into sidecar and agent environments
EgressPlan gains a `canary: str` field (default "") populated in Egress.prepare()
using secrets.token_urlsafe(32).  Each launched bottle:

  - sidecar receives EGRESS_TOKEN_CANARY=<value> (literal env entry, scanned by
    existing known-secrets detector without any detector code changes)
  - agent receives BOT_BOTTLE_CANARY=<value> (visible fake secret that signals
    exfiltration with zero false positives if it appears in outbound traffic)

Docker compose and macos-container backends updated; smolmachines shares docker
compose and so picks this up automatically.  Unit tests cover canary uniqueness,
detection via scan_known_secrets, and EgressPlan backward-compat default.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-24 23:09:11 -04:00
didericis-claude 701df6cb2f feat(dlp): fragmentation resistance, entropy detector, broadened known-value scan
- _alnum_projection(): strip non-alphanumeric chars for separator-injection detection
- scan_known_secrets() gains two extra passes per secret after exact-variant matching:
  alnum-projection exact match (catches hyphens/spaces between secret chars) and a
  sliding-window partial-match scan (catches chunked substrings ≥ PARTIAL_MATCH_MIN_LEN)
- scan_known_secrets() accepts sensitive_prefixes param (default ("EGRESS_TOKEN_",))
  so redact_tokens and call-sites can extend the scanned env-var prefix set
- scan_entropy() warn-only detector flagging windows with Shannon entropy ≥ 5.5 bits/char
- "entropy" added to OUTBOUND_DETECTOR_NAMES; scan_outbound opts it in only when
  explicitly listed in dlp.outbound_detectors (never part of the default "all" set)
- scan_outbound reads BOT_BOTTLE_SENSITIVE_PREFIXES from environ to extend
  scan_known_secrets beyond EGRESS_TOKEN_* without schema changes
- Binary bodies decoded via latin-1 fallback (bijective byte↔codepoint) instead
  of utf-8 errors=replace, preserving ASCII secret strings in binary payloads

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-24 23:09:11 -04:00
didericis ecaae708f7 feat(provider): support startup args settings
test / unit (pull_request) Successful in 41s
test / integration (pull_request) Successful in 26s
lint / lint (push) Successful in 2m12s
test / unit (push) Successful in 41s
test / integration (push) Successful in 26s
Update Quality Badges / update-badges (push) Successful in 2m9s
2026-06-24 22:51:27 -04:00
didericis-claude 2e790268b0 fix(deploy-key): raise DeployKeyCollisionError on 422 key conflicts
lint / lint (push) Successful in 2m7s
test / unit (push) Successful in 46s
test / integration (push) Successful in 25s
Update Quality Badges / update-badges (push) Successful in 2m7s
Gitea returns HTTP 422 when a deploy key title or public key content
already exists on the repo. The provisioner previously surfaced this
as a generic RuntimeError with the raw status code. Introduce
DeployKeyCollisionError (a RuntimeError subclass) in the base module
and detect 422 in GiteaDeployKeyProvisioner.create so callers can
catch collisions explicitly and the error message names the repo and
title involved.
2026-06-25 02:23:12 +00:00
didericis-claude a421d1d688 Rename TOOL_ALLOW to TOOL_EGRESS_ALLOW
lint / lint (push) Successful in 3m50s
test / unit (push) Successful in 38s
test / integration (push) Successful in 23s
Update Quality Badges / update-badges (push) Successful in 1m38s
The constant and its MCP tool name ("allow" → "egress-allow") were the
only supervise tools without an egress-scoped identifier, despite the
tool being egress-only (routes.yaml payload, COMPONENT_FOR_TOOL maps
it to "egress", always grouped with TOOL_EGRESS_BLOCK). The rename
brings it in line with TOOL_EGRESS_BLOCK and TOOL_EGRESS_TOKEN_ALLOW,
and adds TOOL_EGRESS_ALLOW and TOOL_EGRESS_BLOCK to __all__ (both were
previously absent).
2026-06-25 01:23:10 +00:00
didericis 1ad710a041 Default agent-provider routes to the redact on-match policy
lint / lint (push) Successful in 1m42s
test / unit (pull_request) Successful in 34s
test / integration (pull_request) Successful in 16s
Provider routes (the agent talking to its own LLM API — api.anthropic.com,
the Codex backend, etc.) carry the whole conversation payload, which is the
worst source of token-shaped false positives. egress_routes_for_bottle now
fills outbound_on_match=redact on any provider route that doesn't set it
explicitly, so a match there is scrubbed and forwarded rather than blocked
or queued for the operator. A provider that sets the policy keeps its
choice; manifest routes still default to supervise.

Tests: provider route gets redact default, explicit provider policy
preserved, manifest route unaffected. README + PRD 0062 updated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01HnvBjPZC5V7qeQpFbQdDmS
2026-06-24 20:40:36 -04:00
didericis b411577e76 Stop scanning the request body for CRLF injection
lint / lint (push) Successful in 1m41s
test / unit (pull_request) Successful in 31s
test / integration (pull_request) Successful in 18s
A 403 "egress DLP: URL-encoded CRLF (%0d%0a)" was firing on legitimate
requests (e.g. the Claude Code login flow) and bypassing the on-match
policy entirely, because CRLF blocks carry no matched value and were
routed straight to a hard 403.

Root cause: CRLF injection is only an attack in the request line and
headers. An HTTP body is delimited by Content-Length, so CRLF bytes in
the body cannot split the request — but the scan flattened the body into
the same blob it checked, so form-encoded / multi-line body content
(which legitimately contains %0d%0a) tripped it.

Fix:
- scan_outbound takes a crlf_text param; the addon scans CRLF only over
  the body-excluded request line + headers. crlf_text=None keeps the
  old full-blob behavior for host-side callers/tests; the websocket path
  passes "" since a data frame is not a request line.
- The redact policy now also scrubs CRLF (new strip_crlf helper) from the
  path and headers, so redact is a complete escape hatch and structural
  CRLF in the URL/headers can be forwarded when a route opts into it.

Tests: strip_crlf unit tests; scan_outbound crlf_text body-exclusion and
backward-compat tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01HnvBjPZC5V7qeQpFbQdDmS
2026-06-24 20:37:26 -04:00
didericis cdfaaa3de8 Add dlp.outbound_on_match policy (block | redact | supervise)
lint / lint (push) Successful in 1m41s
test / unit (pull_request) Successful in 30s
test / integration (pull_request) Successful in 18s
Give each egress route a policy for what the proxy does when an outbound
DLP detector matches a token, defaulting to the supervise flow added in
the previous commit. The goal is cutting false-positive friction without
weakening default-deny.

- redact: scrub the matched value(s) from the body, non-host headers, and
  path/query via redact_tokens, then re-scan. Forward if clean; fail
  closed with a 403 if a match remains on a surface redaction can't
  rewrite (the hostname, or a unicode-evasion token). For routes where a
  token-shaped value is noise the upstream doesn't need.
- block: the original hard 403, never overridable.
- supervise (default, unset): hold the request for operator approval.

Structural blocks (CRLF, no safelist-able value) stay hard 403s under
every policy.

Threads outbound_on_match from the bottle manifest (manifest_egress)
through the resolved EgressRoute and rendered routes.yaml (egress.py) to
the addon's Route (egress_addon_core), and round-trips it via the
list-egress-routes introspection endpoint. The allow/egress-block tool
descriptions document the new key.

Tests: manifest parse/validation, core parse/validation, full
manifest->render->addon round-trip for redact. README + PRD 0062 updated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01HnvBjPZC5V7qeQpFbQdDmS
2026-06-24 16:50:13 -04:00
didericis 7f2352287e PRD 0062: supervisor override for egress token blocks
lint / lint (push) Successful in 1m42s
test / unit (pull_request) Successful in 31s
test / integration (pull_request) Successful in 16s
When the outbound DLP catches a token, route the block through the
existing supervisor approval queue instead of returning 403 outright.
The egress proxy holds the request open until the operator answers, then
remembers an approved value for the life of the proxy so the request --
and later ones carrying it -- flow through. Fails closed on rejection,
timeout, malformed response, or when supervise is disabled.

- ScanResult.matched carries the raw matched substring (sidecar-only;
  never logged or written to the proposal). scan_outbound and the token
  detectors take a safe_tokens set and skip approved values, continuing
  past a safelisted match so a second secret in the same request is
  still caught.
- New egress-token-allow proposal tool, written directly to the queue by
  the addon (the gitleaks-allow pattern from PRD 0061). build_token_allow
  _payload renders host/method/path/detector reason + redacted context.
- Async request hook polls the queue without stalling the proxy event
  loop; EGRESS_TOKEN_ALLOW_TIMEOUT_SECONDS (default 300) bounds the wait.
- Supervisor TUI renders egress-token-allow like gitleaks-allow: report
  only, modify unavailable, approval requires a recorded reason.
- Unit tests for the matched/safe-tokens plumbing, payload builder, tool
  constant round-trip, and TUI paths; README + PRD 0062.

Closes #261.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01HnvBjPZC5V7qeQpFbQdDmS
2026-06-24 16:12:50 -04:00
didericis 7cb967770e feat(log): add leveled severity and structured context to log wrappers
test / unit (pull_request) Successful in 31s
test / integration (pull_request) Successful in 16s
lint / lint (push) Successful in 1m39s
test / unit (push) Successful in 31s
test / integration (push) Successful in 16s
Update Quality Badges / update-badges (push) Successful in 1m32s
log.py was bare print-to-stderr wrappers with no levels or attributable
context (issue #252). Add:

- Ordered severities (debug/info/warn/error) gated by
  BOT_BOTTLE_LOG_LEVEL (default info). debug is silent by default;
  error always surfaces (nothing sits above it), so the fatal die path
  stays visible regardless of configured level.
- An optional `context` mapping on every wrapper, rendered as a
  parseable ` [k=v ...]` suffix (keys sorted; whitespace/quoted values
  quoted) so failures can be filtered and correlated.

Default output with no context is byte-identical to the original lines,
so the 100+ existing single-string call sites are unaffected. Wires the
supervise crash path (the example the issue names) to attach error_type
and crash_log context. Adds test_log.py (backward-compat, context
rendering, level gating, die surfacing).

Closes #252.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01YcU7nerbg8cVj9R4EkpfLJ
2026-06-24 15:37:57 -04:00
didericis-claude 88c4f61901 fix: don't archive gitleaks-allow response before gate reads it
test / unit (pull_request) Successful in 41s
test / integration (pull_request) Successful in 18s
lint / lint (push) Successful in 1m52s
prd-number / assign-numbers (push) Successful in 45s
test / unit (push) Successful in 36s
test / integration (push) Successful in 21s
Update Quality Badges / update-badges (push) Successful in 1m19s
The TUI was calling archive_proposal for gitleaks-allow immediately
after write_response, moving the response file to processed/ within
microseconds. The git-gate shell loop polls queue_dir for the response
file every second — it never sees it and hangs until timeout.

capability-block is handled by the MCP sidecar which archives after
reading; gitleaks-allow is handled by the shell gate which archives
after processing. Let the gate own the archive step.
2026-06-23 17:37:01 -04:00
didericis-codex 33333ac4d9 Supervise gitleaks inline allow exceptions 2026-06-23 17:36:08 -04:00
didericis-claude eb64a52ffa fix(smolmachines): commit via exec-tar instead of stop→pack
smolvm pack create --from-vm requires the VM to be stopped, and stopping
a smolmachines VM terminates any running interactive session.

Instead, mirror the macos-container approach: exec into the running VM as
root and stream the root filesystem via tar (smolvm machine exec -- tar),
build a Docker image from the archive, push to an ephemeral local registry,
and run smolvm pack create --image to produce the .smolmachine artifact.
The VM stays running throughout the commit.

Remove the stop-confirm prompt and machine_is_running check that were
added in the previous commit — neither is needed when we no longer stop.
2026-06-23 16:53:41 -04:00
didericis-claude d11e3940fa fix(smolmachines): stop VM before pack commit, with confirm prompt
smolvm pack create --from-vm requires the VM to be stopped. Add
machine_is_running() to smolvm.py (via machine ls --json state field),
and add the same confirm-stop flow to SmolmachinesFreezer that was
originally designed for macos-container: if running, prompt the user,
stop the VM, then pack. Already-stopped VMs are packed directly.
2026-06-23 16:53:41 -04:00
didericis-claude a32c0c7865 test: update macos-container tests for exec-tar commit approach
- Rename export test to reflect new exec-tar mechanism; update argv
  assertions to match the new `container exec ... tar` command shape
- Change mock stderr from str to bytes (subprocess.PIPE without text=True)
- Add type annotation to capture_freeze closure to satisfy pyright
2026-06-23 16:53:41 -04:00