Commit Graph

575 Commits

Author SHA1 Message Date
didericis-claude 70c9f7254c docs: add PRD 0040 2026-06-02 11:16:17 -04:00
didericis-claude b9108339e7 docs: mark PRD 0039 Active
test / unit (pull_request) Successful in 33s
test / integration (pull_request) Successful in 43s
test / unit (push) Successful in 30s
test / integration (push) Successful in 41s
2026-06-02 11:15:27 -04:00
didericis-claude e5b5dd16f1 feat(dashboard): guard capability-block approval for smolmachines bottles (PRD 0039)
apply_capability_change is Docker-only teardown/apply code. Before this
change it was called regardless of backend, so approving a capability-block
proposal from a smolmachines agent would run Docker commands against a
slug that has no Docker container.

After this change approve() reads the bottle's metadata: if compose_project
is empty (the smolmachines indicator) it raises CapabilityApplyError with
a clear operator message before any teardown runs. Docker bottles (non-empty
compose_project) and unknown bottles (no metadata) fall through to the
existing Docker path unchanged.

Closes #136
2026-06-02 11:15:27 -04:00
didericis-claude cf76d1a245 docs: add PRD 0039 2026-06-02 11:15:27 -04:00
didericis-claude 717a9126e1 docs: mark PRD 0038 Active
test / integration (pull_request) Successful in 56s
test / unit (pull_request) Successful in 38s
test / unit (push) Successful in 31s
test / integration (push) Successful in 42s
2026-06-02 14:38:44 +00:00
didericis-claude 8830306101 feat(smolmachines): resolve manifest env through resolve_env() (PRD 0038)
Before this change smolmachines prepare.py spliced bottle.env directly
into guest_env, so ?prompt and ${HOST_VAR} entries reached the VM as
raw sentinels rather than being prompted or interpolated.

After this change prepare.py calls resolve_env(), matching the Docker
backend's contract. Forwarded (secret/interpolated) values still flow
through smolvm -e K=V argv — the known exposure gap documented in PRD
0038's open question.

Closes #135
2026-06-02 14:38:36 +00:00
didericis-claude 1c242b0ad9 docs: add PRD 0038
test / unit (pull_request) Successful in 52s
test / integration (pull_request) Successful in 1m2s
2026-06-02 10:28:04 -04:00
didericis-codex f95ef0c446 complete(prd): mark PRD 0037 active
test / unit (pull_request) Successful in 33s
test / integration (pull_request) Successful in 44s
test / unit (push) Successful in 29s
test / integration (push) Successful in 47s
2026-06-02 08:15:20 +00:00
didericis-codex 6e954da9b7 fix(pipelock): validate yaml render config 2026-06-02 08:15:20 +00:00
didericis-codex 9185c145a1 docs(prd): add pipelock yaml contract
test / unit (pull_request) Successful in 31s
test / integration (pull_request) Successful in 42s
2026-06-02 04:14:45 -04:00
didericis-codex a79ef61b62 complete(prd): mark PRD 0036 active
test / unit (pull_request) Successful in 32s
test / integration (pull_request) Successful in 44s
test / unit (push) Successful in 31s
test / integration (push) Successful in 45s
2026-06-02 08:10:34 +00:00
didericis-codex 0a8bba58c7 fix(codex): harden auth redaction 2026-06-02 08:10:34 +00:00
didericis-codex 2247d730cd docs(prd): add codex auth redaction policy
test / unit (pull_request) Successful in 35s
test / integration (pull_request) Successful in 42s
2026-06-02 04:09:18 -04:00
didericis-codex 3472e06efb complete(prd): mark PRD 0035 active
test / integration (pull_request) Successful in 1m4s
test / unit (pull_request) Successful in 45s
test / unit (push) Successful in 36s
test / integration (push) Successful in 46s
2026-06-02 08:06:53 +00:00
didericis-codex 82ce5d3034 fix(supervise): bound response waits 2026-06-02 08:06:45 +00:00
didericis-codex 7c260eeff9 docs(prd): add supervise wait bounds
test / unit (pull_request) Successful in 36s
test / integration (pull_request) Successful in 54s
2026-06-02 07:58:39 +00:00
didericis-codex fe6059e4a6 complete(prd): mark PRD 0034 active
test / unit (pull_request) Successful in 39s
test / integration (pull_request) Successful in 52s
test / unit (push) Successful in 34s
test / integration (push) Successful in 50s
2026-06-02 07:52:38 +00:00
didericis-codex 31708abfad fix(sidecar): queue restart signals 2026-06-02 07:52:19 +00:00
didericis-codex 1b34b1df85 docs(prd): add sidecar restart semantics
test / unit (pull_request) Successful in 42s
test / integration (pull_request) Successful in 59s
2026-06-02 07:43:34 +00:00
didericis-codex 51831bf9c0 complete(prd): mark PRD 0033 active
test / unit (pull_request) Successful in 36s
test / integration (pull_request) Successful in 57s
test / unit (push) Successful in 39s
test / integration (push) Successful in 56s
2026-06-02 07:32:29 +00:00
didericis-codex 8f28bd81a7 refactor(manifest): split schema boundaries 2026-06-02 07:32:06 +00:00
didericis-codex 662e3e1f95 docs(prd): point manifest boundaries to issue 125
test / unit (pull_request) Successful in 41s
test / integration (pull_request) Successful in 57s
2026-06-02 07:31:29 +00:00
didericis-codex 6315456a59 docs(prd): add manifest schema boundaries
test / unit (pull_request) Successful in 48s
test / integration (pull_request) Successful in 1m4s
2026-06-02 07:23:04 +00:00
didericis-claude a81f0ffa49 fix(smolmachines): raise SmolvmError instead of die() on wait_exec_ready timeout
test / unit (pull_request) Successful in 39s
test / integration (pull_request) Successful in 58s
test / unit (push) Successful in 38s
test / integration (push) Successful in 55s
die() raises Die(SystemExit), which implies a process exit. A timeout in
wait_exec_ready is a bringup failure — raising SmolvmError lets the caller
decide whether it's fatal, consistent with how machine_start failures propagate.
2026-06-02 06:29:05 +00:00
didericis-claude c39bbe265b complete(prd): mark PRD 0032 active
test / unit (pull_request) Successful in 39s
test / integration (pull_request) Successful in 58s
All three issues implemented and 805 tests passing.
2026-06-02 06:23:46 +00:00
didericis-claude 0d922371b0 refactor(smolmachines): decompose launch(), add wait_exec_ready, file-lock allocate() (PRD 0032)
Decompose the 207-line launch() into six named helpers: _allocate_resources,
_mint_certs, _start_bundle, _discover_urls, _launch_vm, _init_vm. Each has
explicit inputs/outputs and is independently testable.

Replace time.sleep(1.5) with smolvm.wait_exec_ready(), which polls
`machine exec true` with exponential backoff. Exits as soon as the exec
channel is ready; dies loudly with a timeout message instead of silently
leaving the VM in an unknown state.

File-lock loopback_alias.allocate() with fcntl.flock(LOCK_EX) so concurrent
bottle launches can't race on docker state and claim the same alias.
2026-06-02 06:23:39 +00:00
didericis-claude fe97b6014d docs(prd): PRD 0032 — smolmachines launch decomposition
test / unit (pull_request) Successful in 33s
test / integration (pull_request) Successful in 44s
Split launch() into named per-step helpers, replace time.sleep(1.5) with
a readiness poll, and file-lock loopback alias allocation. Addresses the
three actionable items from the #117 hotspot review of smolmachines/launch.py.
2026-06-02 06:14:16 +00:00
didericis-claude 07c8593999 refactor(egress): EgressRoute inherits Route from egress_addon_core
test / unit (pull_request) Successful in 32s
test / unit (push) Successful in 31s
test / integration (push) Successful in 38s
test / integration (pull_request) Successful in 47s
EgressRoute now extends egress_addon_core.Route, which holds the four
wire-visible fields (host, path_allowlist, auth_scheme, token_env).
EgressRoute adds only the three host-side fields (token_ref, roles,
tls_passthrough) that are never serialised to the sidecar.

_route_to_yaml_fields is typed as Route -> dict, making the host→wire
boundary explicit: only fields declared on the base class cross into the
YAML the addon reads.
2026-06-02 05:58:59 +00:00
didericis-claude f15721b424 complete(prd): mark PRD 0031 active
test / unit (pull_request) Successful in 39s
test / integration (pull_request) Successful in 46s
Provisioned-wins merge and _route_to_yaml_fields are implemented and all
tests pass.
2026-06-02 05:45:28 +00:00
didericis-claude 10d0872043 refactor(egress): provisioned-wins merge + _route_to_yaml_fields (PRD 0031)
Replace _merge_provider_route's five-case nested conditional with a flat
provisioned-wins merge: provider routes claim their hosts outright, manifest
routes for unclaimed hosts append unchanged. Token slot assignment moves to a
single _assign_token_slots pass over the merged list.

Add _route_to_yaml_fields as the single authoritative EgressRoute→YAML mapping,
eliminating the risk of EgressRoute and egress_addon_core.Route silently
drifting apart when new fields are added.

egress_manifest_routes is now a pure lifter with no slot assignment.
_merge_provider_route and _find_or_alloc_token_env are removed.

Tests updated: conflict-die case removed, upgrade-bare replaced with
provider-wins semantics, slot-assignment tests moved to TestSlotAssignment.
2026-06-02 05:45:20 +00:00
didericis-claude ae33d1abfb docs(prd): revise PRD 0031 — provisioned-wins merge + Route type consolidation
test / unit (pull_request) Successful in 42s
test / integration (pull_request) Successful in 1m0s
Expands scope to cover both remaining egress hotspot tasks from #117:
- Replaces the named-helper design with a flat provisioned-wins merge
  (provider routes own their hosts; manifest fills gaps; no upgrade or
  conflict-detection logic needed).
- Adds _route_to_yaml_fields as the single authoritative EgressRoute→Route
  mapping to prevent silent type drift between host and addon.
- Notes that the mitmproxy pure-function split is already clean (decide +
  is_git_push_request) and requires no structural change.
2026-06-02 05:26:15 +00:00
didericis-claude f596464f3f docs(prd): add PRD 0031 — split _merge_provider_route into named case helpers
test / unit (pull_request) Successful in 31s
test / integration (pull_request) Successful in 41s
2026-06-02 05:08:59 +00:00
didericis-claude e528d5c5af docs(prd): update PRD 0030 design to reflect provisioned_env approach
test / unit (pull_request) Successful in 33s
test / integration (pull_request) Successful in 45s
test / unit (push) Successful in 31s
test / integration (push) Successful in 43s
Revises the Design section to describe the implemented solution:
provisioned_env on AgentProvisionPlan rather than an intermediate
egress_resolve_token_values_with_provider function. Drops the old
sentinel/lazy-import design narrative.
2026-06-02 04:54:09 +00:00
didericis-claude 0e29bcc829 refactor(egress): use provisioned_env instead of sentinel for Codex token (PRD 0030)
test / unit (pull_request) Successful in 39s
test / integration (pull_request) Successful in 45s
Add `provisioned_env: dict[str, str]` to `AgentProvisionPlan`. When
`forward_host_credentials=True`, `agent_provision_plan` reads the host
Codex access token at prepare time and stores it under
`CODEX_HOST_CREDENTIAL_TOKEN_REF`. Both backends merge `provisioned_env`
over `os.environ` before calling `egress_resolve_token_values`, so the
token slot resolves like any other manifest-declared token ref.

Removes `egress_resolve_token_values_with_provider` and the sentinel
`continue` skip from `egress_resolve_token_values`. The function is now
fully generic — it neither knows nor cares about provider identity.
2026-06-02 04:53:23 +00:00
didericis-claude 8c2b59ca94 complete(prd): mark PRD 0030 active
test / unit (push) Successful in 45s
test / integration (push) Successful in 1m0s
test / unit (pull_request) Successful in 39s
test / integration (pull_request) Successful in 58s
2026-06-02 04:22:52 +00:00
didericis-claude 75f0f9d907 refactor(egress): deduplicate token resolution across backends (PRD 0030)
Extract egress_resolve_token_values_with_provider into bot_bottle/egress.py.
Both docker and smolmachines launch paths now call the shared function
instead of duplicating the forward_host_credentials / CODEX_HOST_CREDENTIAL_TOKEN_REF
resolution block.

Also fixes the host_env: object annotation on smolmachines._resolve_token_env
to the correct dict[str, str].

Closes #118.
2026-06-02 04:22:43 +00:00
didericis-claude 6682357fbb docs(prd): add PRD 0030 — deduplicate egress token resolution
Extracts the forward_host_credentials / CODEX_HOST_CREDENTIAL_TOKEN_REF
resolution block, currently copy-pasted in both docker and smolmachines
launch files, into a single shared function in bot_bottle/egress.py.

Closes #118. Found via #117 hotspot review.
2026-06-02 04:17:39 +00:00
didericis 2dd8113f7c fix(smolmachines): retry CA install after exec SIGKILL
test / unit (push) Successful in 38s
test / integration (push) Successful in 54s
2026-06-01 23:28:33 -04:00
didericis-codex 36e3443d2e fix(codex): defer workspace trust handling
test / unit (pull_request) Successful in 29s
test / integration (pull_request) Successful in 42s
test / unit (push) Successful in 30s
test / integration (push) Successful in 44s
2026-06-02 03:11:51 +00:00
didericis-codex d6ebd0d2eb fix(egress): skip token slots for unauth provider routes
test / unit (pull_request) Successful in 30s
test / integration (pull_request) Successful in 43s
2026-06-02 03:06:10 +00:00
didericis-codex eb6bace84f complete(prd): mark PRD 0029 active
test / unit (pull_request) Successful in 35s
test / integration (pull_request) Successful in 42s
2026-06-02 03:00:47 +00:00
didericis-claude f8fc29ce87 refactor(manifest): remove empty EGRESS_ROLES and related plumbing
test / unit (pull_request) Successful in 36s
test / integration (pull_request) Successful in 53s
EGRESS_ROLES, EGRESS_SINGLETON_ROLES, and PROVIDER_EGRESS_ROLES were
all empty frozensets after the codex_auth and claude_code_oauth roles
were removed. Delete the constants and all validation code that iterated
over them (the singleton-role loop and provider-role check in
_validate_egress_routes, the EGRESS_ROLES membership test in
EgressRoute.from_dict). EgressRoute.from_dict now rejects any role
string unconditionally; _validate_egress_routes loses its
agent_provider_template parameter entirely.

Assisted-by: Claude Code
2026-06-01 22:24:17 -04:00
didericis-claude 938a0e05d6 refactor(manifest): remove codex_auth egress role
Both provider-owned roles are now gone. Provider auth routes are
provisioner-owned (claude: auth_token, codex: forward_host_credentials);
the role field and validation plumbing stay for future use but EGRESS_ROLES
is empty. Any manifest declaring a role now fails at parse time.

Assisted-by: Claude Code
2026-06-01 22:24:17 -04:00
didericis f768d3a853 fix(agent): move default claude env vars to the right location 2026-06-01 22:24:17 -04:00
didericis-claude f32b7eb299 fix(agent): always emit passthrough egress route for api.anthropic.com
Mirrors the Codex pattern: Claude always gets a tls_passthrough route
for api.anthropic.com so user-set tokens aren't stripped by pipelock,
whether or not auth_token is declared. Auth injection (scheme + token_ref)
and the placeholder env only apply when auth_token is set.

Assisted-by: Claude Code
2026-06-01 22:24:17 -04:00
didericis-claude de9bd7eb83 feat(manifest): add agent_provider.auth_token for Claude OAuth via egress
Operators can now declare:

  agent_provider:
    template: claude
    auth_token: BOT_BOTTLE_CLAUDE_OAUTH_TOKEN

and the provisioner injects a provider-owned api.anthropic.com egress
route (Bearer, tls_passthrough) rather than requiring a manually
declared route with the former claude_code_oauth role.

Changes:
- Add auth_token field to AgentProvider; validate claude-only.
- Remove claude_code_oauth from EGRESS_ROLES / PROVIDER_EGRESS_ROLES.
  Manifests that declare the role now fail at parse time with "unknown
  role" — the provisioner owns the route.
- agent_provision_plan: replace manifest_egress_routes/has_provider_auth
  with auth_token; Claude branch injects the api.anthropic.com route,
  placeholder env, and nonessential-traffic flags when auth_token is set.
- Add hidden_env_names: frozenset[str] to AgentProvisionPlan; Claude
  branch populates it with CLAUDE_CODE_OAUTH_TOKEN.
- Remove auth_role from AgentProviderRuntime and placeholder_env_for().
- print_util.visible_agent_env_names: accept hidden_env_names from the
  plan instead of dispatching on agent_provider_template.
- Both backends: drop manifest_egress_routes call, pass auth_token.
- PRD 0029 rescoped to cover both Codex and Claude provider auth.

Assisted-by: Claude Code
2026-06-01 22:24:17 -04:00
didericis-claude 952dcd7eec refactor(agent): move placeholder env injection into agent_provision_plan
The has_provider_auth check and egress-placeholder injection were
duplicated in both backends. Move them into agent_provision_plan so
the provisioner owns that decision entirely:

- Replace has_provider_auth: bool param with manifest_egress_routes,
  compute has_provider_auth internally from the route roles.
- Inject CLAUDE_CODE_OAUTH_TOKEN=egress-placeholder inside the plan
  when has_provider_auth, alongside the existing nonessential-traffic
  vars. Backends no longer touch the placeholder env.
- Remove placeholder_env from AgentProviderRuntime; expose
  placeholder_env_for() for print_util's hide-from-summary logic.

Assisted-by: Claude Code
2026-06-01 22:24:17 -04:00
didericis-claude 59df0b0f0f fix(codex): emit passthrough egress routes when not forwarding host credentials
When forward_host_credentials is false, Codex bottles should still get
tls_passthrough routes for the OpenAI/ChatGPT hosts so that tokens a
user sets via `codex login` after launch aren't stripped by pipelock's
header DLP. Previously no routes were emitted, which would have blocked
those requests entirely once pipelock enforcement tightens.

Rename the test to reflect the new expected behavior.

Assisted-by: Claude Code
2026-06-01 22:24:17 -04:00
didericis-claude c0219dddd5 fix(egress): break circular import with manifest via TYPE_CHECKING
manifest → agent_provider → egress → manifest created a cycle that
caused ImportError on any module import. With from __future__ import
annotations already present, Bottle is only needed at type-check time
(annotations are lazy strings under PEP 563).

Assisted-by: Claude Code
2026-06-01 22:24:17 -04:00
didericis-claude 884cedc160 refactor: provision egress routes via AgentProvisionPlan
Remove provider-specific branching from egress.py and pipelock.py.
Previously, `egress_routes_for_bottle` and `pipelock_effective_tls_passthrough`
both contained `template == "codex"` checks — the same pattern the rest
of the PR moved out of the backends.

Root cause: `EgressRoute` had no `tls_passthrough` field, so pipelock
couldn't learn from the synthesised Codex routes that they needed
passthrough. Fix:

- Add `EgressRoute.tls_passthrough: bool`. `egress_manifest_routes` lifts
  the existing `pipelock.tls_passthrough` manifest flag here; provider
  routes set it directly.
- Add `AgentProvisionPlan.egress_routes`. `agent_provision_plan` populates
  it for Codex + `forward_host_credentials`, including `tls_passthrough=True`.
- Replace Codex-specific `egress_routes_for_bottle` logic with a generic
  `_merge_provider_route` helper. Backends call `egress_routes_for_bottle(bottle,
  plan.egress_routes)`; no provider type checks inside egress or pipelock.
- Rewrite `pipelock_effective_tls_passthrough` to read `route.tls_passthrough`
  from the merged route set instead of re-implementing the provider check.
- Both backends now call `agent_provision_plan` before `Egress.prepare` and
  `PipelockProxy.prepare`, threading `plan.egress_routes` to both. `has_provider_auth`
  is derived from `egress_manifest_routes` (manifest routes only — provider
  routes carry no auth roles, so the result is identical).

Assisted-by: Claude Code
2026-06-01 22:24:17 -04:00