e2231f46a3
End-to-end test that brings up a real bottle with allowlisted
egress + git-gate + three planted secrets, then runs five
attacks from inside the agent container.
Chunks 1-5 implemented in one pass against the Docker backend:
Attack 1 — non-allowlisted hostname (curl evil.example.com)
✓ blocked by egress
Attack 2 — non-allowlisted IP literal (198.51.100.1) + host-
header spoof via curl --resolve
✓ both blocked by egress
Attack 3 — HTTP exfil to allowlisted destination via path /
query / body / header
✗ ALL FOUR LEAK — request reaches api.anthropic.com
with the secret embedded. Pipelock's DLP doesn't
catch the anthropic-key shape in the body, and
nothing scans path / query / headers.
Attack 4 — DNS exfil via crafted subdomain + direct
dig @8.8.8.8 query
✓ both blocked (egress rejects subdomain, internal
network has no path to 8.8.8.8)
Attack 5 — README push through git-gate with secret-bearing
attacker URL (parameterized over anthropic / AWS /
generic shapes); ordering check that gitleaks fires
BEFORE any upstream attempt
✓ all three secret shapes blocked by gitleaks
Per PRD 0022 Q1 the assertion in attack 3 is authoritative —
HTTP 403 with an egress/pipelock marker in the body is the only
acceptable outcome. Any 4xx from upstream means the secret
reached the network. The four failing sub-tests are real
sandbox gaps that need their own remediation PRDs before this
test merges green.
Also adds `dnsutils` (dig) to the base agent image so attack 4's
direct-DNS check has a tool to run.
CI: no changes needed — `.gitea/workflows/test.yml` already runs
`tests/integration/` and the suite skip_unless_dockers cleanly
when the runner has no Docker socket.
Tests
Plain-Python test suite using stdlib unittest. No external
dependencies. Unit tests run anywhere Python 3 is present; integration
tests need Docker and skip cleanly otherwise.
Layout
tests/
fixtures.py # JSON manifest builders (shared)
_docker.py # docker-availability skip helper (shared)
unit/
test_pipelock_classify.py
test_pipelock_allowlist.py
test_pipelock_yaml.py
test_manifest_runtime.py
integration/
test_pipelock_sidecar_smoke.py
test_dry_run_plan.py
test_orphan_cleanup.py
canaries/
test_pipelock_image.py # opt-in; see below
Classification falls out of the directory — no hand-maintained list to keep in sync.
Running
python -m unittest discover -t . -s tests/unit -v # unit only
python -m unittest discover -t . -s tests/integration -v # integration only
python -m unittest discover -t . -s tests -v # both (recursive)
python -m unittest tests.unit.test_pipelock_yaml # one file
Discovery is invoked with -t . (top-level dir = repo root) so the
claude_bottle package on sys.path resolves correctly.
What the integration tests cover
test_pipelock_sidecar_smoke.py— drivesDockerPipelockProxy.prepare.start(the production code path) against a real Docker daemon and probes the sidecar's/healthfrom an in-network curl container.
test_dry_run_plan.py—cli.py start --dry-run --format=jsonemits a structured plan that contains the resolved egress allowlist and the bottle's runtime, and creates zero Docker resources.test_orphan_cleanup.py—network_removeandPipelockProxy.stopare idempotent against missing resources, so the EXIT trap can call them unconditionally.
Canaries
tests/canaries/ holds upstream-regression checks (e.g. the pinned
pipelock digest's binary still runs). These are gated on
CLAUDE_BOTTLE_RUN_CANARIES=1 and not part of the per-push suite.
They're invoked by the scheduled canaries workflow.
CLAUDE_BOTTLE_RUN_CANARIES=1 python -m unittest discover -t . -s tests/canaries -v
What's NOT covered
claude_bottle/ssh.pyend-to-end (would need a fake SSH host inside the container).- A live SSH-through-pipelock tunnel against a real Tailscale-style IP.
- DLP false-positive measurements.
- TLS handling / cert pinning behavior.
Adding a test
- Pick the directory:
tests/unit/for a pure unit test,tests/integration/for one that needs Docker. - Filename:
test_<topic>.py. - Boilerplate:
import unittest from claude_bottle.<module> import <symbol> class TestThing(unittest.TestCase): def test_x(self): ... if __name__ == "__main__": unittest.main() - For Docker-dependent tests, decorate the class with
@skip_unless_docker()fromtests._docker.