didericis/bot-bottle

Fork 0

Files

T

History

didericis 23f50f7720

test / unit (pull_request) Successful in 19s

Details

test / integration (pull_request) Failing after 49s

Details

fix(pipelock): scan all request headers + fix attack-3 destination

Two related changes the PRD 0022 sandbox-escape test surfaced:

  1. `pipelock_build_config` now emits
     `request_body_scanning.scan_headers: true` and
     `header_mode: all`. Pipelock's default `header_mode:
     sensitive` only checks Authorization / Cookie / X-Api-Key
     / X-Token / Proxy-Authorization / X-Goog-Api-Key — an
     agent attempting exfil could trivially pick a
     non-sensitive header (`X-Custom: $SECRET`) and slip
     through. `all` closes the gap; pipelock caps it by the
     same max_body_bytes the body scan uses.

  2. Test 3 (HTTP exfil shapes) now targets
     raw.githubusercontent.com instead of api.anthropic.com.
     api.anthropic.com is in `DEFAULT_TLS_PASSTHROUGH` —
     pipelock can't MITM it because real LLM conversation
     bodies false-positive on DLP scanners (BIP-39 etc.). The
     trade-off is documented in `pipelock.DEFAULT_TLS_PASSTHROUGH`;
     the test now exercises a host where the sandbox is
     actually supposed to block.

All 5 sandbox-escape attacks now produce HTTP 403 with the
expected sandbox marker (`egress:`, `pipelock`, or `blocked:`):

  - Attack 1 (non-allowlisted host)        ✓ egress
  - Attack 2 (non-allowlisted IP + spoof)  ✓ egress
  - Attack 3a (URL path)                   ✓ pipelock DLP
  - Attack 3b (URL query)                  ✓ pipelock DLP
  - Attack 3c (request body)               ✓ pipelock DLP
  - Attack 3d (request header)             ✓ pipelock DLP (scan_headers)
  - Attack 4a (crafted subdomain)          ✓ egress
  - Attack 4b (direct dig @8.8.8.8)        ✓ network isolation
  - Attack 5 (README push, 3 secret shapes) ✓ gitleaks (pre-upstream)

489 unit tests pass (1 updated for the new request_body_scanning
shape). Full integration suite passes in ~6s.

2026-05-26 22:38:38 -04:00

canaries

style: pass explicit check= to every subprocess.run call

2026-05-12 10:13:56 -04:00

integration

fix(pipelock): scan all request headers + fix attack-3 destination

2026-05-26 22:38:38 -04:00

unit

fix(pipelock): scan all request headers + fix attack-3 destination

2026-05-26 22:38:38 -04:00

__init__.py

refactor: convert project from bash to Python

2026-05-08 15:26:58 +00:00

_docker.py

style: pass explicit check= to every subprocess.run call

2026-05-12 10:13:56 -04:00

fixtures.py

test: drop ssh-gate suites and shadow-route assertions (PRD 0009)

2026-05-12 23:54:22 -04:00

README.md

test: reorganize suite into unit/integration/canaries directories

2026-05-11 16:23:02 -04:00

README.md

Tests

Plain-Python test suite using stdlib unittest. No external dependencies. Unit tests run anywhere Python 3 is present; integration tests need Docker and skip cleanly otherwise.

Layout

tests/
  fixtures.py                       # JSON manifest builders (shared)
  _docker.py                        # docker-availability skip helper (shared)
  unit/
    test_pipelock_classify.py
    test_pipelock_allowlist.py
    test_pipelock_yaml.py
    test_manifest_runtime.py
  integration/
    test_pipelock_sidecar_smoke.py
    test_dry_run_plan.py
    test_orphan_cleanup.py
  canaries/
    test_pipelock_image.py          # opt-in; see below

Classification falls out of the directory — no hand-maintained list to keep in sync.

Running

python -m unittest discover -t . -s tests/unit -v         # unit only
python -m unittest discover -t . -s tests/integration -v  # integration only
python -m unittest discover -t . -s tests -v              # both (recursive)
python -m unittest tests.unit.test_pipelock_yaml          # one file

Discovery is invoked with -t . (top-level dir = repo root) so the claude_bottle package on sys.path resolves correctly.

What the integration tests cover

test_pipelock_sidecar_smoke.py — drives DockerPipelockProxy.prepare
- .start (the production code path) against a real Docker daemon and probes the sidecar's /health from an in-network curl container.
test_dry_run_plan.py — cli.py start --dry-run --format=json emits a structured plan that contains the resolved egress allowlist and the bottle's runtime, and creates zero Docker resources.
test_orphan_cleanup.py — network_remove and PipelockProxy.stop are idempotent against missing resources, so the EXIT trap can call them unconditionally.

Canaries

tests/canaries/ holds upstream-regression checks (e.g. the pinned pipelock digest's binary still runs). These are gated on CLAUDE_BOTTLE_RUN_CANARIES=1 and not part of the per-push suite. They're invoked by the scheduled canaries workflow.

CLAUDE_BOTTLE_RUN_CANARIES=1 python -m unittest discover -t . -s tests/canaries -v

What's NOT covered

claude_bottle/ssh.py end-to-end (would need a fake SSH host inside the container).
A live SSH-through-pipelock tunnel against a real Tailscale-style IP.
DLP false-positive measurements.
TLS handling / cert pinning behavior.

Adding a test

Pick the directory: tests/unit/ for a pure unit test, tests/integration/ for one that needs Docker.
Filename: test_<topic>.py.

Boilerplate:

import unittest

from claude_bottle.<module> import <symbol>

class TestThing(unittest.TestCase):
    def test_x(self):
        ...

if __name__ == "__main__":
    unittest.main()

For Docker-dependent tests, decorate the class with @skip_unless_docker() from tests._docker.