From 7644da428000afc70dc5fe58a681d2e2f08c8d85 Mon Sep 17 00:00:00 2001 From: didericis Date: Wed, 10 Jun 2026 19:19:52 -0400 Subject: [PATCH] docs: add Apple Container transparent egress spike --- ...pple-container-transparent-egress-spike.md | 476 ++++++++++++++++++ 1 file changed, 476 insertions(+) create mode 100644 docs/research/apple-container-transparent-egress-spike.md diff --git a/docs/research/apple-container-transparent-egress-spike.md b/docs/research/apple-container-transparent-egress-spike.md new file mode 100644 index 0000000..a047799 --- /dev/null +++ b/docs/research/apple-container-transparent-egress-spike.md @@ -0,0 +1,476 @@ +# Apple Container transparent egress spike + +Issue: https://gitea.dideric.is/didericis/bot-bottle/issues/230#issuecomment-1994 + +## Summary + +Transparent egress is mechanically possible on Apple Container 1.0.0, +but it is not a free property of the platform and it is not a drop-in +replacement for `HTTP_PROXY` yet. + +The spike proved two separate things: + +- Plain routing/NAT works if the sidecar has `CAP_NET_ADMIN`, IP + forwarding, and masquerade rules, and if the agent default route is + changed to the sidecar's host-only-network IP. +- Transparent mitmproxy interception works if the sidecar redirects + agent-facing TCP 80/443 traffic to `mitmdump --mode transparent`. + Direct HTTP was logged by mitmproxy. Direct HTTPS reached mitmproxy; + it failed with normal certificate verification until the client + skipped verification, which is consistent with bot-bottle's existing + requirement that agents trust the sidecar CA. +- Running DNS on the sidecar and pointing the agent at the sidecar's + host-only IP also works. This is cleaner than relying on forwarded + UDP DNS to a public resolver and gives the backend a natural place to + enforce or observe DNS policy. + +The hard blocker is agent routing. Apple Container 1.0.0 exposes no +documented `--network` gateway option. An ordinary agent container +cannot replace its default route: + +```console +$ container exec bb-spike-230t-agent sh -c \ + 'ip route replace default via 192.168.128.2 dev eth0; ip route' +default via 192.168.128.1 dev eth0 +192.168.128.0/24 dev eth0 scope link src 192.168.128.3 +ip: RTNETLINK answers: Operation not permitted +``` + +The successful route-through-sidecar tests used `--cap-add +CAP_NET_ADMIN` on the agent so the route could be changed after start. +That is not an acceptable final design by itself: it expands the +agent's kernel-facing privilege and lets the agent mutate its own +network namespace. A production design needs either a backend-owned +init/shim that sets the route then drops privilege in a way the agent +cannot regain, a platform-supported gateway option, or a different +network attachment layer. + +## Environment + +Tested on 2026-06-10: + +```console +$ sw_vers +ProductName: macOS +ProductVersion: 26.5.1 +BuildVersion: 25F80 + +$ uname -m +arm64 + +$ container --version +container CLI version 1.0.0 (build: release, commit: ee848e3) +``` + +Apple Container system status: + +```json +{ + "apiServerAppName": "container-apiserver", + "apiServerBuild": "release", + "apiServerCommit": "ee848e3ebfd7c73b04dd419683be54fb450b8779", + "apiServerVersion": "container-apiserver version 1.0.0 (build: release, commit: ee848e3)", + "appRoot": "/Users/didericis/Library/Application Support/com.apple.container/", + "installRoot": "/usr/local/", + "status": "running" +} +``` + +## Baseline + +Networks: + +```bash +container network create bb-spike-230t-agent \ + --internal \ + --label bot-bottle.spike=transparent-egress + +container network create bb-spike-230t-egress \ + --label bot-bottle.spike=transparent-egress +``` + +Sidecar, dual-homed with NAT first: + +```bash +container run --name bb-spike-230t-sidecar \ + --label bot-bottle.spike=transparent-egress \ + --network bb-spike-230t-egress \ + --network bb-spike-230t-agent \ + --dns 1.1.1.1 \ + --detach docker.io/alpine:latest sleep 1800 +``` + +Agent, host-only network: + +```bash +container run --name bb-spike-230t-agent \ + --label bot-bottle.spike=transparent-egress \ + --network bb-spike-230t-agent \ + --detach docker.io/alpine:latest sleep 1800 +``` + +Observed sidecar addresses: + +```console +eth0 192.168.66.2/24 # NAT egress network +eth1 192.168.128.2/24 # host-only agent network +default via 192.168.66.1 dev eth0 +nameserver 1.1.1.1 +``` + +Observed agent baseline: + +```console +eth0 192.168.128.3/24 +default via 192.168.128.1 dev eth0 +nameserver 192.168.128.1 +wget: bad address 'pypi.org' +``` + +That confirms the previous spike's baseline: sidecar can egress, agent +cannot egress directly. + +## Plain NAT Test + +Relaunch sidecar and agent with `CAP_NET_ADMIN`: + +```bash +container run --name bb-spike-230t-sidecar \ + --label bot-bottle.spike=transparent-egress \ + --network bb-spike-230t-egress \ + --network bb-spike-230t-agent \ + --dns 1.1.1.1 \ + --cap-add CAP_NET_ADMIN \ + --detach docker.io/alpine:latest sleep 1800 + +container run --name bb-spike-230t-agent \ + --label bot-bottle.spike=transparent-egress \ + --network bb-spike-230t-agent \ + --cap-add CAP_NET_ADMIN \ + --detach docker.io/alpine:latest sleep 1800 +``` + +Configure sidecar forwarding: + +```bash +container exec bb-spike-230t-sidecar sh -c ' + apk add --no-cache iptables iproute2 + sysctl -w net.ipv4.ip_forward=1 + iptables -t nat -A POSTROUTING -s 192.168.128.0/24 -o eth0 -j MASQUERADE + iptables -A FORWARD -i eth1 -o eth0 -j ACCEPT + iptables -A FORWARD -i eth0 -o eth1 -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT +' +``` + +Point the agent at the sidecar: + +```bash +container exec bb-spike-230t-agent sh -c ' + ip route replace default via 192.168.128.4 dev eth0 + printf "nameserver 1.1.1.1\n" > /etc/resolv.conf +' +``` + +Normal direct PyPI fetch from the agent, with no proxy variables set: + +```bash +container exec bb-spike-230t-agent sh -c ' + for v in HTTP_PROXY HTTPS_PROXY http_proxy https_proxy ALL_PROXY all_proxy; do + if [ -n "$(printenv "$v")" ]; then echo "$v=SET"; fi + done + wget -T 10 -O- https://pypi.org/simple/pip/ | head -c 120 +' +``` + +Observed: + +```console +Connecting to pypi.org (151.101.0.223:443) + + + + +``` + +Sidecar NAT counters increased: + +```console +POSTROUTING MASQUERADE 3 packets / 168 bytes +FORWARD eth1 -> eth0 22 packets / 2806 bytes +FORWARD eth0 -> eth1 29 packets / 54781 bytes +``` + +Verdict: plain transparent routing through the sidecar works, but this +is only NAT. It does not apply bot-bottle's existing route allowlist, +authorization stripping/injection, or DLP logic. + +## Transparent Mitmproxy Test + +The current sidecar launcher uses explicit proxy mode: + +```sh +MODE="--mode regular@9099" +exec mitmdump $CONFDIR_FLAG $MODE $LISTEN_HOST_FLAG $TRUST_FLAG -s /app/egress_addon.py +``` + +So transparent egress needs a launcher mode change plus iptables +redirects. + +Run a test mitmproxy container: + +```bash +container run --name bb-spike-230t-mitm \ + --label bot-bottle.spike=transparent-egress \ + --network bb-spike-230t-egress \ + --network bb-spike-230t-agent \ + --dns 1.1.1.1 \ + --cap-add CAP_NET_ADMIN \ + --detach mitmproxy/mitmproxy:11.1.3 \ + sh -c 'apt-get update >/tmp/apt.log && + apt-get install -y --no-install-recommends iptables iproute2 >>/tmp/apt.log && + echo 1 > /proc/sys/net/ipv4/ip_forward && + iptables -t nat -A PREROUTING -i eth1 -p tcp --dport 80 -j REDIRECT --to-port 8080 && + iptables -t nat -A PREROUTING -i eth1 -p tcp --dport 443 -j REDIRECT --to-port 8080 && + mitmdump --mode transparent@8080 --set showhost=true --set ssl_insecure=true --set confdir=/tmp/mitm -v' +``` + +The container listened successfully: + +```console +Transparent Proxy listening at *:8080. +``` + +It had an agent-facing address of `192.168.128.7`. Point the agent at +it and set DNS: + +```bash +container exec bb-spike-230t-agent sh -c ' + ip route replace default via 192.168.128.7 dev eth0 + printf "nameserver 1.1.1.1\n" > /etc/resolv.conf +' +``` + +DNS also needs NAT/forwarding because only TCP 80/443 is redirected: + +```bash +container exec bb-spike-230t-mitm sh -c ' + iptables -t nat -A POSTROUTING -s 192.168.128.0/24 -o eth0 -j MASQUERADE + iptables -A FORWARD -i eth1 -o eth0 -j ACCEPT + iptables -A FORWARD -i eth0 -o eth1 -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT +' +``` + +An alternative, and likely better, DNS shape is to run a DNS forwarder on +the sidecar's host-only IP and point the agent at it. This was tested +with `dnsmasq`: + +```bash +container exec bb-spike-230t-mitm sh -c ' + apt-get install -y --no-install-recommends dnsmasq + cat >/tmp/dnsmasq.conf </tmp/dnsmasq.log 2>&1 &) + sleep 1 + ss -lunp | grep :53 +' +``` + +Observed: + +```console +UNCONN 0 0 192.168.128.7:53 0.0.0.0:* users:(("dnsmasq",pid=515,fd=4)) +``` + +Point the agent to sidecar DNS: + +```bash +container exec bb-spike-230t-agent sh -c ' + printf "nameserver 192.168.128.7\n" > /etc/resolv.conf + nslookup pypi.org +' +``` + +Observed: + +```console +Server: 192.168.128.7 +Address: 192.168.128.7:53 + +Non-authoritative answer: +Name: pypi.org +Address: 151.101.128.223 +Name: pypi.org +Address: 151.101.192.223 +Name: pypi.org +Address: 151.101.64.223 +Name: pypi.org +Address: 151.101.0.223 +``` + +Direct HTTP from the agent worked and mitmproxy logged the request: + +```console +$ container exec bb-spike-230t-agent sh -c \ + 'wget -T 10 -O- http://example.com | head -c 100' +Connecting to example.com (172.66.147.243:80) +Example Domain +``` + +Mitmproxy log: + +```console +192.168.128.5:39742: GET http://example.com/ + Host: example.com + User-Agent: Wget + << 200 OK 559b +``` + +After switching the agent to sidecar DNS, direct HTTP still hit +mitmproxy: + +```console +192.168.128.5:50784: GET http://example.com/ + Host: example.com + User-Agent: Wget + << 200 OK 559b +``` + +Direct HTTPS from the agent reached mitmproxy but failed certificate +verification, as expected when the client does not trust the mitmproxy +CA: + +```console +$ container exec bb-spike-230t-agent sh -c \ + 'wget -T 10 -O- https://pypi.org/simple/pip/ | head -c 100' +Connecting to pypi.org (151.101.128.223:443) +... certificate verify failed ... +``` + +Mitmproxy log: + +```console +Client TLS handshake failed. The client does not trust the proxy's +certificate for pypi.org (tlsv1 alert unknown ca) +``` + +With verification disabled, the same direct URL succeeded and mitmproxy +logged the full HTTPS request: + +```console +$ container exec bb-spike-230t-agent sh -c \ + 'wget --no-check-certificate -T 10 -O- https://pypi.org/simple/pip/ | head -c 100' +Connecting to pypi.org (151.101.128.223:443) + + + + +``` + +Mitmproxy log: + +```console +192.168.128.5:32802: GET https://pypi.org/simple/pip/ + Host: pypi.org + User-Agent: Wget + << 200 OK 103k +``` + +After switching the agent to sidecar DNS, direct HTTPS still hit +mitmproxy: + +```console +192.168.128.5:50254: GET https://pypi.org/simple/pip/ + Host: pypi.org + User-Agent: Wget + << 200 OK 103k +``` + +Verdict: transparent mitmproxy mode works in this topology. The bot +agent would still need the egress CA installed, which bot-bottle already +does for explicit proxy mode. + +## Answers + +### Can the sidecar become the agent network's default gateway? + +Not directly through Apple Container's documented CLI. The installed +`container run --help` documents `--network +[,mac=XX:XX:XX:XX:XX:XX][,mtu=VALUE]`; it does not document a +gateway option. + +The route can be changed after container start only if the agent has +`CAP_NET_ADMIN`. Without it, `ip route replace default via ` +fails with `Operation not permitted`. + +### Can Apple Container support sidecar forwarding/NAT/transparent proxying? + +Yes. A dual-homed sidecar with `CAP_NET_ADMIN` can enable IP forwarding, +set iptables NAT/forwarding rules, and route agent traffic out through +the NAT network. + +Transparent mitmproxy interception also works with `PREROUTING` +redirects to `mitmdump --mode transparent`. + +### What capabilities/custom image are required? + +At minimum: + +- sidecar needs `CAP_NET_ADMIN`; +- sidecar image needs `iptables`/`iproute2` or equivalent nftables + tooling; +- sidecar should run a DNS listener on its host-only IP, or otherwise + provide a controlled resolver path for the agent; +- sidecar launcher needs a transparent mode variant; +- agent route must be changed to the sidecar's host-only IP; +- agent DNS should point to the sidecar DNS listener; +- agent must trust the sidecar CA for HTTPS interception. + +The tested agent route mutation required agent `CAP_NET_ADMIN`, which +should not be accepted as the final design without a privilege-dropping +init/shim story. + +### Can host-level `pf` or vmnet rules replace agent route mutation? + +Not tested. The successful transparent paths did not use host `pf`; +they used container-local routing and iptables. Host-level `pf` remains +a possible escape hatch if Apple Container cannot set a custom gateway +and we reject agent `CAP_NET_ADMIN`. + +### Can existing route policy and DLP semantics be preserved? + +Likely, but not fully validated in this spike. Mitmproxy transparent +mode produced normal HTTP flows with correct `Host` values for both +HTTP and HTTPS. The existing `egress_addon.py` hooks should still see +`flow.request.pretty_host`, method, path, headers, and response bodies. + +But the current sidecar entrypoint only starts `mitmdump` in regular +explicit-proxy mode. A real implementation must add a transparent mode +launcher and then run the existing egress addon test suite against +transparent flows. + +## Recommendation + +Do not switch `macos-container` to transparent egress yet, but keep it +as a plausible implementation path. + +The next implementation spike should focus on removing the agent +`CAP_NET_ADMIN` requirement. Acceptable options: + +- find or add an Apple Container-supported default-gateway setting; +- start the agent through a tiny root init that sets route/DNS, drops + capabilities, and then execs the agent as the normal user; +- include a sidecar DNS service and set the agent resolver to the + sidecar's host-only IP as part of that init/setup path; +- avoid routing mutation by using host/vmnet-level packet redirection; +- explicitly decide that route mutation is only a convenience layer and + keep explicit proxy env vars for v1. + +Bluntly: transparent egress is feasible, but not production-ready until +the agent route can be controlled without leaving network-admin power in +the agent runtime.