Files
bot-bottle/docs/research/apple-container-transparent-egress-spike.md
T

14 KiB

Apple Container transparent egress spike

Issue: #230 (comment)

Summary

Transparent egress is mechanically possible on Apple Container 1.0.0, but it is not a free property of the platform and it is not a drop-in replacement for HTTP_PROXY yet.

The spike proved two separate things:

  • Plain routing/NAT works if the sidecar has CAP_NET_ADMIN, IP forwarding, and masquerade rules, and if the agent default route is changed to the sidecar's host-only-network IP.
  • Transparent mitmproxy interception works if the sidecar redirects agent-facing TCP 80/443 traffic to mitmdump --mode transparent. Direct HTTP was logged by mitmproxy. Direct HTTPS reached mitmproxy; it failed with normal certificate verification until the client skipped verification, which is consistent with bot-bottle's existing requirement that agents trust the sidecar CA.
  • Running DNS on the sidecar and pointing the agent at the sidecar's host-only IP also works. This is cleaner than relying on forwarded UDP DNS to a public resolver and gives the backend a natural place to enforce or observe DNS policy.

The hard blocker is agent routing. Apple Container 1.0.0 exposes no documented --network gateway option. An ordinary agent container cannot replace its default route:

$ container exec bb-spike-230t-agent sh -c \
  'ip route replace default via 192.168.128.2 dev eth0; ip route'
default via 192.168.128.1 dev eth0
192.168.128.0/24 dev eth0 scope link  src 192.168.128.3
ip: RTNETLINK answers: Operation not permitted

The successful route-through-sidecar tests used --cap-add CAP_NET_ADMIN on the agent so the route could be changed after start. That is not an acceptable final design by itself: it expands the agent's kernel-facing privilege and lets the agent mutate its own network namespace. A production design needs either a backend-owned init/shim that sets the route then drops privilege in a way the agent cannot regain, a platform-supported gateway option, or a different network attachment layer.

Environment

Tested on 2026-06-10:

$ sw_vers
ProductName:		macOS
ProductVersion:		26.5.1
BuildVersion:		25F80

$ uname -m
arm64

$ container --version
container CLI version 1.0.0 (build: release, commit: ee848e3)

Apple Container system status:

{
  "apiServerAppName": "container-apiserver",
  "apiServerBuild": "release",
  "apiServerCommit": "ee848e3ebfd7c73b04dd419683be54fb450b8779",
  "apiServerVersion": "container-apiserver version 1.0.0 (build: release, commit: ee848e3)",
  "appRoot": "/Users/didericis/Library/Application Support/com.apple.container/",
  "installRoot": "/usr/local/",
  "status": "running"
}

Baseline

Networks:

container network create bb-spike-230t-agent \
  --internal \
  --label bot-bottle.spike=transparent-egress

container network create bb-spike-230t-egress \
  --label bot-bottle.spike=transparent-egress

Sidecar, dual-homed with NAT first:

container run --name bb-spike-230t-sidecar \
  --label bot-bottle.spike=transparent-egress \
  --network bb-spike-230t-egress \
  --network bb-spike-230t-agent \
  --dns 1.1.1.1 \
  --detach docker.io/alpine:latest sleep 1800

Agent, host-only network:

container run --name bb-spike-230t-agent \
  --label bot-bottle.spike=transparent-egress \
  --network bb-spike-230t-agent \
  --detach docker.io/alpine:latest sleep 1800

Observed sidecar addresses:

eth0 192.168.66.2/24    # NAT egress network
eth1 192.168.128.2/24   # host-only agent network
default via 192.168.66.1 dev eth0
nameserver 1.1.1.1

Observed agent baseline:

eth0 192.168.128.3/24
default via 192.168.128.1 dev eth0
nameserver 192.168.128.1
wget: bad address 'pypi.org'

That confirms the previous spike's baseline: sidecar can egress, agent cannot egress directly.

Plain NAT Test

Relaunch sidecar and agent with CAP_NET_ADMIN:

container run --name bb-spike-230t-sidecar \
  --label bot-bottle.spike=transparent-egress \
  --network bb-spike-230t-egress \
  --network bb-spike-230t-agent \
  --dns 1.1.1.1 \
  --cap-add CAP_NET_ADMIN \
  --detach docker.io/alpine:latest sleep 1800

container run --name bb-spike-230t-agent \
  --label bot-bottle.spike=transparent-egress \
  --network bb-spike-230t-agent \
  --cap-add CAP_NET_ADMIN \
  --detach docker.io/alpine:latest sleep 1800

Configure sidecar forwarding:

container exec bb-spike-230t-sidecar sh -c '
  apk add --no-cache iptables iproute2
  sysctl -w net.ipv4.ip_forward=1
  iptables -t nat -A POSTROUTING -s 192.168.128.0/24 -o eth0 -j MASQUERADE
  iptables -A FORWARD -i eth1 -o eth0 -j ACCEPT
  iptables -A FORWARD -i eth0 -o eth1 -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
'

Point the agent at the sidecar:

container exec bb-spike-230t-agent sh -c '
  ip route replace default via 192.168.128.4 dev eth0
  printf "nameserver 1.1.1.1\n" > /etc/resolv.conf
'

Normal direct PyPI fetch from the agent, with no proxy variables set:

container exec bb-spike-230t-agent sh -c '
  for v in HTTP_PROXY HTTPS_PROXY http_proxy https_proxy ALL_PROXY all_proxy; do
    if [ -n "$(printenv "$v")" ]; then echo "$v=SET"; fi
  done
  wget -T 10 -O- https://pypi.org/simple/pip/ | head -c 120
'

Observed:

Connecting to pypi.org (151.101.0.223:443)
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta name="pypi:repository-version" content="1.4">

Sidecar NAT counters increased:

POSTROUTING MASQUERADE 3 packets / 168 bytes
FORWARD eth1 -> eth0 22 packets / 2806 bytes
FORWARD eth0 -> eth1 29 packets / 54781 bytes

Verdict: plain transparent routing through the sidecar works, but this is only NAT. It does not apply bot-bottle's existing route allowlist, authorization stripping/injection, or DLP logic.

Transparent Mitmproxy Test

The current sidecar launcher uses explicit proxy mode:

MODE="--mode regular@9099"
exec mitmdump $CONFDIR_FLAG $MODE $LISTEN_HOST_FLAG $TRUST_FLAG -s /app/egress_addon.py

So transparent egress needs a launcher mode change plus iptables redirects.

Run a test mitmproxy container:

container run --name bb-spike-230t-mitm \
  --label bot-bottle.spike=transparent-egress \
  --network bb-spike-230t-egress \
  --network bb-spike-230t-agent \
  --dns 1.1.1.1 \
  --cap-add CAP_NET_ADMIN \
  --detach mitmproxy/mitmproxy:11.1.3 \
  sh -c 'apt-get update >/tmp/apt.log &&
    apt-get install -y --no-install-recommends iptables iproute2 >>/tmp/apt.log &&
    echo 1 > /proc/sys/net/ipv4/ip_forward &&
    iptables -t nat -A PREROUTING -i eth1 -p tcp --dport 80 -j REDIRECT --to-port 8080 &&
    iptables -t nat -A PREROUTING -i eth1 -p tcp --dport 443 -j REDIRECT --to-port 8080 &&
    mitmdump --mode transparent@8080 --set showhost=true --set ssl_insecure=true --set confdir=/tmp/mitm -v'

The container listened successfully:

Transparent Proxy listening at *:8080.

It had an agent-facing address of 192.168.128.7. Point the agent at it and set DNS:

container exec bb-spike-230t-agent sh -c '
  ip route replace default via 192.168.128.7 dev eth0
  printf "nameserver 1.1.1.1\n" > /etc/resolv.conf
'

DNS also needs NAT/forwarding because only TCP 80/443 is redirected:

container exec bb-spike-230t-mitm sh -c '
  iptables -t nat -A POSTROUTING -s 192.168.128.0/24 -o eth0 -j MASQUERADE
  iptables -A FORWARD -i eth1 -o eth0 -j ACCEPT
  iptables -A FORWARD -i eth0 -o eth1 -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
'

An alternative, and likely better, DNS shape is to run a DNS forwarder on the sidecar's host-only IP and point the agent at it. This was tested with dnsmasq:

container exec bb-spike-230t-mitm sh -c '
  apt-get install -y --no-install-recommends dnsmasq
  cat >/tmp/dnsmasq.conf <<EOF
no-daemon
listen-address=192.168.128.7
bind-interfaces
server=1.1.1.1
log-queries
log-facility=-
EOF
  (dnsmasq --conf-file=/tmp/dnsmasq.conf >/tmp/dnsmasq.log 2>&1 &)
  sleep 1
  ss -lunp | grep :53
'

Observed:

UNCONN 0 0 192.168.128.7:53 0.0.0.0:* users:(("dnsmasq",pid=515,fd=4))

Point the agent to sidecar DNS:

container exec bb-spike-230t-agent sh -c '
  printf "nameserver 192.168.128.7\n" > /etc/resolv.conf
  nslookup pypi.org
'

Observed:

Server:		192.168.128.7
Address:	192.168.128.7:53

Non-authoritative answer:
Name:	pypi.org
Address: 151.101.128.223
Name:	pypi.org
Address: 151.101.192.223
Name:	pypi.org
Address: 151.101.64.223
Name:	pypi.org
Address: 151.101.0.223

Direct HTTP from the agent worked and mitmproxy logged the request:

$ container exec bb-spike-230t-agent sh -c \
  'wget -T 10 -O- http://example.com | head -c 100'
Connecting to example.com (172.66.147.243:80)
<!doctype html><html lang="en"><head><title>Example Domain</title>

Mitmproxy log:

192.168.128.5:39742: GET http://example.com/
    Host: example.com
    User-Agent: Wget
 << 200 OK 559b

After switching the agent to sidecar DNS, direct HTTP still hit mitmproxy:

192.168.128.5:50784: GET http://example.com/
    Host: example.com
    User-Agent: Wget
 << 200 OK 559b

Direct HTTPS from the agent reached mitmproxy but failed certificate verification, as expected when the client does not trust the mitmproxy CA:

$ container exec bb-spike-230t-agent sh -c \
  'wget -T 10 -O- https://pypi.org/simple/pip/ | head -c 100'
Connecting to pypi.org (151.101.128.223:443)
... certificate verify failed ...

Mitmproxy log:

Client TLS handshake failed. The client does not trust the proxy's
certificate for pypi.org (tlsv1 alert unknown ca)

With verification disabled, the same direct URL succeeded and mitmproxy logged the full HTTPS request:

$ container exec bb-spike-230t-agent sh -c \
  'wget --no-check-certificate -T 10 -O- https://pypi.org/simple/pip/ | head -c 100'
Connecting to pypi.org (151.101.128.223:443)
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta name="pypi:repository-version" content="1.4">

Mitmproxy log:

192.168.128.5:32802: GET https://pypi.org/simple/pip/
    Host: pypi.org
    User-Agent: Wget
 << 200 OK 103k

After switching the agent to sidecar DNS, direct HTTPS still hit mitmproxy:

192.168.128.5:50254: GET https://pypi.org/simple/pip/
    Host: pypi.org
    User-Agent: Wget
 << 200 OK 103k

Verdict: transparent mitmproxy mode works in this topology. The bot agent would still need the egress CA installed, which bot-bottle already does for explicit proxy mode.

Answers

Can the sidecar become the agent network's default gateway?

Not directly through Apple Container's documented CLI. The installed container run --help documents --network <name>[,mac=XX:XX:XX:XX:XX:XX][,mtu=VALUE]; it does not document a gateway option.

The route can be changed after container start only if the agent has CAP_NET_ADMIN. Without it, ip route replace default via <sidecar> fails with Operation not permitted.

Can Apple Container support sidecar forwarding/NAT/transparent proxying?

Yes. A dual-homed sidecar with CAP_NET_ADMIN can enable IP forwarding, set iptables NAT/forwarding rules, and route agent traffic out through the NAT network.

Transparent mitmproxy interception also works with PREROUTING redirects to mitmdump --mode transparent.

What capabilities/custom image are required?

At minimum:

  • sidecar needs CAP_NET_ADMIN;
  • sidecar image needs iptables/iproute2 or equivalent nftables tooling;
  • sidecar should run a DNS listener on its host-only IP, or otherwise provide a controlled resolver path for the agent;
  • sidecar launcher needs a transparent mode variant;
  • agent route must be changed to the sidecar's host-only IP;
  • agent DNS should point to the sidecar DNS listener;
  • agent must trust the sidecar CA for HTTPS interception.

The tested agent route mutation required agent CAP_NET_ADMIN, which should not be accepted as the final design without a privilege-dropping init/shim story.

Can host-level pf or vmnet rules replace agent route mutation?

Not tested. The successful transparent paths did not use host pf; they used container-local routing and iptables. Host-level pf remains a possible escape hatch if Apple Container cannot set a custom gateway and we reject agent CAP_NET_ADMIN.

Can existing route policy and DLP semantics be preserved?

Likely, but not fully validated in this spike. Mitmproxy transparent mode produced normal HTTP flows with correct Host values for both HTTP and HTTPS. The existing egress_addon.py hooks should still see flow.request.pretty_host, method, path, headers, and response bodies.

But the current sidecar entrypoint only starts mitmdump in regular explicit-proxy mode. A real implementation must add a transparent mode launcher and then run the existing egress addon test suite against transparent flows.

Recommendation

Do not switch macos-container to transparent egress yet, but keep it as a plausible implementation path.

The next implementation spike should focus on removing the agent CAP_NET_ADMIN requirement. Acceptable options:

  • find or add an Apple Container-supported default-gateway setting;
  • start the agent through a tiny root init that sets route/DNS, drops capabilities, and then execs the agent as the normal user;
  • include a sidecar DNS service and set the agent resolver to the sidecar's host-only IP as part of that init/setup path;
  • avoid routing mutation by using host/vmnet-level packet redirection;
  • explicitly decide that route mutation is only a convenience layer and keep explicit proxy env vars for v1.

Bluntly: transparent egress is feasible, but not production-ready until the agent route can be controlled without leaving network-admin power in the agent runtime.