docs: add Fly Machines case study to remote-docker-vm-isolation note
test / run tests/run_tests.py (push) Successful in 13s
test / run tests/run_tests.py (push) Successful in 13s
Concrete worked example covering image strategy (with the bake-the- claude-bottle-image-in optimization that elides 30-90s of in-VM build), cold/warm/hot boot-to-prompt timing, standby vs ephemeral cost breakdown, three workflow patterns, and Fly-specific gotchas (DinD kernel requirements, the y/N preflight blocking automated launch, pricing-may-have-moved hedge). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -139,6 +139,112 @@ For the "VPN pivot" failure mode, see
|
||||
remote VM back to your LAN. If the agent needs LAN resources, expose
|
||||
those through a narrow API instead.
|
||||
|
||||
## Case study: Fly Machines
|
||||
|
||||
Fly.io's Machines product is a useful concrete worked example because
|
||||
it satisfies all the provider requirements (root, Firecracker-backed
|
||||
isolation, scriptable lifecycle, per-second billing) and surfaces the
|
||||
gotchas the abstract pattern leaves implicit.
|
||||
|
||||
### Image strategy
|
||||
|
||||
Build a custom OCI image `FROM docker:dind` that bakes in:
|
||||
|
||||
- The claude-bottle repository checkout.
|
||||
- A pre-built `claude-bottle:latest` image, saved via `docker save` on
|
||||
your laptop and loaded in at image-build time
|
||||
(`RUN docker load < claude-bottle.tar`) or pushed as a layer into
|
||||
the dind storage. Without this step, the first in-VM `docker build`
|
||||
runs `apt-get` and a global `npm install -g
|
||||
@anthropic-ai/claude-code`, which adds 30–90 s to every cold start.
|
||||
- A `flyctl secrets`-injected `CLAUDE_BOTTLE_OAUTH_TOKEN`, exposed to
|
||||
the VM's PID 1 as an env var.
|
||||
- An entrypoint that starts dockerd, waits for it to be healthy, then
|
||||
either drops into a shell or directly runs `cli.py start <agent>`.
|
||||
|
||||
Deploy with `flyctl deploy` or `flyctl machine run --image …`.
|
||||
|
||||
### Boot-to-first-prompt timing
|
||||
|
||||
Three scenarios, all assuming the custom image above (claude-bottle
|
||||
image baked in, token injected, no in-VM rebuild):
|
||||
|
||||
| Phase | Cold (image not cached on Fly host) | Warm (image cached, `machine run` fresh) | Hot (`machine stop`ped, `machine start`) |
|
||||
| --- | --- | --- | --- |
|
||||
| Fly schedule + image fetch | 10–30 s | 2–3 s | ~1 s |
|
||||
| Firecracker kernel boot | ~1 s | ~1 s | ~1 s (resume) |
|
||||
| dockerd-in-VM startup | 2–4 s | 2–4 s | 0 s (already running) |
|
||||
| `cli.py start <agent>` housekeeping (network creates, pipelock sidecar, agent container, skill copy) | 4–6 s | 4–6 s | 4–6 s |
|
||||
| Claude prints first prompt | 1–3 s | 1–3 s | 1–3 s |
|
||||
| **End-to-end** | **~20–45 s** | **~10–17 s** | **~7–11 s** |
|
||||
|
||||
For interactive sessions the warm path is the realistic baseline once
|
||||
the custom image is registered. The hot path trims only a few extra
|
||||
seconds — the question of whether to keep stopped Machines on standby
|
||||
is mostly about cost, not speed.
|
||||
|
||||
### Cost of standby vs. create-per-session
|
||||
|
||||
Stopped Fly Machines stop billing CPU/RAM but continue to bill for
|
||||
storage and any allocated IPv4. A reasonable claude-bottle Machine
|
||||
size (2 vCPU / 2 GB / ~3 GB rootfs) costs roughly:
|
||||
|
||||
| Item | While stopped | Monthly |
|
||||
| --- | --- | --- |
|
||||
| CPU + RAM | not billed | $0 |
|
||||
| Rootfs storage | ~$0.15/GB-month | ~$0.45 |
|
||||
| Dedicated IPv4 (if allocated) | $2/month flat | $2.00 |
|
||||
| Dedicated IPv6 | free | $0 |
|
||||
| Bandwidth | usage-based | $0 |
|
||||
|
||||
So **roughly $0.50–$2.50/month per standby Machine**, with the IPv4
|
||||
line dominating. Drop the dedicated v4 (use IPv6 or Fly's shared v4
|
||||
via WireGuard) and standby falls under $1/month.
|
||||
|
||||
For comparison, running the same Machine 24/7 lands in the
|
||||
$15–$40/month range depending on size, and the create-and-destroy
|
||||
pattern (one Machine per session, destroyed on exit) is effectively
|
||||
$0 since you only pay for the seconds it ran.
|
||||
|
||||
### Practical pattern
|
||||
|
||||
Two reasonable workflows, plus one that's tempting but worse:
|
||||
|
||||
1. **Pure ephemeral.** `flyctl machine run` at session start,
|
||||
`flyctl machine destroy` on exit. ~20–45 s cold start, $0 idle.
|
||||
Maximally isolated; nothing persists between sessions. Best fit
|
||||
when sessions are infrequent or when state continuity across
|
||||
sessions is itself a concern.
|
||||
2. **Standby pool.** A small fleet of pre-built Machines that get
|
||||
`start`ed fresh and `destroy`ed (or wiped) per session. The
|
||||
*Machine identity* is short-lived but the image is pre-cached on
|
||||
Fly's hosts, keeping warm-path latency at ~10–17 s.
|
||||
~$0.50–$1/month per Machine in the pool without dedicated v4.
|
||||
3. **Permanently stopped Machine, just `start`/`stop`.** Saves a few
|
||||
extra seconds (~7–11 s hot) but is the weakest of the three on
|
||||
the isolation axis — the rootfs persists across sessions, so
|
||||
anything a previous session wrote is still there. Avoid unless
|
||||
the saved seconds matter more than the state-continuity concern.
|
||||
|
||||
### Fly-specific caveats
|
||||
|
||||
- **DinD requires kernel features.** Fly Machines historically had
|
||||
some namespacing quirks for nested Docker; verify on a smoke-test
|
||||
Machine before committing. The pattern is supported (Fly's own
|
||||
Remote Builders use it), but kernel/runtime updates have shifted
|
||||
the requirements over time.
|
||||
- **The launcher's interactive y/N preflight blocks automated remote
|
||||
start.** `cli.py start` waits on `/dev/tty`. For an automated entry
|
||||
point you need to pipe `y\n` into stdin, drive it from a pty, or
|
||||
add a `--yes`/`--non-interactive` flag (a small patch). The
|
||||
`--remote=user@host` ergonomics direction below would handle this
|
||||
in passing.
|
||||
- **Pricing has been re-tariffed multiple times.** The structure
|
||||
(per-second compute, GB-month storage, $2/v4) has been stable;
|
||||
specific rates may have moved. Verify against
|
||||
[fly.io/docs/about/pricing](https://fly.io/docs/about/pricing)
|
||||
before committing numbers to any planning doc.
|
||||
|
||||
## Optional ergonomics direction
|
||||
|
||||
A future addon — not architecturally necessary, just nicer:
|
||||
|
||||
Reference in New Issue
Block a user