refactor(demo): drive recording through real cli.py instead of a harness
test / unit (push) Successful in 14s
test / integration (push) Successful in 29s

The previous demo harness called the backend Python API directly,
which didn't match what a user typing `./cli.py start <agent>` would
actually see. The recording now goes through the real CLI surface:

- claude-bottle.demo.json + scripts/demo-setup.sh stage a demo
  manifest (one bottle, FAKE_TOKEN env, one unreachable git upstream)
  alongside a dummy SSH identity at ~/.cache/claude-bottle-demo/.
- docs/demo.tape types `./cli.py start demo`, answers the y/N
  preflight, and runs four bash probes via claude's `!` prefix
  (curl x3 + git push), so the recording shows real preflight output
  and real probe results.
- scripts/demo.sh wraps setup -> cli.py -> teardown for human use;
  scripts/demo-record.sh does the same around `vhs docs/demo.tape`.
- .gitignore picks up claude-bottle.json so a user's local manifest
  doesn't get tracked alongside .example / .demo siblings.

scripts/demo_harness.py is removed -- its behavior is fully replaced
by the cli.py + `!` flow.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-13 01:26:41 -04:00
parent 4ef1cc58df
commit 030a6bc793
9 changed files with 176 additions and 314 deletions
+17 -44
View File
@@ -1,56 +1,29 @@
#!/usr/bin/env bash
# Demo runner: builds the image graph if needed, then runs the four-scenario
# harness against a real bottle. Designed to produce screen-recordable
# output — paced banners, color, no Python tracebacks unless something
# actually breaks.
# Human-runnable demo wrapper. Stages the demo manifest and dummy
# identity (see scripts/demo-setup.sh), launches `./cli.py start demo`
# interactively, then restores prior state. The recorded GIF
# (docs/demo.gif) goes through the same flow via docs/demo.tape.
#
# Usage:
# bash scripts/demo.sh # run live
# vhs docs/demo.tape # record to docs/demo.gif
# Once attached to claude inside the bottle, use the `!` prefix to run
# bash directly — e.g.
# ! curl --proxy "$HTTPS_PROXY" -sw 'status=%{http_code}\n' \
# -o /dev/null http://example.com/
# returns 403 because example.com is not on the bottle's allowlist.
set -euo pipefail
cd "$(dirname "$0")/.."
verbose=0
for arg in "$@"; do
case "$arg" in
-v|--verbose) verbose=1 ;;
-h|--help)
cat <<EOF
Usage: bash scripts/demo.sh [--verbose]
Runs four pipelock + git-gate probes against a real bottle and prints
PASS/BLOCK verdicts. Without --verbose, Docker build chatter and
backend log lines are suppressed so the output is recordable.
if [ -z "${CLAUDE_BOTTLE_OAUTH_TOKEN:-}" ]; then
cat <<'EOF' >&2
demo: CLAUDE_BOTTLE_OAUTH_TOKEN is unset. The bottle launches claude,
which needs the token to authenticate. Set it in your shell env (e.g.
~/.zshrc) — see README §Auth — then re-run.
EOF
exit 0 ;;
esac
done
if ! command -v docker >/dev/null 2>&1; then
echo "docker not found on PATH — install Docker Desktop or equivalent first" >&2
exit 1
fi
if ! docker info >/dev/null 2>&1; then
echo "docker daemon not reachable — start Docker and re-run" >&2
exit 1
fi
bash scripts/demo-setup.sh
trap 'bash scripts/demo-teardown.sh' EXIT
# Pre-warm the image graph quietly so the recorded run shows only the
# four scenario blocks, not BuildKit progress. The backend rebuilds
# (cache-hit) on launch regardless; doing it once up front keeps the
# launch-time chatter short.
if [ "$verbose" = 0 ]; then
docker build -q -t claude-bottle:latest . >/dev/null 2>&1 || true
docker build -q -f Dockerfile.git-gate -t claude-bottle-git-gate:latest . >/dev/null 2>&1 || true
fi
if [ "$verbose" = 1 ]; then
exec python3 -u scripts/demo_harness.py
else
# Stderr carries backend info() lines and BuildKit chatter; drop it.
# The harness writes all scenario output (banners, results) to stdout.
exec python3 -u scripts/demo_harness.py 2>/dev/null
fi
./cli.py start demo