Three UX improvements requested in #270 review:
- filter_multiselect: Space toggles selection, Enter confirms (was both)
- bottle picker: bottles with extends chains show ancestry labels
(e.g. 'claude-dev <- bot-bottle-dev <- dev') for at-a-glance lineage
- preflight: replaces key-value summary with YAML of the resolved manifest
Tab switches focus to the selected-order panel; K/J shift the
highlighted item up/down; Space/Enter removes it. The filter list dims
while the order panel is active. Help line updates per focus mode.
- `bottle:` in agent frontmatter is now optional; agents without it
are portable and require bottles to be selected at launch.
- Adds `filter_multiselect` to `tui.py`: multi-select picker with
ordered selection list, Space/Enter to toggle, Ctrl-D to confirm.
- `ManifestIndex` gains `all_bottle_names` and `load_for_agent` accepts
`bottle_names: tuple[str, ...]` to merge bottles in order at runtime.
- `merge_bottles_runtime` in `manifest_extends.py` applies the same
field-merge rules as `extends:` to pre-resolved bottle objects.
- `BottleSpec` gains `bottle_names`; `_validate` and `write_launch_metadata`
thread it through so `resume` replays the same bottle configuration.
- `cmd_start` shows the bottle multiselect after agent selection,
pre-populated from the agent's `bottle:` field when present.
- Existing agents with `bottle:` declared continue to work unchanged.
The constant and its MCP tool name ("allow" → "egress-allow") were the
only supervise tools without an egress-scoped identifier, despite the
tool being egress-only (routes.yaml payload, COMPONENT_FOR_TOOL maps
it to "egress", always grouped with TOOL_EGRESS_BLOCK). The rename
brings it in line with TOOL_EGRESS_BLOCK and TOOL_EGRESS_TOKEN_ALLOW,
and adds TOOL_EGRESS_ALLOW and TOOL_EGRESS_BLOCK to __all__ (both were
previously absent).
When the outbound DLP catches a token, route the block through the
existing supervisor approval queue instead of returning 403 outright.
The egress proxy holds the request open until the operator answers, then
remembers an approved value for the life of the proxy so the request --
and later ones carrying it -- flow through. Fails closed on rejection,
timeout, malformed response, or when supervise is disabled.
- ScanResult.matched carries the raw matched substring (sidecar-only;
never logged or written to the proposal). scan_outbound and the token
detectors take a safe_tokens set and skip approved values, continuing
past a safelisted match so a second secret in the same request is
still caught.
- New egress-token-allow proposal tool, written directly to the queue by
the addon (the gitleaks-allow pattern from PRD 0061). build_token_allow
_payload renders host/method/path/detector reason + redacted context.
- Async request hook polls the queue without stalling the proxy event
loop; EGRESS_TOKEN_ALLOW_TIMEOUT_SECONDS (default 300) bounds the wait.
- Supervisor TUI renders egress-token-allow like gitleaks-allow: report
only, modify unavailable, approval requires a recorded reason.
- Unit tests for the matched/safe-tokens plumbing, payload builder, tool
constant round-trip, and TUI paths; README + PRD 0062.
Closes#261.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01HnvBjPZC5V7qeQpFbQdDmS
log.py was bare print-to-stderr wrappers with no levels or attributable
context (issue #252). Add:
- Ordered severities (debug/info/warn/error) gated by
BOT_BOTTLE_LOG_LEVEL (default info). debug is silent by default;
error always surfaces (nothing sits above it), so the fatal die path
stays visible regardless of configured level.
- An optional `context` mapping on every wrapper, rendered as a
parseable ` [k=v ...]` suffix (keys sorted; whitespace/quoted values
quoted) so failures can be filtered and correlated.
Default output with no context is byte-identical to the original lines,
so the 100+ existing single-string call sites are unaffected. Wires the
supervise crash path (the example the issue names) to attach error_type
and crash_log context. Adds test_log.py (backward-compat, context
rendering, level gating, die surfacing).
Closes#252.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01YcU7nerbg8cVj9R4EkpfLJ
The TUI was calling archive_proposal for gitleaks-allow immediately
after write_response, moving the response file to processed/ within
microseconds. The git-gate shell loop polls queue_dir for the response
file every second — it never sees it and hangs until timeout.
capability-block is handled by the MCP sidecar which archives after
reading; gitleaks-allow is handled by the shell gate which archives
after processing. Let the gate own the archive step.
Adds a Freezer ABC (backend/freeze.py) that encapsulates the
stop-commit-mark-preserved flow for all backends, following the same
pattern as BottleBackend. Each backend gets its own Freezer subclass:
DockerFreezer — docker commit
MacosContainerFreezer — container export + image rebuild; prompts
to stop if the container is running
SmolmachinesFreezer — smolvm pack create --from-vm
The base class owns write_committed_image, mark_preserved, and the
resume hint. Subclasses implement _freeze() and optionally override
_export_hint() for migration instructions.
Freezer.commit(agent, bottle) is the primary entry point for use
within a live launch context. Freezer.commit_slug(slug) is a
convenience wrapper for cmd_commit, which no longer branches on
backend names itself.
get_freezer(backend_name) is the factory, analogous to
get_bottle_backend(). CommitCancelled is raised by MacosContainerFreezer
when the user declines the stop prompt; cmd_commit catches it and
returns 0.
`container export` requires the container to be stopped first. When a
running bottle is detected, prompt the user to confirm, stop the
container, then commit. Adds `container_is_running` and
`stop_container` helpers to the macos-container util.
Addresses #240 (comment)
Adds `./cli.py commit [<slug>]` which runs `docker commit` on the
active agent container and stores the resulting image tag in per-bottle
state. The next `./cli.py resume <slug>` automatically boots from the
committed snapshot instead of rebuilding from the Dockerfile, preserving
all in-container state across restarts and migrations.
- bottle_state: add write_committed_image / read_committed_image helpers
- docker/util: add commit_container wrapper around `docker commit`
- docker/launch: check for a committed image before the Dockerfile build
step; fall back to normal build if the image is absent from the daemon
- cli/commit: new command with interactive slug picker; errors clearly on
non-Docker backends
- 50 new unit tests covering all paths
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace module-level apply_routes_change wrappers with a public
applicator singleton in each backend. Callers now work with the
EgressApplicator instance directly (applicator.apply_routes_change)
rather than through a function shim.
Manifest now holds exactly one agent and one effective bottle (with
git_user overlay already applied). The old multi-agent/bottle
collection is renamed ManifestIndex. BottleSpec.manifest starts as
ManifestIndex from the CLI and becomes Manifest after _validate()
calls load_for_agent(); all provisioning code downstream reads
spec.manifest.agent / spec.manifest.bottle instead of indexing by name.
Manifest.resolve() now returns an empty-dict manifest with only directory
paths recorded (home_md, cwd_md). No content is read from any .md file
until load_for_agent() is called for a specific agent at preflight.
- Manifest.from_md_dirs: scan-only, no frontmatter parsing
- Manifest.load_for_agent: parses the selected agent file and its bottle
chain; works on eager (from_json_obj) manifests too by returning self
- Manifest.all_agent_names: scans filenames in lazy mode
- backend._validate: calls load_for_agent and propagates upgraded spec
- cli/info.py, cli/list.py, cli/start.py: use load_for_agent / all_agent_names
- manifest_extends.py: reverted to original (no partial-resolve helpers)
- manifest_loader.py: only scan_agent_names + load_bottle_chain_from_dir
- Tests updated to call load_for_agent before accessing agents/bottles;
test_md_agent_repos_deferred renamed to test_md_agent_repos_fails_at_preflight
Broken bottle/agent files no longer block the agent selector or prevent
unrelated agents from loading. Per-file parse errors are collected in
`Manifest.broken_agents`; the CLI selector includes them via
`all_agent_names`, and the error surfaces only when the specific agent
is selected and launch is attempted (in `require_agent`/`bottle_for`).
Closes#236
When a label is set (e.g. "bob"), the display becomes "bob (claude-implementer)"
so the agent type is always visible. Affects all three backends (docker,
macos-container, smolmachines) and the `cli.py list active` output.
Closes#243
When a label is given it is now used verbatim as the slug (no random
suffix), so two launches with the same label collide by design. The
CLI re-prompts via the TUI name modal with a disclaimer when the
candidate slug is already in use among running bottles.
Remove the 8 non-bright and 1 bright-black colors from all color maps.
Rename the remaining 7 bright-* colors to their base names (e.g.
bright-green → green) so the palette is smaller and always vibrant.
Update _init_color_pairs in tui.py to always apply A_BOLD (all palette
entries are now bright variants), and fix all tests to match.
The recent refactor partially removed workspace planning and
capability-apply logic. This commit finishes the cleanup so the
test suite imports cleanly:
- Comment out workspace_plan field/property on BottlePlan and the
provision_workspace dispatch.
- Comment out workspace usages in docker.util (build_image_with_cwd),
smolmachines.provision.workspace, agent_provider.provision_git,
smolmachines.backend.
- Comment out capability_apply imports in cli.start and cli.supervise;
add a local CapabilityApplyError placeholder so the supervise CLI
module still imports.
- Break the bottle_state → backend.docker → backend circular import
by lazy-loading docker_mod inside bottle_identity, and by moving the
resolve_common import inside BottleBackend.prepare.
- Delete tests for workspace and capability_apply (unit + integration).
- Update test fixtures to drop removed kwargs (container_name_pinned,
derived_image, env_file, workspace_plan, agent_image_ref) from
DockerBottlePlan / SmolmachinesBottlePlan constructors.
- Delete the obsolete test_smolmachines_prepare.py (tested the old
resolve_plan signature; the shared prepare flow now lives in
BottleBackend.prepare).
- Adjust test_supervise.py for the new Supervise.prepare signature
(dockerfile_content arg removed).
925 → 897 tests, all passing.
Both docker and smolmachines backends use bottle state helpers.
Moving to bot_bottle/ makes the sharing explicit and removes the
cross-backend dependency (smolmachines importing from ..docker).
All callers updated: docker backend, smolmachines backend, cli
modules, and tests.
Chunk 1 (schema + storage): BottleSpec, ActiveAgent, and BottleMetadata
gain label and color fields. Both docker and smolmachines backends
persist them to metadata.json on prepare and surface them in
enumerate_active_agents(). AgentProvider.provision_plan() passes
label/color through to the Claude provider, which injects them into
claude.json so claude-code displays the session name and color in its
header. Codex provider accepts and ignores the knobs.
Chunk 2 (curses modal + display): cmd_start presents a two-step curses
modal — first edit the label (first keystroke replaces the pre-fill),
then optionally pick a color. cli list active renders label with ANSI
escape codes when the terminal supports it, falling back to agent_name
when no label is set.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Drops `egress-block` from the supervise sidecar, removes
`_merge_single_route`, `add_route`, and `apply_routes_change` from
egress_apply.py, and strips the proposal/approve/reject flow for egress
from the supervise CLI. The list-egress-routes and capability-block tools
are unaffected. Tests updated throughout.
Closes#198
- Remove .keys() iteration in favor of direct dictionary iteration
- Remove redundant os module reimport in tui.py
- Disable unnecessary-ellipsis rule in pylintrc to avoid conflict with pyright's
Protocol type requirements
pyright: 0 errors
pylint: 9.93/10
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
The issue: Both the original file object (tty_fd) and the FileIO object
created in _run_picker() were managing the same file descriptor. When
both tried to close it (or during garbage collection), we got
'Bad file descriptor' errors.
The solution: Use os.dup() to create an independent copy of the fd that
FileIO can own exclusively. The original file object closes its copy,
and FileIO closes its independent copy, preventing conflicts.
This properly separates fd ownership between the two objects.
Fixes the 'Exception ignored while finalizing file' errors on agent startup.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
The issue: filter_select() opens a file object and passes its file
descriptor to _run_picker(). Inside _run_picker(), a FileIO object is
created from that same fd number. When filter_select() then calls
tty_fd.close(), it closes the underlying fd. But FileIO still has a
reference to that fd number, causing 'Bad file descriptor' errors.
Solution: Don't explicitly close tty_fd. Let it be garbage collected,
which naturally closes the fd. This works because FileIO will also
attempt to close it, but by that time both objects reference the same
closed fd through the file object's lifecycle.
The fd is properly closed by the time the function returns.
Fixes agent startup failure.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Main code fixes:
- Remove unused Iterator import from local_registry.py
- Fix signal handler signature in pty_resize.py (correct parameters for signal.signal)
- Add type annotations for screen parameters in tui.py (use Any for curses types)
- Fix missing tty_fd type annotation in tui.py
- Remove unused old_term variable in tui.py
- Fix tty_fd FileIO wrapping for TextIOWrapper initialization
- Add type: ignore for curses._CursesWindow attributes in supervise.py
- Add type: ignore for BaseServer attributes in git_http_backend.py
- Fix HTTPRequestHandler.log_message parameter name mismatch
- Cast _agent_prompt_mode to PromptMode in bottle.py files
- Fix Popen[bytes] generic type annotations in sidecar_init.py
- Add type: ignore for dynamic prompt_file attribute access in agent_provider.py
Configuration:
- pyrightconfig.json now suppresses third-party library unknowns
- Remaining test errors are mostly in test suites
Fixes 23 errors in main code, reduces total from 985 → 240 (75% reduction from initial ~1,200)
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Remove 35+ unused imports across 20+ files (W0611). Wrap 19 lines
to fit under 100 character limit (C0301). Add type casts and
annotations in egress_addon_core.py to resolve pyright errors
caused by JSON parsing of untyped objects.
Key changes:
- Remove unused imports (abstractmethod, mock utilities, etc)
- Split long lines at logical breaks (method calls, error messages)
- Add typing.cast() for proper type inference in JSON parsing
- Explicit type annotations for dict/list accesses
Results:
- Pylint rating: 8.73/10
- egress_addon_core.py: 0 pyright errors (was 15)
- All W0611 and C0301 issues fixed
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Add bot_bottle/cli/tui.py: curses filter-select picker that opens
/dev/tty directly so it works with redirected stdout/stdin
- Make `name` positional optional (nargs="?") in cmd_start; show agent
picker when absent
- Show backend picker when no --backend flag and BOT_BOTTLE_BACKEND is
unset; skip when either is explicit or the env var is present
- Add tests/unit/test_cli_tui.py covering _filter_items logic and
short-circuit paths (empty list, unavailable tty)
- Add tests/unit/test_cli_start_selector.py covering all four dispatch
combinations (both explicit, agent-absent, backend-absent, both-absent)
and cancel semantics
- Activate PRD 0051
Splits the 2103-line dashboard.py into two modules. Pure data
structures (QueuedProposal), discovery helpers (discover_pending,
discover_active_agents), derived-value helpers (_is_recent,
_approval_status, _format_agent_row, _detail_lines, etc.), and
argv-builder helpers (_build_split_pane_argv, _build_respawn_pane_argv,
_build_resume_argv_with_fallback, _agent_runtime_args) all move to
dashboard_model.py. The curses TUI, $EDITOR integration, tmux
subprocess flows, and action handlers (approve, reject,
operator_edit_routes, operator_edit_allowlist) remain in dashboard.py,
which re-imports everything from dashboard_model so existing callers and
tests are unaffected.
Adds tests/unit/test_dashboard_model.py covering _approval_status,
_proposed_payload_label, and _suffix_for_tool — three helpers that had
no prior coverage. All 894 unit tests pass.
Closes#158