bot-bottle/docs/research/bash-vs-python-vs-go.md

# Implementation language: bash vs. Python vs. Go

Research into which runtime bot-bottle should be implemented in, given
where the project is today (~1250 lines, Python, mostly orchestration of
`docker` / `flyctl` / `ssh`). The project started in bash and was rewritten
to Python; this note evaluates whether either of the other two options
would be a better fit going forward.

## Summary

Stay on Python. Switch to Go if and when distribution friction becomes the
dominant pain — i.e., when bug reports about Python interpreter / venv
behavior start outweighing bug reports about bot-bottle itself. Bash is
not the right tool at the project's current size; reverting would be a
regression.

The single thing worth doing *now* regardless of language is keeping the
backend abstraction (local-docker / generic-remote / fly) clean enough that
a future Go rewrite would be mechanical. If the abstraction is right, the
language choice is reversible. If it isn't, the cost of switching balloons
because you're rewriting *and* redesigning at once.

## Axes that matter for this project

The relevant criteria, in roughly the order they bite:

1. **Distribution friction** — how easy is it to install the tool on a
   new dev machine.
2. **Orchestration ergonomics** — 90% of the work is shelling out to
   `docker`, `flyctl`, and `ssh`, so impedance match with subprocess
   invocation matters.
3. **JSON manifest handling** — the manifest is structured config with
   nested fields, validation rules, and per-agent overrides.
4. **Cross-platform behavior** — must work the same on macOS and Linux,
   ideally without per-platform shims.
5. **Test-matrix burden** — local-mac × local-linux × generic-remote × fly
   is already a wide test surface. The language choice should minimize
   what it adds to that matrix, not expand it.
6. **Onboarding new contributors** — single maintainer today, but the
   project is published and may attract drive-by PRs.

## Comparison

| | Bash | Python | Go |
|---|---|---|---|
| Distribution | Zero runtime — `chmod +x && run` | `uv run` is good; bare `python3` is "which one" | Single static binary, `go install` |
| Orchestration ergonomics | Native — pipes, `set -e`, no marshaling | Verbose (`subprocess.run(...)` × 50) | Verbose (`if err != nil { return err }` × 50) |
| JSON manifest | Painful past trivial — `jq` for nested writes is ugly | Excellent — `json` + dataclasses/pydantic | Excellent — struct tags, compile-time schema |
| Cross-platform | macOS bash 3.2 vs Linux 5.x + BSD/GNU userland = real hazard | Mostly a non-issue with `uv` | Best — same binary everywhere |
| Testability | Hard. No good unit story | Mature (pytest, subprocess mocking) | Mature (table-driven, `exec.Command` is mockable) |
| Startup time | ~5ms | ~100-200ms cold | ~10ms |
| Onboarding new contributors | High barrier past ~500 lines | Largest pool | Smaller but technical pool |
| Rewrite cost from current state | Already done once, regretted | Sunk cost — zero | ~1 focused week for ~1200 lines |

## Bash

Right tool *if the project stays under ~500 lines*. bot-bottle has
already crossed that threshold (~1250 lines), and the orchestration is no
longer "stitch CLIs together" — it has manifest validation, env-var
resolution, network and sidecar lifecycle, and SSH provisioning. Bash
scales badly for all four:

- **macOS bash 3.2 ceiling.** No associative arrays, no `wait -n`, no
  `${var,,}`, no namerefs. Anything written for Linux bash 5.x has to be
  back-ported by hand.
- **BSD vs GNU userland divergence.** `sed -i`, `date`, `readlink`, `mktemp`
  flags all behave differently. Every script grows portability shims.
- **JSON via `jq` is workable but ugly.** Nested writes and per-agent
  override merges become unreadable.
- **No real test story.** Black-box integration tests only.
- **Silent failure modes.** `set -euo pipefail` does not catch every case
  (e.g. `null` propagating through a pipeline into a `docker run` flag),
  and command substitutions can lose `set -e`. Subtle, hard to audit.

The original project was written in bash and rewritten to Python for
exactly these reasons. Reverting would not be a portability win; it would
be a portability loss.

## Python

Right tool *for where the project is now*. The sunk cost is real and
accurate: it's testable, the orchestration code reads cleanly, the
manifest layer benefits from structured types, and `uv` plus PEP 723
inline metadata makes the historical "which python3 with which deps"
complaint mostly historical.

Real costs that remain:

- **Startup latency** — ~100-200ms cold is noticeable but not bad for an
  interactive tool that runs a single command and hands off to `docker
  exec -it`.
- **Distribution to non-developer audiences** — has rough edges if the
  user doesn't have `uv`. Acceptable for the current audience (developers
  who already have a Python).
- **Interpreter version drift** — Ubuntu 22.04 ships 3.10, fresh distros
  ship 3.13. Behavior deltas between minor versions exist but are rare in
  the standard library surface this project uses.

## Go

Right tool *if and when distribution becomes the dominant pain*. Single
static binary works identically across macOS arm64, macOS amd64, Linux
amd64, Linux arm64 — which neutralizes most of the cross-platform leg of
the test matrix. Startup is fast enough that the tool feels native.

Costs:

- **Rewrite cost** — roughly one focused week for ~1200 lines of mostly
  mechanical orchestration. Not interesting work.
- **Verbosity** — `if err != nil { return err }` is similar in volume to
  Python's `subprocess.run(..., check=True)` plumbing. No win on terseness.
- **Smaller AI-tooling ecosystem** — most Claude Code-adjacent helpers
  and skills are Python or JS. Drive-by contributors are a smaller pool.
  Any future "import a third-party Python skill package" idea gets harder.
- **Iteration loop** — no "edit the script, rerun" — you build, then run.
  Minor; not load-bearing for a single maintainer.

## Recommendation

Stay on Python. The signal to watch for, before reconsidering, is bug
reports about Python interpreter or venv behavior outnumbering bug reports
about bot-bottle's actual logic. Until that pattern shows up, the Go
rewrite isn't paying for itself.

Independent of language: invest in the backend abstraction now. A clean
`Backend` interface (with `run`, `exec`, `cp`, `build`, `network_*`)
makes the language choice reversible. A leaky abstraction makes it
expensive in any direction.