diff --git a/docs/research/bash-vs-python-vs-go.md b/docs/research/bash-vs-python-vs-go.md new file mode 100644 index 0000000..292e8d4 --- /dev/null +++ b/docs/research/bash-vs-python-vs-go.md @@ -0,0 +1,128 @@ +# Implementation language: bash vs. Python vs. Go + +Research into which runtime claude-bottle should be implemented in, given +where the project is today (~1250 lines, Python, mostly orchestration of +`docker` / `flyctl` / `ssh`). The project started in bash and was rewritten +to Python; this note evaluates whether either of the other two options +would be a better fit going forward. + +## Summary + +Stay on Python. Switch to Go if and when distribution friction becomes the +dominant pain — i.e., when bug reports about Python interpreter / venv +behavior start outweighing bug reports about claude-bottle itself. Bash is +not the right tool at the project's current size; reverting would be a +regression. + +The single thing worth doing *now* regardless of language is keeping the +backend abstraction (local-docker / generic-remote / fly) clean enough that +a future Go rewrite would be mechanical. If the abstraction is right, the +language choice is reversible. If it isn't, the cost of switching balloons +because you're rewriting *and* redesigning at once. + +## Axes that matter for this project + +The relevant criteria, in roughly the order they bite: + +1. **Distribution friction** — how easy is it to install the tool on a + new dev machine. +2. **Orchestration ergonomics** — 90% of the work is shelling out to + `docker`, `flyctl`, and `ssh`, so impedance match with subprocess + invocation matters. +3. **JSON manifest handling** — the manifest is structured config with + nested fields, validation rules, and per-agent overrides. +4. **Cross-platform behavior** — must work the same on macOS and Linux, + ideally without per-platform shims. +5. **Test-matrix burden** — local-mac × local-linux × generic-remote × fly + is already a wide test surface. The language choice should minimize + what it adds to that matrix, not expand it. +6. **Onboarding new contributors** — single maintainer today, but the + project is published and may attract drive-by PRs. + +## Comparison + +| | Bash | Python | Go | +|---|---|---|---| +| Distribution | Zero runtime — `chmod +x && run` | `uv run` is good; bare `python3` is "which one" | Single static binary, `go install` | +| Orchestration ergonomics | Native — pipes, `set -e`, no marshaling | Verbose (`subprocess.run(...)` × 50) | Verbose (`if err != nil { return err }` × 50) | +| JSON manifest | Painful past trivial — `jq` for nested writes is ugly | Excellent — `json` + dataclasses/pydantic | Excellent — struct tags, compile-time schema | +| Cross-platform | macOS bash 3.2 vs Linux 5.x + BSD/GNU userland = real hazard | Mostly a non-issue with `uv` | Best — same binary everywhere | +| Testability | Hard. No good unit story | Mature (pytest, subprocess mocking) | Mature (table-driven, `exec.Command` is mockable) | +| Startup time | ~5ms | ~100-200ms cold | ~10ms | +| Onboarding new contributors | High barrier past ~500 lines | Largest pool | Smaller but technical pool | +| Rewrite cost from current state | Already done once, regretted | Sunk cost — zero | ~1 focused week for ~1200 lines | + +## Bash + +Right tool *if the project stays under ~500 lines*. claude-bottle has +already crossed that threshold (~1250 lines), and the orchestration is no +longer "stitch CLIs together" — it has manifest validation, env-var +resolution, network and sidecar lifecycle, and SSH provisioning. Bash +scales badly for all four: + +- **macOS bash 3.2 ceiling.** No associative arrays, no `wait -n`, no + `${var,,}`, no namerefs. Anything written for Linux bash 5.x has to be + back-ported by hand. +- **BSD vs GNU userland divergence.** `sed -i`, `date`, `readlink`, `mktemp` + flags all behave differently. Every script grows portability shims. +- **JSON via `jq` is workable but ugly.** Nested writes and per-agent + override merges become unreadable. +- **No real test story.** Black-box integration tests only. +- **Silent failure modes.** `set -euo pipefail` does not catch every case + (e.g. `null` propagating through a pipeline into a `docker run` flag), + and command substitutions can lose `set -e`. Subtle, hard to audit. + +The original project was written in bash and rewritten to Python for +exactly these reasons. Reverting would not be a portability win; it would +be a portability loss. + +## Python + +Right tool *for where the project is now*. The sunk cost is real and +accurate: it's testable, the orchestration code reads cleanly, the +manifest layer benefits from structured types, and `uv` plus PEP 723 +inline metadata makes the historical "which python3 with which deps" +complaint mostly historical. + +Real costs that remain: + +- **Startup latency** — ~100-200ms cold is noticeable but not bad for an + interactive tool that runs a single command and hands off to `docker + exec -it`. +- **Distribution to non-developer audiences** — has rough edges if the + user doesn't have `uv`. Acceptable for the current audience (developers + who already have a Python). +- **Interpreter version drift** — Ubuntu 22.04 ships 3.10, fresh distros + ship 3.13. Behavior deltas between minor versions exist but are rare in + the standard library surface this project uses. + +## Go + +Right tool *if and when distribution becomes the dominant pain*. Single +static binary works identically across macOS arm64, macOS amd64, Linux +amd64, Linux arm64 — which neutralizes most of the cross-platform leg of +the test matrix. Startup is fast enough that the tool feels native. + +Costs: + +- **Rewrite cost** — roughly one focused week for ~1200 lines of mostly + mechanical orchestration. Not interesting work. +- **Verbosity** — `if err != nil { return err }` is similar in volume to + Python's `subprocess.run(..., check=True)` plumbing. No win on terseness. +- **Smaller AI-tooling ecosystem** — most Claude Code-adjacent helpers + and skills are Python or JS. Drive-by contributors are a smaller pool. + Any future "import a third-party Python skill package" idea gets harder. +- **Iteration loop** — no "edit the script, rerun" — you build, then run. + Minor; not load-bearing for a single maintainer. + +## Recommendation + +Stay on Python. The signal to watch for, before reconsidering, is bug +reports about Python interpreter or venv behavior outnumbering bug reports +about claude-bottle's actual logic. Until that pattern shows up, the Go +rewrite isn't paying for itself. + +Independent of language: invest in the backend abstraction now. A clean +`Backend` interface (with `run`, `exec`, `cp`, `build`, `network_*`) +makes the language choice reversible. A leaky abstraction makes it +expensive in any direction.