Files
bot-bottle/docs/research/monetization-positioning.md
T
didericis d2081839c9 docs(research): add forge-native orchestration as the delivery vehicle
Fold in the forge-native angle: the git forge (GitHub/GitLab/Gitea) as
the orchestrator, with bot-bottle as the safe runtime it launches into.
Same moat (custody + audit + policy), better vehicle — the forge supplies
identity, state, triggers, review, audit, and permissions for free, and
lands the product where teams already live.

Adds: the crowding map (generic 50-100+ vs forge-native ~10-30 vs
self-hostable-least-priv-audited single digits); the GitHub/GitLab
first-party trap and why to lead Gitea + sovereignty buyers; the
buyer reconciliation (self-hosted-forge compliance orgs); a moat-vs-cost
split of the "hard parts"; run-provenance-on-every-PR as the killer
feature; the `@bot-bottle fix this` MVP riding the headless primitive;
and two forge-specific risks. Sources for the forge landscape noted as
conversation-provided, not independently re-verified.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NkwFXLFff9PYPy4wgVBJp9
2026-06-29 12:02:23 -04:00

403 lines
22 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Monetization & competitive positioning
Where, if anywhere, bot-bottle has a paid wedge — given a 2026
competitive field that has largely commoditized "sandbox a coding
agent." Folds together the agent-provider-agnostic framing, the Fly
remote-backend idea, the supervisor/egress-audit play, and the
solo-dev/Linux brand instinct, then asks the only question that
matters: is there a viable path to revenue that the competition does
not already foreclose?
Companion to
[`agent-sandbox-landscape.md`](agent-sandbox-landscape.md) (the
isolation-tech survey),
[`built-in-supervisor-design.md`](built-in-supervisor-design.md) (the
supervise surface this would extend), and
[`secret-minimization-over-dlp.md`](secret-minimization-over-dlp.md)
(why custody, not detection, is the real moat).
Market data current as of June 2026.
## Summary
**Verdict: a path exists, but it is narrow, and it is not the path the
project is currently shaped for.** Every individual property bot-bottle
leans on — isolation, BYO-image, egress filtering, OSS, self-hosting —
is matched by some competitor, and several are now *free* from the agent
vendors themselves. There is exactly one defensible position left: the
**bundle** that no single competitor occupies —
> uniform egress audit + secret custody + policy, across *heterogeneous
> coding agents you don't trust*, on your infra or a managed pool.
Monetization is viable **only** if the product is sold as cross-vendor
**fleet governance + egress audit for teams**, not as solo-dev agent
safety (which the labs give away free). The solo-dev/Linux/anti-corporate
energy is real and worth using — but as a *distribution and trust*
engine that drives bottom-up adoption into teams, never as the revenue
positioning itself. Get those two wires crossed and the business dies:
you'd be courting the lowest-willingness-to-pay audience on earth while
repelling the only buyer who pays.
Net: **viable, conditional, and unforgiving of positioning error.** Do
Phase 1 (self-hostable egress-audit dashboard) regardless — it's
low-risk and it's the demo that makes everything else legible. Gate the
go/no-go on whether 510 teams confirm they'd pay for cross-vendor
egress audit *before* building the hosted tier.
## The two axes of "agnostic"
bot-bottle differentiates on two orthogonal axes, and conflating them
muddies the pitch:
1. **Agent-provider agnostic** — run Claude Code, Codex, Aider, a local
model, behind one control layer. Already real in the code
(`agent_provider.py`, Claude/Codex templates, BYO Dockerfile). This
is the axis the labs *structurally cannot* match — Anthropic only
runs Claude, OpenAI only their models. Durable.
2. **Compute backend** — local (docker / Apple Container / smolmachines)
today; a remote **Fly** backend would add a managed pool. This is the
axis that makes "fleet" literal for orgs and opens metered billing.
Fly is a strong first remote backend because it also subsumes remote
spin-up (Machines API) and the tunnel problem (6PN/WireGuard) — but
"provider-agnostic compute" should be *earned* after backend #2, not
designed up front (premature generalization trap).
## Competitive field, by capability
The field doesn't have one competitor; it has a different set on each
capability bot-bottle touches. Five dimensions:
| Capability | Who has it | bot-bottle's standing |
| :-- | :-- | :-- |
| **Isolation / sandbox** | Anthropic & OpenAI **native, free**; OSS devcontainer wrappers; E2B/Modal/Daytona/Northflank | Commoditized. Not a wedge. |
| **Arbitrary BYO Docker image** | Sandbox PaaS (E2B/Modal/Daytona/Northflank) yes; **managed agents: ~none** (Codex = fixed `codex-universal` + setup scripts; Copilot "not supported"; Devin/Jules constrained) | Wedge **vs. managed agents** (structural: it's their infra). Table stakes vs. PaaS. |
| **Egress audit + alerts** | LLM-observability tools (Braintrust/Langfuse/Phoenix/Helicone/Datadog) — but on *model calls*, wrong layer. Network-egress security (DeepInspect, AI gateways) — right layer, but decoupled from the agent, not cross-vendor. Sandbox PaaS = gateway/filter, not an audit surface. | **~Nobody in bot-bottle's exact shape** (per-agent egress, tied to the sandbox, with DLP context, cross-vendor). This is the wedge. |
| **OSS / self-hosting** | Managed agents: ~none. Sandbox PaaS: ~half (E2B OSS+self-host; Northflank BYOC; Modal closed; **Daytona leaving OSS**). Devcontainer wrappers: ~all. Observability: several. | Real wedge **vs. managed agents only**. Table stakes vs. PaaS, zero differentiation vs. wrappers. |
| **Cross-vendor uniformity** | Nobody — the labs won't, PaaS is agent-neutral infra not agent-aware control, wrappers are single-tool | Wedge. The connective tissue of the whole position. |
The pattern: **isolation and OSS/self-host are commodity; BYO-image and
cross-vendor are wedges only against the managed agents; egress-audit in
the integrated form is the one thing genuinely unoccupied.**
## Where bot-bottle is alone vs. where it's table stakes
- **Alone (the moat):** egress audit + secret custody + policy, *tied to
the agent sandbox*, *with DLP context* (which secret, which host,
which agent/task), *uniform across vendors*. No competitor bundles
these. An enterprise *could* bolt DeepInspect-style egress monitoring
onto a sandbox, so the defensibility is the **integration and
per-agent context**, not "we can see egress."
- **Table stakes (do not lead with these):** "we sandbox agents" (free
from the labs), "we're open source" (E2B is; the wrapper crowd all
is), "we self-host" (Northflank BYOC, E2B, every wrapper).
## The two existential competitive facts
1. **The agent vendors ship good-enough sandboxing for free.** Claude
Code now has Seatbelt/bubblewrap + a network proxy natively; Codex
has its own sandbox + approvals. This compresses the *single-vendor,
single-dev* market to ~zero willingness-to-pay. It is *why* the
product must be cross-vendor fleet governance, not local agent
safety.
2. **Northflank is converging from the infra side.** It already ships
dedicated egress gateways + proxy-based secret injection + BYOC.
It is the nearest thing to bot-bottle's differentiator as a managed
platform — but infra-first and agent-neutral, not agent-aware,
cross-vendor, or audit-first. Watch it.
## Monetization path (sequenced)
Open-core: **give away the sandbox, charge for the control plane.**
- **Phase 0 — validate (12 wks, parallel).** Ask 510 teams running 2+
agents: would you pay for one egress-audit + policy plane across
Claude *and* Codex? Gate the rest on a yes.
- **Phase 1 — the wedge (self-hostable, OSS).** Multi-bottle egress
dashboard + web approval queue + exportable audit log, built over the
existing `supervise_server.py` JSON-RPC and the egress event levels
(`LOG_BLOCKS` / `LOG_FULL`). Low risk, half-built, and the 30-second
demo that sells everything. The compliance hook (75% of enterprises
rank auditability #1) lives here.
- **Phase 2 — the paywall (hosted team tier).** Multi-tenant supervisor:
SSO/RBAC, audit retention, alerting, **centralized policy push**
(define egress allowlist + DLP once, enforce across all agents —
the moat made concrete). Gate on team/compliance features, *never* on
the core security.
- **Phase 3 — Fly remote backend.** Managed agent pool → "fleet" becomes
literal; metered (agent-hours) billing; subsumes remote spin-up +
tunnel.
- **Phase 4 — deepen.** Second agent provider done deeply (lean
open-source/open-weight for rug-pull resistance); egress anomaly
detection (the DLP stream becomes a product); SOC2/audit-export for
larger buyers.
**Do not build first:** the p2p mobile app (least monetizable, 6PN
gives the tunnel free), a generic multi-cloud abstraction (premature),
or the hosted SaaS before Phase 0.
## Brand vs. revenue: the solo-dev / Linux instinct
The instinct to court Linux/hacker/solo-dev users and stay "not too
corporate" is **right for distribution, dangerous as strategy.**
- **Right:** it's how OSS infra gets discovered and trusted (HN, stars,
word-of-mouth, security-circle vouching); authenticity is a real moat
vs. the corporate players *because the architecture sincerely embodies
it* (local-first, `$HOME` trust boundary, no phone-home); and it fits
the founder.
- **Dangerous:** that audience is the lowest-WTP cohort that exists
(self-hosts the free thing, forks rather than pays), and "not too
corporate" reads to a VP of Eng as "not enterprise-ready." Building an
anti-SaaS brand and then shipping a paid tier invites the sell-out /
rug-pull backlash — which **Daytona just triggered** going closed.
**Resolution — be Tailscale, not a manifesto.** Use the developer-first,
respects-you energy as the *funnel*; sell *through* the solo advocate,
bottom-up, into the team that pays. Two guardrails:
1. "Anti-corporate" must not mean "anti-team-features." SSO/RBAC/audit
retention *are* the monetization; build them in a developer-respecting
way (Tailscale has SSO and is still beloved). Tone is the brand; team
features are the product.
2. Set the open-core social contract publicly **on day one** — core
sandbox open and self-hostable forever; hosted control plane is how
the lights stay on. The communities that don't revolt are the ones
told the deal upfront.
Concrete: the README frames the Docker/**Linux** backend as "legacy."
If courting the Linux crowd, make the Linux path (Docker+gVisor,
libkrun/smolmachines) first-class in the docs, not the fallback.
## Individuals, mobile, and the Pi-ecosystem reality check
"Individual devs won't pay" (above) is too blunt and needs refining.
The accurate claim: individuals won't pay for **safety-as-insurance**
(abstract risk reduction the labs give away free), but they *do* pay for
**capability/convenience felt daily** — Claude Pro, Cursor, Tailscale
Personal. "Drive my self-hosted agent from my phone" is capability, not
insurance, so it has a real (low-priced, high-churn) WTP profile. The
self-hoster/Linux crowd specifically pays for **sovereignty/control**,
just not for enterprise insurance. So an individual "sovereign remote
agent access" tier is *not* unreasonable in principle.
**But the market has already run that experiment, in public, for free.**
The Pi ecosystem (pi.dev) has commoditized every convenience layer an
individual product would charge for:
| Capability | Already free/OSS | bot-bottle differentiates? |
| :-- | :-- | :-- |
| Remote control from mobile | remote-pi, Paseo, TelePi | ❌ commoditized |
| Multi-agent orchestration from mobile | Paseo, pi-agent-dashboard | ❌ commoditized |
| **Launch** new agents from mobile | Paseo (`paseo run`) | ❌ commoditized |
| Launch into a **sandboxed, egress-audited** env | nobody | ✅ the moat |
Paseo (`getpaseo/paseo`, on the App Store) does the full thing an
individual remote-control tier would charge for — launch *and* attach
agents on a laptop/VM/dev-server, driven from mobile over an E2E relay —
free and open source. It *orchestrates* agents; it does **not** sandbox them, run
an egress chokepoint, DLP-scan, or audit. None of the Pi-ecosystem tools
do. So the residue, yet again, is **isolation + governance**, not
remote/launch convenience.
Two takeaways:
1. **Don't compete on orchestration/launch/remote UX** — it's a solved,
free, fast-moving, App-Store-shipping space around Pi. You won't win
it and it isn't the moat.
2. **Be the safe runtime orchestrators launch *into*.** Launch-from-mobile
is table stakes; *launch-into-a-sealed-egress-audited-bottle* is the
differentiator. bot-bottle is the sandbox an orchestrator like Paseo
would target, or that you wrap thin orchestration around — never the
orchestrator itself.
Capability layers commoditize fast: every individual/mobile angle
probed in this analysis collapsed back to the same cross-vendor +
sandbox + egress-audit + custody bundle. Mobile remote belongs as a
*funnel delighter* on top of the team product, not a standalone paid
line.
## Forge-native orchestration as the delivery vehicle
The strongest concrete *product shape* for the moat is not a bespoke
dashboard and not a Paseo competitor — it is **the git forge as the
orchestrator, with bot-bottle as the safe runtime it launches into.**
The forge already provides, for free, everything an orchestrator would
otherwise have to build: identity (agent/bot users, signed commits),
state (issues, labels, PRs/MRs, comments), triggers (webhooks, CI,
comment commands), review (diffs, approvals, status checks), audit
(commits/comments/reviews), and permissions (repo access, protected
branches, token scopes). bot-bottle supplies the one thing the forge
doesn't: **least-privilege, secret-isolated, audited execution of
untrusted agents.** Same moat (custody + audit + policy), better
vehicle — and it lands the product where teams already live, so it
avoids building an agent dashboard before one is needed.
The flow is essentially free to assemble:
```
issue/PR/MR event → webhook → policy/router → assign agent user +
branch/worktree → run agent in an isolated bottle (no ambient secrets)
→ commit as agent identity → open PR/MR → CI + human review + merge
```
**Crowding (why this is less saturated than it looks):**
| Layer | How crowded |
| :-- | :-- |
| Generic multi-agent orchestrators (worktree/TUI/dashboard) | very — 50100+ |
| Forge-native issue/PR/MR orchestration | moderate — ~1030 serious |
| Self-hostable, least-privilege, audited, forge-portable | **single digits** |
The deeper you go toward *untrusted-agent safety + auditability +
self-hostable + forge-portable*, the emptier it gets.
**The GitHub/GitLab first-party trap → lead Gitea + sovereignty.**
GitHub (Agentic Workflows, Copilot coding agent) and GitLab (Duo Agent
Platform) are the forge *vendors* building native issue-to-PR agent
orchestration with native identity/permissions/audit. On their turf you
lose the integration-depth battle the same way single-vendor agent
safety loses to Anthropic/OpenAI — the same "incumbent ships it free,
deeper" dynamic, one layer up. So the durable opening is **Gitea +
self-hosted** (no first-party agent platform exists — the open Gitea
feature request for an AI code agent confirms the vacuum) plus
**cross-forge *untrusted-agent* safety**, which no forge vendor will
build because they want you running *their* agent, not arbitrary ones
under uniform least-privilege across competitors' forges. Cross-vendor
neutrality, applied to forges.
**Buyer reconciliation.** The least-crowded opening (self-hosted Gitea)
overlaps the lowest-WTP crowd (indie self-hosters), while the paying
teams sit on GitHub/GitLab where first-party competition is fiercest.
The intersection that resolves it: **orgs running self-hosted forges for
sovereignty/compliance reasons** (regulated, air-gapped, security-
conscious, on-prem). They have budget, they run self-hosted GitLab/Gitea,
*and* shipping code to a cloud agent vendor is a non-starter — so "run
untrusted agents sandboxed, least-privilege, fully audited, inside our
forge, on our infra" is a procurement checkbox, not a nicety. That is
where "least-crowded" finally meets "has money."
**Separate moat-hard-parts from cost-hard-parts.** The orchestration
"hard parts" are two different things, and conflating them oversells the
fit:
| Moat (your differentiated strength) | Undifferentiated cost (everyone faces) |
| :-- | :-- |
| permission isolation | idempotency / dedupe / run ledger |
| secret handling under malicious prompts | concurrency, locks, cancellation |
| run provenance | queueing / scheduling / cleanup |
| policy language | merge-conflict handling (~27% agent-PR conflict rate) |
The right column is generic distributed-systems plumbing that wins you
nothing and that merge-conflict resolution especially is a *different
competency* from sandbox/custody. Keep it thin in the MVP; do not build a
policy DSL + durable ledger + conflict resolver before one org pays.
**The killer feature: run provenance on every agent PR.** A check/comment
answering — which agent, which model, which prompt, which base commit,
which policy, which tools, which network egress, which test results —
attached at the moment a human reviews. It renders the (invisible)
custody + egress-audit work as a PR artifact the buyer sees at the exact
trust-decision point. No forge vendor's first-party agent will show you
"here is everything the untrusted agent could reach." Build this first.
**MVP** (`@bot-bottle fix this`): create an isolated worktree/bottle →
check out the issue branch → run the selected harness as a named agent
user → deny ambient secrets by default → record prompt/model/tools/policy
→ commit with bot identity → open PR/MR → attach the run-provenance
footer (log + tests + permission/egress summary) → require human merge.
The security model *is* the product. This rides the headless launch
primitive directly: webhook → `start --headless` into an isolated bottle
→ commit as agent identity → PR with provenance.
Open-core line is unchanged: the webhook/comment trigger stays free
(adoption); the sandboxed-execution + provenance + policy layer is the
paid governance.
## Risks to the thesis
- **Lab encroachment.** If Anthropic/OpenAI add cross-agent governance
or open their managed egress logs, the wedge narrows. Mitigate by
going deep on cross-vendor + custody + audit *now*, while they're
single-vendor.
- **Rug-pull dependency.** You run the labs' agents; they can restrict
their agent to their own sandbox via ToS/tech. Hedge toward
open-source/open-weight agents for durability.
- **Northflank (or E2B) ships agent-aware audit.** Plausible from the
infra side. Your defense is agent-awareness + the supervise approval
loop + cross-vendor, not raw egress visibility.
- **WTP may simply not be there.** The honest failure mode: teams like
the audit but won't pay because "we already sandbox in CI." Phase 0
exists to find this out cheaply before building Phase 2/3.
- **Forge-vendor encroachment (forge-native path).** GitHub Agentic
Workflows / Copilot and GitLab Duo are first-party and deepening.
Defense: aim at self-hosted Gitea + sovereignty buyers where no
first-party agent platform exists, and at cross-forge untrusted-agent
neutrality the vendors won't build. Don't fight them GitHub-native.
- **Orchestration-reliability scope creep.** The forge-native build
drags in idempotency, queueing, concurrency, and merge-conflict
handling — undifferentiated plumbing that isn't the moat. Keep it thin
until a paying org forces it.
## Recommendation
Build Phase 1 now — it's low-risk, half-built, and the proof artifact.
Run Phase 0 in parallel. Treat a clear yes from 510 teams as the
green light for the hosted tier; treat a soft maybe as a signal to stay
an excellent OSS tool with a tip-jar/support model rather than a
venture-shaped SaaS. The technology is not the risk — the codebase is
exemplary and the architecture already supports the pivot. The risk is
**positioning discipline**: sell cross-vendor fleet governance to teams,
use the indie brand as the funnel, and never let the anti-corporate
aesthetic veto the features that pay.
## Sources
- Anthropic — Claude Code sandboxing:
https://www.anthropic.com/engineering/claude-code-sandboxing
- OpenAI Codex — cloud environments:
https://developers.openai.com/codex/cloud/environments ;
custom-image feature request:
https://community.openai.com/t/feature-request-custom-docker-images/1265333
- GitHub Copilot — custom container image (not supported), discussion
#194105: https://github.com/orgs/community/discussions/194105
- DeepInspect — AI egress monitoring:
https://www.deepinspect.ai/blog/ai-egress-monitoring
- Braintrust — AI agent observability/alerting:
https://www.braintrust.dev/articles/best-ai-agent-observability-tools-2026
- E2B (OSS, Apache-2.0): https://github.com/e2b-dev/e2b ;
infra/self-host: https://github.com/e2b-dev/infra
- Daytona going closed source:
https://www.daytona.io/dotfiles/updates/daytona-is-going-closed-source
- Northflank — BYOC / egress gateways:
https://northflank.com/blog/what-is-byoc-in-cloud-computing ;
https://northflank.com/blog/self-hostable-alternatives-to-e2b-for-ai-agents
- Modal Sandboxes: https://modal.com/products/sandboxes
- AI agent orchestration / enterprise governance (75% cite
auditability):
https://viston.tech/ai-agent-orchestration-in-2026-moving-from-pilots-to-enterprise-wide-execution/
- Pi harness (provider-agnostic CLI): https://pi.dev/packages/remote-pi ;
https://github.com/earendil-works/pi
- Paseo (launch + attach agents from desktop/mobile, OSS):
https://github.com/getpaseo/paseo ;
https://apps.apple.com/us/app/paseo-remote-coding-agents/id6758887924
- pi-agent-dashboard (mobile-first remote control via mDNS/zrok):
https://github.com/BlackBeltTechnology/pi-agent-dashboard
- TelePi (Telegram remote control for Pi):
https://futurelab.studio/blog/telepi-telegram-remote-control-for-pi/
- Forge-native landscape (provided via conversation, not independently
re-verified):
- awesome-agent-orchestrators (50+ generic orchestrators):
https://github.com/andyrewlee/awesome-agent-orchestrators
- GitHub Agentic Workflows (first-party repo automation):
https://github.blog/ai-and-ml/automate-repository-tasks-with-github-agentic-workflows/
- GitLab Duo Agent Platform GA:
https://ir.gitlab.com/news/news-details/2026/GitLab-Announces-the-General-Availability-of-GitLab-Duo-Agent-Platform/default.aspx
- ai-review (cross-forge review incl. Gitea):
https://github.com/Nikita-Filonov/ai-review
- Gitea feature request — AI code agent (the vacuum):
https://github.com/go-gitea/gitea/issues/34527
- Phoenix — safe GitHub issue resolution (label-based webhook state
machine): https://arxiv.org/abs/2606.20243
- AgenticFlict — ~27% merge-conflict rate in agent PRs:
https://arxiv.org/abs/2604.03551