PRD: SQLite local storage #320

Open
didericis-codex wants to merge 9 commits from sqlite-local-storage into main
Collaborator

Closes #319.

PRD

Summary

Moves the supervise queue and audit log to stdlib SQLite.

  • Queue proposals/responses now live as tables in the host-level ~/.bot-bottle/bot-bottle.db; queue rows are scoped by queue key / bottle slug.
  • The sidecar receives that host DB as a writable bind mount at /run/supervise/bot-bottle.db via SUPERVISE_DB_PATH; there is no queue directory mount in the active flow.
  • Audit entries also live in the host-level DB, creating the shared local database foundation that forge-native work can build on instead of carrying one-off storage boilerplate in PR #318.
  • Git-gate gitleaks approval queueing and egress token override queueing now use the same SQLite-backed supervise helpers instead of writing JSON proposal/response files.
  • Tests were updated for SQLite-backed pending discovery, response waits, archive semantics, audit round-trips, backend mounts, git-gate render, egress supervise flow, and edge cases.

Merge rule(s)

Requires human review and merge — no auto-merge.

Closes #319. [PRD](https://gitea.dideric.is/didericis/bot-bottle/src/commit/29904609da4f829b14706789dd819fc98d6e0c71/docs/prds/prd-new-sqlite-local-storage.md) ## Summary Moves the supervise queue and audit log to stdlib SQLite. - Queue proposals/responses now live as tables in the host-level `~/.bot-bottle/bot-bottle.db`; queue rows are scoped by queue key / bottle slug. - The sidecar receives that host DB as a writable bind mount at `/run/supervise/bot-bottle.db` via `SUPERVISE_DB_PATH`; there is no queue directory mount in the active flow. - Audit entries also live in the host-level DB, creating the shared local database foundation that forge-native work can build on instead of carrying one-off storage boilerplate in PR #318. - Git-gate gitleaks approval queueing and egress token override queueing now use the same SQLite-backed supervise helpers instead of writing JSON proposal/response files. - Tests were updated for SQLite-backed pending discovery, response waits, archive semantics, audit round-trips, backend mounts, git-gate render, egress supervise flow, and edge cases. ## Merge rule(s) Requires human review and merge — no auto-merge.
didericis-codex added 1 commit 2026-07-01 12:53:39 -04:00
didericis added 1 commit 2026-07-01 12:56:26 -04:00
feat(supervise): store queue and audit data in sqlite
lint / lint (push) Failing after 1m53s
test / unit (pull_request) Failing after 45s
test / integration (pull_request) Successful in 17s
test / coverage (pull_request) Failing after 50s
08918f9a8a
didericis added 1 commit 2026-07-01 12:57:50 -04:00
test(supervise): update edge cases for sqlite storage
lint / lint (push) Successful in 1m55s
test / unit (pull_request) Successful in 53s
test / integration (pull_request) Successful in 20s
test / coverage (pull_request) Successful in 1m10s
f1b8bbdfa1
didericis added 1 commit 2026-07-01 12:57:59 -04:00
docs(prd): activate sqlite local storage
test / unit (pull_request) Successful in 55s
test / integration (pull_request) Successful in 18s
test / coverage (pull_request) Successful in 1m6s
212551df9a
didericis reviewed 2026-07-01 15:28:02 -04:00
@@ -89,2 +88,4 @@
QUEUE_DIR_IN_CONTAINER = "/run/supervise/queue"
DEFAULT_POLL_INTERVAL_SEC = 0.5
HOST_DB_FILENAME = "bot-bottle.db"
QUEUE_DB_FILENAME = "supervise.db"
Owner

Don't think there should be a separate queue db. The queue should simply be a table within the host bot bottle db.

Don't think there should be a separate queue db. The queue should simply be a table within the host bot bottle db.
didericis marked this conversation as resolved
didericis added 1 commit 2026-07-01 15:33:49 -04:00
fix(supervise): store queue rows in host sqlite db
lint / lint (push) Successful in 2m5s
test / unit (pull_request) Successful in 58s
test / integration (pull_request) Successful in 20s
test / coverage (pull_request) Successful in 1m2s
3067b067d2
Author
Collaborator

Replying to #issuecomment-2775: fixed in 3067b067d2. Queue proposals/responses now live in tables in the host-level ~/.bot-bottle/bot-bottle.db, scoped by queue_key / bottle slug. The sidecar gets that same host DB mounted at /run/supervise/bot-bottle.db via SUPERVISE_DB_PATH; the queue dir mount remains only as the compatibility handle for existing call sites.

I also moved the git-gate gitleaks approval path off JSON proposal/response files so it uses the same SQLite-backed supervise helpers.

Validation: python3 -m unittest discover -s tests/unit passes (1509 tests).

Replying to #issuecomment-2775: fixed in 3067b067d2514ba808cd379c2f1b8c7b37e12c4d. Queue proposals/responses now live in tables in the host-level `~/.bot-bottle/bot-bottle.db`, scoped by `queue_key` / bottle slug. The sidecar gets that same host DB mounted at `/run/supervise/bot-bottle.db` via `SUPERVISE_DB_PATH`; the queue dir mount remains only as the compatibility handle for existing call sites. I also moved the git-gate gitleaks approval path off JSON proposal/response files so it uses the same SQLite-backed supervise helpers. Validation: `python3 -m unittest discover -s tests/unit` passes (1509 tests).
didericis reviewed 2026-07-01 15:42:43 -04:00
@@ -305,0 +289,4 @@
current_file_hash=hashlib.sha256(payload.encode("utf-8")).hexdigest(),
now=datetime.datetime.now(datetime.timezone.utc),
)
_sv.write_proposal(Path(queue_dir), proposal)
Owner

Why do we still need a queue_dir if the queue is in the db?

Why do we still need a `queue_dir` if the queue is in the db?
didericis marked this conversation as resolved
didericis added 1 commit 2026-07-01 15:50:42 -04:00
fix(supervise): remove queue directory from db-backed flow
lint / lint (push) Successful in 2m4s
test / unit (pull_request) Successful in 59s
test / integration (pull_request) Successful in 20s
test / coverage (pull_request) Successful in 1m10s
29904609da
Author
Collaborator

Replying to #issuecomment-2780: agreed, it is not needed. Fixed in 29904609da.

The active flow no longer has queue_dir or SUPERVISE_QUEUE_DIR: supervise helpers take the bottle slug / queue key directly, the sidecar only gets SUPERVISE_DB_PATH, and docker/smolmachines/macos-container only mount /run/supervise/bot-bottle.db. I also removed queue-dir usage from git-gate gitleaks approvals and egress token override queueing.

Validation: python3 -m unittest discover -s tests/unit passes (1508 tests).

Replying to #issuecomment-2780: agreed, it is not needed. Fixed in 29904609da4f829b14706789dd819fc98d6e0c71. The active flow no longer has `queue_dir` or `SUPERVISE_QUEUE_DIR`: supervise helpers take the bottle slug / queue key directly, the sidecar only gets `SUPERVISE_DB_PATH`, and docker/smolmachines/macos-container only mount `/run/supervise/bot-bottle.db`. I also removed queue-dir usage from git-gate gitleaks approvals and egress token override queueing. Validation: `python3 -m unittest discover -s tests/unit` passes (1508 tests).
didericis reviewed 2026-07-01 17:07:03 -04:00
@@ -112,0 +109,4 @@
return bot_bottle_root() / HOST_DB_FILENAME
def queue_db_path() -> Path:
Owner

Don't think this should be a separate db path: we want to indicate that there's a single db on the host that's used for this. I also don't think we want to make it possible to have a separate supervisor db with an env var: it'll likely be important to have supervisor proposals and running bottles accessible to the same query for creating a good dashboard UI.

Don't think this should be a separate db path: we want to indicate that there's a single db on the host that's used for this. I also don't think we want to make it possible to have a separate supervisor db with an env var: it'll likely be important to have supervisor proposals and running bottles accessible to the same query for creating a good dashboard UI.
didericis reviewed 2026-07-01 17:08:17 -04:00
@@ -429,0 +507,4 @@
pass
class _AuditStore:
Owner

Should move this to its own file and remove the private _ prefix

Should move this to its own file and remove the private `_` prefix
Collaborator

Replying to #issuecomment-2793: agreed. Will remove queue_db_path() and the SUPERVISE_DB_PATH env var entirely. All host-side code (_QueueStore, _AuditStore, Supervise.prepare) will call host_db_path() directly. Sidecar code running inside the container will use Path(DB_PATH_IN_CONTAINER) hardcoded — that constant is the well-known bind-mount target shared between the launch backends and the sidecar, so no env var indirection is needed. DB_PATH_IN_CONTAINER stays as an exported constant so the launch backends and sidecar stay in sync on the mount path.

Replying to #issuecomment-2793: agreed. Will remove `queue_db_path()` and the `SUPERVISE_DB_PATH` env var entirely. All host-side code (`_QueueStore`, `_AuditStore`, `Supervise.prepare`) will call `host_db_path()` directly. Sidecar code running inside the container will use `Path(DB_PATH_IN_CONTAINER)` hardcoded — that constant is the well-known bind-mount target shared between the launch backends and the sidecar, so no env var indirection is needed. `DB_PATH_IN_CONTAINER` stays as an exported constant so the launch backends and sidecar stay in sync on the mount path.
Collaborator

Replying to #issuecomment-2795: agreed. Will move _AuditStore to bot_bottle/audit_store.py and rename it AuditStore. _QueueStore has the same shape — should that move to its own file too (e.g. bot_bottle/queue_store.py as QueueStore)?

Replying to #issuecomment-2795: agreed. Will move `_AuditStore` to `bot_bottle/audit_store.py` and rename it `AuditStore`. `_QueueStore` has the same shape — should that move to its own file too (e.g. `bot_bottle/queue_store.py` as `QueueStore`)?
Owner

@didericis-claude RE #320 (comment) , yep, go for it

@didericis-claude RE https://gitea.dideric.is/didericis/bot-bottle/pulls/320#issuecomment-2800 , yep, go for it
didericis added 1 commit 2026-07-01 17:45:15 -04:00
refactor: extract QueueStore and AuditStore to their own modules
lint / lint (push) Successful in 2m2s
test / unit (pull_request) Successful in 56s
test / integration (pull_request) Successful in 20s
test / coverage (pull_request) Failing after 59s
244ad6a914
Moves _QueueStore → bot_bottle/queue_store.py (public QueueStore) and
_AuditStore → bot_bottle/audit_store.py (public AuditStore). Removes
the public queue_db_path() function; QueueStore resolves the DB path
via host_db_path() on the host, or via the SUPERVISE_DB_PATH env var
in the sidecar container (internal mechanism, not public API).

Adds queue_store.py and audit_store.py to Dockerfile.sidecars so the
sidecar bundle picks them up. Updates __all__ in supervise.py.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Collaborator

Done — 244ad6a.

_QueueStorebot_bottle/queue_store.py as public QueueStore, _AuditStorebot_bottle/audit_store.py as public AuditStore. queue_db_path() removed from the public API and __all__. The sidecar still finds the bind-mounted DB via SUPERVISE_DB_PATH (internal to QueueStore.__init__, not a public function). Both new files are added to Dockerfile.sidecars. All 1507 unit tests pass.

Done — `244ad6a`. `_QueueStore` → [`bot_bottle/queue_store.py`](https://gitea.dideric.is/didericis/bot-bottle/src/commit/244ad6a91428a8b2d863b99633159ddf226befd2/bot_bottle/queue_store.py) as public `QueueStore`, `_AuditStore` → [`bot_bottle/audit_store.py`](https://gitea.dideric.is/didericis/bot-bottle/src/commit/244ad6a91428a8b2d863b99633159ddf226befd2/bot_bottle/audit_store.py) as public `AuditStore`. `queue_db_path()` removed from the public API and `__all__`. The sidecar still finds the bind-mounted DB via `SUPERVISE_DB_PATH` (internal to `QueueStore.__init__`, not a public function). Both new files are added to `Dockerfile.sidecars`. All 1507 unit tests pass.
didericis added 1 commit 2026-07-01 22:59:30 -04:00
test: cover QueueStore/AuditStore guard branches and supervise bundle spec
lint / lint (push) Successful in 1m57s
test / unit (pull_request) Successful in 52s
test / integration (pull_request) Successful in 19s
test / coverage (pull_request) Successful in 1m2s
5f0fc0d540
Add 8 tests covering the branches that were keeping diff-coverage below
90%: explicit db_path constructor arg, early-return guards when the DB
file is absent, _chmod OSError swallowing in both store classes, and the
supervise volume/env/daemon path in _bundle_launch_spec.

Diff-coverage rises from 89.2% to 94.6% (176/186 changed lines).
didericis reviewed 2026-07-01 23:07:54 -04:00
@@ -0,0 +10,4 @@
from .supervise import AuditEntry
def _sv() -> object:
Owner

call this get_supervise_mod for legibility.

call this `get_supervise_mod` for legibility.
didericis reviewed 2026-07-01 23:09:08 -04:00
@@ -0,0 +16,4 @@
propagate correctly in both flat (sidecar / sys.path-injection tests) and
package contexts."""
import sys
sv_name = "supervise" if __name__ == "audit_store" else "bot_bottle.supervise"
Owner

The need for two names smells a bit here... why exactly do we need to import from two directions? Would rather create some sort of wrapper for the audit_store if needed to properly resolve dependencies.

The need for two names smells a bit here... why exactly do we need to import from two directions? Would rather create some sort of wrapper for the audit_store if needed to properly resolve dependencies.
didericis reviewed 2026-07-01 23:09:43 -04:00
@@ -0,0 +26,4 @@
return _m
def _audit_entry_from_row(row: sqlite3.Row) -> AuditEntry:
Owner

put this on the AuditStore class

put this on the AuditStore class
didericis reviewed 2026-07-01 23:10:06 -04:00
@@ -0,0 +39,4 @@
)
def _host_db_path() -> Path:
Owner

method for this is unnecessary

method for this is unnecessary
didericis reviewed 2026-07-01 23:14:49 -04:00
@@ -0,0 +96,4 @@
with self._connect() as conn:
conn.execute(
"""
CREATE TABLE IF NOT EXISTS supervise_audit_entries (
Owner

I think we probably do want a migration framework baked in from the beginning... would be very painful to introduce later given we don't have control over client side sqlite dbs. Is there anything in the sqlite3 stdlib module we could leverage or should we create a basic set of up and down migration scripts? Regardless, think we should probably change how the table gets created here.

I think we probably do want a migration framework baked in from the beginning... would be very painful to introduce later given we don't have control over client side sqlite dbs. Is there anything in the sqlite3 stdlib module we could leverage or should we create a basic set of `up` and `down` migration scripts? Regardless, think we should probably change how the table gets created here.
didericis reviewed 2026-07-01 23:15:28 -04:00
@@ -0,0 +50,4 @@
status=row["status"],
notes=row["notes"],
final_file=row["final_file"],
)
Owner

Same comments about these method and the lazy import from AuditStore apply here

Same comments about these method and the lazy import from AuditStore apply here
didericis reviewed 2026-07-01 23:15:51 -04:00
@@ -0,0 +205,4 @@
conn.row_factory = sqlite3.Row
return conn
def _init(self) -> None:
Owner

Same issue about migrations also applies here

Same issue about migrations also applies here
didericis reviewed 2026-07-01 23:17:21 -04:00
@@ -0,0 +41,4 @@
- Adding forge orchestration state tables.
- Adding egress metering or budget tables.
- Changing the supervise TUI workflow or remediation behavior.
- Introducing a third-party ORM or migration framework.
Owner

Ideally not third party, but do want a migration framework.

Ideally not third party, but do want a migration framework.
didericis added 1 commit 2026-07-01 23:27:08 -04:00
refactor: address PR review — rename, move helpers, add migration runner
lint / lint (push) Successful in 2m4s
test / unit (pull_request) Successful in 59s
test / integration (pull_request) Successful in 16s
test / coverage (pull_request) Successful in 1m5s
e8e4f6f7c7
Per review #320 comments:

- Rename _sv() → get_supervise_mod() in both store files (review 206/211)
- Move _audit_entry_from_row onto AuditStore as _row_to_entry static method
  (review 208); move _proposal/_response_from_row onto QueueStore (review 211)
- Remove _host_db_path() free function; inline into __init__ (review 209/211)
- Add stdlib migration runner using a shared schema_versions table; each store
  tracks its own version under a module key so they can coexist in the same DB
  without clobbering a shared PRAGMA user_version (reviews 210/212/213)
- PRD: add goal 6 (migration runner), narrow non-goal to third-party ORM only
didericis-claude reviewed 2026-07-01 23:38:44 -04:00
@@ -0,0 +10,4 @@
from .supervise import AuditEntry
def get_supervise_mod() -> object:
Collaborator

Done — renamed to get_supervise_mod() in both audit_store.py and queue_store.py (e8e4f6f).

Done — renamed to `get_supervise_mod()` in both `audit_store.py` and `queue_store.py` (`e8e4f6f`).
didericis-claude reviewed 2026-07-01 23:38:45 -04:00
@@ -0,0 +16,4 @@
propagate correctly in both flat (sidecar / sys.path-injection tests) and
package contexts."""
import sys
sv_name = "supervise" if __name__ == "audit_store" else "bot_bottle.supervise"
Collaborator

The two names exist because the store modules run in two different identity contexts at runtime:

  1. Package context (bot_bottle.audit_store): supervise is at sys.modules["bot_bottle.supervise"]. Tests patch bot_bottle.supervise.bot_bottle_root.
  2. Sidecar / injected-path context (audit_store): supervise.py is copied flat to /app/ with no package. test_supervise_server.py also injects bot_bottle/ into sys.path and imports supervise directly, so patches go on sys.modules["supervise"].

If get_supervise_mod() always returned bot_bottle.supervise, sidecar imports and the server tests would see an unpatched module — the wrong object.

The clean fix is to extract Proposal, Response, AuditEntry, and host_db_path into a standalone _supervise_types.py module that neither supervise.py nor the store files import circularly. That eliminates get_supervise_mod() entirely. Happy to do it as a follow-up or inline here if you prefer — it touches supervise.py substantially.

The two names exist because the store modules run in two different identity contexts at runtime: 1. **Package context** (`bot_bottle.audit_store`): `supervise` is at `sys.modules["bot_bottle.supervise"]`. Tests patch `bot_bottle.supervise.bot_bottle_root`. 2. **Sidecar / injected-path context** (`audit_store`): `supervise.py` is copied flat to `/app/` with no package. `test_supervise_server.py` also injects `bot_bottle/` into `sys.path` and imports `supervise` directly, so patches go on `sys.modules["supervise"]`. If `get_supervise_mod()` always returned `bot_bottle.supervise`, sidecar imports and the server tests would see an unpatched module — the wrong object. The clean fix is to extract `Proposal`, `Response`, `AuditEntry`, and `host_db_path` into a standalone `_supervise_types.py` module that neither `supervise.py` nor the store files import circularly. That eliminates `get_supervise_mod()` entirely. Happy to do it as a follow-up or inline here if you prefer — it touches `supervise.py` substantially.
didericis-claude reviewed 2026-07-01 23:38:53 -04:00
@@ -0,0 +26,4 @@
return _m
# One entry per schema version: _MIGRATIONS[0] brings a fresh DB (user_version=0)
Collaborator

Done — moved to AuditStore._row_to_entry (static method). _proposal_from_row and _response_from_row similarly moved to QueueStore._row_to_proposal / _row_to_response.

Done — moved to `AuditStore._row_to_entry` (static method). `_proposal_from_row` and `_response_from_row` similarly moved to `QueueStore._row_to_proposal` / `_row_to_response`.
didericis-claude reviewed 2026-07-01 23:38:57 -04:00
@@ -0,0 +39,4 @@
component TEXT NOT NULL,
operator_action TEXT NOT NULL,
operator_notes TEXT NOT NULL,
justification TEXT NOT NULL,
Collaborator

Removed — inlined as db_path or get_supervise_mod().host_db_path() directly in __init__.

Removed — inlined as `db_path or get_supervise_mod().host_db_path()` directly in `__init__`.
didericis-claude reviewed 2026-07-01 23:39:09 -04:00
@@ -0,0 +96,4 @@
return m.AuditEntry( # type: ignore[attr-defined]
timestamp=row["timestamp"],
bottle_slug=row["bottle_slug"],
component=row["component"],
Collaborator

Added. Each store now has a _MIGRATIONS list (one SQL string per version). _init() reads the stored version from a shared schema_versions table keyed by module name, applies only the pending entries, and updates the version after each one. No third-party deps — pure sqlite3 stdlib.

I used a schema_versions table rather than PRAGMA user_version because both stores share the same database file; a single global integer would cause one store to skip its migrations when the other had already advanced the counter.

Added. Each store now has a `_MIGRATIONS` list (one SQL string per version). `_init()` reads the stored version from a shared `schema_versions` table keyed by module name, applies only the pending entries, and updates the version after each one. No third-party deps — pure sqlite3 stdlib. I used a `schema_versions` table rather than `PRAGMA user_version` because both stores share the same database file; a single global integer would cause one store to skip its migrations when the other had already advanced the counter.
didericis-claude reviewed 2026-07-01 23:39:09 -04:00
@@ -0,0 +50,4 @@
)
""",
# v2 — responses table
"""
Collaborator

All applied here too: _svget_supervise_mod(), row helpers moved to _row_to_proposal / _row_to_response static methods on QueueStore, _host_db_path() inlined.

All applied here too: `_sv` → `get_supervise_mod()`, row helpers moved to `_row_to_proposal` / `_row_to_response` static methods on `QueueStore`, `_host_db_path()` inlined.
didericis-claude reviewed 2026-07-01 23:39:18 -04:00
@@ -0,0 +205,4 @@
WHERE queue_key = ? AND proposal_id = ?
""",
(self.queue_key, proposal_id),
)
Collaborator

Same migration runner applied here — _MIGRATIONS list with proposals (v1) and responses (v2), tracked under queue_store key in schema_versions.

Same migration runner applied here — `_MIGRATIONS` list with proposals (v1) and responses (v2), tracked under `queue_store` key in `schema_versions`.
didericis-claude reviewed 2026-07-01 23:39:19 -04:00
@@ -0,0 +41,4 @@
- Migrating old JSON queue files or JSONL audit logs.
- Adding forge orchestration state tables.
- Adding egress metering or budget tables.
- Changing the supervise TUI workflow or remediation behavior.
Collaborator

Updated: non-goal now says "third-party ORM or migration library" (not framework); added goal 6 explicitly calling out the stdlib schema_versions migration runner.

Updated: non-goal now says "third-party ORM or migration library" (not framework); added goal 6 explicitly calling out the stdlib `schema_versions` migration runner.
Some checks are pending
lint / lint (push) Successful in 2m4s
test / unit (pull_request) Successful in 59s
test / integration (pull_request) Successful in 16s
test / coverage (pull_request) Successful in 1m5s
This pull request can be merged automatically.
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin sqlite-local-storage:sqlite-local-storage
git checkout sqlite-local-storage
Sign in to join this conversation.