PRD 0041: Git HTTP request bounds #144

Merged
didericis merged 4 commits from prd-0041-git-http-request-bounds into main 2026-06-02 11:30:41 -04:00
2 changed files with 9 additions and 23 deletions
Showing only changes of commit 4319b4ef3b - Show all commits
+2 -2
View File
@@ -20,7 +20,7 @@ from urllib.parse import urlsplit
DEFAULT_PORT = 9420 DEFAULT_PORT = 9420
# Body-size cap matching supervise_server.py's 1 MiB limit. # Body-size cap matching supervise_server.py's 1 MiB limit.
_MAX_BODY_BYTES = 1 * 1024 * 1024 MAX_BODY_BYTES = 1 * 1024 * 1024
didericis marked this conversation as resolved Outdated
Outdated
Review

If the agent committed an image to the repo greater than 1 MiB, would it now fail? That's probably fine/good, and assets shouldn't be in a git repo generally, but could see this potentially being annoying for people if seo

If the agent committed an image to the repo greater than 1 MiB, would it now fail? That's probably fine/good, and assets shouldn't be in a git repo generally, but could see this potentially being annoying for people if seo
Outdated
Review

Yes — a push whose packfile body exceeds 1 MiB would 413. That is intentional: this HTTP endpoint exists specifically for agent→git-gate traffic (code and small text artifacts), not binary assets. 1 MiB matches the existing cap in supervise_server.py so the two channels are consistently bounded.

If it does become a pain point, _MAX_BODY_BYTES is a named constant and trivial to raise, but I would not widen it preemptively.

Yes — a push whose packfile body exceeds 1 MiB would 413. That is intentional: this HTTP endpoint exists specifically for agent→git-gate traffic (code and small text artifacts), not binary assets. 1 MiB matches the existing cap in `supervise_server.py` so the two channels are consistently bounded. If it does become a pain point, `_MAX_BODY_BYTES` is a named constant and trivial to raise, but I would not widen it preemptively.
class GitHttpHandler(BaseHTTPRequestHandler): class GitHttpHandler(BaseHTTPRequestHandler):
@@ -88,7 +88,7 @@ class GitHttpHandler(BaseHTTPRequestHandler):
if length < 0: if length < 0:
self.send_error(400, "Negative Content-Length") self.send_error(400, "Negative Content-Length")
return return
if length > _MAX_BODY_BYTES: if length > MAX_BODY_BYTES:
self.send_error(413, "Request body too large") self.send_error(413, "Request body too large")
return return
body = self.rfile.read(length) if length else b"" body = self.rfile.read(length) if length else b""
+7 -21
View File
@@ -7,30 +7,21 @@
## Summary ## Summary
Add Content-Length validation and a body-size cap to `git_http_backend.py` so Add Content-Length validation and a body-size cap to `git_http_backend.py` so malformed or oversized smart-HTTP requests fail cleanly rather than crashing the handler or exhausting memory.
malformed or oversized smart-HTTP requests fail cleanly rather than crashing
the handler or exhausting memory.
## Problem ## Problem
`bot_bottle/git_http_backend.py` calls `int(self.headers.get("Content-Length", `bot_bottle/git_http_backend.py` calls `int(self.headers.get("Content-Length", 0))` without catching `ValueError`. A request with a non-numeric Content-Length raises an unhandled exception in the request handler.
0))` without catching `ValueError`. A request with a non-numeric Content-Length
raises an unhandled exception in the request handler.
The handler reads the full declared length into memory before passing the body The handler reads the full declared length into memory before passing the body to `git http-backend` with no upper bound. A local or compromised client can force arbitrarily high memory use. For comparison, `supervise_server.py` caps request bodies at 1 MiB.
to `git http-backend` with no upper bound. A local or compromised client can
force arbitrarily high memory use. For comparison, `supervise_server.py` caps
request bodies at 1 MiB.
## Goals / Success Criteria ## Goals / Success Criteria
- A missing or non-numeric Content-Length returns HTTP 400. - A missing or non-numeric Content-Length returns HTTP 400.
- A negative Content-Length returns HTTP 400. - A negative Content-Length returns HTTP 400.
- A body larger than the cap (1 MiB, matching `supervise_server.py`) returns - A body larger than the cap (1 MiB, matching `supervise_server.py`) returns HTTP 413.
HTTP 413.
- Valid Git smart-HTTP pushes and fetches continue to work. - Valid Git smart-HTTP pushes and fetches continue to work.
- Unit tests cover: missing length, non-numeric length, negative length, - Unit tests cover: missing length, non-numeric length, negative length, over-cap length, and a valid push/fetch passthrough.
over-cap length, and a valid push/fetch passthrough.
## Non-goals ## Non-goals
@@ -52,17 +43,12 @@ Out of scope:
## Design ## Design
Wrap the Content-Length parse in a try/except and return 400 on `ValueError`. Wrap the Content-Length parse in a try/except and return 400 on `ValueError`. Add an explicit check for negative values. After parsing, compare the declared length against a module-level `MAX_BODY_BYTES` constant (default 1 MiB) and return 413 if exceeded. Read exactly `min(content_length, MAX_BODY_BYTES)` bytes.
Add an explicit check for negative values. After parsing, compare the declared
length against a module-level `_MAX_BODY_BYTES` constant (default 1 MiB) and
return 413 if exceeded. Read exactly `min(content_length, _MAX_BODY_BYTES)`
bytes.
## Testing Strategy ## Testing Strategy
- Unit tests using `unittest.mock` to drive the handler with crafted headers. - Unit tests using `unittest.mock` to drive the handler with crafted headers.
- Test cases: no Content-Length header, `Content-Length: abc`, `Content-Length: - Test cases: no Content-Length header, `Content-Length: abc`, `Content-Length: -1`, `Content-Length: 2097152` (over cap), and a normal small POST body.
-1`, `Content-Length: 2097152` (over cap), and a normal small POST body.
Run: Run: