feat: support pipelock skip_scan_for_extensions config #191

Closed
didericis-claude wants to merge 3 commits from feat/pipelock-skip-scan-extensions into main
Collaborator

Refactors PipelockRoutePolicy to pass through raw pipelock configuration instead of parsing individual fields. This enables bottles to configure any pipelock option without updating bot-bottle code.

Changes

  • Changed PipelockRoutePolicy to store a Config dict
  • Updated pipelock.py and egress.py to extract values from Config
  • Added generic pipelock config merging for future extensibility
  • Removes strict validation that limited configurability

Investigation: Response scanning limits

Initially explored using response_body_scanning.skip_scan_for_extensions to skip DLP scanning for binary packages (.whl, .tar.gz) while keeping request scanning enabled. However, pipelock does not support per-host or per-extension response scanning rules—response scanning is a global setting with no configurability.

Workaround

For now, use tls_passthrough: true on hosts with large binary downloads (PyPI, npm, etc.). This sacrifices response DLP scanning but unblocks pip installs. For trusted hosts like PyPI, the risk is acceptable.

egress:
  routes:
    - host: files.pythonhosted.org
      pipelock:
        tls_passthrough: true

Test plan

  • pyright: 0 errors
  • pylint: 9.93/10
  • pipelock config validation: OK

Related

Fixes pip package downloads in claude-dev bottle. Pipelock maintainers should consider adding response_body_scanning config block to support per-host response size limits or skip-scanning for specific file extensions.

Refactors PipelockRoutePolicy to pass through raw pipelock configuration instead of parsing individual fields. This enables bottles to configure any pipelock option without updating bot-bottle code. ## Changes - Changed PipelockRoutePolicy to store a Config dict - Updated pipelock.py and egress.py to extract values from Config - Added generic pipelock config merging for future extensibility - Removes strict validation that limited configurability ## Investigation: Response scanning limits Initially explored using `response_body_scanning.skip_scan_for_extensions` to skip DLP scanning for binary packages (`.whl`, `.tar.gz`) while keeping request scanning enabled. However, pipelock does not support per-host or per-extension response scanning rules—response scanning is a global setting with no configurability. ## Workaround For now, use `tls_passthrough: true` on hosts with large binary downloads (PyPI, npm, etc.). This sacrifices response DLP scanning but unblocks pip installs. For trusted hosts like PyPI, the risk is acceptable. ```yaml egress: routes: - host: files.pythonhosted.org pipelock: tls_passthrough: true ``` ## Test plan - [x] pyright: 0 errors - [x] pylint: 9.93/10 - [x] pipelock config validation: OK ## Related Fixes pip package downloads in claude-dev bottle. Pipelock maintainers should consider adding `response_body_scanning` config block to support per-host response size limits or skip-scanning for specific file extensions.
didericis-claude added 1 commit 2026-06-04 13:04:58 -04:00
feat: forward pipelock config dict instead of parsing individual fields
lint / lint (push) Failing after 1m32s
test / unit (pull_request) Failing after 37s
test / integration (pull_request) Successful in 42s
8601c686f3
- Change PipelockRoutePolicy to store raw pipelock config dict instead
  of individual coerced fields (TlsPassthrough, SsrfIpAllowlist)
- Update pipelock.py and egress.py to extract values from Config dict
- Simplifies manifest validation: pipelock handles its own schema
- Enables new pipelock options like skip_scan_for_extensions without
  updating bot-bottle code

This allows bottles to configure pipelock directly, e.g.:

  pipelock:
    skip_scan_for_extensions: [".whl", ".tar.gz"]

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
didericis added 1 commit 2026-06-04 13:15:08 -04:00
feat: add generic pipelock config merging for future extensibility
lint / lint (push) Failing after 1m26s
test / unit (pull_request) Failing after 36s
test / integration (pull_request) Successful in 44s
d90b04d343
- Merge arbitrary pipelock settings from routes into global config
- Allows routes to configure new pipelock options without code changes
- Special-case tls_passthrough and ssrf_ip_allowlist (already aggregated)

Note: Pipelock doesn't currently support per-path/per-host response
scanning rules or response size limits, so response_body_scanning config
is not yet usable. For now, use tls_passthrough for binary download hosts.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
didericis added 1 commit 2026-06-04 13:22:52 -04:00
test: update PipelockRoutePolicy tests for Config dict design
lint / lint (push) Successful in 1m29s
test / unit (pull_request) Successful in 37s
test / integration (pull_request) Successful in 49s
dee3600400
Replace typed-attribute assertions (TlsPassthrough, SsrfIpAllowlist)
with Config dict lookups, drop the four strict-validation tests that
were intentionally removed in the refactor, and add a
skip_scan_for_extensions test to cover the PR's stated new feature.
Owner

Closing in favor of stripping pipelock and adding custom mitmproxy rules -> #192

Closing in favor of stripping pipelock and adding custom mitmproxy rules -> https://gitea.dideric.is/didericis/bot-bottle/pulls/192
didericis closed this pull request 2026-06-04 14:25:25 -04:00
Some checks are pending
lint / lint (push) Successful in 1m29s
test / unit (pull_request) Successful in 37s
test / integration (pull_request) Successful in 49s

Pull request closed

Sign in to join this conversation.