From c74bd5cf26ec782fa497a9a91ad68619a21199f1 Mon Sep 17 00:00:00 2001 From: didericis Date: Fri, 8 May 2026 00:00:51 -0400 Subject: [PATCH] docs: add research note on multi-encoding secret exfil tripwires Co-Authored-By: Claude Opus 4.7 --- .../secret-exfil-tripwire-encodings.md | 459 ++++++++++++++++++ 1 file changed, 459 insertions(+) create mode 100644 docs/research/secret-exfil-tripwire-encodings.md diff --git a/docs/research/secret-exfil-tripwire-encodings.md b/docs/research/secret-exfil-tripwire-encodings.md new file mode 100644 index 0000000..d5c4636 --- /dev/null +++ b/docs/research/secret-exfil-tripwire-encodings.md @@ -0,0 +1,459 @@ +# Secret exfiltration tripwires: detecting a known string in many encodings + +Research into tooling that, given a known secret string, generates its +representation in many common encodings so that simple string or regex matchers +can detect the secret in outbound traffic or logged output — regardless of +which naive encoding a misbehaving agent uses. + +## Summary + +- **No off-the-shelf tool does this exactly.** A search of GitHub, PyPI, npm, + and security tool indexes (awesome-lists, awesome-honeypots, awesome-yara, + awesome-canaries) found no project whose stated purpose is "given a secret, + emit N encoded forms suitable for use as grep or regex patterns against + outbound traffic." The capability exists in fragments across several adjacent + categories, none of which compose into the complete picture. +- **The closest existing answer is YARA's `base64` / `base64wide` string + modifiers** (added in YARA 3.6). A single string declaration with both + modifiers generates all three base64 offset permutations automatically, in + plain and UTF-16LE form. Combined with separate `ascii` / `wide` variants, + this covers five of the most common encoding forms with no Python needed. +- **Secret scanners** (gitleaks, trufflehog, detect-secrets, ggshield) solve + the inverse problem: detecting *unknown* secrets matching known patterns. + They are not designed to scan for a specific known literal in N encodings. + Gitleaks (v8.20+) and TruffleHog do perform multi-encoding decodes before + running their detectors, but only as a preprocessing step — not as a way to + produce N encoded forms for downstream matchers. +- **Canary token services** (Thinkst Canarytokens, OpenCanary) are callback + canaries: detection fires when the token itself is accessed and phones home. + They do not scan outbound streams for encoded representations of the canary + value. They address a different threat model. +- **Enterprise DLP** products (Microsoft Purview, Symantec DLP/Broadcom, + Nightfall, Cyberhaven) perform encoding-aware matching internally as a + black-box feature. The capability is real but not exposed as an API and + not available for use in a self-hosted container sidecar. Symantec DLP + explicitly does not decode base64 or ROT13 in all inspection paths due + to processing overhead concerns. +- Rolling this in ~100 lines of Python is feasible and is probably the right + path for claude-bottle v1. The limiting factor is not the encoding logic + — that is straightforward — but the false-positive rate from common + base64 alphabet collisions and the zero coverage against any re-encoding + that involves a key (encryption) or destroys byte boundaries (packet + splitting). + +--- + +## Encoding catalog + +The following encodings are the realistic candidates for a naive agent +performing unintentional or low-sophistication exfiltration. For each, +the "FP risk" column notes the false-positive risk in a pattern matcher +scanning general HTTP request bodies or log lines. + +| # | Encoding | Notes | FP risk | +|---|----------|-------|---------| +| 1 | Raw UTF-8 bytes (literal) | Baseline. Exact substring match. | Lowest | +| 2 | Base64 standard (RFC 4648 §4) with `=` padding | Up to 3 offset variants depending on byte alignment at encode boundary; YARA covers all three with `base64` modifier. | Low–medium; base64 collisions are rare for 16+ char secrets | +| 3 | Base64 without padding | Same byte content, trailing `=` stripped. Matches appear inside longer base64 blobs; needs anchor-free search. | Medium | +| 4 | URL-safe base64 (RFC 4648 §5) | `+` → `-`, `/` → `_`. Required by many JWT and OAuth token formats. A separate encoding class from standard base64. | Low | +| 5 | Base64 of UTF-16LE encoding | Re-encode the secret as UTF-16LE, then base64. Produces a different ciphertext than UTF-8 base64. YARA `base64wide` modifier covers this. | Low | +| 6 | Base64 of UTF-16BE encoding | Same as above but big-endian byte order. Rare in practice but trivial to generate; include for completeness. | Very low | +| 7 | Double base64 | Base64 applied twice. Some log-shipping pipelines or config-encoding layers do this naively. | Very low | +| 8 | Hex lowercase (`0-9a-f`) | Every byte becomes two hex digits; trivially detectable with a fixed-length hex string search. | Low | +| 9 | Hex uppercase (`0-9A-F`) | Same encoded value, different case. Separate regex needed if your matcher is case-sensitive. | Low | +| 10 | Percent/URL encoding of all bytes | `%xx` for every byte. Browsers and curl typically only encode special chars; a misbehaving agent might encode all bytes. | Low in raw bodies; medium in URL query strings | +| 11 | Percent encoding of non-ASCII bytes only | The common case. Only high-bytes and special chars encoded; printable ASCII runs verbatim. | Medium; overlaps with normal URL encoding | +| 12 | Double percent encoding | `%` → `%25`, so `%41` → `%2541`. Common WAF bypass; appears in logged URLs after server-side decode. | Very low | +| 13 | JSON string escaping | `"` → `\"`, `\` → `\\`, non-ASCII → `\uXXXX`. Relevant when a secret is serialized into a JSON payload body. | Medium; very common in request bodies | +| 14 | HTML/XML entity encoding | `&`, `&#xXX;`, `&#DDD;`. Relevant for form POST bodies and SOAP/XML egress. | Medium in HTML context; low in JSON | +| 15 | UTF-16LE raw bytes | Interleaved NUL bytes; `ABC` becomes `41 00 42 00 43 00`. Visible in PCAP or raw log hex dumps. YARA `wide` modifier covers this. | Low in text content; medium in binary/multipart streams | +| 16 | UTF-32LE raw bytes | Four bytes per character. Unusual in web payloads but trivial for an agent to produce via Python's `encode('utf-32-le')`. | Very low | +| 17 | ROT13 | Caesar cipher, shift 13, printable ASCII only. Shifts letters; digits and special chars unchanged. Weak obfuscation, cheap to detect. | Medium; common English words collide | +| 18 | ROT47 | Extends ROT13 over printable ASCII range 33–126. Transforms digits and symbols too. Less collision-prone than ROT13. | Low | +| 19 | gzip + base64 | `gzip(plaintext)` then base64-encode the binary output. Output is always recognizable by the `H4sI` base64 prefix for RFC 1952 gzip magic bytes. | Low; `H4sI` prefix is a cheap anchor | +| 20 | zlib/deflate + base64 | `zlib.compress()` (DEFLATE with zlib header) then base64. Output starts with `eJy` or similar zlib magic prefix in base64. | Low; magic prefix detectable | +| 21 | Leetspeak / character substitution | `e` → `3`, `a` → `@`, `o` → `0`, `i` → `1`, etc. No fixed mapping; generates many variants. High FP against non-secret content. | High; impractical to enumerate exhaustively | +| 22 | Reversed bytes | Secret reversed character-by-character. Trivial, occasionally used as a confusion layer. | Low | +| 23 | Space-separated characters | `SECRET` → `S E C R E T`. Defeats substring search; requires regex `S\s+E\s+C\s+R\s+E\s+T`. | Very low | +| 24 | Null-separated characters (wide variant) | Like space-separated but with literal `\x00` bytes. Same as UTF-16LE for ASCII-only secrets. | Very low | +| 25 | Base32 (RFC 4648 §6) | Used in TOTP seeds, some DNS exfil channels. Alphabet is `A-Z2-7`. Longer output than base64. | Low for secrets ≥ 10 chars | + +**Diminishing returns note.** Encodings 1–13 cover the vast majority of +realistic naive-exfil vectors for an agent using standard Python or shell +tools. Encodings 14–25 are worth including in a comprehensive scanner but +individually contribute little marginal risk reduction. + +--- + +## Adjacent category survey + +### Secret scanners + +The major open-source secret scanners — gitleaks, TruffleHog, detect-secrets, +and ggshield — all solve the *inverse* problem: given a body of text, find +strings that look like secrets using entropy analysis or regular expressions +trained on known credential formats. They are not designed to answer "does +this payload contain *this specific known value* in any encoding?" + +The distinction matters. These tools look for patterns that match the +structural form of, e.g., an AWS access key (`AKIA...`). They are not +designed to take a user-provided literal and report all byte-equivalent +encodings of it. + +That said, two of these tools have encoding-aware *preprocessing* steps +that are directly relevant: + +- **Gitleaks** ([github.com/gitleaks/gitleaks](https://github.com/gitleaks/gitleaks)) + added a `--max-decode-depth` flag in v8.20.0. When set to a non-zero + value, it recursively decodes segments of the input before running + regex detectors, supporting three encodings: hex, percent-encoding + (URL encoding), and base64. The purpose is to find secrets that have + been naively encoded before committing. This is functionally what + a content-tripwire needs to do, but hard-coded to Gitleaks' own + detector ruleset rather than user-supplied literals. The flag is off + by default. + +- **TruffleHog** ([github.com/trufflesecurity/trufflehog](https://github.com/trufflesecurity/trufflehog)) + decodes four encoding types before running its detectors: UTF-8, UTF-16, + Base64, and Escaped Unicode (`\uXXXX` form). It also detects secrets + hidden in archived and compressed files. A Truffle Security blog post + from October 2024 documents this in detail + ([trufflesecurity.com/blog/secret-scanning-encoded-and-archived-data](https://trufflesecurity.com/blog/secret-scanning-encoded-and-archived-data)). + Same caveat as gitleaks: the decode step feeds the existing pattern + detectors, not a user-supplied literal search. + +- **detect-secrets** ([github.com/Yelp/detect-secrets](https://github.com/Yelp/detect-secrets)) + and **ggshield** ([github.com/GitGuardian/ggshield](https://github.com/GitGuardian/ggshield)) + do not appear to have multi-encoding decode steps; they operate on + the input text as-is. + +None of these tools expose a "match this literal in N encodings" API. +The closest workflow would be to feed a custom gitleaks rule that matches +pre-computed encoded variants, but that requires generating those variants +externally (i.e., the exact gap this research note addresses). + +### Canary token services + +Canary token services operate on a fundamentally different detection model +and should not be confused with matcher canaries. + +**Callback canaries** work by embedding a unique URL or resource reference +in a document, credential file, or environment variable. When an agent +(or attacker) reads and uses the credential, the canary service receives +an HTTP callback. The detection signal is the *access of the resource*, +not the presence of an encoded form in an outbound byte stream. + +- **Thinkst Canarytokens** ([canarytokens.org](https://canarytokens.org) / + [github.com/thinkst/canarytokens](https://github.com/thinkst/canarytokens)) + offers AWS key canaries, Azure login canaries, PDF canaries, and + many others. All rely on callback detection. A Canarytokens bypass + issue ([github.com/thinkst/canarytokens/issues/36](https://github.com/thinkst/canarytokens/issues/36)) + specifically documents that an attacker who extracts the canary value + and uses it without triggering the callback URL (e.g., by sending the + raw credential string to an external API over a non-canary channel) can + bypass the detection entirely. This is the exact gap that encoding-aware + content inspection would close. + +- **OpenCanary** ([github.com/thinkst/opencanary](https://github.com/thinkst/opencanary)) + is Thinkst's self-hosted daemon that mimics network services (SSH, + FTP, Telnet, HTTP, SMB, etc.) and alerts when they are probed. It is + a network-layer honeypot, not an outbound content scanner. Detection + is interaction-based, not encoding-aware content matching. + +- **IndicatorOfCanary** by HackingLZ + ([github.com/HackingLZ/IndicatorOfCanary](https://github.com/HackingLZ/IndicatorOfCanary)) + is conceptually the nearest to what is needed: it is a red-team tool + for *detecting the presence of canary tokens inside files* before using + those files, to avoid triggering callback alerts. It searches for known + canary IoCs (callback domain patterns) in file metadata and content. + It is the adversary-side mirror image — red team detecting canaries + before they can be tripped — but it shows the art of the possible for + encoding-aware document inspection. + +**The gap**: no canary service offers a "here is your secret; here are +12 encoded forms of it; ingest these into your egress scanner" API. + +### Enterprise DLP + +Enterprise DLP products do perform encoding-aware content matching, but +as an internal, closed-source capability: + +- **Symantec DLP (Broadcom)** + ([broadcom.com](https://www.broadcom.com/products/cybersecurity/information-protection/data-loss-prevention)) + An official Broadcom knowledge base article explicitly states that + Symantec DLP is not able to inspect and alert on base64 and ROT13 + encoded files in all inspection paths, citing processing overhead as + the reason + ([knowledge.broadcom.com/external/article/184415](https://knowledge.broadcom.com/external/article/184415/is-symantec-data-loss-prevention-dlp-abl.html)). + This is a documented limitation, not marketing copy. + +- **Microsoft Purview DLP** + ([learn.microsoft.com/en-us/purview/dlp-policy-reference](https://learn.microsoft.com/en-us/purview/dlp-policy-reference)) + supports custom sensitive information types and trainable classifiers + but the encoding-awareness of its content matching engine is not + publicly documented at the rule-authoring level. No public API exists + for generating encoded variant patterns. + +- **Nightfall AI** ([nightfall.ai](https://www.nightfall.ai/)) + uses deep-learning classifiers rather than regex, with 100+ AI-based + detectors. It offers a REST API that accepts arbitrary strings and files + and returns findings. Its encoding-awareness is model-dependent and + not configurable by the caller. No "user-supplied literal + encoding + sweep" mode is documented. + +- **Cyberhaven** ([cyberhaven.com](https://www.cyberhaven.com/)) + is notable for its data-lineage approach: it tracks data transformations + (copy, compress, rename, convert) and ties exfiltration events to + original sensitive files even after transformation. This is a more + powerful model than pure byte matching but requires a full endpoint + agent and cloud backend. Not suitable for a container sidecar. + +The enterprise DLP space confirms that encoding-aware detection is a solved +problem at enterprise scale, but the implementations are either closed-source +SaaS products, require endpoint agents, or are not configurable with +user-supplied literals. + +### Pentest / red-team encoding generators + +Several red-team tools generate many encodings of a payload, treating +encoding as a *generation* problem rather than a detection problem. They +are directly useful for producing the encoding catalog needed to build +tripwire patterns. + +- **hURL** ([github.com/fnord0/hURL](https://github.com/fnord0/hURL)) + is a command-line encoder/decoder supporting URL encoding, double URL + encoding, base64, HTML entities, ASCII-to-hex, integer-to-hex, ROT13, + and SHA family hashes. It is packaged in Kali Linux (`apt install hurl`). + It does not produce a "all encodings of this string" output in one + command — each encoding is a separate invocation flag — but the + encoding catalog it covers aligns well with the practical catalog above. + +- **CyberChef** ([gchq.github.io/CyberChef](https://gchq.github.io/CyberChef) / + [github.com/gchq/CyberChef](https://github.com/gchq/CyberChef)) + is the GCHQ "cyber Swiss Army Knife," a browser-based tool with 400+ + encoding/decoding/transformation operations. It can be scripted via + the `cyberchef-node` npm package + ([github.com/nicowillis/cyberchef-node](https://github.com/nicowillis/cyberchef-node)) + to generate many encodings programmatically. The community recipe list + ([github.com/mattnotmax/cyberchef-recipes](https://github.com/mattnotmax/cyberchef-recipes)) + is a good reference for the encoding chains real attackers use. CyberChef + is the best single reference for what an exhaustive encoding catalog + looks like in practice. + +- **Burp Suite Intruder** (PortSwigger, commercial with community edition) + has a payload processing rule chain in its Intruder module that can apply + sequences of encoding transformations (URL, HTML, base64, ASCII hex, + built-in strings) to a wordlist. Not scriptable outside Burp; primarily + useful for interactive enumeration during a pentest. + +- **wfuzz** ([github.com/xmendez/wfuzz](https://github.com/xmendez/wfuzz)) + supports encoder plugins (base64, urlencode, md5, sha1, double-urlencode, + html, etc.) that can be chained with the `@` syntax in payload specs. + It is a brute-force fuzzer, not a pattern generator, but its encoder + catalog is a useful reference list. + +None of these tools emit "N regex patterns for detecting this secret in +any of its encoded forms in an outbound stream." They are all generation +tools for attacks, not detection tools for defense. + +--- + +## YARA string modifiers — the closest existing answer + +YARA ([virustotal.github.io/yara-x](https://virustotal.github.io/yara-x) / +[yara.readthedocs.io](https://yara.readthedocs.io/en/stable/writingrules.html)) +has the most complete existing treatment of "match this string in multiple +encoding forms" via its text-string modifier system. This was designed for +malware detection in binary files and network captures, but the same logic +applies to outbound traffic inspection. + +### Available modifiers + +Four modifiers apply directly to the encoding problem: + +- **`ascii`** — match the string as raw ASCII/UTF-8 bytes. This is the default + when no modifiers are specified. +- **`wide`** — match the string in UTF-16LE form (each ASCII byte interleaved + with a NUL byte). Designed for detecting strings in Windows PE binaries. +- **`base64`** — generate all three base64 offset permutations of the string + and search for any of them. The three permutations arise because base64 + encodes 3 bytes at a time; depending on where a string starts within the + 3-byte boundary, its encoding shifts by 0, 1, or 2 base64 characters. + YARA computes all three at compile time and emits patterns for each, + so the rule author does not need to pre-compute them. +- **`base64wide`** — same as `base64`, but applied to the UTF-16LE form of + the string. Covers the case where the secret was stored as a wide string + (UTF-16LE) before being base64-encoded. + +Modifiers can be combined on a single string declaration. A rule that +covers all four of these forms simultaneously looks like: + +```yara +rule secret_tripwire { + strings: + $s = "my-secret-value" ascii wide base64 base64wide + condition: + $s +} +``` + +YARA will generate and search for (at minimum) seven patterns from +this single declaration: raw UTF-8, raw UTF-16LE, and three base64 +variants of UTF-8, and three base64wide variants of UTF-16LE. + +A fifth modifier, **`xor`** (added in YARA 3.8), searches for single-byte XOR +obfuscated variants of the string across all 255 non-zero keys. The `xor` +modifier cannot be combined with `base64` or `base64wide` in a single string +declaration (it causes a compiler error). To cover both XOR and base64, two +separate string declarations are required. + +**Custom base64 alphabets:** The `base64` and `base64wide` modifiers accept +an optional 64-character custom alphabet string. This covers URL-safe +base64 (`-_` substituted for `+/`) and any custom alphabets. + +### Limitations of YARA for this use case + +- YARA does not natively cover hex encoding, percent encoding, JSON string + escaping, gzip+base64, ROT13, or the other entries in the encoding catalog + above. Those would require pre-computing the encoded forms externally and + writing them as explicit hex-pattern strings in the rule. +- YARA operates on files or byte buffers passed by the calling application; + it does not natively hook network streams. Integration with a proxy or + a log-scanning pipeline requires an application layer to call + `libyara` or the `yara-python` bindings on each captured request body. +- YARA's `base64` modifier has a documented minimum-length constraint: strings + shorter than three characters cannot be base64-matched reliably due to + the offset permutation math. This is unlikely to matter for real secrets + but worth noting. + +### DissectMalware/base64_substring + +The tool `base64_substring` +([github.com/DissectMalware/base64_substring](https://github.com/DissectMalware/base64_substring)) +generates a YARA rule to find base64-encoded files containing a specific +keyword, by enumerating all three offset permutations and emitting them as +a YARA rule. This predates YARA's built-in `base64` modifier and is largely +superseded by it, but the repository is useful as a reference for the +permutation math. + +--- + +## DIY sketch + +There is no off-the-shelf tool that takes a known secret and emits N patterns +for outbound stream matching. The remaining question is how much work it would +be to write one. + +A minimal `tripwire-encode` script in Python (~80–120 lines) would: + +1. Accept a secret string on stdin or as a CLI argument. +2. Emit one encoded form per line (or a JSON object mapping encoding names + to encoded values) for encodings 1–20 from the catalog above. +3. The encoding logic for each form is 1–4 lines of Python using the + standard library (`base64`, `codecs`, `urllib.parse`, `gzip`, `io`); + no third-party dependencies are required. +4. For the YARA output mode, emit a `.yar` rule with one string declaration + per encoding (or use `ascii wide base64 base64wide` for the first four, + then add explicit hex-string patterns for the remaining forms). + +A companion `tripwire-grep` script (~30–50 lines) would: +1. Accept the secret (or the pre-computed encoding list) and a stream on stdin. +2. Compile the encodings into a single `re.search` call or `bytes.find` loop. +3. Exit non-zero and print the matching line/offset if any form is found. + +For a proxy integration, the same encoding set can be compiled once at +container startup and injected into a mitmproxy addon or a small filter +script that wraps the proxy log output. + +**Rough size estimate:** the full encoding catalog (25 entries) in Python +requires roughly 80 lines of encoding logic, 30 lines of CLI scaffolding, +and 40 lines of grep/match logic. Total under 200 lines. + +**Important caveat on ROT13 and leetspeak.** These encodings have +non-trivial false-positive rates for short secrets and for secrets +containing common English substrings. If the secret contains only +alphanumerics, ROT13 and ROT47 patterns should be included but flagged +at warning level rather than error level in the output. + +**The honest limitation of this entire approach** is that it is effective +only against naive, deterministic encodings where a single fixed plaintext +always produces the same fixed ciphertext. Any exfiltration method that +involves: +- A secret key (AES, ChaCha, XOR with a non-trivial key), +- Compression with a novel dictionary, +- Splitting the secret across multiple messages or packets, +- Steganography (hiding data in image pixels, audio samples, whitespace), + +...will produce output with no detectable relationship to the input plaintext. +For claude-bottle's threat model — an untrusted but not adversarially +sophisticated agent that might accidentally or naively encode secrets before +logging or posting them — naive-encoding detection covers the realistic concern. +A deliberately adversarial agent with knowledge of its own monitoring is a +different threat model entirely. + +--- + +## Integration point in claude-bottle + +The natural choke point for encoding-aware content inspection in claude-bottle +is the body of outbound HTTP/HTTPS requests, visible at the egress proxy layer. +The egress-guard note documents smokescreen (Stripe, +[github.com/stripe/smokescreen](https://github.com/stripe/smokescreen)) +and squid as sidecar proxy options. Both can be configured to pass request +bodies through a content-inspection hook. Smokescreen has an experimental +MITM mode with detailed HTTP logging; squid supports `icap_service` ICAP +integration for content inspection. Either could be extended with a small +filter script that runs the encoded-form search against each captured body. + +**SSH is opaque to a proxy.** SSH connections carry encrypted payloads; +the proxy cannot inspect their contents. Egress SSH (e.g., git push to a +permitted remote) is outside the reach of this defense unless the container +is configured to log git command arguments before they execute — which would +catch a literal secret passed on the command line but not a secret embedded +in a commit object or passed via an environment variable to a subprocess. +For v1, this defense should be scoped explicitly to HTTP/HTTPS request bodies +and to logged shell command output, and SSH egress should be called out as +a gap. + +--- + +## Limitations and bypass surface + +This defense catches secrets in transit only when the encoding is deterministic +and pre-computable. The following techniques defeat it entirely: + +- **Symmetric encryption with a key** (AES-CBC, ChaCha20, XOR with a 16+ + byte key). The ciphertext has no statistical relationship to the plaintext. + Detection requires key recovery, which is out of scope for a content scanner. + +- **Splitting across messages or packets.** If a secret is split as + `SECRET[0:4]` in one request and `SECRET[4:]` in another, no single- + message content scanner will fire. Stateful stream reassembly would be + required, which is a much harder problem. + +- **Compression with a novel or pre-shared dictionary** (LZ4, Brotli, + zstd). Standard gzip/zlib produce recognizable magic-byte prefixes in + base64, but a custom compression configuration can produce output that + is entropy-high and format-unrecognizable. + +- **Steganography.** Encoding data into image LSBs, audio samples, + whitespace-as-binary in a text file, or Unicode homoglyphs. The output + looks like a legitimate artifact; no regex will find it. + +- **Hash-chained exfiltration.** Sending the SHA-256 of secret fragments + in sequence; a receiver who knows the secret can verify, but no string + scanner can reconstruct the original. + +- **Protocol-level encoding not visible at the HTTP body layer.** DNS + exfiltration (encoding data in query subdomains), ICMP covert channels, + or TCP sequence number steganography all operate below HTTP and are not + intercepted by an HTTP proxy at all. The egress-guard note covers DNS + separately via a controlled resolver. + +In the claude-bottle context, the primary realistic concern is an agent +that naively embeds a secret in a log line, a curl argument, a JSON body, +or a shell heredoc without specifically intending to obfuscate. All of the +above bypass techniques require deliberate, adversarially-motivated engineering +from a process that is assumed to be untrusted but not intentionally malicious. +The encoding-aware tripwire is an appropriate control for this threat model, +provided its scope is communicated clearly.