Varol Cagdas Tok

Personal notes and articles.

SSRF: Protocol Schemes Beyond HTTP

Many server-side HTTP clients are built on libraries that support multiple URL schemes. When user input controls the full URL, including the scheme, the attacker is not limited to HTTP. Each supported scheme exposes different capabilities and reaches different internal resources.


file://

The file:// scheme reads from the local filesystem. On Unix systems, file:///etc/passwd returns the user list; file:///proc/self/environ returns environment variables for the running process, which may include database credentials, API keys, or cloud provider credentials injected at container startup. file:///proc/self/net/tcp exposes open TCP connections, allowing port scanning of the loopback interface without making any network connections.

curl, Python's urllib, Java's URL class, and most language-level HTTP utilities support file:// by default unless explicitly disabled.


gopher://

Gopher is a pre-web document retrieval protocol. Its URL format allows the client to send arbitrary bytes to a TCP port after the connection is established:

gopher://internal-host:6379/_%0d%0aFLUSHALL%0d%0a

This connects to port 6379 (Redis default) and sends the bytes following the underscore as the data stream. Redis commands are newline-terminated ASCII text. A gopher URL can send a complete sequence of Redis commands, including SET, CONFIG SET dir, and CONFIG SET dbfilename, which together can write an SSH public key to the authorized_keys file or a cron job to a writable cron directory if Redis is running as a privileged user.

The same primitive applies to other text-based protocols: Memcached, SMTP (sending email from the server), and HTTP itself (constructing arbitrary POST requests to internal APIs that cannot be reached via standard HTTP SSRF because they require specific headers or POST bodies).

curl supports gopher natively. Python's urllib does not by default, but the scheme can be registered. The presence of gopher support depends on how the application's HTTP client was compiled and configured.


dict://

The dict:// scheme is used by the DICT protocol (RFC 2229) for dictionary lookups. In SSRF, it is used to send one line of text to a TCP port:

dict://internal-host:11211/stats

This sends stats\r\n to port 11211 (Memcached). The response is the Memcached statistics output. dict:// is more limited than gopher:// because it sends only a single command and the format is constrained, but it is supported in more HTTP clients.


Scheme Restriction as Defense

The primary control against protocol scheme abuse is allowlisting accepted schemes at the point where the URL is validated. If the application only fetches remote images, only https:// should be accepted. Any URL with a different scheme should be rejected before the HTTP client processes it.

This validation must occur after any URL parsing and normalization, not on the raw string. A URL parser may normalize GOPHER:// to gopher://; a case-insensitive scheme check handles this. URL encoding of the scheme characters (%67opher) must also be handled, which means decoding before validation rather than matching against the raw string.