Varol Cagdas Tok

Personal notes and articles.

Application-Layer Denial of Service

Volumetric attacks are operationally straightforward: send more traffic than the target can receive. Protocol attacks exploit specific state machine vulnerabilities at L4. Application-layer denial of service is a different problem. The traffic volume may be modest. The packets complete valid protocol handshakes. The requests conform to application protocol syntax. From every angle below the application layer, the traffic looks legitimate. What exhausts the target is what happens inside the application when it processes the request.

Moving from volume to computation cost changes the attack surface fundamentally. A volumetric attack can be mitigated upstream without examining application semantics. An application-layer attack cannot be distinguished from legitimate traffic without understanding the application, which requires processing the traffic at the application layer, which is precisely what the attacker is trying to overload.


The Cost Asymmetry in Application Processing

Every application operation has a cost profile: how much CPU it requires, how much memory it allocates, how long it holds database connections, how many I/O operations it performs. These costs are not uniform across operations. A static file read may cost microseconds. A complex search across an unindexed table may cost seconds. A resource that renders a high-resolution image on request may cost several hundred milliseconds of CPU.

The attack surface is the gap between the cost of requesting an operation and the cost of performing it. If a request costs the attacker 500 bytes of network traffic and costs the server 500 milliseconds of CPU, the attacker can exhaust server CPU with a request rate of roughly 2 requests per second per core. This is available to any internet-connected host without special tools or infrastructure.

Three factors determine the size of this gap:

Operation complexity: operations that require traversing large data structures, performing cryptographic computation, or calling external services are more expensive than simple memory reads. Applications that expose expensive operations without authentication or rate limiting maximize the gap.

Input-dependent cost: operations whose cost depends on input provided by the requester are particularly dangerous. A search query whose performance degrades with input length, a regex whose backtracking behavior is input-dependent, or a deserialization routine whose cost scales with the nesting depth of the provided structure all allow the attacker to maximize cost by crafting inputs specifically to trigger the expensive case.

Lack of per-client cost accounting: an application that does not track how much resource a given client has consumed in a time window cannot enforce per-client limits. Without per-client accounting, rate limiting is imprecise, you can limit global request rates, but you cannot prevent a single client from consuming a disproportionate share.


HTTP as the Dominant Attack Surface

HTTP is the most common application-layer attack vector because it is the most widely exposed application protocol and because HTTP request processing in typical web applications is computationally expensive. A request triggers routing, middleware execution, database queries, template rendering, and response serialization. The cost of this pipeline, measured in CPU cycles and I/O operations, is several orders of magnitude greater than the cost of transmitting the request.

HTTP request floods, sending many HTTP requests at high rate from distributed sources, are the blunt version of this attack. The sophistication of this approach is limited: if the requests are for expensive resources, they exhaust server computation; if for cheap resources (static files served by a CDN), the CDN absorbs the load and the origin is unaffected. Effective HTTP floods target the resources that bypass caching and reach the application server.

Resources that are deliberately not cached are the most valuable targets: search endpoints, authenticated user-specific pages, real-time data feeds, checkout flows. These cannot be served from cache and must be processed by the application for each request. A crawler or automated tool that systematically requests these resources at high rate achieves significant damage at modest traffic volume.

Search Endpoint Exhaustion

Search is a particularly exposed operation in most web applications. Full-text search requires inverted index traversal. Relational database search without index coverage requires sequential table scans. Either operation scales with dataset size, and the cost is borne entirely by the server regardless of how many results are found.

Wildcard queries, prefix queries, and queries involving boolean combinations across multiple fields are typically more expensive than simple exact-match lookups. An attacker who understands the target application can construct queries that trigger the expensive cases: a search for %a%b%c% in a LIKE clause causes a full table scan on most database engines. An Elasticsearch query with a deeply nested boolean structure forces multi-pass evaluation.

The defense at the application layer involves query cost limits (Elasticsearch's indices.breaker.fielddata.limit and query circuit breakers; query timeout settings in relational databases), strict input validation that rejects structurally expensive queries, and authentication requirements for search functionality.


Slow HTTP Attacks

Slow HTTP attacks are a distinct category that emphasizes state exhaustion over computation exhaustion. The attacker does not attempt to overload server CPU, they attempt to occupy server connection slots by holding connections open for as long as possible while sending as little data as possible.

Slowloris

Slowloris was demonstrated by Robert "RSnake" Hansen in 2009 and named for the slow loris, a primate with slow, deliberate movements. The attack targets HTTP servers that use a threaded or process-per-connection model, where each connection occupies a thread or process for its entire duration.

The attack procedure:

  1. Open a connection to the target server.
  2. Send a partial HTTP request header, enough to indicate a request is in progress, but not the final \r\n\r\n that terminates the headers.
  3. At regular intervals, send another partial header line (e.g., X-a: b\r\n) to prevent the connection from timing out.
  4. Repeat from step 1 until the server's connection pool is exhausted.
  5. The server holds each connection open waiting for the complete request headers. A typical web server timeout for header completion is 30–60 seconds. By sending a partial header line every 10–15 seconds, the attacker keeps the connection alive indefinitely. Each connection occupies a thread in Apache's prefork or worker MPM. When all threads are occupied, new connection attempts queue or are rejected.

    The attacker can maintain hundreds of such connections from a single IP address on a modest connection. The total inbound traffic is minimal, a few bytes per connection every few seconds. The server's network bandwidth and CPU are largely idle; only its thread pool is exhausted.

    Slowloris is ineffective against asynchronous or event-driven servers (nginx, Node.js with event loop, modern Apache with event MPM) that do not hold a thread per connection. These servers handle connections using non-blocking I/O; an idle connection waiting for headers consumes a file descriptor entry but not a thread. The file descriptor limit is typically much larger than the thread pool, a modern Linux system may have an fd limit of several hundred thousand, while a threaded Apache installation may have a few hundred threads. However, event-driven servers are not immune to this general approach, they can be exhausted by connections that have completed headers but are sending request bodies slowly (see RUDY below).

    R.U.D.Y. (R-U-Dead-Yet?)

    R.U.D.Y. targets the request body submission phase. The attacker sends a complete, valid HTTP POST request header, including a large Content-Length value, then transmits the body at one byte every few seconds. The server receives a syntactically valid request with a declared content length and must read the entire body before processing the request, this is standard HTTP behavior. As with Slowloris, the connection remains open, the server allocates the handler context, and legitimate connections are excluded.

    The distinction from Slowloris is that R.U.D.Y. targets servers that handle headers asynchronously but block waiting for request body data. Event-driven servers that handle headers asynchronously are not immune to R.U.D.Y. if they process request bodies synchronously within a handler context.

    Slow Read Attack

    The slow read attack reverses the direction: instead of sending slowly, the attacker receives slowly. The attacker announces a small TCP receive window, causing the server to buffer the response and wait for the client to acknowledge received data before sending more. The server holds the connection open with the response queued, occupying memory and socket state.

    This targets the server's outbound buffer. If the server uses a fixed pool of send buffers or thread contexts that block on socket writes, occupying these with slow-reading connections prevents the server from handling new requests.

    Mitigations include server-side receive window tracking and timeout enforcement: if a client has not acknowledged data after a timeout, the connection is closed regardless of the slow read. nginx's send_timeout directive controls this.


    Asymmetric Computation Attacks

    A class of application-layer attacks specifically targets computation asymmetry: finding operations where the server performs significantly more work than the attacker.

    XML and JSON Processing

    XML parsing is expensive. Deep nesting, large attribute sets, and extensive entity reference resolution all increase the parsing cost. The "billion laughs" attack (CVE-2003-1564 in the libxml context) exploits XML entity expansion:

    <?xml version="1.0"?>
    <!DOCTYPE lolz [
      <!ENTITY lol "lol">
      <!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
      <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
      ...
      <!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
    ]>
    <lolz>&lol9;</lolz>

    Each entity reference expands to ten copies of the entity it references. With nine levels of nesting, the final expansion produces 10^9 copies of the string "lol", approximately 3 gigabytes of memory from a few hundred bytes of input. A parser that fully expands entity references without a depth or expansion size limit allocates this memory and typically crashes or exhausts available memory.

    Modern XML parsers implement configurable limits on entity expansion depth and total expansion size. Applications that use XML parsing without configuring these limits remain vulnerable.

    JSON does not have entity expansion, but deeply nested structures create recursion depth that can overflow the parser's call stack in recursive-descent implementations. A JSON document with nesting depth of several thousand may trigger a stack overflow. Python's json module has a maximum nesting depth configurable via json.decoder.MAXDEPTH (though this was added after the vulnerability was known). The default is 128 levels in CPython, not something you would normally hit with legitimate data, but trivially triggered by an attacker.

    Zip Bombs

    A zip bomb is a compressed file that expands to a much larger file upon decompression. The record case, a 42-kilobyte zip file that expands to approximately 4.5 petabytes, is achieved through nested compression: a zip file containing zip files, each containing zip files, with the innermost files consisting of repeated null bytes that compress extremely well. Most zip parsers detect this through depth limits or extraction size limits, but applications that decompress user-supplied files without these limits can be crashed by decompression.

    The same principle applies to other compressed formats: gzip, bzip2, lz4. Any application that decompresses user input without size limits is vulnerable to decompression bomb exhaustion.

    Hash Collision Attacks (HashDoS)

    Hash tables are ubiquitous in language runtimes for mapping string keys to values. The insertion and lookup performance of a hash table depends on the distribution of keys across buckets. If many keys hash to the same bucket, the table degrades to a linked list scan, with O(n) lookup instead of O(1).

    In 2011, Alexander Klink and Julian Wälde demonstrated that most web frameworks of the time used hash functions without randomization, making it possible to construct a large set of strings that all hashed to the same value in a specific language's hash table implementation. An HTTP POST request body with thousands of form fields, all with names chosen to collide in the hash table, consumed CPU proportional to the square of the number of fields during the server's parsing of the POST body.

    A request with 100,000 colliding keys could consume several minutes of CPU time to parse. The attack required knowledge of the hash function used by the target language, which is deterministic and publicly known for unseeded hash functions. This was a critical vulnerability because POST body parsing happens before request routing and authentication, the application cannot decide not to parse it.

    The fix was hash randomization: using a randomly seeded hash function for runtime hash tables (Java's HashMap randomized starting with JDK 7u6; Python added hash randomization in 3.3 with PYTHONHASHSEED; Perl and Ruby adopted similar changes). With a random seed, an attacker cannot precompute collisions, the seed is unknown and changes between process restarts.


    Database Query Exhaustion

    Database queries are expensive relative to most other application operations. Applications that allow user input to influence query structure, not through SQL injection, but through legitimate query parameterization, can expose expensive operations.

    Sort and pagination: an unbounded sort request against a large table may require the database to sort the full result set before returning the first page. An attacker requesting a very large page size on a sort that cannot use an index forces a full sort of the table. LIMIT clauses applied after sorting do not prevent the sort from completing.

    Aggregation: queries involving GROUP BY, COUNT, SUM, or similar aggregation against large tables without appropriate indexes are expensive. If the application allows users to request aggregated reports without authentication or rate limiting, these queries can exhaust database CPU and connection pool capacity.

    JOIN depth: a query joining many tables requires the database optimizer to evaluate join strategies and execute the plan, which scales non-linearly with the number of joins. Applications that allow users to construct complex queries by selecting filter criteria may allow construction of expensive multi-join queries.

    Full-text search without limits: full-text search indexes are efficient for common queries, but wildcard prefix searches and searches for very common terms may require large index scans. Limiting search query complexity and enforcing per-user query rate limits are the primary defenses.


    The Detection Problem

    Application-layer attacks are hard to detect automatically for the reason stated at the outset: the traffic is syntactically legitimate. An HTTP request from an attacker looks like an HTTP request from a legitimate user. The distinguishing characteristics are behavioral:

    • Request rate per IP higher than the distribution of legitimate users
    • Requests concentrated on a small set of high-cost endpoints
    • Absence of the supporting requests that accompany legitimate browsing (images, CSS, JS, analytics beacons)
    • User agent strings inconsistent with claimed browser behavior
    • No session state or cookie handling consistent with a real browser session

    None of these characteristics are definitive individually. Rate limiting by IP fails against distributed attacks. User agent analysis fails against tools that send realistic user agent strings. Session analysis requires state maintenance across requests.

    Challenge-response mechanisms, JavaScript execution challenges, CAPTCHA, invisible behavioral biometrics, attempt to distinguish automated clients from human ones. These work because legitimate browsers execute JavaScript; simple HTTP libraries do not. Headless browsers execute JavaScript too, and sophisticated attackers use them. As challenge mechanisms improve, evasion follows. But challenges still raise the cost of attack for the majority of unsophisticated attackers, which is most of them.