Varol Cagdas Tok

Personal notes and articles.

SQL Injection: The Mechanism

SQL injection occurs when user-supplied data is interpreted as SQL syntax by the database parser. The root cause is not a complex vulnerability class. It is a boundary problem: the application constructs a SQL statement by concatenating strings, and user input becomes part of that string before the parser sees it. The parser cannot determine which portions were written by the developer and which were supplied by the user, because both arrive as a single string.


The Parser's Perspective

A SQL parser tokenizes its input into keywords, identifiers, operators, literals, and punctuation. It does not have a concept of "trusted" versus "untrusted" substrings. It processes whatever string it receives.

An application building a query with string concatenation:

query = "SELECT * FROM users WHERE username = '" + username + "'";

With a normal input of alice, the parser receives:

SELECT * FROM users WHERE username = 'alice'

With an input of alice' OR '1'='1, the parser receives:

SELECT * FROM users WHERE username = 'alice' OR '1'='1'

The single quote in the input closed the string literal the developer opened. Everything after it is parsed as SQL syntax. The OR '1'='1' clause is structurally valid SQL. The WHERE condition now evaluates to true for every row. The developer intended a single equality check; the parser executes a condition that matches all users.


The Trust Boundary

The application sits between the user and the database. It is supposed to translate user intent into safe database operations. When user input is concatenated directly into a SQL string, the application does not translate anything. It forwards the input to the database with the structure it imposes intact.

The correct model is that SQL structure belongs to the application and data belongs to the user. The two must be kept separate from construction through execution. Concatenation collapses that separation at construction time.


What Injection Enables

The effect depends on where the input lands in the query and what the database permits the application account to do:

WHERE clause injection: modifying filter conditions to match unintended rows, bypassing authentication checks, or reading data from unrelated users.

UNION injection: appending a second SELECT statement to retrieve data from tables not referenced in the original query.

Subquery and expression injection: inserting subqueries or function calls that execute against the database and return results in the response or through side channels.

Statement termination and chaining: in databases and drivers that support multiple statements in one call, terminating the original statement with a semicolon and appending a second statement. This enables arbitrary SQL execution, including DDL operations.

All of these follow from the same root cause: the parser received structure from the user.