Part 1 of Semmle QL’s vulnerability hunting

We’ve previously discussed how MSRC automates root cause analysis of discovered and reported vulnerabilities on this blog. Our next step is variant analysis, which involves identifying and looking into any vulnerability variants. In order to prevent these variants from being exploited in the wild, it is crucial that we locate all of them at once and patch them. I’d like to describe the automation we employ in variant finding in this post.

We’ve been adding Semmle, a third-party static analysis environment, to our manual code review procedures for the past year or so. Semmle QL, a declarative, object-oriented query language created for program analysis, is used to query the relational database ( the snapshot database, which combines database and source code ).

The fundamental process is to write queries to identify code patterns that are semantically similar to the initial vulnerability after root cause analysis. As per usual, any results are evaluated and given to our engineering teams for development of a fix. Additionally, MSRC and other security teams can periodically re-run the queries from a central repository. We can scale our variant finding across numerous codebases and over time in this way.

We have actively used QL in our source code security reviews in addition to variant analysis. A future blog post will focus on this subject. Let’s examine some actual-world examples that were influenced by MSRC cases for the time being.

Checks for incorrect integer overflow

It would be difficult to find variations of this first case’s bug in a large codebase, despite the fact that it is simple to define.

The following code illustrates a typical method for identifying overflow when adding unsigned integers:

if (x + y < x) { // handle integer overflow }

Unfortunately, this does not work properly when the width of the integer type is small enough to be subject to integral promotion. For example, if x and y were both unsigned short, when compiled, the above code would end up being equivalent to (unsigned int)x + y < x, making this overflow check ineffective.

This code pattern is matched by the following QL query:

import cpp

From AddExpr a, Variable v, RelationalOperation r, where ( Expr Op | Op = A. getAnOrand | op. ) = ut p n e s i .e. getType( ). not a. getExplicitlyConverted ( ), but getSize ( ) &lt, 4. getType( ). ” Useless overflow check due to integral promotion,” according to getSize ( ) &lt, 4Select r.

In the from clause, we define the variables, and their types, to be used in the rest of the query. AddExpr, Variable, and RelationalOperation are QL classes representing various sets of entities in the snapshot database, e.g. RelationalOperation covers every expression with a relational operation (less than, greater than, etc.)

The where clause is the meat of the query, using logical connectives and quantifiers to define how to relate the variables. Breaking it down, this means that the addition expression and the relational operation need the same variable as one of their operands (x in the example code above):

a.getAnOperand() = v.getAnAccess() and r.getAnOperand() = v.getAnAccess()

The addition must be the relational operation’s other operand:

r.getAnOperand() = a

The width of both addition operands must be less than 32 bits ( 4 bytes ):

forall(Expr op | op = a.getAnOperand() | op.getType().getSize() < 4)

However, if the addition expression has an explicit cast and is less than 32 bits, we are not interested:

not a.getExplicitlyConverted().getType().getSize() < 4

(This is so a check like (unsigned short)(x + y) < x doesn’t get flagged by the query.)

Finally, the select clause defines the result set.

Memory corruption was the result of that specific ineffective overflow check when this vulnerability was first discovered in Chakra ( Edge’s JavaScript engine ). There were no additional Chakra variants, but the query matched the original vulnerability. When we used this exact query on other Windows components, we did find a number, though.

Use of SafeInt is unsafe.

Use a library with built-in integer overflow checks as an alternative to rolling your own. A C++ template class called SafeInt ignores arithmetic operators and throws an exception when overflow is found.

Here’s an illustration of proper application:

int x, y, z; // ... z = SafeInt(x) + y;

The expression that was given to the constructor may have already overflowed, but this is an illustration of how it can be unintentionally misused:

int x, y, z; // ... z = SafeInt(x + y);

How to write a query to detect this? In the previous example, our query only used built-in QL classes. For this one, let’s start by defining our own. For this, we choose one or more QL classes to subclass from (with extends), and define a characteristic predicate which specifies those entities in the snapshot database that are matched by the class:

class SafeInt extends Type { SafeInt() { this.getUnspecifiedType().getName().matches("SafeInt<%") } }

The QL class Type represents the set of all types in the snapshot database. For the QL class SafeInt, we subset this to just types with a name that begins with “SafeInt<”, thus indicating that they are instantiations of the SafeInt template class. The getUnspecifiedType() predicate is used to disregard typedefs and type specifiers such as const.

Next, we define the subset of expressions that could potentially overflow. Most arithmetic operations can overflow, but not all; this uses QL’s instanceof operator to define which ones. We use a recursive definition because we need expressions such as (x+1)/y to be included, even though x/y is not.

( This instance of BinaryArithmeticOperation // match x+Y X- Y *yand not this instanceof DivExpr// but not y/y, nor is it this example of RemExr ) ////( x %y ) *class PotentialOverflow ( ) extends.

or ( this UnaryArithmeticOperation instance // match x++ + * * + (x-x- X, not this instance, but unaryPlusExpr )// but not ++

To account for potential overflow in/or operands of the operations mentioned above, use recursive definitions. ( BinaryArithmetic Operation ) This is an instance of a PotentialOverflowor called getAnOperand. ( UnaryPlusExpr ) instance of PotentialOverflow’s getOperand

Finally, we ask a question that relates these two classes:

from PotentialOverflow po, SafeInt si where po.getParent().(Call).getTarget().(Constructor).getDeclaringType() = si select po, po + " may overflow before being converted to " + si

.(Call) and .(Constructor) are examples of an inline cast, which, similar to instanceof, is another way of restricting which QL classes match. In this case we are saying that, given an expression that may overflow, we’re only interested if its parent expression is some sort of call. Furthermore, we only want to know if the target of that call is a constructor, and if it’s a constructor for some SafeInt.

This query, like the one before it, produced a number of actionable results across various codebases.

JavaScript re-entry that can be used for free

The following illustration involved an Edge vulnerability brought on by re-entry into JavaScript code.

Many JavaScript-callable functions are defined by Edge. The vulnerability’s core is illustrated by this model function:

Get the first argument from Chakra, get the pointer to arrayBYTE* pBuffer, UINT bufferSize, and hr = Jscript: &amp, tamps, buffer, etc. ( HRESULT SomeClass: )

Obtain an integer valueint someValue from Chakra using the second argument, which is: VarToInt (args ]2], &amp, someVvalue.

// carry out an operation ( pBuffer, bufferSize ) on the array that was previously acquired.

…`

The problem was that when Edge calls back into Chakra, e.g. during VarToInt, arbitrary JavaScript code may be executed. The above function could be exploited by passing it a JavaScript object that overrides valueOf to free the buffer, so when VarToInt returns, pBuffer is a dangling pointer:

New ArrayBuffer ( length ), new Uint8Array ( buf ), and var buf

var param = {} param.valueOf = function() {/free buf(code to actually do this would be defined elsewhere)/neuter(buf); // neuter buf by e.g. posting it to a web workergc(); // trigger garbage collectionreturn 0;};

(arr, param ) vulnerableFunction

The specific pattern we’re looking for with QL is therefore: acquisition of a pointer from GetTypedArrayBuffer, followed by a call to some Chakra function that may execute JavaScript, followed by some use of the pointer.

For the array buffer pointer, we match on the calls to GetTypedArrayBuffer, where the second argument (getArgument of Call is zero-indexed) is an address-of expression (&), and take its operand:

class TypedArrayBufferPointer extends Expr { TypedArrayBufferPointer() { exists(Call c | c.getTarget().getName() = "GetTypedArrayBuffer"and this = c.getArgument(1).(AddressOfExpr).getOperand()) } }

The exists logical quantifier is used here to introduce a new variable (c) into the scope.

For JavaScript re-entrancy, there are a number of Chakra API functions that can be used. We can use QL to determine the internal Chakra function that carries out this task from the call graph rather than defining them by name:

Examine the call graph to determine whether any function that may eventually call MethodCallToPrimitivepredicate mayExecJsFunction ( Formation f ) consists of calls + ( g ) and g. hasName (” MethodCallingToProductive” )

A call to any of the aforementioned functions is defined by the function call function MayExecJsCall ( )- mayExEcJSFunction ( this ).

The “+” suffix of the calls predicate specifies a transitive closure – it applies the predicate to itself until there is a match. This permits a concisely defined exploration of the function call graph.

Finally, using a control flow, this query connects these QL class definitions:

from TypedArrayBufferPointer def, MayExecJsCall call, VariableAccess use, Variable v where v.getAnAccess() = def and v.getAnAccess() = use and def.getASuccessor+() = call and call.getASuccessor+() = use select use, "Call to " + call + " between definition " + def + " and use " + use

The predicate getASuccessor() specifies the next statement or expression in the control flow. Therefore, using e.g. call.getASuccessor+() = use will follow the control flow graph from call until there is a match to use. The diagram below illustrates this:

In addition to the vulnerability that was initially reported, this query also found four additional variants, all of which were given critical severity ratings.

That is all there is to say. With examples from our security review of an Azure firmware component, the following installment will discuss using QL for data flow analysis and taint tracking.

Steven Hunter, MSRC Vulnerabilities &amp, mitigation team