Talos Vulnerability Report

TALOS-2019-0781

Yara Object Lookup Denial of Service Vulnerability

July 30, 2019
CVE Number

CVE-2019-5020

Summary

An exploitable Denial of Service vulnerability exists in the object lookup functionality of Yara 3.8.1. A specially crafted binary file can cause a negative value to be read to satisfy an assert, resulting in Denial of Service. An attacker can create a malicious binary to trigger this vulnerability.

Tested Versions

Yara 3.8.1

Product URLs

http://virustotal.github.io/yara/

CVSSv3 Score

6.5 - CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:H

CWE

CWE-617: Reachable Assertion

Details

Yara is tool for finding patterns in specific files, targeted towards the malware research community. Yara allows for automated scanning of samples to help classify samples for easier analysis. Researchers can write specific rules using yara's format that will trigger when used to scan various samples.

Yara can parse a variety of binary file formats, such as PE, MachO, ELF, and Dex. In order for Yara to parse Dex files, it must decode a variety of fields from the binary file itself. In order to accomplish this, load_encoded_field is called on a given field type. During the initial parsing of a Dex file, the raw fields are stored in the Dex object under a variety of string keys. A few of these keys are shown below.

begin_struct_array("type_ids");
   declare_integer("descriptor_idx");
end_struct_array("type_ids");

begin_struct_array("proto_ids");
   declare_integer("shorty_idx");
   declare_integer("return_type_idx");
   declare_integer("parameters_offset");
end_struct_array("proto_ids");

begin_struct_array("field_ids");
   declare_integer("class_idx");
   declare_integer("type_idx");
   declare_integer("name_idx");
end_struct_array("field_ids");

In order to retrieve one of these values, an index into wanted struct array is given along with a string key such as field_ids[%i].name_idx.

libyara/modules/dex.c:350
int32_t load_encoded_field(
    DEX* dex,
    size_t start_offset,
    uint32_t *previous_field_idx,
    int index_encoded_field,
    int static_field,
    int instance_field)
{
...
encoded_field.field_idx_diff = (uint32_t) read_uleb128(
    (dex->data + start_offset + current_size), &current_size); [0]

encoded_field.access_flags = (uint32_t) read_uleb128(
    (dex->data + start_offset + current_size), &current_size);

*previous_field_idx = encoded_field.field_idx_diff + *previous_field_idx;

int name_idx = (int) get_integer(
    dex->object, "field_ids[%i].name_idx", *previous_field_idx); [1]

Above is trying to reading a value [0] from the Dex binary and asking for the integer parsed at this index [1] from the field_ids array. In order to retrieve the integer, a lookup through the Dex object is performed.

libyara/object.c:896
int64_t yr_object_get_integer(
    YR_OBJECT* object,
    const char* field,
    ...)
{
YR_OBJECT* integer_obj;

va_list args;
va_start(args, field);

if (field != NULL)
    integer_obj = _yr_object_lookup(object, 0, field, args); [2]
else
    integer_obj = object;

In this case, [2] object is the dex->object, field is field_ids[%i].name_idx, and args is previous_field_idx which depended on a read in value from the Dex file. The implementation of the generic object lookup is below.

libyara/object.c:422
static YR_OBJECT* _yr_object_lookup(
    YR_OBJECT* object,
    int flags,
    const char* pattern,
    va_list args)
{
    YR_OBJECT* obj = object;

    const char* p = pattern;
    const char* key = NULL;

    char str[256];

    int i;
    int index = -1;

    while (obj != NULL)
    {
        ...
        if (*p == '[')
        {
        p++;

        if (*p == '%')
        {
            p++;

            switch(*p++)
            {
            case 'i':
                index = va_arg(args, int);
                break;
            case 's':
                key = va_arg(args, const char*);
                break;

            default:
                return NULL;
            }
        }
        ...

This function implements a minimal format string parser. In particular, the value %i treats the current args function parameter as as an integer, overwriting the original value of -1. The function continues to attempt to retrieve the indexed value from the object.

libyara/object.c:503
switch(obj->type)
{
    case OBJECT_TYPE_ARRAY:
        assert(index != -1);
        obj = yr_object_array_get_item(obj, flags, index);
        break;

    case OBJECT_TYPE_DICTIONARY:
        assert(key != NULL);
        obj = yr_object_dict_get_item(obj, flags, key);
        break;
}

Based on the current object's type, a particular function is called to handle the wanted functionality. The index is checked to ensure it was overwritten and has a value that was different than the original value of -1. Failing this check, the program aborts. If an attacker crafts the binary scanned by yara to have a -1 read in at [0], the overwritten index value is still -1, causing the abort to trigger. If yara scans a directory of samples containing this malicious binary, due to the denial of service, the other samples would not be scanned and could be overlooked.

Timeline

2019-02-19 - Vendor Disclosure
2019-02-20 - Plain text copy of report issued per vendor request
2019-02-22 - Vendor patched

Credit

Discovered by Cory Duplantis of Cisco Talos.