CVE-2017-2793
An exploitable heap corruption vulnerability exists in the UnCompressUnicode functionality of AntennaHouse DMC HTMLFilter used by MarkLogic 8.0-6. A specially crafted xls file can cause a heap corruption resulting in arbitrary code execution. An attacker can send/provide malicious XLS file to trigger this vulnerability.
AntennaHouse DMC HTMLFilter shipped with MarkLogic 8.0-6 fb1a22fa08c986ec3614284f4e912b0a /opt/MarkLogic/Converters/cvtofc/libdhf_rdoc.so 15b0acc464fba28335239f722a62037f /opt/MarkLogic/Converters/cvtofc/libdmc_comm.so 1eabb31236c675f9856a7d001b339334 /opt/MarkLogic/Converters/cvtofc/libdhf_rxls.so 1415cbc784f05db0e9db424636df581a /opt/MarkLogic/Converters/cvtofc/libdhf_comm.so 4ae366fbd4540dd4c750e6679eb63dd4 /opt/MarkLogic/Converters/cvtofc/libdmc_conf.so 81db1b55e18a0cb70a78410147f50b9c /opt/MarkLogic/Converters/cvtofc/libdhf_htmlif.so d716dd77c8e9ee88df435e74fad687e6 /opt/MarkLogic/Converters/cvtofc/libdhf_whtml.so e01d37392e2b2cea757a52ddb7873515 /opt/MarkLogic/Converters/cvtofc/convert
https://www.antennahouse.com/antenna1/
8.3 - CVSS:3.0/AV:N/AC:H/PR:N/UI:R/S:C/C:H/I:H/A:H
This vulnerability is present in the AntennaHouse DMC HTMLFilter which is used, among others, to convert XLS files to (X)HTML form.
This product is mainly used by MarkLogic for xls document conversions as part of their web based document search and rendering engine.
A specially crafted XLS file can lead to heap corruption and ultimately to remote code execution.
Let’s investigate this vulnerability. After execution of the XLS to HTML converter with a malformed XLS file as an input we can easily observe the following when using Valgrind:
icewall@ubuntu:~/bugs/cvtofc_86$ valgrind ./convert config_xls
==46749== Memcheck, a memory error detector
==46749== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==46749== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==46749== Command: ./convert config_xls
==46749==
input=/home/icewall/bugs/cvtofc_86/config_xls/toconv.xls
output=/home/icewall/bugs/cvtofc_86/config_xls/conv.html
type=2
info.options='0'
Return from GetFileInfo=0
HtmlInfo.GroupName=UTF-8
HtmlInfo.DefLangName=English
HtmlInfo.bBigEndian=0
HtmlInfo.options=0
HtmlInfo.SheetId=0
HtmlInfo.SlideId=0
HtmlInfo.lpFunc=(nil)
HtmlInfo.szImageFolder=
==46749== Source and destination overlap in strcpy(0x43e178d, 0x43e178d)
==46749== at 0x402D56F: strcpy (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==46749== by 0x635186F: DHF_WOpen (in /home/icewall/bugs/cvtofc_86/libdhf_whtml.so)
==46749== by 0x4039779: FilterToHtml (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==46749== by 0x4038AFB: DHF_GetHtml_V11 (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==46749== by 0x8049AF7: main (in /home/icewall/bugs/cvtofc_86/convert)
==46749==
==46749== Invalid write of size 1
==46749== at 0x40409B3: UnCompressUnicode (in /home/icewall/bugs/cvtofc_86/libdhf_rxls.so)
==46749== by 0x4045223: String (in /home/icewall/bugs/cvtofc_86/libdhf_rxls.so)
==46749== by 0x40532B8: DHF_RGetObject (in /home/icewall/bugs/cvtofc_86/libdhf_rxls.so)
==46749== by 0x403979E: FilterToHtml (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==46749== by 0x4038AFB: DHF_GetHtml_V11 (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==46749== by 0x8049AF7: main (in /home/icewall/bugs/cvtofc_86/convert)
==46749== Address 0x5429bb2 is 0 bytes after a block of size 65,538 alloc'd
==46749== at 0x402C109: calloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==46749== by 0x42DCD24: DMC_calloc (in /home/icewall/bugs/cvtofc_86/libdmc_comm.so)
==46749== by 0x4041002: InitMem (in /home/icewall/bugs/cvtofc_86/libdhf_rxls.so)
==46749== by 0x40526BB: DHF_ROpen (in /home/icewall/bugs/cvtofc_86/libdhf_rxls.so)
==46749== by 0x4039765: FilterToHtml (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==46749== by 0x4038AFB: DHF_GetHtml_V11 (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==46749== by 0x8049AF7: main (in /home/icewall/bugs/cvtofc_86/convert)
The out of bounds write occrus in the UnCompressUnicode
function but the buffer which is overflowed is allocated in InitMem
and has a size of 65538 bytes.
Let we investigate InitMem
function:
Line 1 signed int __cdecl InitMem(int a1)
Line 2 {
Line 3 _DWORD *v1; // esi@1
Line 4 int v2; // eax@1
Line 5
Line 6 v1 = *(_DWORD **)(a1 + 16164);
Line 7 v1[74] = DMC_calloc(65538, 1);
Line 8 v1[75] = DMC_calloc(65538, 1);
Line 9 v1[76] = DMC_calloc(65538, 1);
Line 10 v2 = DMC_calloc(256, 728);
Line 11 *(_DWORD *)(a1 + 9236) = v2;
Line 12 if ( !v1[76] || !v1[75] || !v1[74] || !v2 )
Line 13 return 12;
Line 14 GetEndian(v1);
Line 15 return 0;
Line 16 }
As we can see there are a couple allocations made with the aformentioned mentioned size.
Looking at the call stack where the overflow appears, we see that operations made are related to a string object. Everything becomes clearer when we look at the following code from the DHF_RGetObject
function:
Line 1 else
Line 2 {
Line 3 if ( v12 != 229 )
Line 4 goto LABEL_156;
Line 5 v13 = MergeCellRec(v3);
Line 6 }
Line 7 goto LABEL_155;
Line 8 }
Line 9 if ( v12 == 520 )
Line 10 {
Line 11 v13 = RowRec(v3);
Line 12 goto LABEL_155;
Line 13 }
Line 14 if ( v12 <= 520 )
Line 15 {
Line 16 if ( v12 == 517 )
Line 17 {
Line 18 v13 = BoolerrRec(v3, *(_BYTE *)(*(_DWORD *)(v3 + 296) + 6), *(_BYTE *)(*(_DWORD *)(v3 + 296) + 7));
Line 19 }
Line 20 else if ( v12 > 517 )
Line 21 {
Line 22 if ( v12 != 519 )
Line 23 goto LABEL_156;
Line 24 String();
Line 25 }
Line 26 else
Line 27 {
Line 28 if ( v12 != 515 )
Line 29 goto LABEL_156;
Line 30
Line 31 v13 = NumberRec(a1, (struct_a2 *)v3);
Line 32 }
Line 33
Line 34 v2 = v13;
Line 35 goto LABEL_156;
Line 36 }
This code is responsible for calling the proper object constructor based on the XLS record type. So our buggy code fragment parses an XLS Sring (documented in section 2.4.268 of the Excel Binary Format: https://msdn.microsoft.com/en-us/library/dd923608(v=office.12).aspx). Further investigation reveals that the malformed string record is located at offset 0x11ED:
0x11ED : 07 02 07 00 03 F2 00 31 2F 34 22
0x207 - record type
0x7 - record length
0x03... - Data
According to the documentation:
string (variable): An XLUnicodeString structure that specifies the string value of a formula (section 2.2.2). The value of string.cch MUST be less than or equal to 32767.
The Data
included in the record is described by section 2.5.294 XLUnicodeString
(https://msdn.microsoft.com/en-us/library/dd922754(v=office.12).aspx). The most important information from there:
cch (2 bytes): An unsigned integer that specifies the count of CHARACTERS in the string.
A - fHighByte (1 bit): A bit that specifies whether the characters in rgb are double-byte characters. MUST be a value from the following table: Value Meaning
0x0 - All the characters in the string have a high byte of 0x00 and only the low bytes are in rgb.
0x1 - All the characters in the string are saved as double-byte characters in rgb.
rgb (variable): An array of bytes that specifies the characters. If fHighByte is 0x0, the size of the array MUST be equal to cch. If fHighByte is 0x1, the size of the array MUST be equal to cch*2.
We see check for the A bit
at line 15. In our case it’s equal to 0x00.
Line 1 signed int __usercall String(int a1@<ebp>)
Line 2 {
Line 3 struct_v1 *v1; // edi@1
Line 4 int v3; // edx@3
Line 5 size_t v4; // esi@5
Line 6
Line 7 v1 = *(struct_v1 **)(a1 + 8);
Line 8 *(_DWORD *)(a1 - 16) = (unsigned __int16)Exc_GetWord(v1, v1->dword128);
Line 9 *(_DWORD *)(a1 - 16) *= 2;
Line 10 *(_DWORD *)(v1->dword9A0 + 24 * v1->dword9A4 - 4) = DMC_malloc(*(_DWORD *)(a1 - 16));
Line 11 if ( !*(_DWORD *)(v1->dword9A0 + 24 * v1->dword9A4 - 4) )
Line 12 return 12;
Line 13 memset(*(void **)(v1->dword9A0 + 24 * v1->dword9A4 - 4), 0, *(_DWORD *)(a1 - 16));
Line 14 v3 = v1->dword128;
Line 15 if ( *(_BYTE *)(v3 + 2) & 1 )
Line 16 {
Line 17 memcpy(*(void **)(v1->dword9A0 + 24 * v1->dword9A4 - 4), (const void *)(v3 + 3), *(_DWORD *)(a1 - 16));
Line 18 }
Line 19 else
Line 20 {
Line 21 v4 = *(_DWORD *)(a1 - 16) >> 1;
Line 22 memcpy(v1->pvoid12C, (const void *)(v1->dword128 + 3), v4);
Line 23 v1->dword134 = v4;
Line 24 UnCompressUnicode((int)v1);
Line 25 memcpy(*(void **)(v1->dword9A0 + 24 * v1->dword9A4 - 4), v1->pvoid130, *(_DWORD *)(a1 - 16));
Line 26 }
Line 27 *(_DWORD *)(v1->dword9A0 + 24 * v1->dword9A4 - 8) = *(_DWORD *)(a1 - 16);
Line 28 return 0;
Line 29 }
In this case the assumption is that cch
indicates the number of characters, so it should not exceed 32767 . Where in our case this value is equal to:
»> 0xf203
61955
So nearly two times above the limit. There are no extra checks in the UnCompressUnicode
function:
Line 1 char *__cdecl UnCompressUnicode(struct_a1_2 *stringObj)
Line 2 {
Line 3 int i; // ecx@1
Line 4 char *result; // eax@2
Line 5
Line 6 for ( i = 0; i < stringObj->cch; ++i )
Line 7 {
Line 8 stringObj->globalInitMemPtr[2 * i] = stringObj->rawPtr[i];
Line 9 result = stringObj->globalInitMemPtr;
Line 10 result[2 * i + 1] = 0;
Line 11 }
Line 12 return result;
Line 13 }
As we can see, due to Unicode conversion, we can write double the size of the buffer into the memory allocated by the InitMem function causing heap corruption which can lead to arbitrary code execution.
Starting program: /home/icewall/bugs/cvtofc_86/convert config_xls
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
input=/home/icewall/bugs/cvtofc_86/config_xls/toconv.xls
output=/home/icewall/bugs/cvtofc_86/config_xls/conv.html
type=2
info.options='0'
Return from GetFileInfo=0
HtmlInfo.GroupName=UTF-8
HtmlInfo.DefLangName=English
HtmlInfo.bBigEndian=0
HtmlInfo.options=0
HtmlInfo.SheetId=0
HtmlInfo.SlideId=0
HtmlInfo.lpFunc=(nil)
HtmlInfo.szImageFolder=
*** Error in `/home/icewall/bugs/cvtofc_86/convert': double free or corruption (!prev): 0x080b8720 ***
Program received signal SIGABRT, Aborted.
[----------------------------------registers-----------------------------------]
EAX: 0x0
EBX: 0xe37b
ECX: 0xe37b
EDX: 0x6
ESI: 0x67 ('g')
EDI: 0xf7f06000 --> 0x1aada8
EBP: 0xfffec7e8 --> 0x80c8720 --> 0x0
ESP: 0xfffec524 --> 0xfffec7e8 --> 0x80c8720 --> 0x0
EIP: 0xf7fdacd9 (pop ebp)
EFLAGS: 0x206 (carry PARITY adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
0xf7fdacd3: mov ebp,esp
0xf7fdacd5: sysenter
0xf7fdacd7: int 0x80
=> 0xf7fdacd9: pop ebp
0xf7fdacda: pop edx
0xf7fdacdb: pop ecx
0xf7fdacdc: ret
0xf7fdacdd: and edi,edx
[------------------------------------stack-------------------------------------]
0000| 0xfffec524 --> 0xfffec7e8 --> 0x80c8720 --> 0x0
0004| 0xfffec528 --> 0x6
0008| 0xfffec52c --> 0xe37b
0012| 0xfffec530 --> 0xf7d89687 (xchg ebx,edi)
0016| 0xfffec534 --> 0xf7f06000 --> 0x1aada8
0020| 0xfffec538 --> 0xfffec5d4 --> 0x50 ('P')
0024| 0xfffec53c --> 0xf7d8cab3 (mov edx,DWORD PTR gs:0x8)
0028| 0xfffec540 --> 0x6
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGABRT
0xf7fdacd9 in ?? ()
gdb-peda$ exploitable
Description: Heap error
Short description: HeapError (15/29)
Hash: e2d1b7f3a507d7d332ec76b419bab576.275feecd5f318bb3faa395083dd35f75
Exploitability Classification: EXPLOITABLE
Explanation: The target's backtrace indicates that libc has detected a heap error or that the target was executing a heap function when it stopped. This could be due to heap corruption, passing a bad pointer to a heap function such as free(), etc. Since heap errors might include buffer overflows, use-after-free situations, etc. they are generally considered exploitable.
Other tags: AbortSignal (27/29)
2017-02-09 - Vendor Disclosure
2017-05-04 - Public Release
Discovered by Marcin 'Icewall' Noga of Cisco Talos.