CVE-2016-8386
An exploitable heap-based buffer overflow exists in Iceni Argus. When it attempts to convert a PDF containing a malformed font to XML, the tool will attempt to use a size out of the font to search through a linked list of buffers to return. Due to a signedness issue, a buffer smaller than the requested size will be returned. Later when the tool tries to populate this buffer, the overflow will occur which can lead to code execution under the context of the user running the tool.
Iceni Argus Version 6.6.04 (Sep 7 2012) NK
http://www.iceni.com/legacy.htm
8.8 - CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H
This is a heap-based buffer overflow that occurs in a conversion tool that comes with Iceni Argus. This tool is used primarily by MarkLogic to convert PDF files to (X)HTML form.
When parsing a font file embedded within a PDF, the tool will call the ipParseFontFile
function which will call GetTables
to get the different tables located within a font file. Inside the GetTables
function, the tool will eventually allocate space for the subtable offsets for each entry in the “cmap” table. If the font is corrupted the following code will get executed and the size will be clamped to 0x64 * 4 (0x190).
813ff14: 0f b7 c6 movzwl %si,%eax ; from file
813ff17: c1 e0 08 shl $0x8,%eax
813ff1a: 09 c2 or %eax,%edx
813ff1c: 83 fa 64 cmp $0x64,%edx
813ff1f: 7e 05 jle 813ff26 <GetTables+0x996>
813ff21: ba 64 00 00 00 mov $0x64,%edx ; size to clamp to
813ff26: 8b 8d fc fa ff ff mov -0x504(%ebp),%ecx
813ff2c: 8d 83 c7 ae 49 ff lea -0xb65139(%ebx),%eax
813ff32: 89 91 cc 04 00 00 mov %edx,0x4cc(%ecx) ; XXX: write 0x64 to 0x4cc(%ecx)
813ff38: 89 44 24 04 mov %eax,0x4(%esp) ; "cmap offsets"
813ff3c: 8d 04 95 00 00 00 00 lea 0x0(,%edx,4),%eax ; multiplied by 4
813ff43: 89 04 24 mov %eax,(%esp) ; size
813ff46: e8 25 16 06 00 call 81a1570 <icnMalloc>
813ff4b: 85 c0 test %eax,%eax
813ff4d: 89 85 4c fb ff ff mov %eax,-0x4b4(%ebp)
813ff53: 0f 84 c8 02 00 00 je 8140221 <GetTables+0xc91>
After allocating the space, the tool will read data from the offset table into this buffer. In the following code, the table will be searched for specific values. When the loop terminates, the resulting index will be left in -0x548(%ebp) and the font object in %esi.
81408a5: 84 c0 test %al,%al
81408a7: 75 5c jne 8140905 <GetTables+0x1375>
81408a9: 89 8d b8 fa ff ff mov %ecx,-0x548(%ebp) ; XXX: index that is aggregated
81408af: 83 c1 01 add $0x1,%ecx
81408b2: 39 d1 cmp %edx,%ecx
81408b4: 0f 84 05 f8 ff ff je 81400bf <GetTables+0xb2f>
81408ba: 8b b5 fc fa ff ff mov -0x504(%ebp),%esi ; XXX: check font object
81408c0: 0f b6 84 4e d0 04 00 movzbl 0x4d0(%esi,%ecx,2),%eax
81408c7: 00
81408c8: 3c 03 cmp $0x3,%al
81408ca: 75 d9 jne 81408a5 <GetTables+0x1315>
After finding the index, the tool will then use this index to grab a size out of the “cmap offset” table. Once this size is fetched, this will be added to the base size property from the object that was discovered above and stored in %esi.
813fe31: 8b b5 64 fb ff ff mov -0x49c(%ebp),%esi ; object
813fe37: 8b 85 f8 fa ff ff mov -0x508(%ebp),%eax
813fe3d: 8b 7e 0c mov 0xc(%esi),%edi ; read base size
...
81400cc: 8b 95 b8 fa ff ff mov -0x548(%ebp),%edx ; read index back out
...
81400f4: 8b 85 4c fb ff ff mov -0x4b4(%ebp),%eax
81400fa: 8b 34 90 mov (%eax,%edx,4),%esi ; read bad size
81400fd: 8b 95 f8 fa ff ff mov -0x508(%ebp),%edx
8140103: 89 14 24 mov %edx,(%esp)
8140106: e8 25 14 f7 ff call 80b1530 <ipStreamReset>
814010b: 85 c0 test %eax,%eax
814010d: 0f 84 16 08 00 00 je 8140929 <GetTables+0x1399>
...
8140929: 01 fe add %edi,%esi ; add %edi to size
Immediately afterwards, this size is passed to the icnBufferAlloc function which actually contains a signedness issue. If a signed value is used to allocate with this function, then an undersized buffer will be returned as the algorithm for icnBufferAlloc
searches through a linked list for a buffer size that is larger than the argument passed to it.
814092b: 8b bd 0c fb ff ff mov -0x4f4(%ebp),%edi
8140931: 89 74 24 04 mov %esi,0x4(%esp) ; size
8140935: 89 3c 24 mov %edi,(%esp) ; icnobject
8140938: e8 d3 23 f2 ff call 8062d10 <icnBufferAlloc>
814093d: 85 c0 test %eax,%eax
814093f: 0f 84 30 0f 00 00 je 8141875 <GetTables+0x22e5>
The icnBufferAlloc
function, which is responsible for allocating memory out of a linked list, will take two arguments, one of which is the “icn” object and the other of which is a size. This function contains a signedness issue with regards to the size that’s passed to it. After adding 1 to the size, the function will loop through a linked list pointed to by the first argument while checking to see if the size defined in the list is larger than the size provided as the second argument. If the allocated size is less than 0, then any buffer within the linked list will be returned.
8062d2f: 8b 45 0c mov 0xc(%ebp),%eax ; size
8062d32: 8d b3 b6 34 44 ff lea -0xbbcb4a(%ebx),%esi
8062d38: 89 75 e8 mov %esi,-0x18(%ebp)
8062d3b: 83 c0 01 add $0x1,%eax
8062d3e: 89 45 f0 mov %eax,-0x10(%ebp) ; size+1
...
8062da6: 8b 4f 04 mov 0x4(%edi),%ecx
8062da9: 89 d0 mov %edx,%eax
8062dab: 29 c8 sub %ecx,%eax
8062dad: 3b 45 f0 cmp -0x10(%ebp),%eax ; size+1
8062db0: 7c 94 jl 8062d46 <icnBufferAlloc+0x36>
After allocating with icnBufferAlloc
, the convert tool will return back to GetTables
and then execute the following code. This code passes the potentially undersized buffer along with the size to ipDataFeedRead
which will read data from the file directly into the buffer. Due to the buffer that was allocated potentially being smaller than the space that’s being read as a result of the signedness issue, a buffer overflow can be made to occur.
8140945: 8b 95 0c fb ff ff mov -0x4f4(%ebp),%edx
814094b: 89 14 24 mov %edx,(%esp)
814094e: e8 ad 1d f2 ff call 8062700 <icnBufferGetMemory>
...
8140953: 8b 8d f8 fa ff ff mov -0x508(%ebp),%ecx
8140959: 89 74 24 08 mov %esi,0x8(%esp) ; size
814095d: 89 44 24 04 mov %eax,0x4(%esp) ; buffer
8140961: 8b 41 18 mov 0x18(%ecx),%eax
8140964: 89 04 24 mov %eax,(%esp) ; source
8140967: e8 24 43 f4 ff call 8084c90 <ipDataFeedRead>
$ gdb --quiet --args /opt/MarkLogic/converters/cvtpdf/convert ~/config/
Reading symbols from /opt/MarkLogic/Converters/cvtpdf/convert...done.
(gdb) r
Starting program: /opt/MarkLogic/Converters/cvtpdf/convert /home/user/config/
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Loading configuration...
Parsing macros...
Macro synth-bookmarks='true'
Macro image-output='true'
Macro text-output='true'
Macro zones='false'
Macro ignore-text='true'
Macro remove-overprint='false'
Macro illustrations='true'
Macro line-breaks='true'
Macro image-quality='75'
Macro page-start=''
Macro page-end=''
Macro document-start=''
Macro document-end=''
features='11140221'
Processing...
Analysing '/home/user/poc.pdf'
Pages 1 to 1
Catchpoint 4 (signal SIGSEGV), 0xf7ea902d in __memmove_ssse3_rep ()
from /lib/libc.so.6
(gdb) bt 5
#0 0xf7ea902d in __memmove_ssse3_rep () from /lib/libc.so.6
#1 0x082574f7 in ipFlateFeedRead ()
#2 0x08084cd7 in ipDataFeedRead ()
#3 0x0814096c in GetTables ()
#4 0x08141c6e in ipParseFontFile ()
(More stack frames follow...)
(gdb) h
-=[registers]=-
[eax: 0x0997c6c8] [ebx: 0xf7ea902a] [ecx: 0x6b016400] [edx: 0x099c8ff3]
[esi: 0x00000010] [edi: 0x980283d4] [esp: 0xfffbf9a8] [ebp: 0xfffbf9e8]
[eflags: NZ SF OF CF ND NI]
-=[stack]=-
fffbf9a8 | 08f57000 082574f7 099c8ff3 0997c6c8 | .p...t%.........
fffbf9b8 | 00000010 08084fac 0997c2a0 00000000 | .....O..........
fffbf9c8 | fffbfa08 f7dd54ea 9803a13f 9803a003 | .....T..?.......
fffbf9d8 | fffbf9f8 08f57000 9803a13f 098eefd0 | .....p..?.......
-=[disassembly]=-
=> 0xf7ea902d <__memmove_ssse3_rep+3773>: mov %ecx,0xc(%edx)
0xf7ea9030 <__memmove_ssse3_rep+3776>: mov 0x8(%eax),%ecx
0xf7ea9033 <__memmove_ssse3_rep+3779>: mov %ecx,0x8(%edx)
0xf7ea9036 <__memmove_ssse3_rep+3782>: mov 0x4(%eax),%ecx
0xf7ea9039 <__memmove_ssse3_rep+3785>: mov %ecx,0x4(%edx)
0xf7ea903c <__memmove_ssse3_rep+3788>: mov (%eax),%ecx
2016-10-10 - Vendor Disclosure
2017-02-27 - Public Release
Discovered by Marcin Noga of Cisco Talos and a Talos team member.