CVE-2018-4040
An exploitable uninitialized pointer vulnerability exists in the rich text format parser of Atlantis Word Processor, version 3.2.7.2. A specially crafted document can cause certain RTF tokens to dereference a pointer that has been uninitialized and then write to it. An attacker must convince a victim to open a specially crafted document in order to trigger this vulnerability.
Atlantis Word Processor 3.2.7.1, 3.2.7.2
https://www.atlantiswordprocessor.com/en/
8.8 - CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H
CWE-457: Use of Uninitialized Variable
Atlantis Word Processor is a traditional word processor that provides a number of useful features for a variety of users. The software is fully compatible with other word processors, such as Microsoft Office Word 2007, and even has a similar interface to Microsoft Word. Atlantis also has the ability to encrypt document files and fully customize the interface. This application is written in Delphi and contains the majority of its capabilities within a single relocatable binary.
When opening up an RTF document, the application will first fingerprint it in order to determine the correct file format parser via the following code. This code will first fetch the current TDoc
object and then check one of its fields that represents the current file format enumeration. When RTF is selected, this field will have the value 0, which results in case 0 being executed. At [1], the application will then execute a function that fingerprints and parses the document. After calling the function, the application will allocate a number of data structures and then continue by executing the call at [2].
awp+0x1b3139:
005b3139 8b45e8 mov eax,dword ptr [ebp-18h] // TDoc
005b313c 8b80dc000000 mov eax,dword ptr [eax+0DCh] // TDoc.fileFormatEnumeration
005b3142 83f805 cmp eax,5
005b3145 776a ja awp+0x1b31b1 (005b31b1)
005b3147 ff24854e315b00 jmp dword ptr awp+0x1b314e (005b314e)[eax*4]
...
awp+0x1b3166:
005b3166 55 push ebp // Case 0
005b3167 e858dcfdff call awp+0x190dc4 (00590dc4) // [1] \
005b316c 59 pop ecx
005b316d 8885d7f8ffff mov byte ptr [ebp-729h],al
005b3173 eb49 jmp awp+0x1b31be (005b31be)
\
awp+0x190dc4:
00590dc4 55 push ebp
00590dc5 8bec mov ebp,esp
00590dc7 81c44cf5ffff add esp,0FFFFF54Ch
00590dcd 53 push ebx
00590dce 56 push esi
00590dcf 57 push edi
00590dd0 33c0 xor eax,eax
...
00590fdd 55 push ebp
00590fde 33c0 xor eax,eax
00590fe0 e8c7a4ffff call awp+0x18b4ac (0058b4ac) // [2]
00590fe5 59 pop ecx
Once inside the function call at 0x58b4ac, the application will eventually execute the function at [3]. This function will check that the current character is alphabetic. This is used to determine whether the current function should recurse into itself in order to handle grouping or some of the other features provided by the application’s Rich Text Format parser. After checking the beginning of the document, the application will then enter the loop at [4]. This loop will continue to parser the different groups within the document. When a group has been identified by the parser, the function will recurse into itself at [5].
awp+0x18b4ac:
0058b4ac 55 push ebp
0058b4ad 8bec mov ebp,esp
0058b4af 83c4e8 add esp,0FFFFFFE8h
0058b4b2 53 push ebx
0058b4b3 56 push esi
0058b4b4 57 push edi
0058b4b5 8845ff mov byte ptr [ebp-1],al
...
0058b4ca 8b07 mov eax,dword ptr [edi]
0058b4cc 0303 add eax,dword ptr [ebx]
0058b4ce 40 inc eax
0058b4cf e8cc64eaff call awp+0x319a0 (004319a0) // [3] Check alpha character
0058b4d4 84c0 test al,al
0058b4d6 0f8486000000 je awp+0x18b562 (0058b562)
...
awp+0x18b5bc:
0058b5bc 8b03 mov eax,dword ptr [ebx] // [4] Loop over RTF characters
0058b5be 8b17 mov edx,dword ptr [edi]
0058b5c0 0fb60402 movzx eax,byte ptr [edx+eax]
0058b5c4 83f87b cmp eax,7Bh
0058b5c7 7f21 jg awp+0x18b5ea (0058b5ea)
0058b5c9 7442 je awp+0x18b60d (0058b60d)
...
awp+0x18b60d:
0058b60d 8b07 mov eax,dword ptr [edi]
0058b60f 0303 add eax,dword ptr [ebx]
0058b611 e87663eaff call awp+0x3198c (0043198c) // Check current position for '{*\'
0058b616 84c0 test al,al
0058b618 742a je awp+0x18b644 (0058b644)
...
awp+0x18b644:
0058b644 8b4508 mov eax,dword ptr [ebp+8]
0058b647 50 push eax
0058b648 b001 mov al,1
0058b64a e85dfeffff call awp+0x18b4ac (0058b4ac) // [5] Recurse into current function
0058b64f 59 pop ecx
0058b650 e934040000 jmp awp+0x18ba89 (0058ba89)
...
awp+0x18bab2:
0058bab2 8b03 mov eax,dword ptr [ebx]
0058bab4 8b5508 mov edx,dword ptr [ebp+8]
0058bab7 8b5208 mov edx,dword ptr [edx+8]
0058baba 3b42dc cmp eax,dword ptr [edx-24h] // Distance to move
0058babd 0f8cf9faffff jl awp+0x18b5bc (0058b5bc)
Once the previous function has determined it needs to recurse into itself, the application will re-enter the function at 0x58b4ac. After determining that the currently parsed character is beginning a group or a valid token, the application will execute the function call at [6]. This function will verify that the current character position is pointing at characters that could make up an RTF token and return the token in %eax
. After getting the token identifier, the application will then pass the identifier in %eax
to the function call at [7] to parse it.
awp+0x18b4dc:
0058b4dc 8b4508 mov eax,dword ptr [ebp+8]
0058b4df 50 push eax
0058b4e0 8b4508 mov eax,dword ptr [ebp+8]
0058b4e3 0508f7ffff add eax,0FFFFF708h
0058b4e8 e81fbbffff call awp+0x18700c (0058700c) // [6] Check valid RTF token
0058b4ed 59 pop ecx
0058b4ee 83f8ff cmp eax,0FFFFFFFFh // Token Id
0058b4f1 7557 jne awp+0x18b54a (0058b54a)
...
0058b54a 8b5508 mov edx,dword ptr [ebp+8] // Frame
0058b54d 52 push edx
0058b54e 8b5508 mov edx,dword ptr [ebp+8] // Frame
0058b551 8b920cf7ffff mov edx,dword ptr [edx-8F4h]
0058b557 e8cc510000 call awp+0x190728 (00590728) // [7]
0058b55c 59 pop ecx
0058b55d e97e050000 jmp awp+0x18bae0 (0058bae0)
Once inside the function at 0x590728, the application will first assign the current token identifier into a variable on the stack. Eventually, this identifier will be used to select a particular case to continue parsing with at [8]. When parsing the token “\subject,” which has the identifier 0x179, the application will then execute the case at [9]. The case at [9] will first grab the current TDoc
instance, and then store the offset of the TDoc
’s description field into %eax
. These will then get passed as arguments to the call at [10].
awp+0x190728:
00590728 55 push ebp
00590729 8bec mov ebp,esp
0059072b 83c4f8 add esp,0FFFFFFF8h
0059072e 53 push ebx
0059072f 8bda mov ebx,edx
00590731 8945f8 mov dword ptr [ebp-8],eax // Case
...
00590775 0fb745f8 movzx eax,word ptr [ebp-8] // [8] Case
00590779 3ddf000000 cmp eax,0DFh
0059077e 0f8f32010000 jg awp+0x1908b6 (005908b6)
00590784 0f8442040000 je awp+0x190bcc (00590bcc)
...
005908b6 3d35010000 cmp eax,135h
005908bb 0f8fa1000000 jg awp+0x190962 (00590962)
...
00590962 3d77010000 cmp eax,177h
00590967 7f4a jg awp+0x1909b3 (005909b3)
...
005909b3 3da9010000 cmp eax,1A9h
005909b8 7f1f jg awp+0x1909d9 (005909d9)
005909ba 0f848e030000 je awp+0x190d4e (00590d4e)
005909c0 2d79010000 sub eax,179h
005909c5 0f8451030000 je awp+0x190d1c (00590d1c)
...
00590d1c 55 push ebp // [9]
00590d1d 8b4508 mov eax,dword ptr [ebp+8] // Frame
00590d20 8b4008 mov eax,dword ptr [eax+8]
00590d23 8b40e8 mov eax,dword ptr [eax-18h] // TDoc
00590d26 05ec000000 add eax,0ECh // TDoc.description
00590d2b b201 mov dl,1
00590d2d e802aeffff call awp+0x18bb34 (0058bb34) // [10]
00590d32 59 pop ecx
00590d33 eb7c jmp awp+0x190db1 (00590db1)
The function at 0x58bb34 is simply a wrapper around the call to [11]. This function will close around the variables belonging to the caller’s frame, and then call another function. Once inside the function at 0x58afa4, the application will then enter a loop at [12] in order to determine how to parse the different tokens in the document. This loop will read a byte from the current position in the file and then check to see if it is non-printable, a backslash, or one of the types of braces. When processing the token, the application will execute the case for the backslash at [13].
awp+0x18bb34:
0058bb34 55 push ebp
0058bb35 8bec mov ebp,esp
0058bb37 8b4d08 mov ecx,dword ptr [ebp+8] // Caller frame
0058bb3a 8b4908 mov ecx,dword ptr [ecx+8] // Frame belonging to 0x590dc4
0058bb3d 51 push ecx
0058bb3e 33c9 xor ecx,ecx
0058bb40 e85ff4ffff call awp+0x18afa4 (0058afa4) // [11] \
0058bb45 59 pop ecx
0058bb46 8b4508 mov eax,dword ptr [ebp+8] // Caller frame
0058bb49 8b4008 mov eax,dword ptr [eax+8] // Frame belonging to 0x590dc4
0058bb4c 8b4008 mov eax,dword ptr [eax+8] // Frame belonging to 0x5b2a3c
0058bb4f ff40f8 inc dword ptr [eax-8]
0058bb52 5d pop ebp
0058bb53 c3 ret
\
awp+0x18afa4:
0058afa4 55 push ebp
0058afa5 8bec mov ebp,esp
0058afa7 83c4d4 add esp,0FFFFFFD4h
0058afaa 53 push ebx
0058afab 56 push esi
0058afac 57 push edi
0058afad 33db xor ebx,ebx
...
0058b003 8b03 mov eax,dword ptr [ebx] // [12] Loop
0058b005 8b5508 mov edx,dword ptr [ebp+8] // Frame
0058b008 8b5208 mov edx,dword ptr [edx+8] // Caller's Frame
0058b00b 8b52e0 mov edx,dword ptr [edx-20h] // File contents buffer
0058b00e 8a0402 mov al,byte ptr [edx+eax]
0058b011 2c20 sub al,20h
0058b013 0f8238030000 jb awp+0x18b351 (0058b351) // Non-printable
0058b019 2c3c sub al,3Ch
0058b01b 0f84d0000000 je awp+0x18b0f1 (0058b0f1) // [13] If character is "\"
0058b021 2c1f sub al,1Fh
0058b023 740d je awp+0x18b032 (0058b032) // Left Brace
0058b025 2c02 sub al,2
0058b027 0f8406030000 je awp+0x18b333 (0058b333) // Right Brace
0058b02d e926030000 jmp awp+0x18b358 (0058b358)
...
awp+0x18b450:
0058b450 8b03 mov eax,dword ptr [ebx] // [12] Continue
0058b452 8b5508 mov edx,dword ptr [ebp+8] // Frame
0058b455 8b5208 mov edx,dword ptr [edx+8] // Caller's Frame
0058b458 3b42dc cmp eax,dword ptr [edx-24h]
0058b45b 0f8ca2fbffff jl awp+0x18b003 (0058b003)
When processing a backslash which prefixes a Rich Text Format token, the following code will be executed in order to identify the token. First at [14], the application will read a bigram from the current file position and use this to single out which array to match the token against. Once this is determined, at [15] the application will call a function that will do a comparison and store the actual token identifier. After doing a few checks to handle a specific case for another token, the application will then fetch the current paragraph from one of the caller’s frames and then pass it as an argument along with the token identifier to the call at [16].
awp+0x18b0f1:
0058b0f1 8b4508 mov eax,dword ptr [ebp+8] // Frame
0058b0f4 8b4008 mov eax,dword ptr [eax+8] // Caller's Frame
0058b0f7 8b40e0 mov eax,dword ptr [eax-20h] // File contents buffer
0058b0fa 0303 add eax,dword ptr [ebx]
0058b0fc e89f68eaff call awp+0x319a0 (004319a0) // [14] Read a bigram to identify token
0058b101 84c0 test al,al
0058b103 0f84f5000000 je awp+0x18b1fe (0058b1fe)
0058b109 8b4508 mov eax,dword ptr [ebp+8] // Frame
0058b10c 8b4008 mov eax,dword ptr [eax+8] // Caller's Frame
0058b10f 8b40e0 mov eax,dword ptr [eax-20h] // File contents buffer
0058b112 0303 add eax,dword ptr [ebx]
0058b114 40 inc eax
0058b115 8d55dc lea edx,[ebp-24h]
0058b118 e8af68eaff call awp+0x319cc (004319cc) // Help single out which token array
0058b11d 8d45dc lea eax,[ebp-24h]
0058b120 e88f6aeaff call awp+0x31bb4 (00431bb4) // [15] Return the token identifier
...
awp+0x18b12b:
0058b12b 3da7010000 cmp eax,1A7h
0058b130 755e jne awp+0x18b190 (0058b190)
...
awp+0x18b190:
0058b190 8b5508 mov edx,dword ptr [ebp+8]
0058b193 f6429240 test byte ptr [edx-6Eh],40h
0058b197 743c je awp+0x18b1d5 (0058b1d5)
0058b199 3d7c010000 cmp eax,17Ch
0058b19e 7535 jne awp+0x18b1d5 (0058b1d5)
...
awp+0x18b1d5:
0058b1d5 8b5508 mov edx,dword ptr [ebp+8] // Frame
0058b1d8 52 push edx
0058b1d9 8b5508 mov edx,dword ptr [ebp+8] // Frame
0058b1dc 8b5208 mov edx,dword ptr [edx+8] // Caller's Frame
0058b1df 8b9258f9ffff mov edx,dword ptr [edx-6A8h] // Current Paragraph (TPar)
0058b1e5 83c214 add edx,14h // Paragraph Property
0058b1e8 52 push edx
0058b1e9 8b5508 mov edx,dword ptr [ebp+8] // Frame
0058b1ec 8d4aa4 lea ecx,[edx-5Ch]
0058b1ef 8d55dc lea edx,[ebp-24h] // Token Identifier
0058b1f2 92 xchg eax,edx
0058b1f3 e8b8c4ffff call awp+0x1876b0 (005876b0) // [16] Handle Token Operand
0058b1f8 59 pop ecx
0058b1f9 e929020000 jmp awp+0x18b427 (0058b427)
Finally, when the application executes the function at 0x5876b0, the application will begin to parse any operands belonging to the token. There are a number of different types of tokens that the application can parse depending on the token identifier that was identified in the calling function. At [17], the application will use the determined token identifier to identify the correct case to execute. When handling case 232 for the token “\listsimple”, the block of code beginning near [18] will be executed. The function call at [18] will simply extract a numerical argument out of the text that follows the token and then store it into the %ebx
register. Afterwards, the function call at [19] will be used to fetch the last element of a TAutoList
that should’ve been initialized earlier. Due to the application not properly checking the state of the list before the function call at [19], the application can potentially return an invalid value based on the capacity of the list. Due to Delphi’s technique of caching objects, an attacker may be able to get an object with a controlled capacity freed at which point will be used to calculated a pointer. Following this at [20], the application will attempt to write the integer that was parsed from the file into %ebx
to an address relative to the returned pointer. This can allow for one to corrupt heap memory which can lead to code execution under the context of the application.
awp+0x1876b0:
005876b0 55 push ebp
005876b1 8bec mov ebp,esp
005876b3 83c4d4 add esp,0FFFFFFD4h
005876b6 53 push ebx
005876b7 56 push esi
005876b8 57 push edi
005876b9 894dfc mov dword ptr [ebp-4],ecx
...
005876c2 0fb7d0 movzx edx,ax // [17]
005876c5 81fac7010000 cmp edx,1C7h
005876cb 0f878b380000 ja awp+0x18af5c (0058af5c)
005876d1 ff2495d8765800 jmp dword ptr awp+0x1876d8 (005876d8)[edx*4]
...
awp+0x18942c:
0058942c 8bc6 mov eax,esi
0058942e e84586eaff call awp+0x31a78 (00431a78) // [18] Parse numerical argument
00589433 8bd8 mov ebx,eax // User-controlled value
00589435 8b450c mov eax,dword ptr [ebp+0Ch] // Frame
00589438 8b4008 mov eax,dword ptr [eax+8] // Caller's Frame
0058943b 8b8040f9ffff mov eax,dword ptr [eax-6C0h] // TAutoList
00589441 e80aeae7ff call awp+0x7e50 (00407e50) // [19] Return last element in TList (possibly uninitialized)
00589446 885818 mov byte ptr [eax+18h],bl // [20] Write byte to uninitialized address
00589449 e90e1b0000 jmp awp+0x18af5c (0058af5c)
The function call to fetch the last item from a TList
is as follows. If the items for the TList
is pointing directly into the TList
’s cache and the length is 0, this will result in the application fetching an item outside the bounds of the chosen items which can result in an arbitrary value being returned. This is then written to as described above.
awp+0x7e50:
00407e50 8b4804 mov ecx,dword ptr [eax+4] // Length
00407e53 8b401c mov eax,dword ptr [eax+1Ch] // Items (which can point into cached list)
00407e56 8b4488fc mov eax,dword ptr [eax+ecx*4-4] // Return item from List cache
00407e5a c3 ret
eax=00000001 ebx=00000bb3 ecx=00000000 edx=07354f14 esi=0018ea94 edi=0018fb00
eip=00589446 esp=0018e778 ebp=0018e7b0 iopl=0 nv up ei pl nz ac po nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010212
awp+0x189446:
00589446 885818 mov byte ptr [eax+18h],bl ds:002b:00000019=??
0:000> ub .
awp+0x18942a:
0058942a 0000 add byte ptr [eax],al
0058942c 8bc6 mov eax,esi
0058942e e84586eaff call awp+0x31a78 (00431a78)
00589433 8bd8 mov ebx,eax
00589435 8b450c mov eax,dword ptr [ebp+0Ch]
00589438 8b4008 mov eax,dword ptr [eax+8]
0058943b 8b8040f9ffff mov eax,dword ptr [eax-6C0h]
00589441 e80aeae7ff call awp+0x7e50 (00407e50)
0:000> ? poi(poi(poi(@ebp+c)+8)-6c0)
Evaluate expression: 201277336 = 0bff3f98
0:000> $$>a<c:/users/user/audit/atlantis/scripts/TList.dbgscr poi(poi(poi(@ebp+c)+8)-6c0)
[0bff3f98] <type 'structure' name='TList' size=+0x20>
[0bff3f98] (+0) : p_InfoTable_0 : (00406898) 4221080
[0bff3f9c] (+4) : v_length_4 : (00000000) 0
[0bff3fa0] (+8) : v_capacity_8 : (00000001) 1
[0bff3fa4] (+c) : v_cache(4)_c : { 07366b34, 00000000, 00000000, 00000000 }
[0bff3fb4] (+1c) : p_items_1c : (0bff3fa4) 201277348
To use the proof of concept, simply open up or preview the document in the target application. The application should crash at the address specified due to heap corruption.
This vulnerability is triggered by simply opening up a document file. The only way to mitigate this would be to not open a document file from an untrusted user.
2018-11-16 - Vendor Disclosure 2018-11-20 - Vendor Patched; Public Release
Discovered by a member of Cisco Talos.