CVE-2017-4941
An exploitable code execution vulnerability exists in the remote management functionality of VMware . A specially crafted set of VNC packets can cause a type confusion resulting in stack overwrite, which could lead to code execution. An attacker can initiate a VNC session to trigger this vulnerability.
VMware Workstation Pro 12.5.7, Linux/Windows
https://my.vmware.com/web/vmware/info/slug/desktop_end_user_computing/vmware_workstation_pro/12_0
9.0 - CVSS:3.0/AV:N/AC:H/PR:N/UI:N/S:C/C:H/I:H/A:H
CWE-843: Access of Resource Using Incompatible Type
VMware’s VNC implementation is used for remote management, remote access, and automation purposes in VMware products, such as Workstation, Player, and ESXi, which share a common VMW VNC code base between them all.
As specified by the RFB protocol[1], all VNC servers have to support a specific set of VNC messages. The relevant messages for the purposes of this write-up are listed (format: [byteLen,Varname]):
VncPointerEvent
[\x05][1'button-mask'][2,x'Cord'][2,'yCord']
VncSetPixelFormat
[\x00][3,'pad'],[1,'bpp'],[1,'depth'],[1,'bigE_fl'],[1,'true_col_fl'],
[2,'redmax'],
[2,'greenmax'],[2,'bluemax'],[1,'redshift'],[1,'greenshift'],[1,'blueshift'],
[3,'pad'],
VncFrameBufferUpdateRequest
[\x03][1,'incremental'],[2,'xpos'],[2,'ypos'],[2,'width'],[2,'height']
The high level overview of this bug is that we request the VNC server to create a frame buffer (a screenshot essentially), and then we cause the mouse cursor to have to be re-rendered. During this period, we also ask the VNC server to turn off “TrueColor” mode (as gathered from the VncServerInit packet) via the VncSetPixelFormat message. Turning TrueColor mode off causes the server to mis-interpret the type of the cursor when it tries to decode the cursor, leading to a high value (0xff00000) getting written into the cursor’s PNG infoStruct. This value is later mis-interpreted as the amount of palettes or pixels, causing a loop which reads and writes to the stack to overflow.
A few small details that are required for this crash to occur: 1. The VNCSetPixelFormat must have the ‘bits-per-pixel’ and ‘depth’ fields set to 0x8, otherwise the libpng code will not reach the overflow loop. 2. While a VNCPointerEvent message was utilized, any VNC or VmwareVNC message that causes the cursor to be re-rendered is sufficient. This includes the VMware pointer event packet, and also the VNC messages to grab the cursor. 3. The VNCFrameBufferUpdate Request must not have the “incremental” field set. But if it does, you can replay the POC packets to cause the crash.
It should also be noted that basically any order of these packets will work, the only way I could get it to not crash was to have the FrameBufferUpdate request first, followed by a .2 second sleep between it and the PixelFormat request. The position of the VNCPointer message is not important.
But now onto the lower level details: the following is the disassembly that overwrites the variable on the stack that is eventually used as the loop counter:
mov dword ptr [rsp+136], 0FF0000h
mov dword ptr [rsp+144], 0FF00h
mov [rsp+80], rax 4C8 lea rax, concurency_demangle
mov dword ptr [rsp+140], 0FFh
The main indicators of this being a type confusion were the above hardcoded values (which also seem to be RGB color bitmasks). Interestingly, before this assignment, there is a check on a byte [rdi+0x413] within a custom VMware structure that also contains PNG struct info, the assumption being that this byte determines whether or not the PNG cursor is currently in TrueColor mode or not. <(^.^)># x/4gx $rdi+0x410 0x7f90900701a0: 0x0000000101000001 0xfffffffcfffffff9 0x7f90900701b0: 0x000000130000000d 0x0000000000000000
Since it seems that if you wait long enough in between sending the FrameBuffer and SetPixelFormat VNC messages (>.20 seconds) this crash does not occur, the current running theory is that there is a race condition between the thread handling the rendering of the cursor onto the frame buffer, and the thread handling the updating of encoding of the cursor and the Framebuffer, such that there is a type mismatch, where one is TrueColor and one is not.
Interestingly, in the Linux version of this bug, there is a possibility for exploitation, while for the windows version, there is a lot less of a chance. The disassembly of the crashing loop for Windows and Linux is respectively listed below:
// Linux
loc_7F350F487A40:
movzx ecx, byte ptr [rdx+1Eh] ; extract RGB data .text:00007F350F487A44 398 add esi, 1
mov [rax], cl
movzx ecx, byte ptr [rdx+1Dh]
mov [rax+1], cl
movzx ecx, byte ptr [rdx+1Ch]
add rdx, 4
mov [rax+2], cl
mov rdi, [rsp+398h+counter_base_obj]
add rax, 3
mov ecx, [rdi+18h] ; rdi+0x18==max_loops
cmp ecx, esi ; esi==iteration_count
ja short loc_7F350F487A40 ; extract RGB data
// Windows
lea rcx, [rsp+3B8h+var_327]
lea rdx, [rdi+1Dh]
mov r8, r9 ; Max_loops only read
; once, before loop
loc_14044C270: ; Crashing loop
movzx eax, byte ptr [rdx+1]
add rcx, 3
add rdx, 4
sub r8, 1
mov [rcx-4], al
movzx eax, byte ptr [rdx-4]
mov [rcx-3], al
movzx eax, byte ptr [rdx-5]
mov [rcx-2], al
jnz short loc_14044C270 ; Crashing loop
Since the Linux version reads in the loop counter off the stack with each iteration, eventually the loop counter gets overwritten with another value. If this value ever happens to be less than the current counter, the loop will exit and the program would resume with a modified stack. Since the vmware-vmx binary was compiled without stack-smashing protection, this makes exploitation easier for an attacker.
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fda8992a700 (LWP 50584)]
[----------------------------------registers-----------------------------------]
RAX: 0x7fda8992a59f --> 0x0
RBX: 0x7fda54062470 --> 0x0
RCX: 0x49580200
RDX: 0x7fda8992afe4 --> 0x0
RSI: 0x6b5
RDI: 0x7fda89929510 --> 0x7f00409a92da7f00
RBP: 0x40 ('@')
RSP: 0x7fda89929110 --> 0xffffffff00000000
RIP: 0x7fda9035bdc0 (movzx ecx,BYTE PTR [rdx+0x1e])
R8 : 0x3
R9 : 0x0
R10: 0xf8a9422d6e103298
R11: 0x13690
R12: 0x7fda89929970 --> 0x0
R13: 0x7fda543e4090 --> 0x700000001
R14: 0xba
R15: 0x0
EFLAGS: 0x13216 (carry PARITY ADJUST zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
0x7fda9035bdb1: lea rax,[rsp+0x70]
0x7fda9035bdb6: xor esi,esi
0x7fda9035bdb8: nop DWORD PTR [rax+rax*1+0x0]
=> 0x7fda9035bdc0: movzx ecx,BYTE PTR [rdx+0x1e]
0x7fda9035bdc4: add esi,0x1
0x7fda9035bdc7: mov BYTE PTR [rax],cl
0x7fda9035bdc9: movzx ecx,BYTE PTR [rdx+0x1d]
0x7fda9035bdcd: mov BYTE PTR [rax+0x1],cl
[------------------------------------stack-------------------------------------]
0000| 0x7fda89929110 --> 0xffffffff00000000
0008| 0x7fda89929118 --> 0x0
0016| 0x7fda89929120 --> 0x0
0024| 0x7fda89929128 --> 0xffffffff0000000e
0032| 0x7fda89929130 --> 0x0
0040| 0x7fda89929138 --> 0x0
0048| 0x7fda89929140 --> 0xffffffff0000000f
0056| 0x7fda89929148 --> 0x0
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0x00007fda9035bdc0 in ?? ()
An important factor in this vulnerability is that it requires a successful VNC authentication beforehand, but by default, VMware does not require a username/password for VNC sessions. Turning on VNC authentication should mitigate this, turning it from a no-auth bug to a single-auth one.
2017-07-12 - Vendor Disclosure
2017-12-19 - Public Release
Discovered by Lilith Wyatt and another member of Cisco Talos.