Talos Vulnerability Report

TALOS-2024-1921

The Biosig Project libbiosig sopen_FAMOS_read integer overflow to out-of-bounds write vulnerability

February 20, 2024
CVE Number

CVE-2024-21812

SUMMARY

An integer overflow vulnerability exists in the sopen_FAMOS_read functionality of The Biosig Project libbiosig 2.5.0 and Master Branch (ab0ee111). A specially crafted .famos file can lead to an out-of-bounds write which in turn can lead to arbitrary code execution. An attacker can provide a malicious file to trigger this vulnerability.

CONFIRMED VULNERABLE VERSIONS

The versions below were either tested or verified to be vulnerable by Talos or confirmed to be vulnerable by the vendor.

The Biosig Project libbiosig 2.5.0
The Biosig Project libbiosig Master Branch (ab0ee111)

PRODUCT URLS

libbiosig - https://biosig.sourceforge.net/index.html

CVSSv3 SCORE

9.8 - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H

CWE

CWE-190 - Integer Overflow or Wraparound

DETAILS

Libbiosig is an open source library designed to process various types of medical signal data (EKG, EEG, etc) within a vast variety of different file formats. Libbiosig is also at the core of biosig APIs in Octave and Matlab, sigviewer, and other scientific software utilized for interpreting biomedical signal data.

When reading in or writing out data of any filetype, libbiosig will always end up hitting the sopen_extended function:

HDRTYPE* sopen_extended(const char* FileName, const char* MODE, HDRTYPE* hdr, biosig_options_type *biosig_options) {
/*
    MODE="r"
        reads file and returns HDR
    MODE="w"
        writes HDR into file
 */

This is where the vast majority of parsing logic is for most file types, albeit with some exceptions to this generalization which end up calling more specific sopen_* functions. Regardless, unless specifically stated, it’s safe to assume we’re somewhere in this extremely large function. The general flow of sopen_extended is as one might expect: initialize generic structures, figure out what file type we’re dealing with, parse the filetype, and finally populate the generic structures that can be utilized by whatever is calling sopen_extended. To determine file type, sopen_extended calls getfiletype, which goes through a list of magic byte comparisons. Alternatively we could force a particular file type, but this is generally more useful when writing data to a file.

Moving on from the generic overview, we can get to be more specific. For our current vulnerability we deal with the imc FAMOS file format, a generic format for quick data analysis. To figure out if we’re dealing with a .famos file, getfiletype runs the following magic-byte check:

else if (!memcmp(Header1,"|CF,",4))
        hdr->TYPE = FAMOS;

Simple enough, and assuming we find a .famos file, sopen_extended hits the following branch:

#ifdef WITH_FAMOS
        else if (hdr->TYPE==FAMOS) {
            hdr->HeadLen=count;
            sopen_FAMOS_read(hdr);
    }
#endif

Worth noting that while the famos file format can be disabled with the WITH_FAMOS compiler flag, by default it is enabled. Continuing on into sopen_FAMOS_read:

EXTERN_C void sopen_FAMOS_read(HDRTYPE* hdr) {
#define Header1 ((char*)hdr->AS.Header) 

        size_t count = hdr->HeadLen;

        char *t, *t2;
        const char EOL[] = "|;\xA\xD";
        size_t pos, l1, len;
        pos  = strspn(Header1, EOL);   // [1]
        uint16_t gdftyp, CHAN=0;
        char OnOff=1;
        double Fs = NAN;
        uint32_t NoChanCurrentGroup = 0;    // number of (undefined) channels of current group 
        int level = 0;  // to check consistency of file

        char flag_AbstandFile = 0;      // interleaved format ??? used for experimental code 

        fprintf(stdout,"SOPEN(FAMOS): support is experimental. Only time series with equidistant sampling and single sampling rate are supported.\n"); // [2]

        while (pos < count-20) { // [3]
            t       = Header1+pos;  // start of line // [4]

            l1      = strcspn(t+5, ","); // [5]
            t[l1+5] = 0;
            len     = atol(t+5);  // [6]
            pos    += 6+l1;       
            t2      = Header1+pos;  // start of line // [7]
            if (count < max(pos,hdr->HeadLen)+256) { // HeadLen can be updated... //[8]
                    size_t bufsiz = 4095;
                    hdr->AS.Header = (uint8_t*)realloc(hdr->AS.Header, count+bufsiz+1); // [9]  
                    count += ifread(hdr->AS.Header+count,1,bufsiz,hdr);
            }
            pos    += len+1;
            
        // [...]
        pos += strcspn(Header1+pos,EOL);
        pos += strspn(Header1+pos,EOL);
    }

To start, our code skips over any sort of newlines or delimiter at [1], and then prints out the fact that FAMOS support is experimental [2]. Heeding this warning, our input file proceeds with caution as we enter the main logic loop at [3]. The variables t [4] is saved to denote the start of the line, l1 [5] is the length of our current line (the +5 is there since the tags denoting data type are 4 bytes long), len ends up being the length of the data of the current line, and finally t2 ends up pointing to the actual start of our data for the current line. Assuming that our count variable is smaller than the position of where we’re about to start reading plus 0xFF [8], then we end up reallocating our memory buffer and reading in more of the file into it [9]. To see how each opcode or tag is parsed within our file, we examine a sample opcode in the logic below:

        if (!strncmp(t,"CF,2",4) && (level==0)) {                     
            level = 1;
        }
        else if (!strncmp(t,"CK,1",4) && (level==1)) {                
            level = 2;
        }
        else if (!strncmp(t,"NO,1",4) && ((level==1) || (level==2))) { 
                  // [...]
        }
        else if (!strncmp(t,"CT,1",4) && (level>1)) {
        }

        else if (!strncmp(t,"Cb,1",4))  // Buffer Beschreibung         // [10]
        {
            // AnzahlBufferInKey
            int p = strcspn(t2,",");
            t2[p] = 0;
            if (atoi(t2) != 1) {
                biosigERROR(hdr, B4C_FORMAT_UNSUPPORTED, "FAMOS: more than one buffer not supported");lll
            }
            // BytesInUserInfo
            t2 += 1+p;
            p = strcspn(t2,",");
            t2[p] = 0;
            // Buffer Referenz
            t2 += 1+p;
            p = strcspn(t2,",");
            t2[p] = 0;
            // IndexSamplesKey
            t2 += 1+p;
            p = strcspn(t2,",");
            t2[p] = 0;
            // OffsetBufferInSamplesKey
            t2 += 1+p;
            p = strcspn(t2,",");
            t2[p] = 0;
            hdr->CHANNEL[CHAN].bi = atol(t2); // [11]

Which parsing code we hit is determined first by the first four bytes of our line, and then an internal state machine variable is checked. While not all opcodes require a particular state (e.g. [10]), as seen above, most do. Continuing on in the branch at [13], assuming any given line in our input file starts with “Cb,1”, we then search the rest of that line for five commas, turn all of them into null bytes such that these fields can be parsed later. The first example of this parsing is at [11], where we populate a hdr->CHANNEL’s bi member with a signed long. Importantly, this particular line itself is but one example of a more general idea - that we can write into the hdr->CHANNEL[CHAN] at various spots, and our task with FAMOS files is generally to find interesting spots to turn these into heap writes. For this particular writeup, there’s a spot inside sopen_FAMOS_read where we can realloc the hdr->CHANNEL buffer, which will become very interesting very quickly:

        else if (!strncmp(t,"CG,1",4)) // Definition eines Datenfeldes     // [12]
        {
            int p;
            // Anzahl Komponenten               
            p = strcspn(t2,",");
            t2[p] = 0;
            NoChanCurrentGroup = atol(t2);  // additional channels // [13]
            hdr->NS += NoChanCurrentGroup; 
            // Feldtyp
            p = strcspn(t2,",");
            t2[p] = 0;
            OnOff = 1;
            if (atoi(t2) != 1) {
        //          OnOff = 0; 
        //          biosigERROR(hdr, B4C_DATATYPE_UNSUPPORTED, "FAMOS: data is not real and aquidistant sampled");
            }
            // Dimension 
            p = strcspn(t2,",");
            t2[p] = 0;

            hdr->CHANNEL = (CHANNEL_TYPE*)realloc(hdr->CHANNEL, hdr->NS*sizeof(CHANNEL_TYPE));   // [14]
            level = 3;   // [15]
        }

Assuming we have a line like: CG,1,9,10,11,12;;; [12], then the nine is treated as the size of the data and the following 10 is put into the uint32_t NoChanCurrentGroup variable [13]. At [14], hdr->NS is advanced by this same amount, and then we realloc our hdr->CHANNEL buffer by a multiplier of this same amount. A quick relevant tangent to the above code snippet - what types do all these variables happen to be? Well hdr->NS happens to be a uint16_t, NoChanCurrentGroup happens to be a uint32_t, and sizeof(CHANNEL_TYPE) is 0x158, all of which will become relevant further on. Also useful to note, our state machine has now been set to level = 3 at [15]. To proceed, let us now assume that our input file’s next line is CG,1,10,65528,0,0;;;, and we hit the exact same code block.

Since there’s no level restriction on the conditional at [12], doing another CG line is completely valid. This time we pass in 0xFFF8 as our NoChanCurrentGroup [13], which causes hdr->NS to overflow since hdr->NS is a uint16_t, and 0xFFF8 + 0xA is 0x10002. Thus, after the addition, hdr->NS is 0x2, and our reallocation at [14] ends up shrinking the buffer to size 0x3b0 (2 * 0x158) from 0xd70 (0xa * 0x158). While the buffer size still corresponds to the hdr->NS field, what’s key here is that NoChanCurrentGroup is still 0xFFF8, and there’s now a meaningful difference between these two variables. Continuing on, let’s now assume that our input file contains the line CC,1,1,1:

        else if (!strncmp(t,"CC,1",4) && (level>=3)) {  // [16]
            if (NoChanCurrentGroup<1) {                 
                biosigERROR(hdr, B4C_UNSPECIFIC_ERROR, "FAMOS: too many CC definitions in group");
            }
            CHAN = hdr->NS - NoChanCurrentGroup--; // [16]

            if (CHAN==0)
                hdr->SampleRate = Fs;
            else if (OnOff && (fabs(hdr->SampleRate - Fs)>1e-9*Fs)) {
                fprintf(stdout,"ERR2: %i %f %f\n",CHAN,hdr->SampleRate, Fs);
            //                  biosigERROR(hdr, B4C_DATATYPE_UNSUPPORTED, "FAMOS: multiple sampling rates not supported");
            }
            if (VERBOSE_LEVEL>7)
                fprintf(stdout,"CC: %i#%i Fs=%f,%i\n",OnOff,CHAN,Fs,(int)len);

            level = 4;  // [17]
        }

Since our state has been set to level three at [15], we hit the branch at [16]. The key point here for our current vulnerability is just the line at [16], since we overflowed hdr->NS to create some space between it and NoChanCurrentGroup, CHAN is set to 0x2 - 0xFFF8, which for a uint16_t would be 0xa. Our state machine level is also set to0x4 at [17], but this is not the most important. Continuing on, with everything set in place, we have a few places to actually trigger this vulnerability. To keep it simple, let’s examine the Cb opcode once again, and assume that our input line now contains Cb,1,18,1,1,1,1,333,1,1,1;:

 else if (!strncmp(t,"Cb,1",4))  // Buffer Beschreibung        
        {
            // AnzahlBufferInKey
            int p = strcspn(t2,",");
            t2[p] = 0;
            if (atoi(t2) != 1) {
                biosigERROR(hdr, B4C_FORMAT_UNSUPPORTED, "FAMOS: more than one buffer not supported");lll
            }
            // BytesInUserInfo
            t2 += 1+p;
            p = strcspn(t2,",");
            t2[p] = 0;
            // Buffer Referenz
            t2 += 1+p;
            p = strcspn(t2,",");
            t2[p] = 0;
            // IndexSamplesKey
            t2 += 1+p;
            p = strcspn(t2,",");
            t2[p] = 0;
            // OffsetBufferInSamplesKey
            t2 += 1+p;
            p = strcspn(t2,",");
            t2[p] = 0;
            hdr->CHANNEL[CHAN].bi = atol(t2); // [18]

Skipping all the boiler plate parsing code that was already covered up at [10], we hit the atol(t2) with an input-controlled value of 333. This time however, we know that hdr->CHANNEL only contains two channels, whilst the CHAN offset is 0xa, resulting in an out-of-bounds write on the heap. This process can be repeated as much as needed, and the Cb opcode is not the only opcode that writes to hdr->CHANNEL[CHAN] offsets, and all of this can quickly and potentially contribute to code execution.

Crash Information

INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 3925793809
INFO: Loaded 2 modules   (18494 inline 8-bit counters): 18457 [0x7fb9cdaba570, 0x7fb9cdabed89), 37 [0x5565630421c8, 0x5565630421ed), 
INFO: Loaded 2 PC tables (18494 PCs): 18457 [0x7fb9cdabed90,0x7fb9cdb06f20), 37 [0x5565630421f0,0x556563042440), 
./biosig_fuzzer.bin: Running 1 inputs 1 time(s) each.
Running: ../fuzzing/triage/FAMOS_bugs/write_4_141/int_overflow_write4.famos
=================================================================
==40163==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x617000000f30 at pc 0x7fb9cd9c4b59 bp 0x7fff95d269d0 sp 0x7fff95d269c8
WRITE of size 4 at 0x617000000f30 thread T0
    #0 0x7fb9cd9c4b58 in sopen_FAMOS_read /biosig/stable_release/biosig-2.5.0/biosig4c++/./t210/sopen_famos_read.c:137:27
    #1 0x7fb9cd9465cb in sopen_extended /biosig/stable_release/biosig-2.5.0/biosig4c++/biosig.c:8481:3
    #2 0x556562fff35f in LLVMFuzzerTestOneInput /biosig/stable_release/./fuzz_biosig.cpp:84:20
    #3 0x556562f254d3 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/biosig/stable_release/biosig_fuzzer.bin+0x3f4d3) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #4 0x556562f0f24f in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) (/biosig/stable_release/biosig_fuzzer.bin+0x2924f) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #5 0x556562f14fa6 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/biosig/stable_release/biosig_fuzzer.bin+0x2efa6) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #6 0x556562f3edc2 in main (/biosig/stable_release/biosig_fuzzer.bin+0x58dc2) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #7 0x7fb9cd429d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #8 0x7fb9cd429e3f in __libc_start_main csu/../csu/libc-start.c:392:3
    #9 0x556562f09b14 in _start (/biosig/stable_release/biosig_fuzzer.bin+0x23b14) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)

0x617000000f30 is located 3072 bytes to the right of 688-byte region [0x617000000080,0x617000000330)
allocated by thread T0 here:
    #0 0x556562fc1f76 in __interceptor_realloc (/biosig/stable_release/biosig_fuzzer.bin+0xdbf76) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #1 0x7fb9cd9c2d7a in sopen_FAMOS_read /biosig/stable_release/biosig-2.5.0/biosig4c++/./t210/sopen_famos_read.c:224:35
    #2 0x7fb9cd9465cb in sopen_extended /biosig/stable_release/biosig-2.5.0/biosig4c++/biosig.c:8481:3
    #3 0x556562fff35f in LLVMFuzzerTestOneInput /biosig/stable_release/./fuzz_biosig.cpp:84:20
    #4 0x556562f254d3 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/biosig/stable_release/biosig_fuzzer.bin+0x3f4d3) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #5 0x556562f0f24f in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) (/biosig/stable_release/biosig_fuzzer.bin+0x2924f) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #6 0x556562f14fa6 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/biosig/stable_release/biosig_fuzzer.bin+0x2efa6) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #7 0x556562f3edc2 in main (/biosig/stable_release/biosig_fuzzer.bin+0x58dc2) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #8 0x7fb9cd429d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16

SUMMARY: AddressSanitizer: heap-buffer-overflow /biosig/stable_release/biosig-2.5.0/biosig4c++/./t210/sopen_famos_read.c:137:27 in sopen_FAMOS_read
Shadow bytes around the buggy address:
  0x0c2e7fff8190: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2e7fff81a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2e7fff81b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2e7fff81c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2e7fff81d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c2e7fff81e0: fa fa fa fa fa fa[fa]fa fa fa fa fa fa fa fa fa
  0x0c2e7fff81f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2e7fff8200: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2e7fff8210: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2e7fff8220: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2e7fff8230: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==40163==ABORTING
VENDOR RESPONSE

The vendor provided a new release at: https://biosig.sourceforge.net/download.html

TIMELINE

2024-02-05 - Initial Vendor Contact
2024-02-05 - Vendor Disclosure
2024-02-19 - Vendor Patch Release
2024-02-20 - Public Release

Credit

Discovered by Lilith >_> of Cisco Talos.