Talos Vulnerability Report

TALOS-2024-1925

The Biosig Project libbiosig sopen_FAMOS_read NULL calloc out-of-bounds write vulnerability

February 20, 2024
CVE Number

CVE-2024-23606

SUMMARY

An out-of-bounds write vulnerability exists in the sopen_FAMOS_read functionality of The Biosig Project libbiosig 2.5.0 and Master Branch (ab0ee111). A specially crafted .famos file can lead to arbitrary code execution. An attacker can provide a malicious file to trigger this vulnerability.

CONFIRMED VULNERABLE VERSIONS

The versions below were either tested or verified to be vulnerable by Talos or confirmed to be vulnerable by the vendor.

The Biosig Project libbiosig 2.5.0
The Biosig Project libbiosig Master Branch (ab0ee111)

PRODUCT URLS

libbiosig - https://biosig.sourceforge.net/index.html

CVSSv3 SCORE

9.8 - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H

CWE

CWE-131 - Incorrect Calculation of Buffer Size

DETAILS

Libbiosig is an open source library designed to process various types of medical signal data (EKG, EEG, etc) within a vast variety of different file formats. Libbiosig is also at the core of biosig APIs in Octave and Matlab, sigviewer, and other scientific software utilized for interpreting biomedical signal data.

When reading in or writing out data of any filetype, libbiosig will always end up hitting the sopen_extended function:

HDRTYPE* sopen_extended(const char* FileName, const char* MODE, HDRTYPE* hdr, biosig_options_type *biosig_options) {
/*
    MODE="r"
        reads file and returns HDR
    MODE="w"
        writes HDR into file
 */

This is where the vast majority of parsing logic is for most file types, albeit with some exceptions to this generalization which end up calling more specific sopen_* functions. Regardless, unless specifically stated, it’s safe to assume we’re somewhere in this extremely large function. The general flow of sopen_extended is as one might expect: initialize generic structures, figure out what file type we’re dealing with, parse the filetype, and finally populate the generic structures that can be utilized by whatever is calling sopen_extended. To determine file type, sopen_extended calls getfiletype, which goes through a list of magic byte comparisons. Alternatively we could force a particular file type, but this is generally more useful when writing data to a file.

Moving on from the generic overview, we can get to be more specific. For our current vulnerability we deal with the imc FAMOS file format, a generic format for quick data analysis. To figure out if we’re dealing with a .famos file, getfiletype runs the following magic-byte check:

else if (!memcmp(Header1,"|CF,",4))
        hdr->TYPE = FAMOS;

Simple enough, and assuming we find a .famos file, sopen_extended hits the following branch:

#ifdef WITH_FAMOS
        else if (hdr->TYPE==FAMOS) {
            hdr->HeadLen=count;
            sopen_FAMOS_read(hdr);
    }
#endif

Worth noting that while the famos file format can be disabled with the WITH_FAMOS compiler flag, by default it is enabled. Continuing on into sopen_FAMOS_read:

EXTERN_C void sopen_FAMOS_read(HDRTYPE* hdr) {
#define Header1 ((char*)hdr->AS.Header) 

        size_t count = hdr->HeadLen;

        char *t, *t2;
        const char EOL[] = "|;\xA\xD";
        size_t pos, l1, len;
        pos  = strspn(Header1, EOL);   // [1]
        uint16_t gdftyp, CHAN=0;
        char OnOff=1;
        double Fs = NAN;
        uint32_t NoChanCurrentGroup = 0;    // number of (undefined) channels of current group 
        int level = 0;  // to check consistency of file

        char flag_AbstandFile = 0;      // interleaved format ??? used for experimental code 

        fprintf(stdout,"SOPEN(FAMOS): support is experimental. Only time series with equidistant sampling and single sampling rate are supported.\n"); // [2]

        while (pos < count-20) { // [3]
            t       = Header1+pos;  // start of line // [4]

            l1      = strcspn(t+5, ","); // [5]
            t[l1+5] = 0;
            len     = atol(t+5);  // [6]
            pos    += 6+l1;       
            t2      = Header1+pos;  // start of line // [7]
            if (count < max(pos,hdr->HeadLen)+256) { // HeadLen can be updated... //[8]
                    size_t bufsiz = 4095;
                    hdr->AS.Header = (uint8_t*)realloc(hdr->AS.Header, count+bufsiz+1); // [9]  
                    count += ifread(hdr->AS.Header+count,1,bufsiz,hdr);
            }
            pos    += len+1;
            
        // [...]
        pos += strcspn(Header1+pos,EOL);
        pos += strspn(Header1+pos,EOL);
    }

To start, our code skips over any sort of newlines or delimiter at [1], and then prints out the fact that FAMOS support is experimental [2]. Heeding this warning, our input file proceeds with caution as we enter the main logic loop at [3]. The variables t [4] is saved to denote the start of the line, l1 [5] is the length of our current line (the +5 is there since the tags denoting data type are 4 bytes long), len ends up being the length of the data of the current line, and finally t2 ends up pointing to the actual start of our data for the current line. Assuming that our count variable is smaller than the position of where we’re about to start reading plus 0xFF [8], then we end up reallocating our memory buffer and reading in more of the file into it [9]. To see how each opcode or tag is parsed within our file, we examine a sample opcode in the logic below:

        if (!strncmp(t,"CF,2",4) && (level==0)) {                     
            level = 1;
        }
        else if (!strncmp(t,"CK,1",4) && (level==1)) {                
            level = 2;
        }
        else if (!strncmp(t,"NO,1",4) && ((level==1) || (level==2))) { 
                  // [...]
        }
        else if (!strncmp(t,"CT,1",4) && (level>1)) {
        }

        else if (!strncmp(t,"Cb,1",4))  // Buffer Beschreibung         // [10]
        {
            // AnzahlBufferInKey
            int p = strcspn(t2,",");
            t2[p] = 0;
            if (atoi(t2) != 1) {
                biosigERROR(hdr, B4C_FORMAT_UNSUPPORTED, "FAMOS: more than one buffer not supported");
            }
            // BytesInUserInfo
            t2 += 1+p;
            p = strcspn(t2,",");
            t2[p] = 0;
            // Buffer Referenz
            t2 += 1+p;
            p = strcspn(t2,",");
            t2[p] = 0;
            // IndexSamplesKey
            t2 += 1+p;
            p = strcspn(t2,",");
            t2[p] = 0;
            // OffsetBufferInSamplesKey
            t2 += 1+p;
            p = strcspn(t2,",");
            t2[p] = 0;
            hdr->CHANNEL[CHAN].bi = atol(t2); // [11]

Which parsing code we hit is determined first by the first four bytes of our line, and then an internal state machine variable is checked. While not all opcodes require a particular state (e.g. [10]), as seen above, most do. Continuing on in the branch at [13], assuming any given line in our input file starts with “Cb,1”, we then search the rest of that line for five commas, turn all of them into null bytes such that these fields can be parsed later. The first example of this parsing is at [11], where we populate a hdr->CHANNEL’s bi member with a signed long. This example conveniently also happens to be the first instance of our current vulnerability, as it’s completely possible for hdr->CHANNEL to have been allocated by a calloc(0x0,sizeof(CHANNEL_TYPE)), which would cause this write to be out of bounds. Before getting to far ahead of ourselves, let’s backtrack to how the hdr->CHANNEL buffer is generated. First, for FAMOS files, there’s a spot inside sopen_FAMOS_read where we can hit realloc for the hdr->CHANNEL buffer:

        else if (!strncmp(t,"CG,1",4)) // Definition eines Datenfeldes     // [12]
        {
            int p;
            // Anzahl Komponenten               
            p = strcspn(t2,",");
            t2[p] = 0;
            NoChanCurrentGroup = atol(t2);  // additional channels   // [13]
            hdr->NS += NoChanCurrentGroup; 
            // Feldtyp
            p = strcspn(t2,",");
            t2[p] = 0;
            OnOff = 1;
            if (atoi(t2) != 1) {
        //          OnOff = 0; 
        //          biosigERROR(hdr, B4C_DATATYPE_UNSUPPORTED, "FAMOS: data is not real and aquidistant sampled");
            }
            // Dimension 
            p = strcspn(t2,",");
            t2[p] = 0;

            hdr->CHANNEL = (CHANNEL_TYPE*)realloc(hdr->CHANNEL, hdr->NS*sizeof(CHANNEL_TYPE));   // [14]
            level = 3;
        }

Thus, assuming we have a line something like: CG,1,9,10,11,12;;;, the nine is treated as the size of the data and the following 10 is put into the uint32_t NoChanCurrentGroup variable [13]. The hdr->NS variable is incremented by this amount, which then determines the size of our hdr->CHANNEL buffer at [14]. Curiously however, as mentioned before, there’s no state machine requirements to write to the hdr->CHANNEL[CHAN].bi buffer at [11], and so there’s not actually any requirement for us to hit the realloc at [14] before writing to it. As such we must go further back to how this hdr->CHANNEL buffer is originally allocated, and for that we have to go back to the constructHDR function:

HDRTYPE* constructHDR(const unsigned NS, const unsigned N_EVENT)
{
/*
    HDR is initialized, memory is allocated for
    NS channels and N_EVENT number of events.

    The purpose is to define all parameters at an initial step.
    No parameters must remain undefined.
 */
HDRTYPE* hdr = (HDRTYPE*)malloc(sizeof(HDRTYPE));
// [...]
     
hdr->FileName = NULL;
hdr->FILE.OPEN = 0;
hdr->FILE.FID = 0;
hdr->FILE.POS = 0;
hdr->FILE.Des = 0;
hdr->FILE.COMPRESSION = 0;
hdr->FILE.size = 0;
#ifdef ZLIB_H
    hdr->FILE.gzFID = 0;
#endif
    // [...] 

hdr->NRec = 0;
hdr->SPR  = 0;
hdr->NS = NS;            // [15]
hdr->SampleRate = 4321.5;
hdr->Patient.Id[0]=0;

//[...]
// define variable header
hdr->CHANNEL = (CHANNEL_TYPE*)calloc(hdr->NS, sizeof(CHANNEL_TYPE));  // [16] 
BitsPerBlock = 0;
for (k=0;k<hdr->NS;k++) {
    size_t nbits;
    CHANNEL_TYPE *hc = hdr->CHANNEL+k;
        hc->Label[0]  = 0;
        hc->LeadIdCode= 0;
     // [...]

As quickly seen at [15] and [16], the size of our hdr->CHANNEL buffer is based on the const unsigned NS variable passed into constructHDR. To proceed we must now look to where and how constructHDR is called in the sopen_extended call chain:

HDRTYPE* sopen_extended(const char* FileName, const char* MODE, HDRTYPE* hdr, biosig_options_type *biosig_options) {
/*
    MODE="r"
        reads file and returns HDR
    MODE="w"
        writes HDR into file
 */

        size_t      count;
#ifndef  ONLYGDF
    char*       ptr_str;
    struct tm   tm_time;

    const char  GENDER[] = "XMFX";
    const uint16_t  CFWB_GDFTYP[] = {17,16,3};
    const float CNT_SETTINGS_NOTCH[] = {0.0, 50.0, 60.0};
    const float CNT_SETTINGS_LOWPASS[] = {30, 40, 50, 70, 100, 200, 500, 1000, 1500, 2000, 2500, 3000};
    const float CNT_SETTINGS_HIGHPASS[] = {NAN, 0, .05, .1, .15, .3, 1, 5, 10, 30, 100, 150, 300};
    uint16_t    BCI2000_StatusVectorLength=0;   // specific for BCI2000 format
#endif //ONLYGDF

    biosig_options_type default_options;
    default_options.free_text_event_limiter="\0";

    if (biosig_options==NULL) biosig_options = &default_options;

    if (VERBOSE_LEVEL>7) fprintf(stdout,"%s(%s,%s) (line %d): --delimiter=<%s> %p\n",__func__, FileName, MODE, __LINE__, biosig_options->free_text_event_limiter, biosig_options);

    if (FileName == NULL) {
        biosigERROR(hdr, B4C_CANNOT_OPEN_FILE, "no filename specified");
        return (hdr);
    }
    if (hdr==NULL)
        hdr = constructHDR(0,0); // initializes fields that may stay undefined during SOPEN   // [17]

Even with this call at [17], it is still somewhat ambiguous as to whether or not it is realistic for a hdr->NS to be set to 0x0, since how often is it that a null hdr gets passed into sopen_extended, how often is it that a hdr is passed into sopen_extended that’s been created with constructHDR(0x0,X)? Fortunately we can take a really quick look at both questions:

  File                Function                    Line
0 biosig.c            sopen_extended              3724 hdr = constructHDR(0,0);
4 biosig2.c           biosig_unserialize          1308 HDRTYPE *hdr = constructHDR(0,0);
5 biosig_client.c     main                          76 HDRTYPE *hdr=constructHDR(0,0);
6 biosig_server.c     DoJob                        225 hdr = constructHDR(0,0);
7 biosig_server.c     DoJob                        603 HDRTYPE *hdr = constructHDR(0,0);
9 mexSLOAD.cpp        mexFunction                  336 hdr = constructHDR(0,0);
b biosig.c            sload                         46 HDRTYPE *hdr = constructHDR(0,0);
c biosig.c            uload                        154 HDRTYPE *hdr = constructHDR(0,0);
f ttl2trig.c          main                         318 hdr = constructHDR(minChan,0);

  File                Function                   Line
0 sload.c             sload                        37 HDRTYPE *hdr = sopen(CHAR(asChar(filename)), "r", NULL);
1 sload.c             jsonHeader                   67 HDRTYPE *hdr = sopen(CHAR(asChar(filename)), "r", NULL);
2 biosig4r.c          sload                        37 HDRTYPE *hdr = sopen(CHAR(asChar(filename)), "r", NULL);
3 biosig4r.c          jsonHeader                   67 HDRTYPE *hdr = sopen(CHAR(asChar(filename)), "r", NULL);
4 biosig.c            RerefCHANNEL               3410 HDRTYPE *RR = sopen((const char *)arg2,"r",NULL);
7 biosig.c            sopen_extended             6284 HDRTYPE *hdr2 = sopen(mrkfile,"r",NULL);
8 biosig2.c           biosig2_open_file_readonly  725 HDRTYPE *hdr = sopen(path,"r",NULL);
9 biosig2.c           biosig_open_file_readonly   743 HDRTYPE *hdr = sopen(path,"r",NULL);

As seen above, these conditions are met in the library a significant amount of times. We can also examine the utility binaries that the Biosig project created for another good example of how they intended their code to be used. Looking at the code of biosig2gdf we can also see a pattern of the vulnerable code usage:

hdr = constructHDR(0,0);     // [18]
// hdr->FLAG.OVERFLOWDETECTION = FlagOverflowDetection;
hdr->FLAG.UCAL = 1;
hdr->FLAG.TARGETSEGMENT = TARGETSEGMENT;
hdr->FLAG.ANONYMOUS = FLAG_ANON;

if (argsweep) {
    k = 0;
    do {
        hdr->AS.SegSel[k++] = strtod(argsweep+1, &argsweep);
    } while (argsweep[0]==',' && (k < 5) );
}
hdr = sopen_extended(source, "r", hdr, &biosig_options); // [19]

We can clearly see a header created at [18] with 0x0 channels and then passed into a call to sopen_extended at [19]. If the argument is made that the channel number could potentially be changed by the argsweep loop, we simply need to look at the usage output of the binary:

if (LenChanList > 0) {
    fprintf(stderr, "argument -c <chanlist> is currently not supported, the argument is ignored ");
    LenChanList=0;
}

In summary, it is reasonable to conclude that, by default, when reaching sopen_FAMOS_read, the hdr->NS will be 0x0 in the vast majority of cases, allowing for us to have a quick out of bounds write with the Cb, and CR, famos opcodes, both of which have no state machine requirement so that we can hit them before our hdr->CHANNEL buffer is anything more than a calloc(0x0). These out of bounds writes can potentially lead to code execution.

Crash Information

=================================================================
==42563==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000170 at pc 0x7ffff79c4b59 bp 0x7fffffffa970 sp 0x7fffffffa968
WRITE of size 4 at 0x602000000170 thread T0
[Detaching after fork from child process 42567]
    #0 0x7ffff79c4b58 in sopen_FAMOS_read /biosig/stable_release/biosig-2.5.0/biosig4c++/./t210/sopen_famos_read.c:137:27
    #1 0x7ffff79465cb in sopen_extended /biosig/stable_release/biosig-2.5.0/biosig4c++/biosig.c:8481:3
    #2 0x55555566d35f in LLVMFuzzerTestOneInput /biosig/stable_release/./fuzz_biosig.cpp:84:20
    #3 0x5555555934d3 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/biosig/stable_release/biosig_fuzzer.bin+0x3f4d3) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #4 0x55555557d24f in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) (/biosig/stable_release/biosig_fuzzer.bin+0x2924f) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #5 0x555555582fa6 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/biosig/stable_release/biosig_fuzzer.bin+0x2efa6) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #6 0x5555555acdc2 in main (/biosig/stable_release/biosig_fuzzer.bin+0x58dc2) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #7 0x7ffff7429d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #8 0x7ffff7429e3f in __libc_start_main csu/../csu/libc-start.c:392:3
    #9 0x555555577b14 in _start (/biosig/stable_release/biosig_fuzzer.bin+0x23b14) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)

0x602000000170 is located 272 bytes to the right of 16-byte region [0x602000000050,0x602000000060)
allocated by thread T0 here:
    #0 0x55555561a553 in __interceptor_strdup (/biosig/stable_release/biosig_fuzzer.bin+0xc6553) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #1 0x7ffff7901a8f in sopen_extended /biosig/stable_release/biosig-2.5.0/biosig4c++/biosig.c:3728:19
    #2 0x55555566d35f in LLVMFuzzerTestOneInput /biosig/stable_release/./fuzz_biosig.cpp:84:20
    #3 0x5555555934d3 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/biosig/stable_release/biosig_fuzzer.bin+0x3f4d3) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #4 0x55555557d24f in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) (/biosig/stable_release/biosig_fuzzer.bin+0x2924f) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #5 0x555555582fa6 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/biosig/stable_release/biosig_fuzzer.bin+0x2efa6) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #6 0x5555555acdc2 in main (/biosig/stable_release/biosig_fuzzer.bin+0x58dc2) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #7 0x7ffff7429d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16

SUMMARY: AddressSanitizer: heap-buffer-overflow /biosig/stable_release/biosig-2.5.0/biosig4c++/./t210/sopen_famos_read.c:137:27 in sopen_FAMOS_read
Shadow bytes around the buggy address:
  0x0c047fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff8000: fa fa 07 fa fa fa 01 fa fa fa 00 00 fa fa fa fa
  0x0c047fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c047fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa[fa]fa
  0x0c047fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8060: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8070: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==42563==ABORTING
[Thread 0x7ffff41f9640 (LWP 42566) exited]
[Inferior 1 (process 42563) exited with code 01]
VENDOR RESPONSE

The vendor provided a new release at: https://biosig.sourceforge.net/download.html

TIMELINE

2024-02-05 - Initial Vendor Contact
2024-02-05 - Vendor Disclosure
2024-02-19 - Vendor Patch Release
2024-02-20 - Public Release

Credit

Discovered by Lilith >_> of Cisco Talos.