Talos Vulnerability Report

TALOS-2024-1922

The Biosig Project libbiosig sopen_FAMOS_read integer underflow to out-of-bounds write vulnerability

February 20, 2024
CVE Number

CVE-2024-23313

SUMMARY

An integer underflow vulnerability exists in the sopen_FAMOS_read functionality of The Biosig Project libbiosig 2.5.0 and Master Branch (ab0ee111). A specially crafted .famos file can lead to an out-of-bounds write which in turn can lead to arbitrary code execution. An attacker can provide a malicious file to trigger this vulnerability.

CONFIRMED VULNERABLE VERSIONS

The versions below were either tested or verified to be vulnerable by Talos or confirmed to be vulnerable by the vendor.

The Biosig Project libbiosig 2.5.0
The Biosig Project libbiosig Master Branch (ab0ee111)

PRODUCT URLS

libbiosig - https://biosig.sourceforge.net/index.html

CVSSv3 SCORE

9.8 - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H

CWE

CWE-191 - Integer Underflow (Wrap or Wraparound)

DETAILS

Libbiosig is an open source library designed to process various types of medical signal data (EKG, EEG, etc) within a vast variety of different file formats. Libbiosig is also at the core of biosig APIs in Octave and Matlab, sigviewer, and other scientific software utilized for interpreting biomedical signal data.

When reading in or writing out data of any filetype, libbiosig will always end up hitting the sopen_extended function:

HDRTYPE* sopen_extended(const char* FileName, const char* MODE, HDRTYPE* hdr, biosig_options_type *biosig_options) {
/*
    MODE="r"
        reads file and returns HDR
    MODE="w"
        writes HDR into file
 */

This is where the vast majority of parsing logic is for most file types, albeit with some exceptions to this generalization which end up calling more specific sopen_* functions. Regardless, unless specifically stated, it’s safe to assume we’re somewhere in this extremely large function. The general flow of sopen_extended is as one might expect: initialize generic structures, figure out what file type we’re dealing with, parse the filetype, and finally populate the generic structures that can be utilized by whatever is calling sopen_extended. To determine file type, sopen_extended calls getfiletype, which goes through a list of magic byte comparisons. Alternatively we could force a particular file type, but this is generally more useful when writing data to a file.

Moving on from the generic overview, we can get to be more specific. For our current vulnerability we deal with the imc FAMOS file format, a generic format for quick data analysis. To figure out if we’re dealing with a .famos file, getfiletype runs the following magic-byte check:

else if (!memcmp(Header1,"|CF,",4))
        hdr->TYPE = FAMOS;

Simple enough, and assuming we find a .famos file, sopen_extended hits the following branch:

#ifdef WITH_FAMOS
        else if (hdr->TYPE==FAMOS) {
            hdr->HeadLen=count;
            sopen_FAMOS_read(hdr);
    }
#endif

Worth noting that while the famos file format can be disabled with the WITH_FAMOS compiler flag, by default it is enabled. Continuing on into sopen_FAMOS_read:

EXTERN_C void sopen_FAMOS_read(HDRTYPE* hdr) {
#define Header1 ((char*)hdr->AS.Header) 

        size_t count = hdr->HeadLen;

        char *t, *t2;
        const char EOL[] = "|;\xA\xD";
        size_t pos, l1, len;
        pos  = strspn(Header1, EOL);   // [1]
        uint16_t gdftyp, CHAN=0;
        char OnOff=1;
        double Fs = NAN;
        uint32_t NoChanCurrentGroup = 0;    // number of (undefined) channels of current group 
        int level = 0;  // to check consistency of file

        char flag_AbstandFile = 0;      // interleaved format ??? used for experimental code 

        fprintf(stdout,"SOPEN(FAMOS): support is experimental. Only time series with equidistant sampling and single sampling rate are supported.\n"); // [2]

        while (pos < count-20) { // [3]
            t       = Header1+pos;  // start of line // [4]

            l1      = strcspn(t+5, ","); // [5]
            t[l1+5] = 0;
            len     = atol(t+5);  // [6]
            pos    += 6+l1;       
            t2      = Header1+pos;  // start of line // [7]
            if (count < max(pos,hdr->HeadLen)+256) { // HeadLen can be updated... //[8]
                    size_t bufsiz = 4095;
                    hdr->AS.Header = (uint8_t*)realloc(hdr->AS.Header, count+bufsiz+1); // [9]  
                    count += ifread(hdr->AS.Header+count,1,bufsiz,hdr);
            }
            pos    += len+1;
            
        // [...]
        pos += strcspn(Header1+pos,EOL);
        pos += strspn(Header1+pos,EOL);
    }

To start, our code skips over any sort of newlines or delimiter at [1], and then prints out the fact that FAMOS support is experimental [2]. Heeding this warning, our input file proceeds with caution as we enter the main logic loop at [3]. The variables t [4] is saved to denote the start of the line, l1 [5] is the length of our current line (the +5 is there since the tags denoting data type are 4 bytes long), len ends up being the length of the data of the current line, and finally t2 ends up pointing to the actual start of our data for the current line. Assuming that our count variable is smaller than the position of where we’re about to start reading plus 0xFF [8], then we end up reallocating our memory buffer and reading in more of the file into it [9]. To see how each opcode or tag is parsed within our file, we examine a sample opcode in the logic below:

        if (!strncmp(t,"CF,2",4) && (level==0)) {                     
            level = 1;
        }
        else if (!strncmp(t,"CK,1",4) && (level==1)) {                
            level = 2;
        }
        else if (!strncmp(t,"NO,1",4) && ((level==1) || (level==2))) { 
                  // [...]
        }
        else if (!strncmp(t,"CT,1",4) && (level>1)) {
        }

        else if (!strncmp(t,"Cb,1",4))  // Buffer Beschreibung         // [10]
        {
            // AnzahlBufferInKey
            int p = strcspn(t2,",");
            t2[p] = 0;
            if (atoi(t2) != 1) {
                biosigERROR(hdr, B4C_FORMAT_UNSUPPORTED, "FAMOS: more than one buffer not supported");lll
            }
            // BytesInUserInfo
            t2 += 1+p;
            p = strcspn(t2,",");
            t2[p] = 0;
            // Buffer Referenz
            t2 += 1+p;
            p = strcspn(t2,",");
            t2[p] = 0;
            // IndexSamplesKey
            t2 += 1+p;
            p = strcspn(t2,",");
            t2[p] = 0;
            // OffsetBufferInSamplesKey
            t2 += 1+p;
            p = strcspn(t2,",");
            t2[p] = 0;
            hdr->CHANNEL[CHAN].bi = atol(t2); // [11]

Which parsing code we hit is determined first by the first four bytes of our line, and then an internal state machine variable is checked. While not all opcodes require a particular state (e.g. [10]), as seen above, most do. Continuing on in the branch at [13], assuming any given line in our input file starts with “Cb,1”, we then search the rest of that line for five commas, turn all of them into null bytes such that these fields can be parsed later. The first example of this parsing is at [11], where we populate a hdr->CHANNEL’s bi member with a signed long. Importantly, this particular line itself is but one example of a more general idea - that we can write into the hdr->CHANNEL[CHAN] at various spots, and our task with FAMOS files is generally to find interesting spots to turn these into heap writes. For this particular writeup, there’s a spot inside sopen_FAMOS_read where we can realloc the hdr->CHANNEL buffer, which will become very interesting very quickly:

        else if (!strncmp(t,"CG,1",4)) // Definition eines Datenfeldes     // [12]
        {
            int p;
            // Anzahl Komponenten               
            p = strcspn(t2,",");
            t2[p] = 0;
            NoChanCurrentGroup = atol(t2);  // additional channels // [13]
            hdr->NS += NoChanCurrentGroup; 
            // Feldtyp
            p = strcspn(t2,",");
            t2[p] = 0;
            OnOff = 1;
            if (atoi(t2) != 1) {
        //          OnOff = 0; 
        //          biosigERROR(hdr, B4C_DATATYPE_UNSUPPORTED, "FAMOS: data is not real and aquidistant sampled");
            }
            // Dimension 
            p = strcspn(t2,",");
            t2[p] = 0;

            hdr->CHANNEL = (CHANNEL_TYPE*)realloc(hdr->CHANNEL, hdr->NS*sizeof(CHANNEL_TYPE));   // [14]
            level = 3;   // [15]
        }

Assuming we have a line something like: CG,1,8,4,11,12;;; [12], then the nine is treated as the size of the data and the following 0x4 is put into the uint32_t NoChanCurrentGroup variable [13]. At [14], hdr->NS is advanced by this same amount, and then we realloc ourhdr->CHANNEL buffer by a multiplier of this same amount. A quick relavant tangent to the above code snippet - what types do all these variables happen to be? Well hdr->NS happens to be a uint16_t, NoChanCurrentGroup happens to be a uint32_t, and sizeof(CHANNEL_TYPE) is 0x158, all of which will become relevant further on. Also useful to note, our state machine has now been set to level = 3 at [15]. For fun, let us now assume that our next ten lines are all CC,1,1,1;:

        else if (!strncmp(t,"CC,1",4) && (level>=3)) {
            if (NoChanCurrentGroup<1) {                           // [15]
                biosigERROR(hdr, B4C_UNSPECIFIC_ERROR, "FAMOS: too many CC definitions in group");
            }
            CHAN = hdr->NS - NoChanCurrentGroup--;  // [16]

            if (CHAN==0)
                hdr->SampleRate = Fs;
            else if (OnOff && (fabs(hdr->SampleRate - Fs)>1e-9*Fs)) {
                fprintf(stdout,"ERR2: %i %f %f\n",CHAN,hdr->SampleRate, Fs);
                // biosigERROR(hdr, B4C_DATATYPE_UNSUPPORTED, "FAMOS: multiple sampling rates not supported");
            }
            if (VERBOSE_LEVEL>7)
                fprintf(stdout,"CC: %i#%i Fs=%f,%i\n",OnOff,CHAN,Fs,(int)len);

            level = 4;   // [17]
        }

We’ve established that our NoChanCurrentGroup is currently 0x4, and so we pass by the branch at [15] (which honestly doesn’t matter because biosigERROR just sets flags in the hdr struct that do not get checked here). Next up at [16], the uint16_t CHAN variable is set to hdr->NS - NoChanCurrentGroup--; at [16]. Since we’re hitting this code ten times, and there’s nothing to stop us from doing so, each time we hit this NoChanGroupCurrent group decrements and decrements until it underflows eventually from 0x4 to 0xFFFFFFFE, and so when we subtract 0x4 - 0xFFFFFFFE, i.e. 0x4 - (-2) we end up with 0x6, a CHAN variable that is bigger than hdr->NS. Continuing on, with everything set in place, we have a few places to actually trigger this vulnerability. To keep it simple, let’s examine the Cb opcode once again, and assume that our input line now contains Cb,1,18,1,1,1,1,333,1,1,1;:

 else if (!strncmp(t,"Cb,1",4))  // Buffer Beschreibung        
        {
            // AnzahlBufferInKey
            int p = strcspn(t2,",");
            t2[p] = 0;
            if (atoi(t2) != 1) {
                biosigERROR(hdr, B4C_FORMAT_UNSUPPORTED, "FAMOS: more than one buffer not supported");lll
            }
            // BytesInUserInfo
            t2 += 1+p;
            p = strcspn(t2,",");
            t2[p] = 0;
            // Buffer Referenz
            t2 += 1+p;
            p = strcspn(t2,",");
            t2[p] = 0;
            // IndexSamplesKey
            t2 += 1+p;
            p = strcspn(t2,",");
            t2[p] = 0;
            // OffsetBufferInSamplesKey
            t2 += 1+p;
            p = strcspn(t2,",");
            t2[p] = 0;
            hdr->CHANNEL[CHAN].bi = atol(t2); // [18]

Skipping all the boiler plate parsing code that was already covered up at [10], we hit the atol(t2) with an input-controlled value of 333. This time however, we know that hdr->CHANNEL only contains four channels, whilst the CHAN offset is 0x6, resulting in an out-of-bounds write on the heap. This process can be repeated as much as needed since the “CC” opcode can be repeated without limit, and the other opcodes that writes to hdr->CHANNEL[CHAN] offsets can also be repeated. These facts can quickly and potentially contribute to code execution.

Crash Information

INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 2214963163
INFO: Loaded 2 modules   (18494 inline 8-bit counters): 18457 [0x7f5beaeba570, 0x7f5beaebed89), 37 [0x557fc18661c8, 0x557fc18661ed), 
INFO: Loaded 2 PC tables (18494 PCs): 18457 [0x7f5beaebed90,0x7f5beaf06f20), 37 [0x557fc18661f0,0x557fc1866440), 
./biosig_fuzzer.bin: Running 1 inputs 1 time(s) each.
Running: ../fuzzing/triage/FAMOS_bugs/write_4_141/underflow_arb_write4.famos
=================================================================
==40206==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61a0000009d0 at pc 0x7f5beadc4b59 bp 0x7ffef7d45490 sp 0x7ffef7d45488
WRITE of size 4 at 0x61a0000009d0 thread T0
    #0 0x7f5beadc4b58 in sopen_FAMOS_read /biosig/stable_release/biosig-2.5.0/biosig4c++/./t210/sopen_famos_read.c:137:27
    #1 0x7f5bead465cb in sopen_extended /biosig/stable_release/biosig-2.5.0/biosig4c++/biosig.c:8481:3
    #2 0x557fc182335f in LLVMFuzzerTestOneInput /biosig/stable_release/./fuzz_biosig.cpp:84:20
    #3 0x557fc17494d3 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/biosig/stable_release/biosig_fuzzer.bin+0x3f4d3) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #4 0x557fc173324f in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) (/biosig/stable_release/biosig_fuzzer.bin+0x2924f) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #5 0x557fc1738fa6 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/biosig/stable_release/biosig_fuzzer.bin+0x2efa6) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #6 0x557fc1762dc2 in main (/biosig/stable_release/biosig_fuzzer.bin+0x58dc2) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #7 0x7f5bea829d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #8 0x7f5bea829e3f in __libc_start_main csu/../csu/libc-start.c:392:3
    #9 0x557fc172db14 in _start (/biosig/stable_release/biosig_fuzzer.bin+0x23b14) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)

0x61a0000009d0 is located 1008 bytes to the right of 1376-byte region [0x61a000000080,0x61a0000005e0)
allocated by thread T0 here:
    #0 0x557fc17e5f76 in __interceptor_realloc (/biosig/stable_release/biosig_fuzzer.bin+0xdbf76) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #1 0x7f5beadc2d7a in sopen_FAMOS_read /biosig/stable_release/biosig-2.5.0/biosig4c++/./t210/sopen_famos_read.c:224:35
    #2 0x7f5bead465cb in sopen_extended /biosig/stable_release/biosig-2.5.0/biosig4c++/biosig.c:8481:3
    #3 0x557fc182335f in LLVMFuzzerTestOneInput /biosig/stable_release/./fuzz_biosig.cpp:84:20
    #4 0x557fc17494d3 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/biosig/stable_release/biosig_fuzzer.bin+0x3f4d3) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #5 0x557fc173324f in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) (/biosig/stable_release/biosig_fuzzer.bin+0x2924f) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #6 0x557fc1738fa6 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/biosig/stable_release/biosig_fuzzer.bin+0x2efa6) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #7 0x557fc1762dc2 in main (/biosig/stable_release/biosig_fuzzer.bin+0x58dc2) (BuildId: 9ffac83f55dadf5472f09c72de5ba7a7aa4860e0)
    #8 0x7f5bea829d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16

SUMMARY: AddressSanitizer: heap-buffer-overflow /biosig/stable_release/biosig-2.5.0/biosig4c++/./t210/sopen_famos_read.c:137:27 in sopen_FAMOS_read
Shadow bytes around the buggy address:
  0x0c347fff80e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c347fff80f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c347fff8100: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c347fff8110: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c347fff8120: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c347fff8130: fa fa fa fa fa fa fa fa fa fa[fa]fa fa fa fa fa
  0x0c347fff8140: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c347fff8150: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c347fff8160: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c347fff8170: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c347fff8180: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==40206==ABORTING
VENDOR RESPONSE

The vendor provided a new release at: https://biosig.sourceforge.net/download.html

TIMELINE

2024-02-05 - Initial Vendor Contact
2024-02-05 - Vendor Disclosure
2024-02-19 - Vendor Patch Release
2024-02-20 - Public Release

Credit

Discovered by Lilith >_> of Cisco Talos.