Talos Vulnerability Report

TALOS-2024-2068

GNOME Project G Structured File Library (libgsf) Compound Document Binary File Directory integer overflow vulnerability

October 3, 2024
CVE Number

CVE-2024-36474

SUMMARY

An integer overflow vulnerability exists in the Compound Document Binary File format parser of the GNOME Project G Structured File Library (libgsf) version v1.14.52. A specially crafted file can result in an integer overflow when processing the directory from the file that allows for an out-of-bounds index to be used when reading and writing to an array. This can lead to arbitrary code execution. An attacker can provide a malicious file to trigger this vulnerability.

CONFIRMED VULNERABLE VERSIONS

The versions below were either tested or verified to be vulnerable by Talos or confirmed to be vulnerable by the vendor.

GNOME Project G Structured File Library (libgsf) 1.14.52
GNOME Project G Structured File Library (libgsf) commit 634340d31177c02ccdb43171e37291948e7f8974

PRODUCT URLS

G Structured File Library (libgsf) - https://gitlab.gnome.org/GNOME/libgsf.git

CVSSv3 SCORE

8.4 - CVSS:3.1/AV:L/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H

CWE

CWE-190 - Integer Overflow or Wraparound

DETAILS

The G Structured File Library (libgsf) is a GNOME project with the goals of providing an abstraction layer around different structured file formats. This library provides support for common archive formats such as tar, zip, and includes other formats such as the compound document file format. The G Structured File Library (libgsf) is used by a number of applications in order to extract data from the supported formats. Some applications that use this library are Gnumeric, GNOME Commander, AbiWord, and the tracker-miners service. Tracker-miner service is specifically important, as it will automatically index and parse all files found under user’s home directory without user interaction.

This vulnerability specifically involves the way the G Structured File Library (libgsf) parses the compound document binary file format. The format is designed as a container that can be used to store multiple streams of information, similar to an archive. Within the container, a directory retaining naming information for the contents of each document component is stored in order to allow for identification of the streams that it contains. This design allows for a writer of said format to manipulate the different streams individually without interfering with other applications that may be accessing the same file. This capability is facilitated by the format organizing its contents using a file allocation table and a layer of indirection to reference said allocation table. Within the file allocation table is a linked-list describing which sectors are contiguous, thus each directory entry will reference its contents by specifying which sector in the file allocation table to start at. It is also worth noting that there are two types of sectors within the file format, with their sizes residing in the document header. For more information on this file format, please review Microsoft’s documentation at https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-cfb/53989ce4-7b05-4f8d-829b-d08d6148375b.

After a consumer of the libgsf library has opened a file using the compound document binary file format, the gsf_infile_msole_new function will be used as an entry point to parse the file’s contents. This function allocates a GsfInfileMSOle structure to contain the information necessary to parse the compound document file. At [1], the ole_init_info function will be called to read the header for the file format.

gsf/gsf-infile-msole.c:971-993
GsfInfile *
gsf_infile_msole_new (GsfInput *source, GError **err)
{
    GsfInfileMSOle *ole;
    gsf_off_t calling_pos;

    g_return_val_if_fail (GSF_IS_INPUT (source), NULL);

    ole = (GsfInfileMSOle *)g_object_new (GSF_INFILE_MSOLE_TYPE, NULL);
    ole->input = gsf_input_proxy_new (source);
    gsf_input_set_size (GSF_INPUT (ole), 0);

    calling_pos = gsf_input_tell (source);
    if (ole_init_info (ole, err)) {                                     // [1] Initialize tables to parse file
        /* We do this so other kinds of archives can be tried.  */
        (void)gsf_input_seek (source, calling_pos, G_SEEK_SET);

        g_object_unref (ole);
        return NULL;
    }

    return GSF_INFILE (ole);
}

Once inside the ole_init_info function, the implementation will start by verifying the signature of the file at [2]. Afterwards at [3], each of the fields composing the header are loaded into variables local to the function’s scope. As mentioned earlier, the compound document binary file format stores its sector sizes within the header. These sizes are stored as a power of 2 (or “shift”). After reading the sector sizes, at [4] the function will verify that their sizes are within a specific range (64 - 1073741824). However, as per Microsoft’s documentation, the file sector shift can be either 9 (512) or 12 (4096) depending on the major version in the header.

gsf/gsf-infile-msole.c:492-669
static gboolean
ole_init_info (GsfInfileMSOle *ole, GError **err)
{
    static guint8 const signature[] =
        { 0xd0, 0xcf, 0x11, 0xe0, 0xa1, 0xb1, 0x1a, 0xe1 };
    guint8 *seen_before;
    guint8 const *header, *tmp;
    guint32 *metabat = NULL;
    MSOleInfo *info;
    guint32 bb_shift, sb_shift, num_bat, num_sbat, num_metabat, threshold, last, dirent_start;
    guint32 metabat_block, *ptr;
    gboolean fail;

    /* check the header */
    if (gsf_input_seek (ole->input, 0, G_SEEK_SET) ||                           
        NULL == (header = gsf_input_read (ole->input, OLE_HEADER_SIZE, NULL)) ||    // [2] Check the signature in the header
        0 != memcmp (header, signature, sizeof (signature))) {
...
    }

    bb_shift      = GSF_LE_GET_GUINT16 (header + OLE_HEADER_BB_SHIFT);              // [3] Read the sector shift (sector size)
    sb_shift      = GSF_LE_GET_GUINT16 (header + OLE_HEADER_SB_SHIFT);              // [3] Read the minisector shift (minisector size)
    num_bat	      = GSF_LE_GET_GUINT32 (header + OLE_HEADER_NUM_BAT);               // [3] Read the number of sectors for the file allocation table
    num_sbat      = GSF_LE_GET_GUINT32 (header + OLE_HEADER_NUM_SBAT);              // [3] Read the number of sectors for the mini file allocation table
    threshold     = GSF_LE_GET_GUINT32 (header + OLE_HEADER_THRESHOLD);             // [3] Read the stream threshold (mini)
    dirent_start  = GSF_LE_GET_GUINT32 (header + OLE_HEADER_DIRENT_START);          // [3] Read the starting sector of the directory
        metabat_block = GSF_LE_GET_GUINT32 (header + OLE_HEADER_METABAT_BLOCK);     // [3] Read the starting sector containing the indirection table
    num_metabat   = GSF_LE_GET_GUINT32 (header + OLE_HEADER_NUM_METABAT);           // [3] Read the number of sectors containing the indirection table
...
    /* Some sanity checks
     * 1) There should always be at least 1 BAT block
     * 2) It makes no sense to have a block larger than 2^31 for now.
     *    Maybe relax this later, but not much.
     */
    if (6 > bb_shift || bb_shift >= 31 || sb_shift > bb_shift ||                    // [4] Validate the sector sizes
        (gsf_input_size (ole->input) >> bb_shift) < 1) {                            // [4] Validate the sector sizes
        if (err != NULL)
            *err = g_error_new (gsf_input_error_id (), 0,
                        _("Unreasonable block sizes"));
        return TRUE;
    }
...
    return FALSE;
}

Once the fields have been read from the header and the sector sizes validated, the function will allocate space for the info local variable. After being allocated, the implementation will store the fields required to read the indirection table and its file allocation table. At [5], the function will use the sector “shift” read from the header to calculate the size of an individual sector, and store it into the info variable. The vulnerability being described specifically involves this sector size. At [6], the same calculation will be made with the minisector “shift” before storing it to the corresponding field of the info variable. At [7], the fields containing the dimensions of the minisector file allocation table will also be stored to info.

gsf/gsf-infile-msole.c:492-669
static gboolean
ole_init_info (GsfInfileMSOle *ole, GError **err)
{
...
    MSOleInfo *info;
...
    info = g_new0 (MSOleInfo, 1);
    ole->info = info;

    info->ref_count	     = 1;
    info->bb.shift	     = bb_shift;                                                                            // [5] Store sector "shift"
    info->bb.size	     = 1 << info->bb.shift;                                                                 // [5] Convert sector shift to size
    info->bb.filter	     = info->bb.size - 1;
    info->sb.shift	     = sb_shift;                                                                            // [6] Store minisector "shift"
    info->sb.size	     = 1 << info->sb.shift;                                                                 // [6] Convert minisector shift to size
    info->sb.filter	     = info->sb.size - 1;
    info->threshold	     = threshold;
        info->sbat_start     = GSF_LE_GET_GUINT32 (header + OLE_HEADER_SBAT_START);                             // [7] Start of minisector fat
    info->num_sbat       = num_sbat;                                                                            // [7] Number of sectors for minisector fat
    info->max_block	     = (gsf_input_size (ole->input) - OLE_HEADER_SIZE + info->bb.size -1) / info->bb.size;
    info->sb_file	     = NULL;

...
    return FALSE;
}

After the info field has been populated by the ole_init_info function, the following code will be encountered. At [8], the ole_init_info function will start by validating the number of sectors for the file allocation table against the size of the file. Afterwards, the file allocation table will need to be allocated in order to store the sectors indices that compose it. This is done at by taking the sector size and dividing it by the size of an entry (4). Afterwards, it is then multiplied by the number of sectors that was read from the header, and then used to perform an allocation at [9]. After space for the file allocation has been allocated, the function will read each sector from the indirection table, decode a guint32 containing an entry, and then store each entry into the file allocation table at [10].

gsf/gsf-infile-msole.c:492-669
static gboolean
ole_init_info (GsfInfileMSOle *ole, GError **err)
{
...
    guint32 *metabat = NULL;
    MSOleInfo *info;
    guint32 bb_shift, sb_shift, num_bat, num_sbat, num_metabat, threshold, last, dirent_start;
    guint32 metabat_block, *ptr;
...

    /* very rough heuristic, just in case */
    if (num_bat < info->max_block && info->num_sbat < info->max_block) {            // [8] Check number of sectors (regular and mini) against file
        info->bb.bat.num_blocks = num_bat * (info->bb.size / BAT_INDEX_SIZE);
        info->bb.bat.block	= g_new0 (guint32, info->bb.bat.num_blocks);            // [9] Allocate space for the file allocation table

        metabat = g_try_new (guint32, MAX (info->bb.size, OLE_HEADER_SIZE));
        if (!metabat) {
...
        }

        /* Reading the elements invalidates this memory, make copy */
        gsf_ole_get_guint32s (metabat, header + OLE_HEADER_START_BAT,               // [10] Read sector containing file allocation table entries
            OLE_HEADER_SIZE - OLE_HEADER_START_BAT);
        last = num_bat;
        if (last > OLE_HEADER_METABAT_SIZE)
            last = OLE_HEADER_METABAT_SIZE;

        ptr = ole_info_read_metabat (ole, info->bb.bat.block,                       // [10] Read entries into file allocation table
            info->bb.bat.num_blocks, metabat, metabat + last);
        num_bat -= last;
    } else
        ptr = NULL;

...
    return FALSE;
}

After the info field has been populated by the ole_init_info function, the following code will be encountered. At [8], the ole_init_info function will start by validating the number of sectors for the file allocation table against the size of the file. Afterwards, the file allocation table will need to be allocated in order to store the sectors indices that compose it. This is done at by taking the sector size and dividing it by the size of an entry (4). Afterwards, it is then multiplied by the number of sectors that was read from the header, and then used to perform an allocation at [9]. After space for the file allocation has been allocated, the function will read each sector from the indirection table, decode a guint32 containing an entry, and then store each entry into the file allocation table at [10].

gsf/gsf-infile-msole.c:492-669
static gboolean
ole_init_info (GsfInfileMSOle *ole, GError **err)
{
...
    guint32 *metabat = NULL;
    MSOleInfo *info;
    guint32 bb_shift, sb_shift, num_bat, num_sbat, num_metabat, threshold, last, dirent_start;
    guint32 metabat_block, *ptr;
...

    /* very rough heuristic, just in case */
    if (num_bat < info->max_block && info->num_sbat < info->max_block) {            // [8] Check number of sectors (regular and mini) against file
        info->bb.bat.num_blocks = num_bat * (info->bb.size / BAT_INDEX_SIZE);
        info->bb.bat.block	= g_new0 (guint32, info->bb.bat.num_blocks);            // [9] Allocate space for the file allocation table

        metabat = g_try_new (guint32, MAX (info->bb.size, OLE_HEADER_SIZE));
        if (!metabat) {
...
        }

        /* Reading the elements invalidates this memory, make copy */
        gsf_ole_get_guint32s (metabat, header + OLE_HEADER_START_BAT,               // [10] Read sector containing file allocation table entries
            OLE_HEADER_SIZE - OLE_HEADER_START_BAT);
        last = num_bat;
        if (last > OLE_HEADER_METABAT_SIZE)
            last = OLE_HEADER_METABAT_SIZE;

        ptr = ole_info_read_metabat (ole, info->bb.bat.block,                       // [10] Read entries into file allocation table
            info->bb.bat.num_blocks, metabat, metabat + last);
        num_bat -= last;
    } else
        ptr = NULL;

...
    return FALSE;
}

After successfully reading the sectors containing the file allocation table, the following code will be executed. At [11], the library will use the starting sector for the directory that was stored in header to read the number of sectors that are occupied by the directory using the ole_make_bat function. This function will fetch the index of each sector composing the directory, and then update its parameter with whatever was discovered. At [12], the number of sectors as calculated by the ole_make_bat function will be used to allocate an array for the number of potential directory entries. This is done by taking the product of the number of sectors for the directory, the sector size, and the size of a directory entry (0x80). Due to the entire expression being 32-bit, if the product of these values is larger than 32-bits, an overflow can be made to occur resulting in the array being undersized. After allocating the memory with g_malloc0, the allocated memory will then be passed to the ole_dirent_new function call at [13].

gsf/gsf-infile-msole.c:492-669
static gboolean
ole_init_info (GsfInfileMSOle *ole, GError **err)
{
    static guint8 const signature[] =
        { 0xd0, 0xcf, 0x11, 0xe0, 0xa1, 0xb1, 0x1a, 0xe1 };
    guint8 *seen_before;
...

    /* Read the directory's bat, we do not know the size */
    if (ole_make_bat (&info->bb.bat, 0, dirent_start, &ole->bat)) {                         // [11] Count the number of sectors used by the directory
        if (err != NULL)
            *err = g_error_new (gsf_input_error_id (), 0,
                        _("Problems making block allocation table"));
        return TRUE;
    }

    /* Read the directory */
    seen_before = g_malloc0 ((ole->bat.num_blocks << info->bb.shift) * DIRENT_SIZE + 1);    // [12] Multiply the number of sectors by the sector size and a directory entry
    ole->dirent = info->root_dir =
        ole_dirent_new (ole, 0, NULL, seen_before);                                         // [13] Proceed to read the contents of the directory.
    g_free (seen_before);                                                                   // Free allocated memory
    if (ole->dirent == NULL) {
        if (err != NULL)
            *err = g_error_new (gsf_input_error_id (), 0,
                        _("Problems reading directory"));
        return TRUE;
    }

    /*
     * The spec says to ignore modtime for root object.  That doesn't
     * keep files from actually have a modtime there.
     */
    gsf_input_set_modtime (GSF_INPUT (ole), ole->dirent->modtime);

    return FALSE;
}

The following is the implementation of the ole_make_bat function. This function is simply responsible for scanning the file allocation table for a chain at the specified sector. This is accomplished by allocating an array at [14], and then entering a loop that reads indices from the file allocation table into the array. Once an entry in the file allocation table has been read, at [15] this will be appended into the array. After the loop has finished processing all of the available sectors, at [16] the length of the array will be stored to a parameter. This length is used to trigger the integer overflow that is relevant to the mentioned vulnerability.

gsf/gsf-infile-msole.c:154-186
static gboolean
ole_make_bat (MSOleBAT const *metabat, size_t size_guess, guint32 block,
          MSOleBAT *res)
{
    /* NOTE : Only use size as a suggestion, sometimes it is wrong */
    GArray *bat = g_array_sized_new (FALSE, FALSE,                          // [14] Allocate a new array.
        sizeof (guint32), size_guess);

    guint8 *used = (guint8*)g_alloca (1 + metabat->num_blocks / 8);
    memset (used, 0, 1 + metabat->num_blocks / 8);

    while (block < metabat->num_blocks) {
        /* Catch cycles in the bat list */
        if (used[block/8] & (1 << (block & 0x7)))
            break;
        used[block/8] |= 1 << (block & 0x7);

        g_array_append_val (bat, block);                                    // [15] Append current sector to array.
        block = metabat->block [block];
    }

    res->num_blocks = bat->len;                                             // [16] Store length of array into parameter.
    res->block = (guint32 *) (gpointer) g_array_free (bat, FALSE);

    if (block != BAT_MAGIC_END_OF_CHAIN) {
        g_warning ("This OLE2 file is invalid.\n"
               "The Block Allocation Table for one of the streams had 0x%08x instead of a terminator (0x%08x).\n"
               "We might still be able to extract some data, but you'll want to check the file.",
               block, BAT_MAGIC_END_OF_CHAIN);
    }

    return FALSE;
}

Once the result of the integer overflow has been used by the ole_init_info function to allocate the “seen_before” array, the ole_dirent_new function will be called with the undersized array as its last parameter. Upon entry of the ole_dirent_new function, the “entry” parameter will be set to 0. At [17], this parameter is validated to ensure that the index points to a valid sector and is not larger than G_MAXUINT divided by DIRENT_SIZE (0x80). After validation, the implementation will check a flag from the undersized “seen_before” array before writing to it. At this point, the function will read the contents of the current directory entry and store them inside local variables at [19]. After the contents of the directory entry has been read, at [20] the function will use the read fields to recurse and continue to parse the rest of the directory entries within the stream. When this function recurses, the “entry” parameter can then be controlled using the field from the directory entry. This results in the ability to read and write outside the boundaries of the “seen_before” parameter.

gsf-infile-msole.c:303-412
static MSOleDirent *
ole_dirent_new (GsfInfileMSOle *ole, guint32 entry, MSOleDirent *parent,
        guint8 *seen_before)
{
    MSOleDirent *dirent;
    guint32 block, next, prev, child, size;
    guint8 const *data;
    guint8 type;
    guint16 name_len;
    guint64 ft;

    if (entry >= DIRENT_MAGIC_END)                                      // [17] Validate entry index
        return NULL;

    g_return_val_if_fail (entry <= G_MAXUINT / DIRENT_SIZE, NULL);      // [17] Validate entry index

    block = OLE_BIG_BLOCK (entry * DIRENT_SIZE, ole);
    g_return_val_if_fail (block < ole->bat.num_blocks, NULL);           // [17] Validate that block is valid

    g_return_val_if_fail (!seen_before[entry], NULL);                   // [18] Read out of allocated array
    seen_before[entry] = TRUE;                                          // [18] Write to allocated array

...
    prev  = GSF_LE_GET_GUINT32 (data + DIRENT_PREV);                    // [19] Read field from directory entry
    next  = GSF_LE_GET_GUINT32 (data + DIRENT_NEXT);                    // [19] Read field from directory entry
    child = GSF_LE_GET_GUINT32 (data + DIRENT_CHILD);                   // [19] Read field from directory entry
...
    /* NOTE : These links are a tree, not a linked list */
    ole_dirent_new (ole, prev, parent, seen_before);                    // [20] Recurse for the previous directory entry
    ole_dirent_new (ole, next, parent, seen_before);                    // [20] Recurse for the next directory entry

...
    return dirent;
}

Crash Information

If the libgsf library is built using address-sanitizer, an out-of-bounds read will be encountered when trying to load the document generated by the proof-of-concept.

==15165==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x502000002590 at pc 0x7ff50b2e5910 bp 0x7ffc7df6b650 sp 0x7ffc7df6b648
READ of size 1 at 0x502000002590 thread T0
    #0 0x7ff50b2e590f in ole_dirent_new /tracker-miners/libgsf/gsf/gsf-infile-msole.c:322:2
    #1 0x7ff50b2e5792 in ole_dirent_new /tracker-miners/libgsf/gsf/gsf-infile-msole.c:403:2
    #2 0x7ff50b2e1ddb in ole_init_info /tracker-miners/libgsf/gsf/gsf-infile-msole.c:653:3
    #3 0x7ff50b2e1ddb in gsf_infile_msole_new /tracker-miners/libgsf/gsf/gsf-infile-msole.c:984:6
    #4 0x42a3b0 in main /tracker-miners/libgsf/tools/fuck.c:23:11
    #5 0x7ff50aa71087 in __libc_start_call_main /usr/src/debug/glibc-2.39-22.fc40.x86_64/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #6 0x7ff50aa7114a in __libc_start_main@GLIBC_2.2.5 /usr/src/debug/glibc-2.39-22.fc40.x86_64/csu/../csu/libc-start.c:360:3
    #7 0x42a414 in _start (/tracker-miners/libgsf/asan/fuck+0x42a414) (BuildId: 00fd75f831d384c1c99a05a2fce59ad16a3afd71)
    
0x502000002590 is located 3998 bytes after 2-byte region [0x5020000015f0,0x5020000015f2)
allocated by thread T0 here:
    #0 0x4c5263 in malloc (/tracker-miners/libgsf/asan/fuck+0x4c5263) (BuildId: 00fd75f831d384c1c99a05a2fce59ad16a3afd71)
    #1 0x7ff50af7e879 in g_malloc /usr/src/debug/glib2-2.80.3-1.fc40.x86_64/redhat-linux-build/../glib/gmem.c:100:13
    #2 0x7ff50b2ce084 in gsf_msole_sorting_key_new /tracker-miners/libgsf/gsf/gsf-msole-utils.c:2678:14
    #3 0x7ff50b2e1ddb in ole_init_info /tracker-miners/libgsf/gsf/gsf-infile-msole.c:653:3
    #4 0x7ff50b2e1ddb in gsf_infile_msole_new /tracker-miners/libgsf/gsf/gsf-infile-msole.c:984:6

SUMMARY: AddressSanitizer: heap-buffer-overflow /tracker-miners/libgsf/gsf/gsf-infile-msole.c:322:2 in ole_dirent_new
Shadow bytes around the buggy address:
  0x502000002300: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x502000002380: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x502000002400: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x502000002480: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x502000002500: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x502000002580: fa fa[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x502000002600: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x502000002680: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x502000002700: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x502000002780: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x502000002800: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07  
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==15165==ABORTING

The proof-of-concept results in allocating a single-element for the “seen_before” array. This results in an out-of-bounds index being used. Afterwards, when the ole_dirent_new function is made to recurse, the implementation will use the out-of-bounds index to read and then write to the “seen_before” array.

Exploit Proof of Concept

In order to generate the document that triggers the vulnerability, python will be required to run the provided proof-of-concept. To generate the file, run the proof-of-concept with the filename as its only parameter. After a bit, the proof-of-concept will terminate with the file being written to disk. Once this has happened, the file format is ready to be parsed by libgsf. The generated file can be opened with the library using either the gsf tool, or by expliciting using the gsf_infile_msole_new function.

$ python poc.directory.py3.zip filename
Growing DIFAT in order to make space for 2064 entries...
Allocating 2064 sectors for FAT...
Designating 16 sectors in FAT as belonging to the DIFAT...
Designating 2064 sectors in FAT as belonging to the FAT...
Committing changes (17 sectors)...
...

The first sector (0x200) bytes of the generated file contains the header of the compound document. Within the header are 2 fields that relate to the vulnerability. The first field is the sector size at offset 0x1e as a power of 2. This value can be between 6 and 30, resulting in a sector size from 0x40 to 0x40000000 (respectively). The next field is at offset 0x2c and contains the number of sectors used for the directory. If the product of the sector size, number of directory sectors, and the size of a directory entry (0x80) is larger than 32-bits, then this vulnerability is being triggered.

<class storage.File> 'unnamed_7f3a8e045e50' {unnamed=True}
[0] <instance storage.Header 'Header'> (little) 0xd0cf11e0a1b11ae1 version=3.62 clsid={00000000-0000-0000-0000-000000000000}
[1e] <instance storage.HeaderSectorShift 'SectorShift'> uSectorShift=9 (0x200) uMiniSectorShift=8 (0x100)
[22] <instance ptype.block 'reserved'> (6) "\x00\x00\x00\x00\x00\x00"
[28] <instance storage.HeaderFat 'Fat'> sectDirectory=0x00000820 csectDirectory=65536 csectFat=2064 dwTransaction=0x00000000
[38] <instance storage.HeaderMiniFat 'MiniFat'> ulMiniSectorCutoff=4096 sectMiniFat=ENDOFCHAIN(0xfffffffe) csectMiniFat=0
[44] <instance storage.HeaderDiFat 'DiFat'> sectDifat=0x00000000 csectDifat=16
[4c] <instance storage.DIFAT 'Table'> storage.DIFAT.IndirectPointer[109] +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[200] <instance ptype.block 'padding(Table)'> ...
VENDOR RESPONSE

Fixed in 1.14.53

TIMELINE

2024-09-03 - Vendor Disclosure
2024-09-03 - Initial Vendor Contact
2024-10-01 - Vendor Patch Release
2024-10-03 - Public Release

Credit

Discovered by a member of Cisco Talos.