Talos Vulnerability Report

TALOS-2023-1817

GTKWave VZT vzt_rd_process_block autosort out-of-bounds write vulnerabilities

January 8, 2024
CVE Number

CVE-2023-39235,CVE-2023-39234

SUMMARY

Multiple out-of-bounds write vulnerabilities exist in the VZT vzt_rd_process_block autosort functionality of GTKWave 3.3.115. A specially crafted .vzt file can lead to arbitrary code execution. A victim would need to open a malicious file to trigger these vulnerabilities.

CONFIRMED VULNERABLE VERSIONS

The versions below were either tested or verified to be vulnerable by Talos or confirmed to be vulnerable by the vendor.

GTKWave 3.3.115

PRODUCT URLS

GTKWave - https://gtkwave.sourceforge.net

CVSSv3 SCORE

7.8 - CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

CWE

CWE-129 - Improper Validation of Array Index

DETAILS

GTKWave is a wave viewer, often used to analyze FPGA simulations and logic analyzer captures. It includes a GUI to view and analyze traces, as well as convert across several file formats (.lxt, .lxt2, .vzt, .fst, .ghw, .vcd, .evcd) either by using the UI or its command line tools. GTKWave is available for Linux, Windows and MacOS. Trace files can be shared within teams or organizations, for example to compare results of simulation runs across different design implementations, to analyze protocols captured with logic analyzers or just as a reference when porting design implementations.

GTKWave sets up mime types for its supported extensions. So, for example, it’s enough for a victim to double-click on a wave file received by e-mail to cause the gtkwave program to be executed and load a potentially malicious file.

VZT (Verilog Zipped Trace) files are parsed by the functions found in vzt_read.c. These functions are used in the vzt2vcd file conversion utility, vztminer, and by the GUI portion of GTKwave. Thus both are affected by the issue described in this report.

To parse VZT files, the function vzt_rd_init_smp is called:

     struct vzt_rd_trace *vzt_rd_init_smp(const char *name, unsigned int num_cpus) {
[1]      struct vzt_rd_trace *lt = (struct vzt_rd_trace *)calloc(1, sizeof(struct vzt_rd_trace));
         ...

[2]      if (!(lt->handle = fopen(name, "rb"))) {
             vzt_rd_close(lt);
             lt = NULL;
         } else {
             vztint16_t id = 0, version = 0;
             ...
[3]          if (!fread(&id, 2, 1, lt->handle)) {
                 id = 0;
             }
             if (!fread(&version, 2, 1, lt->handle)) {
                 id = 0;
             }
             if (!fread(&lt->granule_size, 1, 1, lt->handle)) {
                 id = 0;
             }
         ...

At [1] the lt structure is initialized. This is the structure that will contain all the information about the input file.
The input file is opened [2] and 3 fields are read [3] to make sure the input file is a supported VZT file.

         ...
         rcf = fread(&lt->numfacs, 4, 1, lt->handle);
[4]      lt->numfacs = rcf ? vzt_rd_get_32(&lt->numfacs, 0) : 0;
         ...
         rcf = fread(&lt->numfacbytes, 4, 1, lt->handle);
         lt->numfacbytes = rcf ? vzt_rd_get_32(&lt->numfacbytes, 0) : 0;
         rcf = fread(&lt->longestname, 4, 1, lt->handle);
         lt->longestname = rcf ? vzt_rd_get_32(&lt->longestname, 0) : 0;
         rcf = fread(&lt->zfacnamesize, 4, 1, lt->handle);
         lt->zfacnamesize = rcf ? vzt_rd_get_32(&lt->zfacnamesize, 0) : 0;
         rcf = fread(&lt->zfacname_predec_size, 4, 1, lt->handle);
         lt->zfacname_predec_size = rcf ? vzt_rd_get_32(&lt->zfacname_predec_size, 0) : 0;
         rcf = fread(&lt->zfacgeometrysize, 4, 1, lt->handle);
         lt->zfacgeometrysize = rcf ? vzt_rd_get_32(&lt->zfacgeometrysize, 0) : 0;
         rcf = fread(&lt->timescale, 1, 1, lt->handle);
         ...

Several fields are then read from the file [4]:

  • numfacs: the number of facilities (elements in facnames)
  • numfacbytes: unused
  • longestname: keeps the longest length of all defined facilities’ names
  • zfacnamesize: compressed size of facnames
  • zfacname_predec_size: decompressed size of facnames
  • zfacgeometrysize: compressed size of facgeometry

Then, the facnames and facgeometry structures are extracted. They can be compressed with either gzip, bzip2 or lzma, depending on the first 2 bytes within the structure buffer.

Right after these two structures, there’s a sequence of blocks that can be arbitrarily long.

     for (;;) {
         ...
[5]      b = calloc(1, sizeof(struct vzt_rd_block));
         b->last_rd_value_idx = ~0;

[6]      rcf = fread(&b->uncompressed_siz, 4, 1, lt->handle);
         b->uncompressed_siz = rcf ? vzt_rd_get_32(&b->uncompressed_siz, 0) : 0;
         rcf = fread(&b->compressed_siz, 4, 1, lt->handle);
         b->compressed_siz = rcf ? vzt_rd_get_32(&b->compressed_siz, 0) : 0;
         rcf = fread(&b->start, 8, 1, lt->handle);
         b->start = rcf ? vzt_rd_get_64(&b->start, 0) : 0;
         rcf = fread(&b->end, 8, 1, lt->handle);
         b->end = rcf ? vzt_rd_get_64(&b->end, 0) : 0;
         pos = ftello(lt->handle);

         ...
         if ((b->uncompressed_siz) && (b->compressed_siz) && (b->end)) {
             /* fprintf(stderr, VZT_RDLOAD"block [%d] %lld / %lld\n", lt->numblocks, b->start, b->end); */
             fseeko(lt->handle, b->compressed_siz, SEEK_CUR);

             lt->numblocks++;
             if (lt->numblocks <= lt->pthreads) {
                 vzt_rd_pthread_mutex_init(lt, &b->mutex, NULL);
                 vzt_rd_decompress_blk_pth(lt, b); /* prefetch first block */
             }

[7]          if (lt->block_curr) {
                 b->prev = lt->block_curr;
                 lt->block_curr->next = b;
                 lt->block_curr = b;
                 lt->end = b->end;
             } else {
                 lt->block_head = lt->block_curr = b;
                 lt->start = b->start;
                 lt->end = b->end;
             }
         } else {
             free(b);
             break;
         }

         pos += b->compressed_siz;
     }

At [5] the block structure is initialized. At [6] some fields are extracted. Finally, the block is saved inside a linked list [7].

From this code we can see the file structure for a block as follows:

  • uncompressed_siz - unsigned big endian 32-bit
  • compressed_siz - unsigned big endian 32-bit
  • start_time - unsigned big endian 64-bit
  • end_time - unsigned big endian 64-bit
  • compressed data of size compressed_siz

Upon return from the current vzt_rd_init_smp function, the blocks are parsed inside vzt_rd_iter_blocks.

Eventually, a call to vzt_rd_decompress_blk decompresses the compressed contents of the block and sets b->mem to point to the contents of the decompressed data.

Once b->mem is set, we reach a call to vzt_rd_block_vch_decode that parses the compressed block contents.

     static void vzt_rd_block_vch_decode(struct vzt_rd_trace *lt, struct vzt_rd_block *b) {
         vzt_rd_pthread_mutex_lock(lt, &b->mutex);

         if ((!b->times) && (b->mem)) {
             vztint64_t *times = NULL;
             vztint32_t *change_dict = NULL;
             vztint32_t *val_dict = NULL;
             unsigned int num_time_ticks, num_sections, num_dict_entries;
[8]          unsigned char *pnt = b->mem;
             vztint32_t i, j, m, num_dict_words;
             /* vztint32_t *block_end = (vztint32_t *)(pnt + b->uncompressed_siz); */
             vztint32_t *val_tmp;
             unsigned int num_bitplanes;
             uintptr_t padskip;

[9]         num_time_ticks = vzt_rd_get_v32(&pnt);
            ...
[10]        num_sections = vzt_rd_get_v32(&pnt);
            num_dict_entries = vzt_rd_get_v32(&pnt);
            padskip = ((uintptr_t)pnt) & 3;
[11]        pnt += (padskip) ? 4 - padskip : 0; /* skip pad to next 4 byte boundary */
            ...
            val_dict = (vztint32_t *)pnt;
            pnt = (char *)(val_dict + (num_dict_words = num_dict_entries * num_sections));

        bpcalc:
[12]        num_bitplanes = vzt_rd_get_byte(pnt, 0) + 1;
            pnt++;
            b->multi_state = (num_bitplanes > 1);
            padskip = ((uintptr_t)pnt) & 3;
[13]        pnt += (padskip) ? 4 - padskip : 0; /* skip pad to next 4 byte boundary */
            b->vindex = (vztint32_t *)(pnt);
            ...
            pnt = (char *)(b->vindex + num_bitplanes * lt->total_values);
            ...

            num_dict_words = (num_sections * num_dict_entries) * sizeof(vztint32_t);
[14]        change_dict = malloc(num_dict_words ? num_dict_words : sizeof(vztint32_t)); /* scan-build */
            m = 0;
            for (i = 0; i < num_dict_entries; i++) {
                vztint32_t pbit = 0;
                for (j = 0; j < num_sections; j++) {
                    vztint32_t k = val_dict[m];
[15]                vztint32_t l = k ^ ((k << 1) ^ pbit);
                    change_dict[m++] = l;
                    pbit = k >> 31;
                }
            }

[16]        b->val_dict = val_dict;
            b->change_dict = change_dict;
            b->times = times;
            b->num_time_ticks = num_time_ticks;
            b->num_dict_entries = num_dict_entries;
            b->num_sections = num_sections;
        }

At [8] pnt is set to point to the decompressed block data.
At [9] num_time_ticks is extracted as a 32-bit varint from pnt and times are extracted from the block.
At [10] num_sections and num_dict_entries are extracted.
At [11] a pointer to val_dict is extracted, ensuring it’s aligned to a 4-byte boundary. This array contains wave values.
At [12] num_bitplanes is extracted, 1 byte.
At [13] a pointer to vindex is extracted, ensuring it’s aligned to a 4-byte boundary.
At [14] change_dict is calculated based on val_dict. We can see at [15] that change_dict is performing an operation to calculate all signal transitions in the val_dict integer array, which is treated as a bit-stream. This is important to note as we can tell change_dict is arbitrarily controlled by the file contents.

Finally at [16] the extracted values are saved inside the block structure b.

Back inside vzt_rd_process_block at [17], the function loops over all facilities:

 int vzt_rd_process_block(struct vzt_rd_trace *lt, struct vzt_rd_block *b) {
     unsigned int i, i2;
     vztint32_t idx;
     char *pnt = lt->value_current_sector, *pnt2 = lt->value_previous_sector;
     char buf[32];
     char *bufpnt;

     struct vzt_ncycle_autosort **autosort;
     struct vzt_ncycle_autosort *deadlist = NULL;
     struct vzt_ncycle_autosort *autofacs = calloc(lt->numrealfacs ? lt->numrealfacs : 1, sizeof(struct vzt_ncycle_autosort)); /* fix for scan-build on lt->numrealfacs */

     vzt_rd_block_vch_decode(lt, b);
[17] vzt_rd_pthread_mutex_lock(lt, &b->mutex);

[18] autosort = calloc(b->num_time_ticks, sizeof(struct vzt_ncycle_autosort *));
     for (i = 0; i < b->num_time_ticks; i++) autosort[i] = NULL;
     deadlist = NULL;

[19] for (idx = 0; idx < lt->numrealfacs; idx++) {
         int process_idx = idx / 8;
         int process_bit = idx & 7;

             ...

             i2 = vzt_rd_next_value_chg_time(lt, b, i, idx);
             if (i2) {
                 struct vzt_ncycle_autosort *t = autosort[i2];

                 autofacs[idx].next = t;
[22]             autosort[i2] = autofacs + idx;
             } else {
                 struct vzt_ncycle_autosort *t = deadlist;
                 autofacs[idx].next = t;
                 deadlist = autofacs + idx;
             }
         }
     }

[20] for (i = 1; i < b->num_time_ticks; i++) {
         struct vzt_ncycle_autosort *t = autosort[i];

         if (t) {
             while (t) {
                 struct vzt_ncycle_autosort *tn = t->next;

                 idx = t - autofacs;

                 vzt_rd_fac_value(lt, b, i, idx, pnt);
                 if (!(lt->flags[idx] & (VZT_RD_SYM_F_DOUBLE | VZT_RD_SYM_F_STRING))) {
                     lt->value_change_callback(&lt, &b->times[i], &idx, &pnt);
                 } else {
                     if (lt->flags[idx] & VZT_RD_SYM_F_DOUBLE) {
                         bufpnt = buf;
                         vzt_rd_double_xdr(pnt, buf);
[21]                     lt->value_change_callback(&lt, &b->times[i], &idx, &bufpnt);
                     } else {
                         unsigned int spnt = vzt_rd_make_sindex(pnt);
                         char *msg = ((!i) && (b->prev)) ? "UNDEF" : b->sindex[spnt];
[21]                     lt->value_change_callback(&lt, &b->times[i], &idx, &msg);
                     }
                 }

                 i2 = vzt_rd_next_value_chg_time(lt, b, i, idx);

                 if (i2 != i) {
                     struct vzt_ncycle_autosort *ta = autosort[i2];

                     autofacs[idx].next = ta;
[23]                 autosort[i2] = autofacs + idx;
                 } else {
                     struct vzt_ncycle_autosort *ta = deadlist;
                     autofacs[idx].next = ta;
                     deadlist = autofacs + idx;
                 }

                 t = tn;
             }
         }
     }
     ...

At high level, this function sorts the wave value changes by looping over all facilities [19] and time ticks [20] and eventually emits VCD syntax accordingly [21].
At [18], autosort is allocated with a size of b->num_time_ticks * sizeof(void *). Then, the autosort array is written and points to the autofacs array, using the i2 index. Ideally this index should be within the size of autosort. However, this is not the case, as i2 can be arbitrarily controlled, and there are no checks that make sure the writes at [22] and [23] happen within the bounds of the autosort array.

i2 is returned by vzt_rd_next_value_chg_time:

 vztint32_t vzt_rd_next_value_chg_time(struct vzt_rd_trace *lt, struct vzt_rd_block *b, vztint32_t time_offset, vztint32_t facidx) {
     unsigned int i;
     vztint32_t len = lt->len[facidx];
     vztint32_t vindex_offset = lt->vindex_offset[facidx];
     vztint32_t vindex_offset_x = vindex_offset + lt->total_values;
     vztint32_t old_time_offset = time_offset;
     int word = time_offset / 32;
     int bit = (time_offset & 31) + 1;
     int row_size = b->num_sections;
     vztint32_t *valpnt, *valpnt_x;
     vztint32_t change_msk;

     if ((time_offset >= (b->num_time_ticks - 1)) || (facidx > lt->numrealfacs)) return (time_offset);

     time_offset &= ~31;

     for (; word < row_size; word++) {
         if (bit != 32) {
             change_msk = 0;

             if (!(lt->flags[facidx] & VZT_RD_SYM_F_SYNVEC)) {
                 if (b->multi_state) {
                     for (i = 0; i < len; i++) {
                         valpnt = b->change_dict + (b->vindex[vindex_offset + i] * row_size + word);
                         valpnt_x = b->change_dict + (b->vindex[vindex_offset_x + i] * row_size + word);
                         change_msk |= *valpnt;
                         change_msk |= *valpnt_x;
                     }
                 } else {
                     for (i = 0; i < len; i++) {
                         valpnt = b->change_dict + (b->vindex[vindex_offset + i] * row_size + word);
                         change_msk |= *valpnt;
                     }
                 }
             } else {
                 if (b->multi_state) {
                     for (i = 0; i < len; i++) {
                         if ((facidx + i) >= lt->numfacs) break;

                         vindex_offset = lt->vindex_offset[facidx + i];
                         vindex_offset_x = vindex_offset + lt->total_values;

                         valpnt = b->change_dict + (b->vindex[vindex_offset] * row_size + word);
                         valpnt_x = b->change_dict + (b->vindex[vindex_offset_x] * row_size + word);
                         change_msk |= *valpnt;
                         change_msk |= *valpnt_x;
                     }
                 } else {
                     for (i = 0; i < len; i++) {
                         if ((facidx + i) >= lt->numfacs) break;

                         vindex_offset = lt->vindex_offset[facidx + i];

                         valpnt = b->change_dict + (b->vindex[vindex_offset] * row_size + word);
                         change_msk |= *valpnt;
                     }
                 }
             }

             change_msk >>= bit;
             if (change_msk) {
[24]             return ((change_msk & 1 ? 0 : vzt_rd_tzc(change_msk)) + time_offset + bit);
             }
         }

         time_offset += 32;
         bit = 0;
     }

     return (old_time_offset);
 }

In this function change_msk is calculated based on change_dict, which is arbitrarily controlled by values in inside blocks [15]. At [24], the function can return the number position of the most significant bit in change_msk. As this is controlled by file contents, this issue can be used to control i2, which in turn leads to writing out-of-bounds in the heap at [22] and [23].
With careful heap manipulation, this issue can be used to execute code arbitrarily.

CVE-2023-39234 - autosort numrealfacs

At [22], while looping over lt->numrealfacs, autosort can be written out-of-bounds in the heap, which can in turn lead to arbitrary code execution.

CVE-2023-39235 - autosort num_time_ticks

At [23], while looping over lt->num_time_ticks, autosort can be written out-of-bounds in the heap, which can in turn lead to arbitrary code execution.

VENDOR RESPONSE

Fixed in version 3.3.118, available from https://sourceforge.net/projects/gtkwave/files/gtkwave-3.3.118/

TIMELINE

2023-08-02 - Vendor Disclosure
2023-12-31 - Vendor Patch Release
2024-01-08 - Public Release

Credit

Discovered by Claudio Bozzato of Cisco Talos.