Talos Vulnerability Report

TALOS-2023-1804

GTKWave VCD parse_valuechange portdump out-of-bounds write vulnerabilities

January 8, 2024
CVE Number

CVE-2023-37416,CVE-2023-37419,CVE-2023-37420,CVE-2023-37418,CVE-2023-37417

SUMMARY

Multiple out-of-bounds write vulnerabilities exist in the VCD parse_valuechange portdump functionality of GTKWave 3.3.115. A specially crafted .vcd file can lead to arbitrary code execution. A victim would need to open a malicious file to trigger these vulnerabilities.

CONFIRMED VULNERABLE VERSIONS

The versions below were either tested or verified to be vulnerable by Talos or confirmed to be vulnerable by the vendor.

GTKWave 3.3.115

PRODUCT URLS

GTKWave - https://gtkwave.sourceforge.net

CVSSv3 SCORE

7.8 - CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

CWE

CWE-787 - Out-of-bounds Write

DETAILS

GTKWave is a wave viewer, often used to analyze FPGA simulations and logic analyzer captures. It includes a GUI to view and analyze traces, as well as convert across several file formats (.lxt, .lxt2, .vzt, .fst, .ghw, .vcd, .evcd) either by using the UI or its command line tools. GTKWave is available for Linux, Windows and MacOS. Trace files can be shared within teams or organizations, for example to compare results of simulation runs across different design implementations, to analyze protocols captured with logic analyzers or just as a reference when porting design implementations.

GTKWave sets up mime types for its supported extensions.

VCD (Value Change Dump) files are parsed by the vcd_parse function. This function is duplicated in several conversion utilities (vcd2lxt, vcd2lxt2, vcd2vzt) and in the GUI portion of GTKWave. In general the various implementations are very similar or identical, but in some cases they differ enough to not allow the issue described in this advisory to be triggered. For example, this issue is not present in the vcd_parse function in vcd_recorder.c.
Let’s describe the execution flow for the vcd2lxt utility; the other implementations have very similar behavior.

The function vcd_parse loops over each line in the file [1]. Depending on which token has been read [2], a different switch block is executed:

     static void vcd_parse(int linear) {
         int tok;

[1]      for (;;) {
[2]          switch (get_token()) {
                 ...

The get_token() function simply extracts a token from the file at the current cursor position, saving it to the global yytext buffer and assigning the token’s length to the global yylen.
Moreover, if the token does not start with “$”, the token is considered a special symbol, and it has to match one of these tokens:

char *tokens[] = {"var", "end", "scope", "upscope",
                  "comment", "date", "dumpall", "dumpoff", "dumpon",
                  "dumpvars", "enddefinitions",
                  "dumpports", "dumpportsoff", "dumpportson", "dumpportsall",
                  "timescale", "version", "vcdclose", "timezero",
                  "", "", ""};

The return value of get_token is a token type, which is an index inside the tokens array above.
If the token does not start with “$”, the token is considered a string and the returned token type will be T_STRING.

Going back to the switch above, if the parsed token is “$var”, the token type will be T_VAR and we will enter the block at [3]:

[3]  case T_VAR: {
         int vtok;
         struct vcdsymbol *v = NULL;

         var_prevch = 0;
         ...
[4]      vtok = get_vartoken(1);

A new token is read using get_vartoken() [5] and saved into vtok.
This function, similarly to get_token(), extracts a token from the file, separated by any of “ “, “\t”, “\n”, or “\r”.
If match_kw (the first argument to the function) is 1, then the token is matched against the vartypes array and the return value is set to the matching index inside it:

static char *vartypes[] = {"event", "parameter",
                           "integer", "real", "real_parameter", "realtime", "string", "reg", "supply0",
                           "supply1", "time", "tri", "triand", "trior",
                           "trireg", "tri0", "tri1", "wand", "wire", "wor", "port", "in", "out", "inout",
                           "$end", "", "", "", ""};

Going on with the T_VAR case. Our token needs to be “port” or any of the other symbols before it [5]:

         ...
[5]      if (vtok > V_PORT) goto bail;

[6]      v = (struct vcdsymbol *)calloc_2(1, sizeof(struct vcdsymbol));
         v->vartype = vtok;
         v->msi = v->lsi = vcd_explicit_zero_subscripts; /* indicate [un]subscripted status */

         if (vtok == V_PORT) {
             ...
         } else /* regular vcd var, not an evcd port var */
         {
[7]          vtok = get_vartoken(1);
             if (vtok == V_END) goto err;
[7]          v->size = atoi_64(yytext);
[8]          vtok = get_strtoken();
             if (vtok == V_END) goto err;
             v->id = (char *)malloc_2(yylen + 1);
[8]          strcpy(v->id, yytext);
             v->nid = vcdid_hash(yytext, yylen);

             if (v->nid < vcd_minid) vcd_minid = v->nid;
             if (v->nid > vcd_maxid) vcd_maxid = v->nid;

[9]          vtok = get_vartoken(0);
             if (vtok != V_STRING) goto err;
             if (slisthier_len) {
                ...
             } else {
                 v->name = (char *)malloc_2(yylen + 1);
[9]              strcpy(v->name, yytext);
             }

[10]          vtok = get_vartoken(1);
             if (vtok == V_END) goto dumpv;
             ...
         }

A series of tokens is then extracted to populate the vcdsymbol pointed by v [6], in this order:

  • [7] v->size (integer) is the size of this symbol
  • [8] v->id is a string representing the symbol ID
  • [9] v->name is a string representing the name of this symbol
  • [10] the declaration must end with the string “$end”

The code then continues by populating v->value and v->narray:

     dumpv:
         ...

         /* initial conditions */
[11]     v->value = (char *)malloc_2(v->size + 1);
         v->value[v->size] = 0;
         v->narray = (struct Node **)calloc_2(v->size, sizeof(struct Node *));
         {
             int i;
             for (i = 0; i < v->size; i++) {
[12]             v->value[i] = 'x';

                 v->narray[i] = (struct Node *)calloc_2(1, sizeof(struct Node));
                 v->narray[i]->head.time = -1;
                 v->narray[i]->head.v.val = 1;
             }
         }

         ...

[13]    if (!vcdsymroot) {
            vcdsymroot = vcdsymcurr = v;
        } else {
            vcdsymcurr->next = v;
            vcdsymcurr = v;
        }
        numsyms++;

         ...

     bail:
         if (vtok != V_END) sync_end(NULL);
         break;
     }

v->value is allocated with a size of v->size + 1 [11] before it is initialized with “x” characters [12] and null-terminated.
Finally, the new v symbol is added to the symbols list [13] and numsyms is incremented. Here the case block ends and we go back to the loop at [1].

The next token is read; if it matches “$enddefinitions” the switch block at [14] is entered.

[14] case T_ENDDEFINITIONS:
         if (!header_over) {
[15]         header_over = 1; /* do symbol table management here */
             create_sorted_table();
             if ((!sorted) && (!indexed)) {
                 fprintf(stderr, "No symbols in VCD file..nothing to do!\n");
                 exit(1);
             }

             if (linear) lt_set_no_interlace(lt);
         }
         break;

At [15], the variable header_over is set to 1. This is important, because this variable must be set to 1 in order to call the parse_valuechange() function in the next step.
At this point, we go back to the loop at [1].

The next token is read. If it’s a string, we enter the switch block at [16].

     case T_STRING:
[16]     if (header_over) {
             /* catchall for events when header over */
[17]         if (yytext[0] == '#') {
                ...
             } else {
[18]             parse_valuechange();
             }
         }
         break;

If header_over is 1 [16] and the string does not start with “#” [17], the function parse_valuechange() is called.

     static void parse_valuechange(void) {
         struct vcdsymbol *v;
         char *vector;
         int vlen;

         switch (yytext[0]) {
             ...
[19]         case 'p':
                 /* extract port dump value.. */
[20]             vector = malloc_2(yylen_cache = yylen);
                 strcpy(vector, yytext + 1);
                 vlen = yylen - 1;

[21]             get_strtoken(); /* throw away 0_strength_component */
                 get_strtoken(); /* throw away 0_strength_component */
                 get_strtoken(); /* this is the id                  */
[22]             v = bsearch_vcd(yytext, yylen);
                 if (!v) {
                     fprintf(stderr, "Near line %d, Unknown identifier: '%s'\n", vcdlineno, yytext);
                     free_2(vector);
                 } else {
[23]                 if (vlen < v->size) /* fill in left part */
                     {
                         char extend;
                         int i, fill;

                         extend = '0';

                         fill = v->size - vlen;
                         for (i = 0; i < fill; i++) {
                             v->value[i] = extend;
                         }
                         evcd_strcpy(v->value + fill, vector);
                     }
                     ...

                     if ((v->size == 1) || (!atomic_vectors)) {
                        ...
                     } else {
[24]                     if (yylen_cache < v->size) {
                             free_2(vector);
[25]                         vector = malloc_2(v->size + 1);
                         }
[26]                     strcpy(vector, v->value);
                         add_histent(current_time, v->narray[0], 0, 1, vector);
                         free_2(vector);
                     }
                 }
                 break;
             ...
         }
     }

If the token starts with the letter “p”, we enter the block at [19].
The array vector is allocated with a size of yylen (the size of the token string, excluding the null terminator) [20]. vlen is then assigned a value of yylen - 1, which is the length of the string after “p” (again without the null terminator).
Two tokens are then extracted and discarded [21]. Finally the id token is extracted and searched for in the vcdsymbol’s list [22]. If found, the resulting vcdsymbol is then pointed by v.
At [23], if vlen is smaller than v->size, this means that the string in vector is not big enough to fill the whole v->value buffer, so it is padded with zeros.
Then, the padded string is copied back to vector, and at [24] the code attempts to make sure the string will fit in vector, reallocating if necessary [25]. Finally, the padded v->value is copied back to vector.

The problem in this code is the check at [24]: yylen_cache is the size of vector, and v->size is the length of the v->value string. So, to copy the v->value string into vector safely, vector needs to have a size of vsize-> + 1. The check at [24] only reallocates if vector has a size smaller than vsize, whereas it should check for yylen_cache <= v->size. This allows the strcpy to write one NULL byte out-of-bounds in the heap, which, with careful heap massaging (which looks feasible, as allocations in this parser can be easily controlled via the file contents), an attacker can abuse to execute arbitrary code.

As mentioned before, this issue affects 5 different source files listed separately below.

CVE-2023-37416 - VCD GUI legacy

The GUI’s legacy VCD parsing code incorrectly checks the buffer size at line src/vcd.c:965, leading to a NULL byte out-of-bounds write in heap.

This can be triggered by using the -L flag when starting GTKWave.

CVE-2023-37417 - VCD GUI interactive

The GUI’s interactive VCD parsing code incorrectly checks the buffer size at line src/vcd_partial.c:901, leading to a NULL byte out-of-bounds write in heap.

This can be triggered by using the -I flag when starting GTKWave.

CVE-2023-37418 - vcd2vzt

The vcd2vzt conversion utility incorrectly checks the buffer size at line src/helpers/vcd2vzt.c:935, leading to a NULL byte out-of-bounds write in heap.

CVE-2023-37419 - vcd2lxt2

The vcd2lxt2 conversion utility incorrectly checks the buffer size at line src/helpers/vcd2lxt2.c:933, leading to a NULL byte out-of-bounds write in heap.

CVE-2023-37420 - vcd2lxt

The vcd2lxt conversion utility incorrectly checks the buffer size at line src/helpers/vcd2lxt.c:928, leading to a NULL byte out-of-bounds write in heap.

Crash Information

==156408==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xf4d00b6e at pc 0xf79f6887 bp 0xffffd698 sp 0xffffd270
WRITE of size 31 at 0xf4d00b6e thread T0
    #0 0xf79f6886 in __interceptor_strcpy ../../../../src/libsanitizer/asan/asan_interceptors.cpp:425
    #1 0x5655cc29 in parse_valuechange vcd2lxt.c:933
    #2 0x5655f9de in vcd_parse vcd2lxt.c:1417
    #3 0x56561640 in vcd_main vcd2lxt.c:1704
    #4 0x56562dad in main vcd2lxt.c:1959
    #5 0xf7647294 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #6 0xf7647357 in __libc_start_main_impl ../csu/libc-start.c:381
    #7 0x565583f6 in _start (vcd2lxt+0x33f6)

0xf4d00b6e is located 0 bytes to the right of 30-byte region [0xf4d00b50,0xf4d00b6e)
allocated by thread T0 here:
    #0 0xf7a55ffb in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
    #1 0x56578a6d in malloc_2 v2l_debug.c:92
    #2 0x5655c470 in parse_valuechange vcd2lxt.c:874
    #3 0x5655f9de in vcd_parse vcd2lxt.c:1417
    #4 0x56561640 in vcd_main vcd2lxt.c:1704
    #5 0x56562dad in main vcd2lxt.c:1959
    #6 0xf7647294 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: heap-buffer-overflow ../../../../src/libsanitizer/asan/asan_interceptors.cpp:425 in __interceptor_strcpy
Shadow bytes around the buggy address:
  0x3e9a0110: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3e9a0120: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3e9a0130: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3e9a0140: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3e9a0150: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x3e9a0160: fa fa fa fa fa fa fa fa fa fa 00 00 00[06]fa fa
  0x3e9a0170: 00 00 00 07 fa fa fa fa fa fa fa fa fa fa fa fa
  0x3e9a0180: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3e9a0190: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3e9a01a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3e9a01b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
VENDOR RESPONSE

Fixed in version 3.3.118, available from https://sourceforge.net/projects/gtkwave/files/gtkwave-3.3.118/

TIMELINE

2023-08-01 - Vendor Disclosure
2023-12-31 - Vendor Patch Release
2024-01-08 - Public Release

Credit

Discovered by Claudio Bozzato of Cisco Talos.