CVE-2023-37447,CVE-2023-37446,CVE-2023-37445,CVE-2023-37444,CVE-2023-37442,CVE-2023-37443
Multiple out-of-bounds read vulnerabilities exist in the VCD var definition section functionality of GTKWave 3.3.115. A specially crafted .vcd file can lead to arbitrary code execution. A victim would need to open a malicious file to trigger these vulnerabilities.
The versions below were either tested or verified to be vulnerable by Talos or confirmed to be vulnerable by the vendor.
GTKWave 3.3.115
GTKWave - https://gtkwave.sourceforge.net
7.8 - CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H
CWE-119 - Improper Restriction of Operations within the Bounds of a Memory Buffer
GTKWave is a wave viewer, often used to analyze FPGA simulations and logic analyzer captures. It includes a GUI to view and analyze traces, as well as convert across several file formats (.lxt
, .lxt2
, .vzt
, .fst
, .ghw
, .vcd
, .evcd
) either by using the UI or its command line tools. GTKWave is available for Linux, Windows and MacOS. Trace files can be shared within teams or organizations, for example to compare results of simulation runs across different design implementations, to analyze protocols captured with logic analyzers or just as a reference when porting design implementations.
GTKWave sets up mime types for its supported extensions.
VCD (Value Change Dump) files are parsed by the vcd_parse
function. This function is duplicated in several conversion utilities (vcd2lxt
, vcd2lxt2
, vcd2vzt
) and in the GUI portion of GTKWave. In general the various implementations are very similar or identical, and in this case they are all affected by the issue described in this advisory.
Let’s describe the execution flow for the vcd2lxt
utility; the other implementations have very similar behavior.
The function vcd_parse
loops over each line in the file [1]. Depending on which token has been read [2], a different switch block is executed:
static void vcd_parse(int linear) {
int tok;
[1] for (;;) {
[2] switch (get_token()) {
...
The get_token()
function simply extracts a token from the file at the current cursor position, saving it to the global yytext
buffer and assigning the token’s length to the global yylen
.
Moreover, if the token does not start with “$”, the token is considered a special symbol, and it has to match one of these tokens:
char *tokens[] = {"var", "end", "scope", "upscope",
"comment", "date", "dumpall", "dumpoff", "dumpon",
"dumpvars", "enddefinitions",
"dumpports", "dumpportsoff", "dumpportson", "dumpportsall",
"timescale", "version", "vcdclose", "timezero",
"", "", ""};
The return value of get_token
is a token type, which is an index inside the tokens
array above.
If the token does not start with “$”, the token is considered a string and the returned token type will be T_STRING
.
Going back to the switch
above, if the parsed token is “$var”, the token type will be T_VAR
and we will enter the block at [3]:
[3] case T_VAR: {
int vtok;
struct vcdsymbol *v = NULL;
var_prevch = 0;
...
[4] vtok = get_vartoken(1);
A new token is read using get_vartoken()
[4] and saved into vtok
.
This function, similarly to get_token()
, extracts a token from the file, separated by any of “ “, “\t”, “\n”, or “\r”.
If match_kw
(the first argument to the function) is 1, then the token is matched against the vartypes
array, and the return value is set to the matching index inside it:
static char *vartypes[] = {"event", "parameter",
"integer", "real", "real_parameter", "realtime", "string", "reg", "supply0",
"supply1", "time", "tri", "triand", "trior",
"trireg", "tri0", "tri1", "wand", "wire", "wor", "port", "in", "out", "inout",
"$end", "", "", "", ""};
Going on with the T_VAR
case. Our token needs to be “port” or any of the other symbols before it [5]:
...
[5] if (vtok > V_PORT) goto bail;
[6] v = (struct vcdsymbol *)calloc_2(1, sizeof(struct vcdsymbol));
v->vartype = vtok;
v->msi = v->lsi = vcd_explicit_zero_subscripts; /* indicate [un]subscripted status */
if (vtok == V_PORT) {
...
} else /* regular vcd var, not an evcd port var */
{
[7] vtok = get_vartoken(1);
if (vtok == V_END) goto err;
[7] v->size = atoi_64(yytext);
[8] vtok = get_strtoken();
if (vtok == V_END) goto err;
v->id = (char *)malloc_2(yylen + 1);
[8] strcpy(v->id, yytext);
[11] v->nid = vcdid_hash(yytext, yylen);
[12] if (v->nid < vcd_minid) vcd_minid = v->nid;
if (v->nid > vcd_maxid) vcd_maxid = v->nid;
[9] vtok = get_vartoken(0);
if (vtok != V_STRING) goto err;
if (slisthier_len) {
...
} else {
v->name = (char *)malloc_2(yylen + 1);
[9] strcpy(v->name, yytext);
}
[10] vtok = get_vartoken(1);
if (vtok == V_END) goto dumpv;
...
}
A series of tokens is then extracted to populate the vcdsymbol
pointed by v
[6], in this order:
Most importantly, a hash of v->id
is computed [11] and the vcd_minid
and vcd_maxid
global variables are updated to represent, respectively, the minimum and maximum hash IDs ever encountered.
It is interesting to note how the hash is computed in the vcdid_hash
function:
static unsigned int vcdid_hash(char *s, int len) {
unsigned int val = 0;
int i;
s += (len - 1);
for (i = 0; i < len; i++) {
val *= 94;
val += (((unsigned char)*s) - 32);
s--;
}
return (val);
}
As we can see, each character in the input symbol s
is added to the integer variable val
, which is multiplied by 94 in each loop. This means that short symbols will have small hashes as a result. The returned hash can span the whole integer space. Let’s see two simple examples:
vcdid_hash("A", 1) returns 0x21
vcdid_hash("AAAAA", 5) returns 0x9b3890cb
Going back to the variable definition, the code continues by adding the new v
symbol to the symbols list, [13] and numsyms
is incremented:
dumpv:
...
[13] if (!vcdsymroot) {
vcdsymroot = vcdsymcurr = v;
} else {
vcdsymcurr->next = v;
vcdsymcurr = v;
}
numsyms++;
...
bail:
if (vtok != V_END) sync_end(NULL);
break;
}
The next token is read. If it matches “$enddefinitions” the switch block at [14] is entered.
[14] case T_ENDDEFINITIONS:
if (!header_over) {
[15] header_over = 1; /* do symbol table management here */
[16] create_sorted_table();
if ((!sorted) && (!indexed)) {
fprintf(stderr, "No symbols in VCD file..nothing to do!\n");
exit(1);
}
if (linear) lt_set_no_interlace(lt);
}
break;
At [15], the variable header_over
is set to 1. This is important, as it declares the end of the variable definitions and allows the definition of the next sections.
As the variable definition has completed, the function create_sorted_table()
is called to store all VCD symbols in an index for faster access [16].
static void create_sorted_table(void) {
struct vcdsymbol *v;
struct vcdsymbol **pnt;
unsigned int vcd_distance;
struct vcdsymbol *root_v;
int i;
if (numsyms) {
[17] vcd_distance = vcd_maxid - vcd_minid + 1;
[18] if (vcd_distance <= 8 * 1024 * 1024) {
[19] indexed = (struct vcdsymbol **)calloc_2(vcd_distance, sizeof(struct vcdsymbol *));
printf("%d symbols span ID range of %d, using indexing...\n", numsyms, vcd_distance);
v = vcdsymroot;
while (v) {
if (!(root_v = indexed[v->nid - vcd_minid])) {
indexed[v->nid - vcd_minid] = v;
}
alias_vs_normal_symadd(v, root_v);
v = v->next;
}
[20] } else {
pnt = sorted = (struct vcdsymbol **)calloc_2(numsyms, sizeof(struct vcdsymbol *));
v = vcdsymroot;
while (v) {
*(pnt++) = v;
v = v->next;
}
qsort(sorted, numsyms, sizeof(struct vcdsymbol *), vcdsymcompare);
...
}
}
At [17], the difference between vcd_maxid
and vcd_minid
is calculated. If we consider the example above, if two variables with symbols “A” and “AAAAA” are declared, their distance will be very big: 0x9b3890cb - 0x21 + 1
. If the distance is smaller than ~8 millions [18], create_sorted_table
will create a simple hash table in indexed
[19], so that symbols can be retrieved directly, after computing the vcd_hash
of the symbol to look up. The size of this table is at maximum 8 * 1024 * 1024 * sizeof(void *)
, so 32 MB for 32-bit code and 64 MB for 64-bit code.
If instead the distance is too big [20], a sorted array is created in sorted
.
At this point, we go back to the loop at [1].
As “$enddefinitions” has been issued and the index has been created, it should not be possible to create a new variable definition. However, there’s nothing preventing it. This is the core of the issue described in this advisory.
If another variable is defined, we end up again in the T_VAR
case (reporting the code from above for reference):
[3] case T_VAR: {
int vtok;
struct vcdsymbol *v = NULL;
var_prevch = 0;
...
[4] vtok = get_vartoken(1);
...
[11] v->nid = vcdid_hash(yytext, yylen);
[12] if (v->nid < vcd_minid) vcd_minid = v->nid;
if (v->nid > vcd_maxid) vcd_maxid = v->nid;
Here the code may eventually set new vcd_minid
and vcd_maxid
values, depending on the hash value for the new variable. If vcd_minid
changes, the hash lookups will be affected, as we will see soon.
Let’s assume we go back to the loop at [1] and we encounter a port dump definition. This would hit the T_STRING
case:
case T_STRING:
[21] if (header_over) {
/* catchall for events when header over */
[22] if (yytext[0] == '#') {
...
} else {
[23] parse_valuechange();
}
}
break;
If header_over
is 1 [21] and the string does not start with “#” [22], the function parse_valuechange()
is called [23].
static void parse_valuechange(void) {
struct vcdsymbol *v;
char *vector;
int vlen;
switch (yytext[0]) {
...
[24] case 'p':
/* extract port dump value.. */
vector = malloc_2(yylen_cache = yylen);
strcpy(vector, yytext + 1);
vlen = yylen - 1;
[25] get_strtoken(); /* throw away 0_strength_component */
get_strtoken(); /* throw away 0_strength_component */
get_strtoken(); /* this is the id */
[26] v = bsearch_vcd(yytext, yylen);
if (!v) {
fprintf(stderr, "Near line %d, Unknown identifier: '%s'\n", vcdlineno, yytext);
free_2(vector);
} else {
[27] if (vlen < v->size) /* fill in left part */
{
char extend;
int i, fill;
extend = '0';
fill = v->size - vlen;
for (i = 0; i < fill; i++) {
v->value[i] = extend;
}
evcd_strcpy(v->value + fill, vector);
}
...
If the token starts with the letter “p”, we enter the block at [24].
Two tokens are extracted and discarded [25]. Finally, the id
token is extracted and searched for in the vcdsymbol
’s list [26] using bsearch_vcd
.
static struct vcdsymbol *bsearch_vcd(char *key, int len) {
struct vcdsymbol **v;
struct vcdsymbol *t;
if (indexed) {
[28] unsigned int hsh = vcdid_hash(key, len);
[29] if ((hsh >= vcd_minid) && (hsh <= vcd_maxid)) {
[30] return (indexed[hsh - vcd_minid]);
}
}
v = (struct vcdsymbol **)bsearch(key, sorted, numsyms,
sizeof(struct vcdsymbol *), vcdsymbsearchcompare);
...
}
If the vcd symbols have been stored in the indexed
hash table, a direct array access is performed to retrieve the symbol.
At [28] the hash is calculated, and if it’s within vcd_minid
and vcd_maxid
[29], the nth element from indexed
is returned, calculated as hsh - vcd_minid
[30].
As vcd_minid
may have been modified after $enddefinitions
, using a direct access this way may lead to an out-of-bounds read.
For example, assume this input file:
$var wire 2 symid symname $end
$enddefinitions
$var wire 5 A anything $end
p AA
The first line will create a vcdsymbol and set both vcd_minid
and vcd_maxid
to 0x401a052d (the hash for “symid”).
The indexed
hash table is created as vcd_maxid - vcd_minid + 1
is 1, and it is filled with the only variable (symname) defined at position 0. The size of indexed
will thus be 4 (or a higher number depending on the malloc implementation, but this does not matter in this case).
Another variable is defined, this time with an id of “A”. vcd_minid
will be set to 0x21.
When the last line p AA
is read, bsearch_vcd("AA", 2)
will be called. Inside bsearch_vcd
, hsh
will be set to 0xc3f, which is within vcd_minid
(0x21) and vcd_maxid
(0x401a052d). Finally, indexed[0xc3f - 0x21]
is returned. As indexed
only has one element, this will read out-of-bounds in the heap.
Even though this is a read operation, this can later be used to perform arbitrary writes. By controlling the heap, it is possible at this point to return a vcdsymbol that points to arbitrary contents. With this assumption, let’s see an example of how to turn this into an arbitrary write. When we return from bsearch_vcd
we are back in parse_valuechange
:
[26] v = bsearch_vcd(yytext, yylen);
if (!v) {
fprintf(stderr, "Near line %d, Unknown identifier: '%s'\n", vcdlineno, yytext);
free_2(vector);
} else {
[27] if (vlen < v->size) /* fill in left part */
{
char extend;
int i, fill;
extend = '0';
[28] fill = v->size - vlen;
for (i = 0; i < fill; i++) {
[29] v->value[i] = extend;
}
evcd_strcpy(v->value + fill, vector);
}
...
At [27] vlen
is checked to be smaller than v->size
, but as v->size
is controlled, we can choose to enter this block. fill
at [28] is also controlled as we control v->size
, which will allow to control the size of the loop. Finally, as we control the v->value
pointer, we can write the ‘0’ character anywhere in memory. This allows, in turn, arbitrary code execution.
As mentioned before, this issue affects both the GUI program and some conversion utilities, which lie in separate source files. For this reason, we are listing each issue separately below.
The GUI’s recoder VCD parsing code (default parser) allows “$var” definitions to happen after “$enddefinitions”. This allows the out-of-bounds read described earlier to be exploited into an out-of-bounds write, leading to arbitrary code execution.
Note that in vcd_recoder.c:1590
there seems to be the right check to defend against this issue, but it’s nullified by the && 0
. This if
condition can never be true:
case T_VAR:
if((GLOBALS->header_over_vcd_recoder_c_3)&&(0))
{
fprintf(stderr,"$VAR encountered after $ENDDEFINITIONS near byte %d. VCD is malformed, exiting.\n",
(int)(GLOBALS->vcdbyteno_vcd_recoder_c_3+(GLOBALS->vst_vcd_recoder_c_3-GLOBALS->vcdbuf_vcd_recoder_c_3)));
vcd_exit(255);
}
This issue does not need any special command-line switch to be triggered when starting GTKWave.
The GUI’s legacy VCD parsing code allows “$var” definitions to happen after “$enddefinitions”. This allows the out-of-bounds read described earlier to be exploited into an out-of-bounds write, leading to arbitrary code execution.
Note that in vcd.c:1248
there seems to be the right check to defend against this issue, but it’s nullified by the && 0
. This if
condition can never be true:
case T_VAR:
if((GLOBALS->header_over_vcd_c_1)&&(0))
{
fprintf(stderr,"$VAR encountered after $ENDDEFINITIONS near byte %d. VCD is malformed, exiting.\n",
(int)(GLOBALS->vcdbyteno_vcd_c_1+(GLOBALS->vst_vcd_c_1-GLOBALS->vcdbuf_vcd_c_1)));
vcd_exit(255);
}
This issue can be triggered by using the -L
flag when starting GTKWave.
The GUI’s interactive VCD parsing code allows “$var” definitions to happen after “$enddefinitions”. This allows the out-of-bounds read described earlier to be exploited into an out-of-bounds write, leading to arbitrary code execution.
Note that in vcd_partial.c:1196
there seems to be the right check to defend against this issue, but it’s nullified by the && 0
. This if
condition can never be true:
case T_VAR:
if((GLOBALS->header_over_vcd_partial_c_2)&&(0))
{
fprintf(stderr,"$VAR encountered after $ENDDEFINITIONS near byte %d. VCD is malformed, exiting.\n",
(int)(GLOBALS->vcdbyteno_vcd_partial_c_2+(GLOBALS->vst_vcd_partial_c_2-GLOBALS->vcdbuf_vcd_partial_c_2)));
exit(0);
}
This issue can be triggered by using the -I
flag when starting GTKWave.
The VCD parsing code in the vcd2vzt
conversion utility allows “$var” definitions to happen after “$enddefinitions”. This allows the out-of-bounds read described earlier to be exploited into an out-of-bounds write, leading to arbitrary code execution.
The VCD parsing code in the vcd2lxt2
conversion utility allows “$var” definitions to happen after “$enddefinitions”. This allows the out-of-bounds read described earlier to be exploited into an out-of-bounds write, leading to arbitrary code execution.
The VCD parsing code in the vcd2lxt
conversion utility allows “$var” definitions to happen after “$enddefinitions”. This allows the out-of-bounds read described earlier to be exploited into an out-of-bounds write, leading to arbitrary code execution.
Fixed in version 3.3.118, available from https://sourceforge.net/projects/gtkwave/files/gtkwave-3.3.118/
2023-08-01 - Vendor Disclosure
2023-12-31 - Vendor Patch Release
2024-01-08 - Public Release
Discovered by Claudio Bozzato of Cisco Talos.