Talos Vulnerability Report

TALOS-2021-1405

Gerbv RS-274X aperture macro outline primitive integer overflow vulnerability

December 6, 2021
CVE Number

CVE-2021-40394

Summary

An integer overflow vulnerability exists in the RS-274X aperture macro outline primitive functionality of Gerbv 2.7.0 and dev (commit b5f1eacd) and the forked version of Gerbv (commit 71493260). A specially-crafted gerber file can lead to code execution. An attacker can provide a malicious file to trigger this vulnerability.

Tested Versions

Gerbv 2.7.0
Gerbv dev (commit b5f1eacd)
Gerbv forked dev (commit 71493260)

Product URLs

https://sourceforge.net/projects/gerbv/

CVSSv3 Score

10.0 - CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H

CWE

CWE-190 - Integer Overflow or Wraparound

Details

Gerbv is an open-source software that allows to view RS-274X Gerber files, Excellon drill files and pick-n-place files. These file formats are used in industry to describe the layers of a printed circuit board and are a core part of the manufacturing process.

Some PCB (printed circuit board) manufacturers use software like Gerbv in their web interfaces as a tool to convert Gerber (or other supported) files into images. Users can upload gerber files to the manufacturer website, which are converted to an image to be displayed in the browser, so that users can verify that what has been uploaded matches their expectations. Gerbv can do such conversions using the -x switch (export). For this reason, we consider this software as reachable via network without user interaction or privilege requirements.

Gerbv uses the function gerbv_open_image to open files. In this advisory we’re interested in the RS-274X file-type.

int
gerbv_open_image(gerbv_project_t *gerbvProject, char *filename, int idx, int reload,
                gerbv_HID_Attribute *fattr, int n_fattr, gboolean forceLoadFile)
{
    ...        
    dprintf("In open_image, about to try opening filename = %s\n", filename);
    
    fd = gerb_fopen(filename);
    if (fd == NULL) {
        GERB_COMPILE_ERROR(_("Trying to open \"%s\": %s"),
                        filename, strerror(errno));
        return -1;
    }
    ...
    if (gerber_is_rs274x_p(fd, &foundBinary)) {                                 // [1]
        dprintf("Found RS-274X file\n");
        if (!foundBinary || forceLoadFile) {
                /* figure out the directory path in case parse_gerb needs to
                 * load any include files */
                gchar *currentLoadDirectory = g_path_get_dirname (filename);
                parsed_image = parse_gerb(fd, currentLoadDirectory);            // [2]
                g_free (currentLoadDirectory);
        }
    }
    ...

A file is considered of type “RS-274X” if the function gerber_is_rs274x_p [1] returns true. When true, parse_gerb is called [2] to parse the input file. Let’s first look at the requirements that we need to satisfy to have an input file be recognized as an RS-274X file:

gboolean
gerber_is_rs274x_p(gerb_file_t *fd, gboolean *returnFoundBinary) 
{
    ...
    while (fgets(buf, MAXL, fd->fd) != NULL) {
        dprintf ("buf = \"%s\"\n", buf);
        len = strlen(buf);
    
        /* First look through the file for indications of its type by
         * checking that file is not binary (non-printing chars and white 
         * spaces)
         */
        for (i = 0; i < len; i++) {                                             // [3]
            if (!isprint((int) buf[i]) && (buf[i] != '\r') && 
                (buf[i] != '\n') && (buf[i] != '\t')) {
                found_binary = TRUE;
                dprintf ("found_binary (%d)\n", buf[i]);
            }
        }
        if (g_strstr_len(buf, len, "%ADD")) {
            found_ADD = TRUE;
            dprintf ("found_ADD\n");
        }
        if (g_strstr_len(buf, len, "D00") || g_strstr_len(buf, len, "D0")) {
            found_D0 = TRUE;
            dprintf ("found_D0\n");
        }
        if (g_strstr_len(buf, len, "D02") || g_strstr_len(buf, len, "D2")) {
            found_D2 = TRUE;
            dprintf ("found_D2\n");
        }
        if (g_strstr_len(buf, len, "M00") || g_strstr_len(buf, len, "M0")) {
            found_M0 = TRUE;
            dprintf ("found_M0\n");
        }
        if (g_strstr_len(buf, len, "M02") || g_strstr_len(buf, len, "M2")) {
            found_M2 = TRUE;
            dprintf ("found_M2\n");
        }
        if (g_strstr_len(buf, len, "*")) {
            found_star = TRUE;
            dprintf ("found_star\n");
        }
        /* look for X<number> or Y<number> */
        if ((letter = g_strstr_len(buf, len, "X")) != NULL) {
            if (isdigit((int) letter[1])) { /* grab char after X */
                found_X = TRUE;
                dprintf ("found_X\n");
            }
        }
        if ((letter = g_strstr_len(buf, len, "Y")) != NULL) {
            if (isdigit((int) letter[1])) { /* grab char after Y */
                found_Y = TRUE;
                dprintf ("found_Y\n");
            }
        }
    }
    ...
    /* Now form logical expression determining if the file is RS-274X */
    if ((found_D0 || found_D2 || found_M0 || found_M2) &&                     // [4]
        found_ADD && found_star && (found_X || found_Y)) 
        return TRUE;
    
    return FALSE;

} /* gerber_is_rs274x */

For an input to be considered an RS-274X file, the file must first of all contain only printing characters [3]. The other requirements can be gathered by the conditional expression at [4]. An example of a minimal RS-274X file is the following:

%FSLAX26Y26*%
%MOMM*%
%ADD100C,1.5*%
D100*
X0Y0D03*
M02*

Though not important for the purposes of the vulnerability itself, note that the checks use g_strstr_len, so all those fields can be found anywhere in the file. For example, this file is also recognized as an RS-274X file, even though it will fail later checks in the execution flow:

%ADD0X0*

After an RS-274X file has been recognized, parse_gerb is called, which in turn calls gerber_parse_file_segment:

gboolean
gerber_parse_file_segment (gint levelOfRecursion, gerbv_image_t *image, 
                           gerb_state_t *state,        gerbv_net_t *curr_net, 
                           gerbv_stats_t *stats, gerb_file_t *fd, 
                           gchar *directoryPath)
{
    ...
    while ((read = gerb_fgetc(fd)) != EOF) {
        ...
        case '%':
            dprintf("... Found %% code at line %ld\n", line_num);
            while (1) {
                    parse_rs274x(levelOfRecursion, fd, image, state, curr_net,
                                stats, directoryPath, &line_num);

If our file starts with “%”, we end up calling parse_rs274x:

static void 
parse_rs274x(gint levelOfRecursion, gerb_file_t *fd, gerbv_image_t *image, 
             gerb_state_t *state, gerbv_net_t *curr_net, gerbv_stats_t *stats, 
             gchar *directoryPath, long int *line_num_p)
{
    ...
    switch (A2I(op[0], op[1])){
    ...
    case A2I('A','D'): /* Aperture Description */
        a = (gerbv_aperture_t *) g_new0 (gerbv_aperture_t,1);

        ano = parse_aperture_definition(fd, a, image, scale, line_num_p); // [6]
        ...
        break;
    case A2I('A','M'): /* Aperture Macro */
        tmp_amacro = image->amacro;
        image->amacro = parse_aperture_macro(fd);                         // [5]
        if (image->amacro) {
            image->amacro->next = tmp_amacro;
        ...

For this advisory, we’re interested in the AM and AD commands. For details on the Gerber format see the specification from Ucamco.

In summary, AM defines a “macro aperture template,” which is, in other terms, a parametrized shape. It is a flexible way to define arbitrary shapes by building on top of simpler shapes (primitives). It allows arithmetic operations and variable definition. After a template has been defined, the AD command is used to instantiate the template and optionally pass some parameters to customize the shape.

From the specification, this is the syntax of the AM command:

<AM command>          = AM<Aperture macro name>*<Macro content>
<Macro content>       = {{<Variable definition>*}{<Primitive>*}}
<Variable definition> = $K=<Arithmetic expression>
<Primitive>           = <Primitive code>,<Modifier>{,<Modifier>}|<Comment>
<Modifier>            = $M|< Arithmetic expression>
<Comment>             = 0 <Text>

While this is the syntax for the AD command:

<AD command> = ADD<D-code number><Template>[,<Modifiers set>]*
<Modifiers set> = <Modifier>{X<Modifier>}

For this advisory, we’re interested in the “Outline” primitive (code 4). From the specification:

An outline primitive is an area defined by its outline or contour. The outline is a polygon, consisting of linear segments only, defined by its start vertex and n subsequent vertices.

The outline primitive should contain the following fields:

+-----------------+----------------------------------------------------------------------------------------+
| Modifier number | Description                                                                            |
+-----------------+----------------------------------------------------------------------------------------+
| 1               | Exposure off/on (0/1)                                                                  |
+-----------------+----------------------------------------------------------------------------------------+
| 2               | The number of vertices of the outline = the number of coordinate pairs minus one.      |
|                 | An integer ≥3.                                                                         |
+-----------------+----------------------------------------------------------------------------------------+
| 3, 4            | Start point X and Y coordinates. Decimals.                                             |
+-----------------+----------------------------------------------------------------------------------------+
| 5, 6            | First subsequent X and Y coordinates. Decimals.                                        |
+-----------------+----------------------------------------------------------------------------------------+
| ...             | Further subsequent X and Y coordinates. Decimals.                                      |
|                 | The X and Y coordinates are not modal: both X and Y must be specified for all points.  |
+-----------------+----------------------------------------------------------------------------------------+
| 3+2n, 4+2n      | Last subsequent X and Y coordinates. Decimals. Must be equal to the start coordinates. |
+-----------------+----------------------------------------------------------------------------------------+
| 5+2n            | Rotation angle, in degrees counterclockwise, a decimal.                                |
|                 | The primitive is rotated around the origin of the macro definition,                    |
|                 | i.e. the (0, 0) point of macro                                                         |
+----------------------------------------------------------------------------------------------------------+

Also the specification states that “The maximum number of vertices is 5000,” which is controlled by the modified number 2. So, depending on the number of vertices, the length of this primitive will change.

In the parse_rs274x function, when an AM command is found, the function parse_aperture_macro is called [5]. Let’s see how this outline primitive is handled there:

gerbv_amacro_t *
parse_aperture_macro(gerb_file_t *fd)
{
    gerbv_amacro_t *amacro;
    gerbv_instruction_t *ip = NULL;
    int primitive = 0, c, found_primitive = 0;
    ...
    int equate = 0;

    amacro = new_amacro();

    ...        
    /*
     * Since I'm lazy I have a dummy head. Therefore the first 
     * instruction in all programs will be NOP.
     */
    amacro->program = new_instruction();
    ip = amacro->program;
    
    while(continueLoop) {
        
        c = gerb_fgetc(fd);
        switch (c) {
        ...
        case '*':
            ...
            /*
             * Check is due to some gerber files has spurious empty lines.
             * (EagleCad of course).
             */
            if (found_primitive) {
                ip->next = new_instruction(); /* XXX Check return value */
                ip = ip->next;
                if (equate) {
                    ip->opcode = GERBV_OPCODE_PPOP;
                    ip->data.ival = equate;
                } else {
                    ip->opcode = GERBV_OPCODE_PRIM;                         // [10]
                    ip->data.ival = primitive;
                }
                equate = 0;
                primitive = 0;
                found_primitive = 0;
            }
            break;
        ...
        case ',':
            if (!found_primitive) {                                         // [8]
                found_primitive = 1;
                break;
            }
            ...
            break;
        ...
        case '1':
        case '2':
        case '3':
        case '4':
        case '5':
        case '6':
        case '7':
        case '8':
        case '9':
        case '.':
            /* 
             * First number in an aperture macro describes the primitive
             * as a numerical value
             */
            if (!found_primitive) {                                         // [7]
                primitive = (primitive * 10) + (c - '0');
                break;
            }
            (void)gerb_ungetc(fd);
            ip->next = new_instruction(); /* XXX Check return value */      // [9]
            ip = ip->next;
            ip->opcode = GERBV_OPCODE_PUSH;
            amacro->nuf_push++;
            ip->data.fval = gerb_fgetdouble(fd);
            if (neg) 
                ip->data.fval = -ip->data.fval;
            neg = 0;
            comma = 0;
            break;
        case '%':
            gerb_ungetc(fd);  /* Must return with % first in string
                                 since the main parser needs it */
            return amacro;                                                  // [11]
        default :
            /* Whitespace */
            break;
        }
        if (c == EOF) {
            continueLoop = 0;
        }
    }
    free (amacro);
    return NULL;
}

As we can see, this function implements a set of opcodes for a virtual machine that are used to perform arithmetic operations, handle variable definitions and references via a virtual stack, and primitives.
Let’s take an outline primitive definition as example:

%AMX0*4,0,3,1,1,1*%

As discussed before, %AM will land us in the parse_aperture_macro function, and X0 is the name for the macro. The macro parsing starts with 4 [7]: this is the primitive number, which is read as a decimal number until a , is found [8]. After that, each field separated by , is read as a double and added to the stack via PUSH [9]. These form the arguments to the primitive. When * is found [10], the primitive instruction is added, and with % the macro is returned.

For reference, these are the prototype for the macro and the program instructions:

struct amacro {
    gchar *name;
    gerbv_instruction_t *program;
    unsigned int nuf_push;
    struct amacro *next;
}

struct instruction {
    gerbv_opcodes_t opcode;
    union {
        int ival;
        float fval;
    } data;
    struct instruction *next;
}

Back to parse_rs274x: When an AD command is found, the function parse_aperture_definition is called [6], which in turn calls simplify_aperture_macro when the AD command is using a template.

static int
simplify_aperture_macro(gerbv_aperture_t *aperture, gdouble scale)
{
    ...
    gerbv_instruction_t *ip;
    int handled = 1, nuf_parameters = 0, i, j, clearOperatorUsed = FALSE;   // [18]
    double *lp; /* Local copy of parameters */
    double tmp[2] = {0.0, 0.0};
    ...
    /* Allocate stack for VM */
    s = new_stack(aperture->amacro->nuf_push + extra_stack_size);           // [12]
    if (s == NULL) 
        GERB_FATAL_ERROR("malloc stack failed in %s()", __FUNCTION__);
    ...
    for(ip = aperture->amacro->program; ip != NULL; ip = ip->next) {
        switch(ip->opcode) {
        case GERBV_OPCODE_NOP:
            break;
        case GERBV_OPCODE_PUSH :
            push(s, ip->data.fval);                                         // [13]
            break;
        ...
        case GERBV_OPCODE_PRIM :
            /* 
             * This handles the exposure thing in the aperture macro
             * The exposure is always the first element on stack independent
             * of aperture macro.
             */
            switch(ip->data.ival) {
            ...
            case 4 :                                                        // [14]
                dprintf("  Aperture macro outline [4] (");
                type = GERBV_APTYPE_MACRO_OUTLINE;
                /*
                 * Number of parameters are:
                 * - number of points defined in entry 1 of the stack + 
                 *   start point. Times two since it is both X and Y.
                 * - Then three more; exposure,  nuf points and rotation.
                 */
                nuf_parameters = ((int)s->stack[1] + 1) * 2 + 3;            // [15]
                break;
            ...
            }

            if (type != GERBV_APTYPE_NONE) { 
                if (nuf_parameters > APERTURE_PARAMETERS_MAX) {             // [16]
                        GERB_COMPILE_ERROR(_("Number of parameters to aperture macro (%d) "
                                                        "are more than gerbv is able to store (%d)"),
                                                        nuf_parameters, APERTURE_PARAMETERS_MAX);
                        nuf_parameters = APERTURE_PARAMETERS_MAX;
                }

                /*
                 * Create struct for simplified aperture macro and
                 * start filling in the blanks.
                 */
                sam = g_new (gerbv_simplified_amacro_t, 1);
                sam->type = type;
                sam->next = NULL;
                memset(sam->parameter, 0, 
                       sizeof(double) * APERTURE_PARAMETERS_MAX);
                memcpy(sam->parameter, s->stack,                            // [17]
                       sizeof(double) *  nuf_parameters);

For this advisory, all the AD commands have to do is utilize the macro that we just created, without special parameters. Let’s consider the following aperture definition:

%ADD09X0*

For AD to use the template, it has to execute the template in the virtual machine. To this end, a virtual stack is allocated at [12] to handle parameters.

As previously discussed, our program contains a series of GERBV_OPCODE_PUSH instructions (pushing the numbers 0,3,1,1,1 to the stack, at [13]) and a GERBV_OPCODE_PRIM instruction for primitive 4 (outline), executed at [14].

At [15] the number of vertices is taken from the second field in the stack (as per specification) and the number of parameters for the primitive is calculated. At [16] the code makes sure that nuf_parameters is not bigger than APERTURE_PARAMETERS_MAX (102), otherwise nuf_parameters gets limited to APERTURE_PARAMETERS_MAX. Finally at [17] the parameters are copied from the stack into the newly allocated sam structure.

The problem in this whole logic is how integers are treated, since calculations at [15] and [17] can be forced to overflow.
The variable nuf_parameters is signed int. As an example (valid for both 32 and 64 bit systems), let’s assume that the number of vertices as taken from the file is 1073741977. The calculation at [15] will set nuf_parameters to (1073741977 + 1) * 2 + 3 = 0x80000137, which is negative. This will allow to skip the check at [16], leaving nuf_parameters set to 0x80000137. At [17], sizeof(double) will be 8, so 8 * 0x80000137 will wrap around and result in 0x9b8.

The type of the sam structure is:

struct gerbv_simplified_amacro {
    gerbv_aperture_type_t type;
    double parameter[102];
    struct gerbv_simplified_amacro *next;
} *

Since sam->parameter has a size of 0x330 bytes and the size argument to memcpy is 0x9b8, we’ll write out-of-bounds of the structure pointed by sam (stored in heap). Note that the contents of s->stack are controlled by the attacker since that contains the macro parameters (taken from file and stored as IEEE754 encoding). This can lead to code execution.

Crash Information

# gerbv -x png -o out aperture_macro_parameters_intoverflow.poc
=================================================================
==9184==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xf23037b8 at pc 0xf798e90e bp 0xffa1f368 sp 0xffa1ef38
WRITE of size 2088 at 0xf23037b8 thread T0
    #0 0xf798e90d  (/usr/lib/i386-linux-gnu/libasan.so.4+0x7790d)
    #1 0x566a224d in simplify_aperture_macro src/gerber.c:2051
    #2 0x566a4017 in parse_aperture_definition src/gerber.c:2272
    #3 0x5669eaaf in parse_rs274x src/gerber.c:1637
    #4 0x56691fd1 in gerber_parse_file_segment src/gerber.c:243
    #5 0x56697b57 in parse_gerb src/gerber.c:768
    #6 0x566adb73 in gerbv_open_image src/gerbv.c:526
    #7 0x566ab520 in gerbv_open_layer_from_filename_with_color src/gerbv.c:249
    #8 0x5661724b in main src/main.c:929
    #9 0xf6b82f20 in __libc_start_main (/lib/i386-linux-gnu/libc.so.6+0x18f20)
    #10 0x565d51b0  (gerbv+0x161b0)

0xf23037b8 is located 0 bytes to the right of 824-byte region [0xf2303480,0xf23037b8)
allocated by thread T0 here:
    #0 0xf79fcf54 in malloc (/usr/lib/i386-linux-gnu/libasan.so.4+0xe5f54)
    #1 0xf6f05568 in g_malloc (/usr/lib/i386-linux-gnu/libglib-2.0.so.0+0x4e568)
    #2 0x566a4017 in parse_aperture_definition src/gerber.c:2272
    #3 0x5669eaaf in parse_rs274x src/gerber.c:1637
    #4 0x56691fd1 in gerber_parse_file_segment src/gerber.c:243
    #5 0x56697b57 in parse_gerb src/gerber.c:768
    #6 0x566adb73 in gerbv_open_image src/gerbv.c:526
    #7 0x566ab520 in gerbv_open_layer_from_filename_with_color src/gerbv.c:249
    #8 0x5661724b in main src/main.c:929
    #9 0xf6b82f20 in __libc_start_main (/lib/i386-linux-gnu/libc.so.6+0x18f20)

SUMMARY: AddressSanitizer: heap-buffer-overflow (/usr/lib/i386-linux-gnu/libasan.so.4+0x7790d)
Shadow bytes around the buggy address:
  0x3e4606a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x3e4606b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x3e4606c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x3e4606d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x3e4606e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x3e4606f0: 00 00 00 00 00 00 00[fa]fa fa fa fa fa fa fa fa
  0x3e460700: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3e460710: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x3e460720: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x3e460730: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x3e460740: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==9184==ABORTING

Timeline

2021-11-03 - Vendor Disclosure
2021-11-24 - Vendor Patched
2021-12-06 - Public Release

Credit

Discovered by Claudio Bozzato of Cisco Talos.