none
C/C++ data flow analysis rules (Cause of control flow failure in C) RRS feed

  • General discussion

  • The rules are categorized into errors, mistakes, warnings, security, and portability so as to help you decide which to enable.

    You should consider portability checks if you are concerned about porting your code from machines of one width (like 32-bit machines) to machines of another width (like 64-bit machines).

    In contrast, an error is something we expect everybody to want to see, while warnings very few of you will turn on. Mistakes are somewhere between errors and warnings. Depending on your coding style, they may be symptoms of something serious of not.

    Restriction: On Windows®, if you are analyzing files that contain multibyte characters or have multibyte characters in their filename or file path, C/C++ Data Flow Analysis will produce no results.

    Errors

    Do not use unassigned variables

    The value of <var>xyz</var> is used, however, it has not been assigned or it is not initialized. In this case, <var>xyz</var> will evaluate to a random bit string, which is rarely desired.

    If the results indicate that the variable has not been assigned, there is not a single assignment to <var>xyz</var> anywhere in the function. If the results indicate that the variable has not been initialized, there exists a feasible path to the usage of <var>xyz</var>, bypassing all assignments (even though there may some assignments to <var>xyz</var>). If so, the path is displayed.

    If the results indicate this error, you should not assume that you simply need to initialize the variable. The majority of uninitialized variables are not caused by lack of initialization at the declaration, but rather by some mistake along the identified path.

    Example:

    int ival; void func() { struct foo { int x; int y; } v1, v2; v1.x = 42; v2 = v1; func2(v1); ival = v1.y; }

    It can happen that only a piece of a variable is initialized, and the remainder is not. For example, look at this C example:

    int func() { union { int i; short s[2]; } x; x.s[0] = 0; return x.i; }

    In this case, the message will be:

    uninitialized 'x.i[16:31]'

    The notation <samp>[16:31]</samp> refers to bit positions inside <samp>x.i</samp>.

    Do not perform invalid operations involving NULL pointers

    Example 1:

    a = &(p->f); b = &(q[2]); c = r + 2;

    The C standard considers these invalid in case <var>p</var>, <var>q</var>, or <var>r</var> are NULL pointers.

    This complaint is suppressed when you explicitly write

    a = &(NULL->f);

    because that is a common way of calculating the offset of the field <var>f</var> (typically via the <samp>offsetof</samp> macro).

    Example 2:

    i = p->f; j = *p;

    If <var>p</var> is <samp>NULL</samp>, the assignment to <var>j</var> involves dereferencing a <samp>NULL</samp> pointer, which will be reported as Do not dereference a NULL pointer. Do not perform invalid operations involving NULL pointers is closely related to the dereferencing of a <samp>NULL</samp> pointer, but it is not the same thing. Consider the above assignment to <var>i</var>, and suppose that the field <var>f</var> is at offset 4. If <var>p</var> is NULL, then <samp>p->f</samp> does not constitute dereferencing NULL (in other words, address 0) but dereferencing the address 4.

    Do not deallocate previously deallocated memory

    This complaint is followed by a path through the program. This path ends in a deallocation, such as <samp>free(<var>xyz</var>)</samp>. Somewhere along that path, there is another deallocation of <var>xyz</var>.

    Deallocating a memory location more than once can lead to memory corruption and random crashes much later in the program.

    Common causes include using two pointers to the same memory instead of making a copy of an object, or falling through an unexpected path of code where a second deallocation is occurring.

    Take a look at where the memory was allocated, and see if there is any reason that there is more than one place that the memory can be deallocated. There should be a one-to-one correspondence between allocations and deallocations for each pointer.

    Do not access deallocated memory

    This complaint is followed by a path through the program. This path ends in an access to a memory location <var>xyz</var> which could be an assignment or use of the value of <var>xyz</var>. However, somewhere along the path, the location <var>xyz</var> was previously deallocated. Once memory is deallocated, its value is undefined, and should not be used.

    This is commonly caused by having more than one pointer to a memory location and deallocating the memory through one of them. All of the other pointers will still contain the correct address, but the value at that address cannot be determined and should never be used.

    To investigate, ascertain where the memory is being used and why it is possible to get to that point after the memory has been deallocated.

    Do not dereference a NULL pointer

    This complaint is followed by a path through the program. This path ends in the dereferencing of a pointer, for example <samp>*xyz</samp>. When dereferenced, <var>xyz</var> will be <samp>NULL</samp>, which is a problem because dereferencing a <samp>NULL</samp> pointer is likely to result in unpredictable behavior. There are two ways the path might cause <var>xyz</var> to be considered <samp>NULL</samp>:

    • by an assignment, for example
      xyz = NULL;
    • by a test, for example
      if ( xyz )

    Any path that follows the <samp>else</samp> branch of such a test implies that <var>xyz</var> is <samp>NULL</samp>.

    Common causes include forgetting to check if a pointer is <samp>NULL</samp> before using it, or moving through a path of code where the pointer is expected to be valid, but is <samp>NULL</samp> instead.

    Do not invoke a method with a NULL class pointer

    This complaint is followed by a path through the program. This path ends in an expression of the form <samp>xyz->foo()</samp>. When dereferenced, <var>xyz</var> will be <samp>NULL</samp>, which is a problem because calling a method via a <samp>NULL</samp> class pointer can have unpredictable results, and will not work in many cases. There are two ways the path might cause <var>xyz</var> to be considered <samp>NULL</samp>:

    • by an assignment, for example
      xyz = NULL;
    • or by a test, for example,
      if ( xyz )

    Any path that follows the <samp>else</samp> branch of such a test implies that <var>xyz</var> is <samp>NULL</samp>.

    Common causes include forgetting to check if a pointer is <samp>NULL</samp> before using it, or moving through a path of code where the pointer is expected to be valid, but is <samp>NULL</samp> instead.

    The difference between this rule and Do not dereference a NULL pointer is that this rule refers to the special case of invoking of a class member <var>foo</var> through a pointer <var>xyz</var>. This may not automatically lead to a failure, provided the method <var>foo</var> handles the possibility that <var>this</var> is <samp>NULL</samp>. Allowing <var>this</var> to be <samp>NULL</samp> can lead to unpredictable results (for example, it could fail for virtual functions).

    Do not access an array beyond its bounds

    This complaint is followed by a path through the program. This path ends in one of:

    • a dereferenced pointer, for example <samp>*xyz</samp>
    • an array that is subscripted, for example <samp>A[I]</samp> (which is the same as <samp>*(A+I)</samp>)
    • a called function that accesses a buffer, for example <samp>strcpy(A, B)</samp>

    Reading from memory outside of an array's bounds may result in a random value. Writing into memory outside of an array's bounds may result in memory corruption.

    Common causes include not checking a variable's value before using it to index an array, or using a value that is the same as the number of array elements (which is too large to actually index the array with). A common cause of buffer overflow is forgetting to terminate a <samp>NULL</samp> character.

    To investigate the problem, look at the index that is being used. If it is a variable, ensure that it can only contain values that are within the array's bounds. If it is not a variable, ensure the declaration of the array is large enough - and ensure that the index is correct.

    C/C++ data flow analysis assumes a memory model where all allocations (on heap as well as stack) are independent of each other. It may issue Do not access an array beyond its bounds if you are assuming allocations are not independent, such as assuming that memory is allocated consecutively.

    Example 1:

    typedef struct {int a; int b[10]; int c;} abc; void f(int *p) { abc *s = (abc *) malloc(sizeof(abc)); s->b[9] = 0; s->b[10] = 0; s->b[11] = 0; s->b[-1] = 0; s->b[-2] = 0; p[-1] = 0; }

    Please note the difference between <samp>s->b[10]</samp> and <samp>s->b[11]</samp> in the example. The latter most likely accesses beyond allocated memory, while the former does not -- it is likely to access the field <var>b</var> (depending how the compiler lays out the <samp>struct</samp>). Both are rarely intended, but <samp>s.a[11]</samp> has two potential problems, while <samp>s.a[10]</samp> has only one.

    Example 2:

    extern struct {int a; int b[1];} *s; s->a[10] = 0;

    Avoid passing pointers which point to deallocated memory

    This complaint is followed by a path through the program. This path ends in the use of a variable which is a pointer that was previously freed along the path. For example,

    free ( xyz ); a = xyz; foo ( xyz ); if (xyz) ... b = (xyz == another_ptr) ? c : d;

    Typically, you will not want a dangling pointer to deallocated memory. Using a pointer to deallocated memory can have unpredictable results because it is usually unknown if it will be dereferenced in the future. The only possible exception is comparing the value of a deallocated pointer to some other pointer. In the above example, the <samp>if</samp> statement compares <var>xyz</var> against <samp>NULL</samp> and, in the assignment to <var>b</var>, there is a comparison of <var>xyz</var> against <var>another_ptr</var>. In contrast, the assignment to <var>a</var> and the call to <var>foo</var> does not involve any comparison.

    Common causes include meaning to use the pointer before it is deallocated, or meaning to set the pointer to <samp>NULL</samp> or to new memory before it is used again after the deallocation.

    Things to look at include where the memory is deallocated, and the path of the code until it is first used again. Most of the time, the pointer should be reassigned to <samp>NULL</samp> or another memory location before it is used again.

    Avoid passing NULL

    This complaint is followed by a path through the program. The path ends in a function call such as

    f ( xyz );

    At this point, <var>xyz</var> will be <samp>NULL</samp>, which is a problem because <var>f</var> is incapable of handling a <samp>NULL</samp> pointer.

    Passing <samp>NULL</samp> to certain functions can result in unpredictable behavior.

    Common causes include forgetting to check whether a pointer is <samp>NULL</samp> before passing it to a function, using the wrong function, or assuming that the function allows <samp>NULL</samp> pointers as arguments.

    Look at the argument that is being passed into the function, and look back at its declaration or definition. Decide if you should be checking its value against <samp>NULL</samp> before using it in the function call.

     


    Boobalan M
    Saturday, June 11, 2011 6:20 AM