Developing in C for the ATmega328P: Buffer Overflow

4 minute read

Where I use examples from “The C Programming Language”, Kernighan & Ritchie, to demonstrate string copies, using pointers and how to check for buffer overflows.


I find it very helpful to periodically review/read “The C Programming Language”, Kernighan and Ritchie (K&R). In this past review, I ran across some examples provided on pages 105-6 (Second Edition) as to using pointers to copy strings. There were 4 examples provided, with the last commented as “the idiom should be mastered”, which is an implication of “this is a programming best practice”. Perhaps, for the code in question is quite simple and extremely powerful, that said, it leads to the issue of buffer overflows, for which I provide a mechanism to resolve.

The K&R Examples

In the book, there are four examples provided in the section 5.5 Character Pointers and Functions:

  • strcopy: copy t to s: array subscript version

  • strcopy: copy t to s: pointer version 1, increment pointers separately

  • strcopy: copy t to s: pointer version 2, move pointer increment into comparison

  • strcopy: copy t to s: pointer version 3, comparison to \0 is redundent,

I provide the code to these examples, in my AVR_C Repository in examples/strcpy. I find it extremely helpful to review each one of these functions in detail.

Array Subscript Version

This is the version I would have written just to begin testing out code. I tend to think more intuitively using arrays and indices, than with pointers:

    i = 0;
    while((s[i] = t[i]) != '\0')

It does beg for optimization, though, my first thought would have been to play with a for loop.

Increment Pointers Separately

    while((*s = *t) != '\0')

This is a great transition. It also dramatically simplifies how you begin to think about the copy operation. In this case, it makes clear it is a copy from one memory location to another, by using pointers.

Move Pointer Increment Into Comparison

    while((*s++ = *t++) != '\0')

This one looks great due to its simplicity. And what I thought the solution would be. It’s important to realize all of the operations which are taking place:

  1. The location s points to, is set to the value in t
  2. Both s and t are incremented
  3. Due to the extra set of "()", character in s is now compared to ’\0’, and if true, exits the loop (or it only stays in the loop if the comparison is false).

Comparison To \0 Is Redundant

   while((*s++ = *t++))

This one takes a little more thought. It performs steps 1 and 2 from the previous version, however, it also acts as a boolean as well due to the extra set of "()". (Its helpful to remove the extra set and see the error message.) When "*s++ = *t++" is evaluated as a boolean when the character is ’\0’, the value will be false, as ’\0’ is false.

This walk through was very helpful for me to continue to process how to work with pointers, particularly, as it pertains to character arrays. For remember, in C, you will have to use a function such as strcpy, to initialize a new value for a string.

Buffer Overflow

A extremely significant issue with the final version is that it enables, encourages, promotes, buffer overflows. As clean as the code is, it assumes the second string will always end with a ’\0’. A very significant error, indeed.

So, how do we fix it? We create a new version which only allows a copy to a specific size. We update our interface to include a pointer to a variable which contains the length of the destination buffer.

In the case below, I add the len variable, which tells the function how many characters to copy, and because its a pointer, one can also discern information as to its value upon return.

The enhanced version is this:

void strcpyn_p3(char *s, char *t, uint8_t *len) 
    while((*s++ = *t++) && (--(*len)))

This approach doesn’t remove our susceptibility to buffer overflows. There can still be an overflow if the b string does not end in an ’\0’ and is shorter than the a buffer..

The overhead isn’t much, I was able to use a scope to determine how it takes to execute one loop in both strcpy_p3 (third version, no length checking) and one loop in strcpyn_p3 (with length checking). The latter was about 880us slower or about 14 clock cycles of a 16Mhz clock. Those 880us can save days of debugging later…

I also wanted to confirm that every string copied was a legal string. In C, a legal string is one which ends in ’\0’. I added a function which prints out the string passed to it and confirms the last character in the string is a ’\0’.

Comments powered by Talkyard.