The blog post introduces the important concept of buffer overflow in C.

Null-Terminated Strings
When we initialize a string in C, it automatically adds \0 (or 0 in ASCII) at the end of the string to signify that it is the end of the string. Hence, when we want to store 5 characters, we need to prepare a buffer of size 6.
int main () {
char buffer[6];
buffer[0] = "1";
buffer[1] = "2";
buffer[2] = "3";
buffer[3] = "4";
buffer[4] = "5";
buffer[5] = "\0"; // or 0
return 0;
}
This is a great way of implementing strings because it doesn't need to store the length of the string, which takes up 4 bytes. However, this is also where a significant security vulnerability comes from.
Buffer Overflow
Let's see what happens if we forget to add the null terminator.
#include <string.h>
int main() {
char buffer[6];
buffer[0] = "1";
buffer[1] = "2";
buffer[2] = "3";
buffer[3] = "4";
buffer[4] = "5";
// buffer[5] = "\0"; // or 0
printf("len: %d", strlen(buffer)); // does not work properly
return 0;
}
The above strlen
operation does not work properly as it keeps going until it detects \0
,
which is not present in the buffer. This also happens when we try to assign a larger string
to a buffer with strcpy
. It keeps copying the characters until it sees \0
, which leads to the
values outside of the buffer being modified. This is called buffer overflow, and hackers can
use this to do all kinds of malicious activities.
Buffer Overflow Attack
Let's say our main
function takes a string user input like the following:
int main (int argc, char *argv[]) {
// argc : Argument count
// argv : Argument values
char buffer[500];
strcpy(buffer, argv);
printf("%s", buffer);
return 0;
}
It simply takes a user input string, copies that to the buffer of size 500, and prints that buffer. Now, using the buffer overflow, we can alter any part of the memory by assigning more characters than the buffer can take, even the return address. Hence, we can inject a string such that it overflows to the return address, which points to the malicious code embedded in the string (gaining super-user access to the files, stealing information, etc.). I recommend the video from Computerphile linked below if you are interested in the details.
The exploitation of buffer overflow caused by the null-terminated string in C is called a buffer overflow attack, and a C programmer has to be careful with this.
Coutermeasures
Luckily, we have countermeasures against such attacks that we can implement to safely handle user input strings. For example, instead of using strcpy
,
we can use strncpy
to only access characters inside of the buffer.
int main (int argc, char *argv[]) {
char buffer[500];
strncpy(buffer, argv, 499); // only up to 500th character, not up to \0
buffer[499] = '\0'; // add null terminator at the end of the buffer
printf("%s", buffer);
return 0;
}
However, realistically speaking, it is hard not to forget to implement those countermeasures all the time, just like how it is hard to manage heap safety. This is why many programming languages today implement strings differently, and why many programmers choose something other than C.
Exercises
From this article, there will be an exercise section where you can test your understanding of the material introduced in the article. I highly recommend solving these questions by yourself after reading the main part of the article. You can click on each question to see its answer.
Resources
- Computerphile. 2016. Running a Buffer Overflow Attack - Computerphile. YouTube.
- Portfolio Courses. 2023. Null Terminated String Safety Issues | C Programming Tutorial. YouTube.