The blog introduces modularization in C (C++).

When you embark on a large project in C, you will reach a point where the code becomes so large that you might want to break it down into related modules. In this article, we will discuss the motivation behind modularization and how it can be done in C (and C++).
Why Modularize
As has been briefly mentioned, the code can grow so large that it becomes intractable for developers to scroll up and down. In such cases, modularization comes in handy. When working with a large single code file, even if you are making only small changes to the code, you have to recompile the entire code. Modularization alleviates this issue by allowing you to recompile only the relevant file.
For example, we have worked with various custom data types like AdjacencyMatrix
and BTree
in C++.
If we were to contain them all in one file where the main function resides, it would be difficult to benefit
from the abstraction that these objects provide to reason about the program. Moreover, a small change in the hidden
implementation could cause recompilation of large unchanged parts of the program. The same problem exists in C,
where ideally, we would define a struct and the functions that operate on the struct in a different file for convenience.
How To Modularize
There is no formal definition of a module in C (and C++), but we can use header and source files to achieve "modularization."
The header file (.h
) contains the declarations of functions and types, serving as an interface for other code.
The source code (.c
or .cpp
) contains the implementation of the declared functions and types. Below is an example of a header file for a linked list.
typedef struct {
int data;
struct Node *next;
} Node;
void printLinkedList (Node *head);
Node *prependNode(Node *head, int data);
The header file only declares functions and types without defining them. (You can technically define them,
but it would require recompilation of all the code files that use the module when making any changes.)
You might also include documentation explaining the functionalities in the header file.
The source code uses the #include
macro to copy the declarations and provide definitions as follows.
#include "<stdio.h>"
#include "LinkedList.h"
void printLinkedList (Node *head) {
Node *current = head;
while (current != NULL) {
printf("%d -> ", current -> data);
current = current -> next;
}
printf("\n");
return;
};
Node *prependNode(Node *head, int data) {
Node *newNode = malloc(sizeof(Node));
// ... IMPLEMENTATIONS ...
return head;
};
The above uses ""
instead of <>
, which we use when including standard libraries,
because ""
indicates that the header file is in the same directory.
The #include
macro substitutes itself with the specified header file,
which can cause redeclaration problems if the same header file is included multiple times.
To prevent this, we use include guards as follows.
#ifndef LINKEDLIST_H
#define LINKEDLIST_H
typedef struct {
int data;
struct Node *next;
} Node;
void printLinkedList (Node *head);
Node *prependNode(Node *head, int data);
#endif
The include guard uses macros to skip the declaration when the unique variable LINKEDLIST_H
is already defined.
This ensures that the same header file is not re-included, and functions and types are not redeclared.
We can then include the header file in the main source code like #include "LINKEDLIST.h"
and use its functions and types.
Object Files
When compiling source code to make use of modules, we compile them into object files (.o
).
These contain machine code and a symbol table but cannot be executed on their own.
The symbol table is used to locate function definitions when linking
or assembling the object files—including exactly one object file containing
the main function—into a single executable program. To compile files into object files,
we can use the -c
flag.
> gcc -c LinkedList.c
> gcc -c main.c
> ls
main.c main.h main.o LinkedList.c LinkedList.h LinkedList.o
To link them and execute, we list the object files and specify the executable file.
> gcc main.o LinkedList.o
> ls
a.out main.c main.h main.o LinkedList.c LinkedList.h LinkedList.o
> ./a.out
By default, the name of the executable file is a.out
, which can be changed using the -o
flag,
like gcc -o <name> main.o LinkedList.o
. The same applies to g++
for C++.
Static vs Dynamic Libraries
While object files can serve as "modules," listing all the relevant object files to compile them into an executable is tedious. This is especially true when there are many object files used in multiple places. It becomes intractable for users to keep track of and list relevant object files each time. Hence, we usually bundle the object files together into an archive to create a library.
A static library is a library where all the declarations and definitions are copied into the executable during compilation.
We can create a static library using ar rcs lib_<name>.a <name>.o ....
Static files usually have a .a
extension with lib
as a prefix.
Then, we can link the static library using -L.
and -l
flags, like gcc main.o -L. -l<name>
, where .
indicates the library is in the same directory.
A dynamic library, or shared library, is a library where the declarations and definitions are referred to dynamically as needed at runtime.
This often results in a much lighter executable file. We can create a shared library using the -fPIC
flag when compiling source code into object files,
followed by the -shared
flag, like gcc -shared lib_<name>.so <name>.o ....
Shared libraries typically have a .so
(shared object) extension,
and we can link them using -L.
and -l
flags.
Conclusion
In this article, we covered why modularization is useful, how it can be achieved using header files and object files, and how to create static or shared libraries from object files. Modularization allows us to work with larger codebases, potentially in collaborative environments. For practice, I recommend revisiting the articles on data structures and creating modules or libraries for them.
Resources
- Jain, A. 2024. Static Library vs Dynamic Library: Understanding the Differences. Medium.
- Jordan, K. 2021. C "Modules" - Tutorial on .h Header Files, Include Guards, .o Object Code, & Incremental Compilation. YouTube.