Road to C++ Programmer #26 - Pybind11

Last Edited: 3/24/2025

The blog post introduces pybind11 module in C++.

Pybind11

When creating a module, we want it to be easy to use, accessible, and highly performant. However, a trade-off often exists between ease of use and performance in programming languages. Scripting languages like Python are generally easier to use due to automatic memory management with a garbage collector, dynamic typing, and tools with high levels of abstraction. However, this often results in slower execution compared to compiled languages like C++ with manual memory management and fewer abstractions.

In most cases, performance isn't the highest priority, and performance differences tend to be minor, making Python a generally more favorable choice. Meanwhile, in performance-critical scenarios like machine learning (ML), we often resort to compiled languages like C++. However, there's a way to achieve the best of both worlds. We can create bindings that act as an interface, exposing modules written in a compiled language to a scripting language, allowing even beginners to take advantage of highly performant modules in a compiled language while remaining within a scripting language.

In fact, we're already using such bound modules throughout our ML series, such as NumPy, PyTorch, and TensorFlow, where a large portion of their code is written in C++ and exposed to Python. (This makes Python highly performant and contributes to its popularity among average programmers, to the extent that it achieves the #1 position in the TIOBE index.) There are multiple ways to set up Python bindings for C++ code (Cython, SWIG, Boost.Python), but in this article, we're going to cover the very basics of Pybind11, a lightweight, header-only library that offers a simple solution for exposing C++11 code to Python.

Installing Pybind11

To install the library, we can use git submodule add -b stable ../../pybind/pybind11 extern/pybind11 and git submodule update --init, assuming we are using Git, GitHub, and extern for external dependencies. Alternatively, you can use the full HTTPS or SSH URL instead of using GitHub and the relative URL to install the library. When building the library, you can include the header files from extern/pybind11/include or use the official integration tools we will discuss later.

You can also use PyPI, conda-forge, vcpkg, and brew to include the library. To check if the development environment is set up correctly, create a build directory, navigate into it (cd build), run cmake .., and then execute make check -j 4 to perform testing. For Linux, you need to install python-dev or python3-dev and cmake. For macOS, you only need to install cmake as the included Python version generally works out of the box. For all code presented onwards, we assume you use #include <pybind11/pybind11.h> and namespace py = pybind11; to include the header file and namespace.

Functions & Variables

We can demonstrate how pybind11 works by exporting functions and variables. The following code example exposes a simple add function with a default argument and a variable seed.

example.cpp
int add(int i,int j = 2) {
    return i + j;
};
 
PYBIND11_MODULE(example, m) {
    m.doc() = "pybind11 example plugin"; // optional module docstring
 
    m.def("add", &add, "A function that adds two numbers", 
    py::arg("i"), py::arg("j") = 2);
    m.attr("seed") = 42;
}

This code can be manually compiled with the command, clang++ -O3 -Wall -shared -std=c++11 -fPIC `python3-config --includes` -undefined dynamic_lookup example.cpp -o example`python3-config --extension-suffix` -I 'extern/pybind11/include'. You can then import the module and use it in a Python script in the same directory, like the following.

example.py
import example
 
print(f"Seed: ${example.seed}") # => Seed: 42
print(f"Add: ${example.add(i=3)}") # => Add: 5
print(f"Add: ${example.add(i=2, j=4)}") # => Add: 6

Many data types are supported out of the box, including STL, chrono (time), and Eigen (which works seamlessly with NumPy arrays). You can see a complete list of supported types here.

Custom Type

Using pybind11, you can convert custom-defined classes, data structures, and enums in C++ into Python classes. Specifically, you can use class_ and enum_ to create bindings as follows.

struct Pokemon {
    std::string name; // attribute
    Pokemon(const std::string &name) : name(name) {}; // constructor
    void setName(const std::string &name_) { name = name_; }; // setter
    const std::string &getName() const { return name; }; // getter (const method)
}; // struct can have methods in C++
 
enum PokemonType { Fire = 0, Water, Grass };
 
PYBIND11_MODULE(example, m) {
    py::class_<Pokemon>(m, "Pokemon")
        .def(py::init<const std::string &>())
        .def("setName", &Pokemon::setName)
        // (Optional) Below sets utility function that outputs meaningful info when printed out
        .def("getName", &Pokemon::getName).def("__repr__",
        [](const Pokemon &a) {
            return "<example.Pokemon named '" + a.name + "'>";
        }
    );
 
    py::enum_<PokemonType>(m, "PokemonType", py::arithmetic())
        .value("Fire", &PokemonType::Fire)
        .value("Water", &PokemonType::Water)
        .value("Grass", &PokemonType::Grass);
}

In the above example, we are converting the data structure Pokemon using class_ and a builder pattern. The constructor is implicitly converted by init, which takes the types of a constructor's parameters as template arguments. You can access the methods in Python like below.

import example
 
p = example.Pokemon("Pikachu")
print(p.getName()) # => Pikachu
p.setName("Raichu")
print(p) # => <example.Pokemon named 'Raichu'>
 

You can also expose the name attribute with def_readwrite("name", &Pokemon::name) or def_readonly("name", &Pokemon::name) for constant fields. For a class Pokemon with a private name attribute and getter and setter functions, you can use def_property("name", &Pokemon::getName, &Pokemon::setName). Methods are also available for static attributes.

Inheritance

When you have object inheritance, you need to specify this to pybind11 to reflect the inheritance in Python classes. This can be achieved as follows.

class Pokemon {
public:
    std::string nickname;
    Pokemon(const std::string &nickname) : nickname(nickname) { };
};
 
class Pikachu : public Pokemon {
    Pikachu(const std::string &nickname) : Pokemon(nickname) { };
    std::string speak() const { return "Pikachu!" };
};
 
PYBIND11_MODULE(example, m) {
    py::class_<Pokemon>(m, "Pokemon")
        .def(py::init<const std::string &>())
        .def_readwrite("nickname", &Pokemon::nickname);
 
    py::class_<Pikachu, Pokemon>(m, "Pikachu") // parent class Pokemon specified
        .def(py::init<const std::string &>())
        .def("speak", &Pikachu::speak);
 
    // Alternatively we can do:
    // py::class_<Pokemon> pokemon(m, "Pokemon");
    // pokemon.def(py::init<const std::string &>())
    //        .def_readwrite("nickname", &Pokemon::nickname);
    // py::class_<Pikachu>(pokemon, "Pikachu")
    //    .def(py::init<const std::string &>())
    //    .def("speak", &Pikachu::speak);
}

However, this approach may not expose classes properly when the parent class is an abstract class for dynamic binding. To handle abstract classes, you need to use helper trampoline classes for both the parent and child classes. Details on this are available here.

Build Modules With CMake

Instead of the lengthy c++ command to compile and build a module, we can leverage build tools like CMake. Pybind11 offers CMake configuration and the pybind11_add_module command for building modules, which can be utilized as follows.

CMakeLists.txt
cmake_minimum_required(VERSION 3.5...3.29)
project(example LANGUAGES CXX)
 
add_subdirectory(extern/pybind11) 
{/* or find_package(pybind11 CONFIG REQUIRED) */}
 
pybind11_add_module(example example.cpp)
install(TARGETS example DESTINATION ${CMAKE_SOURCE_DIR})

We can run the CMake command to build the module with a .so extension. When using a package already installed on our device, we can use find_package instead of add_subdirectory.

Distribute Modules with PyPI

Often, we want to distribute the Python package online, and we can do so on PyPI using pybind11, setuptools, build, and twine. A typical file structure for that looks like the following.

example/
├── src/
│   └── example.cpp
├── LICENSE
├── pyproject.toml
├── README.md
└── setup.py

In pyproject.toml, we can specify build dependencies and the project's metadata. (If you only use setuptools for building the module, you can use [tool.setuptools] within the TOML file to perform the build solely based on this file.)

pyproject.toml
[build-system]
requires = [
    "setuptools>=42",
    "pybind11>=2.10.0",
]
build-backend = "setuptools.build_meta"
 
[project]
name="example"
version = "0.0.1"
authors=[
 {name = "Testing", email = "testing@example.com"},
]
description = "A test project using pybind11"
readme = "README.md"
license-files = ["LICEN[CS]E*"]
classifiers = [
  "Development Status :: 4 - Beta",
  "Programming Language :: Python"
]

Then, we can make use of helper functions provided by pybind11 and setuptools to create the extension modules in setup.py.

setup.py
from setuptools import setup
from pybind11.setup_helpers import Pybind11Extension, build_ext
 
ext_modules = [
    Pybind11Extension(
        'example_un',
        ['src/example.cpp'],
        language='c++'
    ),
]
 
setup(
    ext_modules=ext_modules,
    cmdclass={'build_ext': build_ext},
    zip_safe=False,
)

Running python3 -m build in the root directory confirms that a dist directory containing tar.gz and whl folders is created. We can test how it looks on PyPI by uploading to Test PyPI first with twine upload --repository testpypi dist/*, where you input the API token you can create on Test PyPI after setting up an account here.

You can confirm that your module works by installing from Test PyPI using python3 -m pip install -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple example==0.0.1, and running the module's methods. (The name example might not work on Test PyPI, so I recommend using a different name.) You can create more complex modules with submodules and have build targets for different modules, and I recommend checking out the official documentation by PyPA, setuptools, and Pybind11 cited below for more details.

Conclusion

In this article, we covered the basics of what pybind11 is, how we can use it to create Python bindings, and how we can build and distribute the module with various tools. While it seems simple to use these tools, the complexity grows exponentially as we deal with larger and more complicated repositories, especially when utilizing advanced C++ features and elaborate file structures. To learn how to cleanly manage larger projects involving them, I recommend examining various large repositories like PyTorch on GitHub.

Resources