Building Your Own Efficient uint128 in C++
Summary
The article provides a practical walkthrough for implementing a fixed-width 128-bit unsigned integer in modern C++ using two 64-bit limbs. It relies on carry/borrow and multiply intrinsics to map to real hardware instructions, and shows that the hand-written arithmetic can match the performance and codegen of built-in __uint128_t across common operations, with platform notes for GCC/Clang, MSVC, and AArch64. The piece also discusses scope, limitations, and future extensions to larger widths.