io_uring, libaio performance across Linux kernels and an unexpected IOMMU trap

The article presents a detailed performance comparison between io_uring and libaio, two asynchronous I/O interfaces, across a range of Linux kernel versions (5.4 to 7.0-rc3). While io_uring demonstrated expected superior performance, offering approximately a 2x improvement over libaio, the most significant finding was an unexpected performance regression.

The study primarily focused on 4K random write operations, chosen as a representative workload for database patterns and for its effectiveness in measuring software latency on NVMe devices.
A substantial ~30% performance degradation was observed in newer kernels.
This regression was directly attributed to the IOMMU (Input/Output Memory Management Unit) being enabled by default in these kernel versions.

In conclusion, this research provides valuable insights into the evolution of Linux I/O performance, underscoring not only the benefits of modern interfaces like io_uring but also the critical, sometimes hidden, performance implications of default kernel configurations like IOMMU.

io_uring, libaio performance across Linux kernels and an unexpected IOMMU trap

The Lowdown

The Gossip

I/O Interrogations: Understanding IOMMU's Impact & Benchmark Choices