Mitigating CPU Frontend for Complex Data Plane Applications

Author:

Yihan Dang

, Hao Li , Ze Xia ,

Jiajun Luan

, and Peng Zhang

Proceedings of USENIX NSDI'26

People

Ze Xia

Master Student

Peng Zhang

Professor

Abstract

The functionality and requirement of modern networks are becoming increasingly complex, giving rise to Complex Data Plane Applications (CDPA) with rich semantics but often limited performance. However, most existing optimizations fail to improve the performance of CDPAs. This is because CDPAs usually come with excessively large code size, which is often two order of magnitude larger than today’s L1 instruction cache (I-cache) size, causing the CPU to frequently stall on accessing instructions, thus presenting a distinct performance profile that bounds on CPU frontend. This paper proposes NanoPL, an i-cache friendly new execution model that shuffles the packet processing order to efficiently mitigate CPU frontend for CDPAs. Stemming from the common execution pattern of CDPAs, NanoPL analyzes its code to ensure semantic consistency after shuffling. By collecting performance profile of CDPAs over underlying traffic, NanoPL partitions CDPAs into execution stages and conducts i-cache friendly shuffing policy. Experiments show that NanoPL can achieve 17.2%~30.2% higher throughput over real world CDPAs due to the reduction of i-cache misses by up to 86.4%.

Mitigating CPU Frontend for Complex Data Plane Applications

People

Master Student

Professor

Abstract

Resources