sched_ext: pluggable scheduling in the Linux kernel

Scheduling is a notoriously difficult problem. An effective scheduler should fully utilize a system, while also optimizing for cache locality, while also accounting for real time constraints, while also accounting for battery life and power management, while also ensuring fairness, etc. The landscape of the tech industry has changed a lot in the last 15 years. Back in the late 2000s, cores were typically homogeneous, and were spaced further apart from one another. Modern systems are by comparison much more complex. Heterogeneous architectures are the norm for mobile devices, and are becoming more common in x86. Cache hierarchies are also less uniform, with Core Complex (CCX) chips having multiple shared L3 caches within a single socket. Use cases have evolved as well. Applications such as mobile and VR have latency requirements to avoid missing deadlines that impact user experience, and stacking workloads in data centers is constantly pushing the demands on the scheduler in terms of workload isolation and resource distribution. While CFS is a great scheduler, there are opportunities to continue to improve it for such use cases. With sched_ext, we can easily experiment and find scheduling algorithms that address these use cases by allowing developers to implement scheduling policies in BPF programs.

David VERNET

David is a kernel engineer at Meta, working primarly on the scheduler and BPF. David likes to utilize the CPU as much as possible, but gets anxiety if he sees too many cache misses. In his free time, David is the co-chair for the IETF BPF standardization working group, and he also likes to cook, drink beer, and do freelance writing for LWN.