Linux 7.2 Kernel Set to Introduce Long-Awaited Cache Aware Scheduling

linux

As the Linux 7.2 kernel merge window approaches in mid-June, the open source community is abuzz with the news that Cache Aware Scheduling (CAS) is finally poised to enter the mainline kernel. This feature, which has been under development for over a year—primarily by Intel engineers—promises significant performance improvements for modern multi-core CPUs, especially server-class processors from both Intel and AMD.

Cache Aware Scheduling is designed to optimize task placement by ensuring that processes sharing data are scheduled on CPUs within the same last level cache (LLC) domain. This reduces cache misses and “cache bouncing,” where data must be repeatedly fetched from main memory, degrading performance. The feature is enabled via the CONFIG_SCHED_CACHE Kconfig option and can be toggled at runtime through a DebugFS interface, allowing administrators to compare performance with and without CAS active.

According to recent updates, the CAS code has been merged into the “sched/cache” branch by Peter Zijlstra and is now part of the TIP (Tip Infrastructure Project) repository’s sched/core branch. This is the final step before submission to Linus Torvalds for inclusion in Linux 7.2. Assuming no last-minute issues, the feature is expected to land in the mainline kernel this summer.

Real-World Benchmarks Show Impressive Gains

Testing on modern server platforms has demonstrated tangible benefits. On an AMD EPYC Genoa system, the ChaCha20 cryptographic benchmark saw a 44% throughput improvement with CAS enabled. Intel’s Sapphire Rapids platforms also showed significant wins, particularly in latency-sensitive workloads like hackbench and schbench, with up to 30% faster task completion in some scenarios.

However, not all workloads benefit equally. Some memory- and network-intensive benchmarks, such as stream and netperf, showed minimal change, while others, like certain netperf configurations, even saw slight regressions. Still, the overall trend is positive, especially for workloads that are cache-sensitive or involve many threads sharing data.

Tunables and Future Work

CAS introduces several runtime tunables under /sys/kernel/debug/sched/, including:

  • llc_aggr_tolerance: Controls how aggressively tasks are grouped by LLC affinity.
  • llc_overload_pct and llc_imb_pct: Prevent overloading LLC domains.
  • llc_epoch_period and llc_epoch_affinity_timeout: Fine-tune how often LLC occupancy is sampled and how long a process retains its cache affinity.

The current implementation is global, but developers are considering per-process controls in the future via prctl.

Source: Phoronix, lwn

Leave a Reply

Your email address will not be published. Required fields are marked *