Introduction
The Linux kernel's swap subsystem manages anonymous pages that are moved to secondary storage when memory is scarce. For years, this subsystem was largely neglected, but recent developer interest—sparked by three dedicated sessions at the 2026 Linux Storage, Filesystem, Memory Management, and BPF Summit—has revived it. Two sessions focused on improving swap performance and maintainability, while a joint session with the storage track explored making swap more friendly to solid-state drives (SSDs). This guide walks you through implementing the key improvements discussed: swap tables for faster lookups, flash-friendly swap to reduce write amplification, and the new swap_ops API for custom per-device behavior.
What You Need
- A working Linux kernel source tree (version 6.x or later, with swap subsystem patches)
- Basic understanding of memory management and swap concepts
- A test system with both a traditional HDD and an SSD (or NVMe drive) for comparison
- A swap partition or a swap file on each storage device
- Kernel build tools (gcc, make, headers) and a suitable environment for building and deploying custom kernels
- Benchmarking utilities like perf, vmstat, and fio for performance testing
- Root access to configure swap and kernel parameters
Step-by-Step Instructions
Step 1: Understand Current Swap Limitations
Before making changes, analyze why the existing swap code is inefficient. Traditional swap uses a single global hash table for page lookups, which causes contention on multi-core systems. On SSDs, naive swap operations cause write amplification and lack TRIM support, degrading endurance. Sessions at the summit highlighted these pain points. Begin by reviewing the kernel's swap code in mm/swap_state.c and mm/page_io.c. Note the linear scanning and lack of per-device configuration.
Step 2: Enable and Configure Swap Tables
The new swap tables replace the global hash with per-device, scalable data structures. To enable them:
- Apply the swap-table patchset (available from the memory management mailing list).
- Configure your kernel to include the CONFIG_SWAP_TABLES option.
- Rebuild and boot the kernel.
- Verify activation by checking
/proc/swapsanddmesgfor messages about per-device tables. - Optionally, tune the table size via
/sys/kernel/mm/swap_tables/parameters (e.g., num_entries, backing_shift).
Swap tables reduce lock contention and improve lookup performance, especially under heavy swapping workloads.
Step 3: Implement Flash-Friendly Swap Mechanisms
To make swap friendly to SSDs, modify the I/O path to issue TRIM commands and align writes to erase block boundaries. Follow these steps:
- Enable the CONFIG_SWAP_DISCARD feature (if not already on) to send TRIM after freeing swap slots.
- Apply the flash-friendly swap patchset that introduces write-combining and atomic swap slot allocation.
- Configure per-device parameters via sysfs:
/sys/block/and/queue/discard_max_bytes /sys/kernel/mm/swap_flash_friendly. - Test with a benchmark that generates heavy swap traffic (e.g.,
stress --vm-bytes 2G --vm-keep) and monitor SSD write amplification using vendor tools orsmartctl.
These changes ensure that swap operations do not prematurely wear out flash storage.
Step 4: Leverage Swap_Ops for Custom Behaviors
The swap_ops API allows each swap device to define its own read, write, and discard operations. Implementation steps:
- Review the existing
struct swap_operationsininclude/linux/swapops.h. - Write a custom swap_ops implementation for your device (e.g., one that prioritizes compression or encryption).
- Register the ops when the swap device is activated using
swap_set_ops(). - Modify the swap activation code in
mm/swapfile.cto callswap_set_opsbefore enabling the device. - Test that custom operations are invoked by checking kernel debug logs.
This flexibility was a key focus of the summit's joint storage track session.
Step 5: Test and Benchmark Performance
Once changes are in place, measure improvements:
- Use
perf stat -e page-faults,swap:swap_*to gather swap-specific events. - Run a realistic workload (e.g., compiling the kernel with limited RAM) and record swap utilization with
vmstat 1. - Compare throughput and latency between the old and new implementations on both HDD and SSD.
- Document results, paying attention to reduction in write amplification and lock contention.
Step 6: Integrate and Submit Patches (Optional)
If you intend to contribute back to the community, ensure your code follows kernel coding style. Prepare patches for the swap tables, flash-friendly swap, and swap_ops changes. Submit them to the linux-mm mailing list. Include benchmark data from Step 5 to demonstrate benefits.
Tips for Success
- Focus on maintainability: The sessions emphasized that new code must be clean and well-commented to avoid regressions.
- Test on multiple storage types: HDD behavior differs from SSD; ensure your changes work well on both.
- Consider power loss: Flash-friendly swap should not sacrifice data integrity during sudden power-offs.
- Monitor I/O patterns: Use
blktraceto verify that TRIM and write-combining are actually happening. - Engage with the community: The summit discussions showed that collaboration between memory management and storage developers is essential for optimal results.
By following these steps, you can bring the latest swap improvements from the LSMM summit into your own systems, achieving better performance and longer SSD life.