Protect the working set under memory pressure to prevent thrashing, avoid high latency and prevent livelock in near-OOM conditions The kernel does not provide a way to protect the working set under memory pressure. A certain amount of anonymous and clean file pages is required by the userspace for normal operation. First of all, the userspace needs a cache of shared libraries and executable binaries. If the amount of the clean file pages falls below a certain level, then thrashing and even livelock can take place. The patch provides sysctl knobs for protecting the working set (anonymous and clean file pages) under memory pressure. The vm.anon_min_kbytes sysctl knob provides hard protection of anonymous pages. The anonymous pages on the current node won't be reclaimed under any conditions when their amount is below vm.anon_min_kbytes. This knob may be used to prevent excessive swap thrashing when anonymous memory is low (for example, when memory is going to be overfilled by compressed data of zram module). The vm.clean_low_kbytes sysctl knob provides best-effort protection of clean file pages. The file pages on the current node won't be reclaimed under memory pressure when the amount of clean file pages is below vm.clean_low_kbytes unless we threaten to OOM. Protection of clean file pages using this knob may be used when swapping is still possible to - Prevent disk I/O thrashing under memory pressure. - Improve performance in disk cache-bound tasks under memory pressure. The vm.clean_min_kbytes sysctl knob provides hard protection of clean file pages. The file pages on the current node won't be reclaimed under memory pressure when the amount of clean file pages is below vm.clean_min_kbytes. Hard protection of clean file pages using this knob may be used to - Prevent disk I/O thrashing under memory pressure even with no free swap space. - Improve performance in disk cache-bound tasks under memory pressure. - Avoid high latency and prevent livelock in near-OOM conditions. le9ec patches provide three sysctl knobs (vm.anon_min_kbytes, vm.clean_low_kbytes, vm.clean_min_kbytes) with zero values and does not protect the working set by default (CONFIG_ANON_MIN_KBYTES=0, CONFIG_CLEAN_LOW_KBYTES=0, CONFIG_CLEAN_MIN_KBYTES=0). You can specify other values during kernel build, or change the knob values on the fly. Effects - Improving system responsiveness under low-memory conditions. - Improving performance in I/O bound tasks under memory pressure; - OOM killer comes faster (with hard protection). - Fast system reclaiming after OOM (with hard protection). Note that the effects depend on the values of the sysctl tunables. source patch: https://github.com/hakavlad/le9-patch Signed-off-by: kanonifyX <kanonify01@gmail.com>
Documentation for /proc/sys/ kernel version 2.2.10 (c) 1998, 1999, Rik van Riel <riel@nl.linux.org> 'Why', I hear you ask, 'would anyone even _want_ documentation for them sysctl files? If anybody really needs it, it's all in the source...' Well, this documentation is written because some people either don't know they need to tweak something, or because they don't have the time or knowledge to read the source code. Furthermore, the programmers who built sysctl have built it to be actually used, not just for the fun of programming it :-) ============================================================== Legal blurb: As usual, there are two main things to consider: 1. you get what you pay for 2. it's free The consequences are that I won't guarantee the correctness of this document, and if you come to me complaining about how you screwed up your system because of wrong documentation, I won't feel sorry for you. I might even laugh at you... But of course, if you _do_ manage to screw up your system using only the sysctl options used in this file, I'd like to hear of it. Not only to have a great laugh, but also to make sure that you're the last RTFMing person to screw up. In short, e-mail your suggestions, corrections and / or horror stories to: <riel@nl.linux.org> Rik van Riel. ============================================================== Introduction: Sysctl is a means of configuring certain aspects of the kernel at run-time, and the /proc/sys/ directory is there so that you don't even need special tools to do it! In fact, there are only four things needed to use these config facilities: - a running Linux system - root access - common sense (this is especially hard to come by these days) - knowledge of what all those values mean As a quick 'ls /proc/sys' will show, the directory consists of several (arch-dependent?) subdirs. Each subdir is mainly about one part of the kernel, so you can do configuration on a piece by piece basis, or just some 'thematic frobbing'. The subdirs are about: abi/ execution domains & personalities debug/ <empty> dev/ device specific information (eg dev/cdrom/info) fs/ specific filesystems filehandle, inode, dentry and quota tuning binfmt_misc <Documentation/binfmt_misc.txt> kernel/ global kernel info / tuning miscellaneous stuff net/ networking stuff, for documentation look in: <Documentation/networking/> proc/ <empty> sunrpc/ SUN Remote Procedure Call (NFS) vm/ memory management tuning buffer and cache management user/ Per user per user namespace limits These are the subdirs I have on my system. There might be more or other subdirs in another setup. If you see another dir, I'd really like to hear about it :-)