Files
kernel_google_wahoo/Documentation
Daniel Colascione ca4760396a Add /proc/pid/smaps_rollup
/proc/pid/smaps_rollup is a new proc file that improves the
performance of user programs that determine aggregate memory
statistics (e.g., total PSS) of a process.

Android regularly "samples" the memory usage of various processes in
order to balance its memory pool sizes. This sampling process involves
opening /proc/pid/smaps and summing certain fields. For very large
processes, sampling memory use this way can take several hundred
milliseconds, due mostly to the overhead of the seq_printf calls in
task_mmu.c.

smaps_rollup improves the situation. It contains most of the fields of
/proc/pid/smaps, but instead of a set of fields for each VMA,
smaps_rollup instead contains one synthetic smaps-format entry
representing the whole process. In the single smaps_rollup synthetic
entry, each field is the summation of the corresponding field in all
of the real-smaps VMAs. Using a common format for smaps_rollup and
smaps allows userspace parsers to repurpose parsers meant for use with
non-rollup smaps for smaps_rollup, and it allows userspace to switch
between smaps_rollup and smaps at runtime (say, based on the
availability of smaps_rollup in a given kernel) with minimal fuss.

By using smaps_rollup instead of smaps, a caller can avoid the
significant overhead of formatting, reading, and parsing each of a
large process's potentially very numerous memory mappings. For
sampling system_server's PSS in Android, we measured a 12x speedup,
representing a savings of several hundred milliseconds.

One alternative to a new per-process proc file would have been
including PSS information in /proc/pid/status. We considered this
option but thought that PSS would be too expensive (by a few orders of
magnitude) to collect relative to what's already emitted as part of
/proc/pid/status, and slowing every user of /proc/pid/status for the
sake of readers that happen to want PSS feels wrong.

The code itself works by reusing the existing VMA-walking framework we
use for regular smaps generation and keeping the mem_size_stats
structure around between VMA walks instead of using a fresh one for
each VMA.  In this way, summation happens automatically.  We let
seq_file walk over the VMAs just as it does for regular smaps and just
emit nothing to the seq_file until we hit the last VMA.

Benchmarks:

    using smaps:
    iterations:1000 pid:1163 pss:220023808
    0m29.46s real 0m08.28s user 0m20.98s system

    using smaps_rollup:
    iterations:1000 pid:1163 pss:220702720
    0m04.39s real 0m00.03s user 0m04.31s system

Signed-off-by: Daniel Colascione <dancol@google.com>

Bug: 63439864
Change-Id: I4af852034c014f9d606060649207f8a1eb3d2702
2017-08-17 14:57:46 -07:00
..
2017-08-17 14:57:46 -07:00
2015-09-24 15:39:42 -06:00
2015-08-28 22:59:36 -04:00
2015-10-27 18:55:31 -07:00
2015-06-08 16:43:09 -06:00
2017-05-22 23:19:33 -07:00
2015-06-08 23:15:45 +02:00
2015-07-14 12:35:42 -06:00
2015-06-23 06:49:44 -06:00
2016-02-16 13:53:25 -08:00