summaryrefslogtreecommitdiff
path: root/include/linux/mm_types.h
diff options
context:
space:
mode:
authorColin Cross <ccross@google.com>2020-07-27 12:40:27 +0530
committerSumit Semwal <sumit.semwal@linaro.org>2020-08-31 19:16:27 +0530
commit9ed5101f8718abf7da005dc00702b3af04839ee9 (patch)
tree823c185fad16d6fa461e1d84dd7beb7cf50515ed /include/linux/mm_types.h
parent8b0478a367a1738080b17f9de2970b1bc6420444 (diff)
mm: add a field to store names for private anonymous memorytmp/p3-vma-upstr
In many userspace applications, and especially in VM based applications like Android uses heavily, there are multiple different allocators in use. At a minimum there is libc malloc and the stack, and in many cases there are libc malloc, the stack, direct syscalls to mmap anonymous memory, and multiple VM heaps (one for small objects, one for big objects, etc.). Each of these layers usually has its own tools to inspect its usage; malloc by compiling a debug version, the VM through heap inspection tools, and for direct syscalls there is usually no way to track them. On Android we heavily use a set of tools that use an extended version of the logic covered in Documentation/vm/pagemap.txt to walk all pages mapped in userspace and slice their usage by process, shared (COW) vs. unique mappings, backing, etc. This can account for real physical memory usage even in cases like fork without exec (which Android uses heavily to share as many private COW pages as possible between processes), Kernel SamePage Merging, and clean zero pages. It produces a measurement of the pages that only exist in that process (USS, for unique), and a measurement of the physical memory usage of that process with the cost of shared pages being evenly split between processes that share them (PSS). If all anonymous memory is indistinguishable then figuring out the real physical memory usage (PSS) of each heap requires either a pagemap walking tool that can understand the heap debugging of every layer, or for every layer's heap debugging tools to implement the pagemap walking logic, in which case it is hard to get a consistent view of memory across the whole system. Tracking the information in userspace leads to all sorts of problems. It either needs to be stored inside the process, which means every process has to have an API to export its current heap information upon request, or it has to be stored externally in a filesystem that somebody needs to clean up on crashes. It needs to be readable while the process is still running, so it has to have some sort of synchronization with every layer of userspace. Efficiently tracking the ranges requires reimplementing something like the kernel vma trees, and linking to it from every layer of userspace. It requires more memory, more syscalls, more runtime cost, and more complexity to separately track regions that the kernel is already tracking. This patch adds a field to /proc/pid/maps and /proc/pid/smaps to show a userspace-provided name for anonymous vmas. The names of named anonymous vmas are shown in /proc/pid/maps and /proc/pid/smaps as [anon:<name>]. Userspace can set the name for a region of memory by calling prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, start, len, (unsigned long)name); Setting the name to NULL clears it. The name is stored in a user pointer in the shared union in vm_area_struct that points to a null terminated string inside the user process. vmas that point to the same address and are otherwise mergeable will be merged, but vmas that point to equivalent strings at different addresses will not be merged. The idea to store a userspace pointer to reduce the complexity within mm (at the expense of the complexity of reading /proc/pid/mem) came from Dave Hansen. This results in no runtime overhead in the mm subsystem other than comparing the anon_name pointers when considering vma merging. The pointer is stored in a union with fields that are only used on file-backed mappings, so it does not increase memory usage. (Upstream changed to remove the union, so this patch adds it back as well) Signed-off-by: Colin Cross <ccross@google.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Jan Glauber <jan.glauber@gmail.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Rob Landley <rob@landley.net> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Kees Cook <keescook@chromium.org> Cc: "Serge E. Hallyn" <serge.hallyn@ubuntu.com> Cc: David Rientjes <rientjes@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michel Lespinasse <walken@google.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Robin Holt <holt@sgi.com> Cc: Shaohua Li <shli@fusionio.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org> Change-Id: I0213a267e1b98b2932fd74ea8c63a50f0c760797 --- v2: updates the commit message to explain in more detail why the patch is useful. v3: renames vma_get_anon_name to vma_anon_name replaces logic in seq_print_vma_name with access_process_vm removes Name: entry from smaps, it's already on the header line changes the prctl option number to match what is currently in use on Android v4: adds paragraph to commit log on why this is better than tracking in userspace squashes fixes from Andrew Morton to fix build error and warning fix build error reported by Mark Salter when !CONFIG_MMU v5: rebased to v5.9-rc1, added minor fixes to match upstream v6: rebased to v5.9-rc3, and addressed review comments: - added missing callers in fs/userfaultd.c - simplified the union - use the new access_remote_vm_locked() in show_map_vma() since that already holds mmap_lock
Diffstat (limited to 'include/linux/mm_types.h')
-rw-r--r--include/linux/mm_types.h25
1 files changed, 21 insertions, 4 deletions
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 496c3ff97cce..f7d54ae487e6 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -336,11 +336,19 @@ struct vm_area_struct {
/*
* For areas with an address space and backing store,
* linkage into the address_space->i_mmap interval tree.
+ *
+ * For private anonymous mappings, a pointer to a null terminated string
+ * in the user process containing the name given to the vma, or NULL
+ * if unnamed.
*/
- struct {
- struct rb_node rb;
- unsigned long rb_subtree_last;
- } shared;
+
+ union {
+ struct {
+ struct rb_node rb;
+ unsigned long rb_subtree_last;
+ } shared;
+ const char __user *anon_name;
+ };
/*
* A file's MAP_PRIVATE vma can be in both i_mmap tree and anon_vma
@@ -772,4 +780,13 @@ typedef struct {
unsigned long val;
} swp_entry_t;
+/* Return the name for an anonymous mapping or NULL for a file-backed mapping */
+static inline const char __user *vma_anon_name(struct vm_area_struct *vma)
+{
+ if (vma->vm_file)
+ return NULL;
+
+ return vma->anon_name;
+}
+
#endif /* _LINUX_MM_TYPES_H */