summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2013-04-29memcg: add memory.pressure_level eventsAnton Vorontsov
With this patch userland applications that want to maintain the interactivity/memory allocation cost can use the pressure level notifications. The levels are defined like this: The "low" level means that the system is reclaiming memory for new allocations. Monitoring this reclaiming activity might be useful for maintaining cache level. Upon notification, the program (typically "Activity Manager") might analyze vmstat and act in advance (i.e. prematurely shutdown unimportant services). The "medium" level means that the system is experiencing medium memory pressure, the system might be making swap, paging out active file caches, etc. Upon this event applications may decide to further analyze vmstat/zoneinfo/memcg or internal memory usage statistics and free any resources that can be easily reconstructed or re-read from a disk. The "critical" level means that the system is actively thrashing, it is about to out of memory (OOM) or even the in-kernel OOM killer is on its way to trigger. Applications should do whatever they can to help the system. It might be too late to consult with vmstat or any other statistics, so it's advisable to take an immediate action. The events are propagated upward until the event is handled, i.e. the events are not pass-through. Here is what this means: for example you have three cgroups: A->B->C. Now you set up an event listener on cgroups A, B and C, and suppose group C experiences some pressure. In this situation, only group C will receive the notification, i.e. groups A and B will not receive it. This is done to avoid excessive "broadcasting" of messages, which disturbs the system and which is especially bad if we are low on memory or thrashing. So, organize the cgroups wisely, or propagate the events manually (or, ask us to implement the pass-through events, explaining why would you need them.) Performance wise, the memory pressure notifications feature itself is lightweight and does not require much of bookkeeping, in contrast to the rest of memcg features. Unfortunately, as of current memcg implementation, pages accounting is an inseparable part and cannot be turned off. The good news is that there are some efforts[1] to improve the situation; plus, implementing the same, fully API-compatible[2] interface for CONFIG_MEMCG=n case (e.g. embedded) is also a viable option, so it will not require any changes on the userland side. [1] http://permalink.gmane.org/gmane.linux.kernel.cgroups/6291 [2] http://lkml.org/lkml/2013/2/21/454 [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix CONFIG_CGROPUPS=n warnings] Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Acked-by: Kirill A. Shutemov <kirill@shutemov.name> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Glauber Costa <glommer@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: Leonid Moiseichuk <leonid.moiseichuk@nokia.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm, hotplug: avoid compiling memory hotremove functions when disabledDavid Rientjes
__remove_pages() is only necessary for CONFIG_MEMORY_HOTREMOVE. PowerPC pseries will return -EOPNOTSUPP if unsupported. Adding an #ifdef causes several other functions it depends on to also become unnecessary, which saves in .text when disabled (it's disabled in most defconfigs besides powerpc, including x86). remove_memory_block() becomes static since it is not referenced outside of drivers/base/memory.c. Build tested on x86 and powerpc with CONFIG_MEMORY_HOTREMOVE both enabled and disabled. Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Toshi Kani <toshi.kani@hp.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29resource: add release_mem_region_adjustable()Toshi Kani
Add release_mem_region_adjustable(), which releases a requested region from a currently busy memory resource. This interface adjusts the matched memory resource accordingly even if the requested region does not match exactly but still fits into. This new interface is intended for memory hot-delete. During bootup, memory resources are inserted from the boot descriptor table, such as EFI Memory Table and e820. Each memory resource entry usually covers the whole contigous memory range. Memory hot-delete request, on the other hand, may target to a particular range of memory resource, and its size can be much smaller than the whole contiguous memory. Since the existing release interfaces like __release_region() require a requested region to be exactly matched to a resource entry, they do not allow a partial resource to be released. This new interface is restrictive (i.e. release under certain conditions), which is consistent with other release interfaces, __release_region() and __release_resource(). Additional release conditions, such as an overlapping region to a resource entry, can be supported after they are confirmed as valid cases. There is no change to the existing interfaces since their restriction is valid for I/O resources. [akpm@linux-foundation.org: use GFP_ATOMIC under write_lock()] [akpm@linux-foundation.org: switch back to GFP_KERNEL, less buggily] [akpm@linux-foundation.org: remove unneeded and wrong kfree(), per Toshi] Signed-off-by: Toshi Kani <toshi.kani@hp.com> Reviewed-by : Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Reviewed-by: Ram Pai <linuxram@us.ibm.com> Cc: T Makphaibulchoke <tmac@hp.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm: remove CONFIG_HOTPLUG ifdefsYijing Wang
CONFIG_HOTPLUG is going away as an option, cleanup CONFIG_HOTPLUG ifdefs in mm files. Signed-off-by: Yijing Wang <wangyijing@huawei.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm: replace hardcoded 3% with admin_reserve_pages knobAndrew Shewmaker
Add an admin_reserve_kbytes knob to allow admins to change the hardcoded memory reserve to something other than 3%, which may be multiple gigabytes on large memory systems. Only about 8MB is necessary to enable recovery in the default mode, and only a few hundred MB are required even when overcommit is disabled. This affects OVERCOMMIT_GUESS and OVERCOMMIT_NEVER. admin_reserve_kbytes is initialized to min(3% free pages, 8MB) I arrived at 8MB by summing the RSS of sshd or login, bash, and top. Please see first patch in this series for full background, motivation, testing, and full changelog. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: make init_admin_reserve() static] Signed-off-by: Andrew Shewmaker <agshew@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm: limit growth of 3% hardcoded other user reserveAndrew Shewmaker
Add user_reserve_kbytes knob. Limit the growth of the memory reserved for other user processes to min(3% current process size, user_reserve_pages). Only about 8MB is necessary to enable recovery in the default mode, and only a few hundred MB are required even when overcommit is disabled. user_reserve_pages defaults to min(3% free pages, 128MB) I arrived at 128MB by taking the max VSZ of sshd, login, bash, and top ... then adding the RSS of each. This only affects OVERCOMMIT_NEVER mode. Background 1. user reserve __vm_enough_memory reserves a hardcoded 3% of the current process size for other applications when overcommit is disabled. This was done so that a user could recover if they launched a memory hogging process. Without the reserve, a user would easily run into a message such as: bash: fork: Cannot allocate memory 2. admin reserve Additionally, a hardcoded 3% of free memory is reserved for root in both overcommit 'guess' and 'never' modes. This was intended to prevent a scenario where root-cant-log-in and perform recovery operations. Note that this reserve shrinks, and doesn't guarantee a useful reserve. Motivation The two hardcoded memory reserves should be updated to account for current memory sizes. Also, the admin reserve would be more useful if it didn't shrink too much. When the current code was originally written, 1GB was considered "enterprise". Now the 3% reserve can grow to multiple GB on large memory systems, and it only needs to be a few hundred MB at most to enable a user or admin to recover a system with an unwanted memory hogging process. I've found that reducing these reserves is especially beneficial for a specific type of application load: * single application system * one or few processes (e.g. one per core) * allocating all available memory * not initializing every page immediately * long running I've run scientific clusters with this sort of load. A long running job sometimes failed many hours (weeks of CPU time) into a calculation. They weren't initializing all of their memory immediately, and they weren't using calloc, so I put systems into overcommit 'never' mode. These clusters run diskless and have no swap. However, with the current reserves, a user wishing to allocate as much memory as possible to one process may be prevented from using, for example, almost 2GB out of 32GB. The effect is less, but still significant when a user starts a job with one process per core. I have repeatedly seen a set of processes requesting the same amount of memory fail because one of them could not allocate the amount of memory a user would expect to be able to allocate. For example, Message Passing Interfce (MPI) processes, one per core. And it is similar for other parallel programming frameworks. Changing this reserve code will make the overcommit never mode more useful by allowing applications to allocate nearly all of the available memory. Also, the new admin_reserve_kbytes will be safer than the current behavior since the hardcoded 3% of available memory reserve can shrink to something useless in the case where applications have grabbed all available memory. Risks * "bash: fork: Cannot allocate memory" The downside of the first patch-- which creates a tunable user reserve that is only used in overcommit 'never' mode--is that an admin can set it so low that a user may not be able to kill their process, even if they already have a shell prompt. Of course, a user can get in the same predicament with the current 3% reserve--they just have to launch processes until 3% becomes negligible. * root-cant-log-in problem The second patch, adding the tunable rootuser_reserve_pages, allows the admin to shoot themselves in the foot by setting it too small. They can easily get the system into a state where root-can't-log-in. However, the new admin_reserve_kbytes will be safer than the current behavior since the hardcoded 3% of available memory reserve can shrink to something useless in the case where applications have grabbed all available memory. Alternatives * Memory cgroups provide a more flexible way to limit application memory. Not everyone wants to set up cgroups or deal with their overhead. * We could create a fourth overcommit mode which provides smaller reserves. The size of useful reserves may be drastically different depending on the whether the system is embedded or enterprise. * Force users to initialize all of their memory or use calloc. Some users don't want/expect the system to overcommit when they malloc. Overcommit 'never' mode is for this scenario, and it should work well. The new user and admin reserve tunables are simple to use, with low overhead compared to cgroups. The patches preserve current behavior where 3% of memory is less than 128MB, except that the admin reserve doesn't shrink to an unusable size under pressure. The code allows admins to tune for embedded and enterprise usage. FAQ * How is the root-cant-login problem addressed? What happens if admin_reserve_pages is set to 0? Root is free to shoot themselves in the foot by setting admin_reserve_kbytes too low. On x86_64, the minimum useful reserve is: 8MB for overcommit 'guess' 128MB for overcommit 'never' admin_reserve_pages defaults to min(3% free memory, 8MB) So, anyone switching to 'never' mode needs to adjust admin_reserve_pages. * How do you calculate a minimum useful reserve? A user or the admin needs enough memory to login and perform recovery operations, which includes, at a minimum: sshd or login + bash (or some other shell) + top (or ps, kill, etc.) For overcommit 'guess', we can sum resident set sizes (RSS) because we only need enough memory to handle what the recovery programs will typically use. On x86_64 this is about 8MB. For overcommit 'never', we can take the max of their virtual sizes (VSZ) and add the sum of their RSS. We use VSZ instead of RSS because mode forces us to ensure we can fulfill all of the requested memory allocations-- even if the programs only use a fraction of what they ask for. On x86_64 this is about 128MB. When swap is enabled, reserves are useful even when they are as small as 10MB, regardless of overcommit mode. When both swap and overcommit are disabled, then the admin should tune the reserves higher to be absolutley safe. Over 230MB each was safest in my testing. * What happens if user_reserve_pages is set to 0? Note, this only affects overcomitt 'never' mode. Then a user will be able to allocate all available memory minus admin_reserve_kbytes. However, they will easily see a message such as: "bash: fork: Cannot allocate memory" And they won't be able to recover/kill their application. The admin should be able to recover the system if admin_reserve_kbytes is set appropriately. * What's the difference between overcommit 'guess' and 'never'? "Guess" allows an allocation if there are enough free + reclaimable pages. It has a hardcoded 3% of free pages reserved for root. "Never" allows an allocation if there is enough swap + a configurable percentage (default is 50) of physical RAM. It has a hardcoded 3% of free pages reserved for root, like "Guess" mode. It also has a hardcoded 3% of the current process size reserved for additional applications. * Why is overcommit 'guess' not suitable even when an app eventually writes to every page? It takes free pages, file pages, available swap pages, reclaimable slab pages into consideration. In other words, these are all pages available, then why isn't overcommit suitable? Because it only looks at the present state of the system. It does not take into account the memory that other applications have malloced, but haven't initialized yet. It overcommits the system. Test Summary There was little change in behavior in the default overcommit 'guess' mode with swap enabled before and after the patch. This was expected. Systems run most predictably (i.e. no oom kills) in overcommit 'never' mode with swap enabled. This also allowed the most memory to be allocated to a user application. Overcommit 'guess' mode without swap is a bad idea. It is easy to crash the system. None of the other tested combinations crashed. This matches my experience on the Roadrunner supercomputer. Without the tunable user reserve, a system in overcommit 'never' mode and without swap does not allow the admin to recover, although the admin can. With the new tunable reserves, a system in overcommit 'never' mode and without swap can be configured to: 1. maximize user-allocatable memory, running close to the edge of recoverability 2. maximize recoverability, sacrificing allocatable memory to ensure that a user cannot take down a system Test Description Fedora 18 VM - 4 x86_64 cores, 5725MB RAM, 4GB Swap System is booted into multiuser console mode, with unnecessary services turned off. Caches were dropped before each test. Hogs are user memtester processes that attempt to allocate all free memory as reported by /proc/meminfo In overcommit 'never' mode, memory_ratio=100 Test Results 3.9.0-rc1-mm1 Overcommit | Swap | Hogs | MB Got/Wanted | OOMs | User Recovery | Admin Recovery ---------- ---- ---- ------------- ---- ------------- -------------- guess yes 1 5432/5432 no yes yes guess yes 4 5444/5444 1 yes yes guess no 1 5302/5449 no yes yes guess no 4 - crash no no never yes 1 5460/5460 1 yes yes never yes 4 5460/5460 1 yes yes never no 1 5218/5432 no no yes never no 4 5203/5448 no no yes 3.9.0-rc1-mm1-tunablereserves User and Admin Recovery show their respective reserves, if applicable. Overcommit | Swap | Hogs | MB Got/Wanted | OOMs | User Recovery | Admin Recovery ---------- ---- ---- ------------- ---- ------------- -------------- guess yes 1 5419/5419 no - yes 8MB yes guess yes 4 5436/5436 1 - yes 8MB yes guess no 1 5440/5440 * - yes 8MB yes guess no 4 - crash - no 8MB no * process would successfully mlock, then the oom killer would pick it never yes 1 5446/5446 no 10MB yes 20MB yes never yes 4 5456/5456 no 10MB yes 20MB yes never no 1 5387/5429 no 128MB no 8MB barely never no 1 5323/5428 no 226MB barely 8MB barely never no 1 5323/5428 no 226MB barely 8MB barely never no 1 5359/5448 no 10MB no 10MB barely never no 1 5323/5428 no 0MB no 10MB barely never no 1 5332/5428 no 0MB no 50MB yes never no 1 5293/5429 no 0MB no 90MB yes never no 1 5001/5427 no 230MB yes 338MB yes never no 4* 4998/5424 no 230MB yes 338MB yes * more memtesters were launched, able to allocate approximately another 100MB Future Work - Test larger memory systems. - Test an embedded image. - Test other architectures. - Time malloc microbenchmarks. - Would it be useful to be able to set overcommit policy for each memory cgroup? - Some lines are slightly above 80 chars. Perhaps define a macro to convert between pages and kb? Other places in the kernel do this. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: make init_user_reserve() static] Signed-off-by: Andrew Shewmaker <agshew@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29include/linux/memory.h: implement register_hotmemory_notifier()Andrew Morton
When CONFIG_MEMORY_HOTPLUG=n, we don't want the memory-hotplug notifier handlers to be included in the .o files, for space reasons. The existing hotplug_memory_notifier() tries to handle this but testing with gcc-4.4.4 shows that it doesn't work - the hotplug functions are still present in the .o files. So implement a new register_hotmemory_notifier() which is a copy of register_hotcpu_notifier(), and which actually works as desired. hotplug_memory_notifier() and register_memory_notifier() callsites should be converted to use this new register_hotmemory_notifier(). While we're there, let's repair the existing hotplug_memory_notifier(): it simply stomps on the register_memory_notifier() return value, so well-behaved code cannot check for errors. Apparently non of the existing callers were well-behaved :( Cc: Andrew Shewmaker <agshew@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29page_alloc: make setup_nr_node_ids() usable for arch init codeCody P Schafer
powerpc and x86 were opencoding copies of setup_nr_node_ids(), which page_alloc provides but makes static. Make it avaliable to the archs in linux/mm.h. Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29sparse-vmemmap: specify vmemmap population range in bytesJohannes Weiner
The sparse code, when asking the architecture to populate the vmemmap, specifies the section range as a starting page and a number of pages. This is an awkward interface, because none of the arch-specific code actually thinks of the range in terms of 'struct page' units and always translates it to bytes first. In addition, later patches mix huge page and regular page backing for the vmemmap. For this, they need to call vmemmap_populate_basepages() on sub-section ranges with PAGE_SIZE and PMD_SIZE in mind. But these are not necessarily multiples of the 'struct page' size and so this unit is too coarse. Just translate the section range into bytes once in the generic sparse code, then pass byte ranges down the stack. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Bernhard Schmidt <Bernhard.Schmidt@lrz.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: David S. Miller <davem@davemloft.net> Tested-by: David S. Miller <davem@davemloft.net> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm, hugetlb: include hugepages in meminfoDavid Rientjes
Particularly in oom conditions, it's troublesome that hugetlb memory is not displayed. All other meminfo that is emitted will not add up to what is expected, and there is no artifact left in the kernel log to show that a potentially significant amount of memory is actually allocated as hugepages which are not available to be reclaimed. Booting with hugepages=8192 on the command line, this memory is now shown in oom conditions. For example, with echo m > /proc/sysrq-trigger: Node 0 hugepages_total=2048 hugepages_free=2048 hugepages_surp=0 hugepages_size=2048kB Node 1 hugepages_total=2048 hugepages_free=2048 hugepages_surp=0 hugepages_size=2048kB Node 2 hugepages_total=2048 hugepages_free=2048 hugepages_surp=0 hugepages_size=2048kB Node 3 hugepages_total=2048 hugepages_free=2048 hugepages_surp=0 hugepages_size=2048kB [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm: allow arch code to control the user page table ceilingHugh Dickins
On architectures where a pgd entry may be shared between user and kernel (e.g. ARM+LPAE), freeing page tables needs a ceiling other than 0. This patch introduces a generic USER_PGTABLES_CEILING that arch code can override. It is the responsibility of the arch code setting the ceiling to ensure the complete freeing of the page tables (usually in pgd_free()). [catalin.marinas@arm.com: commit log; shift_arg_pages(), asm-generic/pgtables.h changes] Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: <stable@vger.kernel.org> [3.3+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29kexec, vmalloc: export additional vmalloc layer informationAtsushi Kumagai
Now, vmap_area_list is exported as VMCOREINFO for makedumpfile to get the start address of vmalloc region (vmalloc_start). The address which contains vmalloc_start value is represented as below: vmap_area_list.next - OFFSET(vmap_area.list) + OFFSET(vmap_area.va_start) However, both OFFSET(vmap_area.va_start) and OFFSET(vmap_area.list) aren't exported as VMCOREINFO. So this patch exports them externally with small cleanup. [akpm@linux-foundation.org: vmalloc.h should include list.h for list_head] Signed-off-by: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Dave Anderson <anderson@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ingo Molnar <mingo@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm, vmalloc: export vmap_area_list, instead of vmlistJoonsoo Kim
Although our intention is to unexport internal structure entirely, but there is one exception for kexec. kexec dumps address of vmlist and makedumpfile uses this information. We are about to remove vmlist, then another way to retrieve information of vmalloc layer is needed for makedumpfile. For this purpose, we export vmap_area_list, instead of vmlist. Signed-off-by: Joonsoo Kim <js1304@gmail.com> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Dave Anderson <anderson@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm, vmalloc: move get_vmalloc_info() to vmalloc.cJoonsoo Kim
Now get_vmalloc_info() is in fs/proc/mmu.c. There is no reason that this code must be here and it's implementation needs vmlist_lock and it iterate a vmlist which may be internal data structure for vmalloc. It is preferable that vmlist_lock and vmlist is only used in vmalloc.c for maintainability. So move the code to vmalloc.c Signed-off-by: Joonsoo Kim <js1304@gmail.com> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Dave Anderson <anderson@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ingo Molnar <mingo@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm: make snapshotting pages for stable writes a per-bio operationDarrick J. Wong
Walking a bio's page mappings has proved problematic, so create a new bio flag to indicate that a bio's data needs to be snapshotted in order to guarantee stable pages during writeback. Next, for the one user (ext3/jbd) of snapshotting, hook all the places where writes can be initiated without PG_writeback set, and set BIO_SNAP_STABLE there. We must also flag journal "metadata" bios for stable writeout, since file data can be written through the journal. Finally, the MS_SNAP_STABLE mount flag (only used by ext3) is now superfluous, so get rid of it. [akpm@linux-foundation.org: rename _submit_bh()'s `flags' to `bio_flags', delobotomize the _submit_bh declaration] [akpm@linux-foundation.org: teeny cleanup] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Artem Bityutskiy <dedekind1@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm/hugetlb: add more arch-defined huge_pte functionsGerald Schaefer
Commit abf09bed3cce ("s390/mm: implement software dirty bits") introduced another difference in the pte layout vs. the pmd layout on s390, thoroughly breaking the s390 support for hugetlbfs. This requires replacing some more pte_xxx functions in mm/hugetlbfs.c with a huge_pte_xxx version. This patch introduces those huge_pte_xxx functions and their generic implementation in asm-generic/hugetlb.h, which will now be included on all architectures supporting hugetlbfs apart from s390. This change will be a no-op for those architectures. [akpm@linux-foundation.org: fix warning] Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Hillf Danton <dhillf@gmail.com> Acked-by: Michal Hocko <mhocko@suse.cz> [for !s390 parts] Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29fs: don't compile in drop_caches.c when CONFIG_SYSCTL=nJosh Triplett
drop_caches.c provides code only invokable via sysctl, so don't compile it in when CONFIG_SYSCTL=n. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29cgroup: remove css_get_nextMichal Hocko
Now that we have generic and well ordered cgroup tree walkers there is no need to keep css_get_next in the place. Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Li Zefan <lizefan@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Ying Han <yinghan@google.com> Cc: Tejun Heo <htejun@gmail.com> Cc: Glauber Costa <glommer@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm: introduce free_highmem_page() helper to free highmem pages into buddy systemJiang Liu
The original goal of this patchset is to fix the bug reported by https://bugzilla.kernel.org/show_bug.cgi?id=53501 Now it has also been expanded to reduce common code used by memory initializion. This is the second part, which applies to the previous part at: http://marc.info/?l=linux-mm&m=136289696323825&w=2 It introduces a helper function free_highmem_page() to free highmem pages into the buddy system when initializing mm subsystem. Introduction of free_highmem_page() is one step forward to clean up accesses and modificaitons of totalhigh_pages, totalram_pages and zone->managed_pages etc. I hope we could remove all references to totalhigh_pages from the arch/ subdirectory. We have only tested these patchset on x86 platforms, and have done basic compliation tests using cross-compilers from ftp.kernel.org. That means some code may not pass compilation on some architectures. So any help to test this patchset are welcomed! There are several other parts still under development: Part3: refine code to manage totalram_pages, totalhigh_pages and zone->managed_pages Part4: introduce helper functions to simplify mem_init() and remove the global variable num_physpages. This patch: Introduce helper function free_highmem_page(), which will be used by architectures with HIGHMEM enabled to free highmem pages into the buddy system. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Suzuki K. Poulose" <suzuki@in.ibm.com> Cc: Alexander Graf <agraf@suse.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Attilio Rao <attilio.rao@citrix.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Cong Wang <amwang@redhat.com> Cc: David Daney <david.daney@cavium.com> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Jiang Liu <liuj97@gmail.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Michel Lespinasse <walken@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rik van Riel <riel@redhat.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yinghai Lu <yinghai@kernel.org> Reviewed-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm: introduce common help functions to deal with reserved/managed pagesJiang Liu
The original goal of this patchset is to fix the bug reported by https://bugzilla.kernel.org/show_bug.cgi?id=53501 Now it has also been expanded to reduce common code used by memory initializion. This is the first part, which applies to v3.9-rc1. It introduces following common helper functions to simplify free_initmem() and free_initrd_mem() on different architectures: adjust_managed_page_count(): will be used to adjust totalram_pages, totalhigh_pages, zone->managed_pages when reserving/unresering a page. __free_reserved_page(): free a reserved page into the buddy system without adjusting page statistics info free_reserved_page(): free a reserved page into the buddy system and adjust page statistics info mark_page_reserved(): mark a page as reserved and adjust page statistics info free_reserved_area(): free a continous ranges of pages by calling free_reserved_page() free_initmem_default(): default method to free __init pages. We have only tested these patchset on x86 platforms, and have done basic compliation tests using cross-compilers from ftp.kernel.org. That means some code may not pass compilation on some architectures. So any help to test this patchset are welcomed! There are several other parts still under development: Part2: introduce free_highmem_page() to simplify freeing highmem pages Part3: refine code to manage totalram_pages, totalhigh_pages and zone->managed_pages Part4: introduce helper functions to simplify mem_init() and remove the global variable num_physpages. This patch: Code to deal with reserved/managed pages are duplicated by many architectures, so introduce common help functions to reduce duplicated code. These common help functions will also be used to concentrate code to modify totalram_pages and zone->managed_pages, which makes the code much more clear. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Anatolij Gustschin <agust@denx.de> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Liqin <liqin.chen@sunplusct.com> Cc: Chris Zankel <chris@zankel.net> Cc: David Howells <dhowells@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Jiang Liu <liuj97@gmail.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Mikael Starvik <starvik@axis.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Russell King <linux@arm.linux.org.uk> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29vm: adjust ifdef for TINY_RCUPaul E. McKenney
There is an ifdef in page_cache_get_speculative() that checks for !SMP and TREE_RCU, which has been an impossible combination since the advent of TINY_RCU. The ifdef enables a fastpath that is valid when preemption is disabled by rcu_read_lock() in UP systems, which is the case when TINY_RCU is enabled. This commit therefore adjusts the ifdef to generate the fastpath when TINY_RCU is enabled. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reported-by: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm/shmem.c: remove an ifdefAndrew Morton
Create a CONFIG_MMU=y stub for ramfs_nommu_expand_for_mapping() in the usual fashion. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Wolfram Sang <wsa@the-dreams.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm, show_mem: suppress page counts in non-blockable contextsDavid Rientjes
On large systems with a lot of memory, walking all RAM to determine page types may take a half second or even more. In non-blockable contexts, the page allocator will emit a page allocation failure warning unless __GFP_NOWARN is specified. In such contexts, irqs are typically disabled and such a lengthy delay may even result in NMI watchdog timeouts. To fix this, suppress the page walk in such contexts when printing the page allocation failure warning. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Mel Gorman <mgorman@suse.de> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm: trace filemap add and delRobert Jarzmik
Use the events API to trace filemap loading and unloading of file pieces into the page cache. This patch aims at tracing the eviction reload cycle of executable and shared libraries pages in a memory constrained environment. The typical usage is to spot a specific device and inode (for example /lib/libc.so) to see the eviction cycles, and find out if frequently used code is rather spread across many pages (bad) or coallesced (good). Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Cc: Dave Chinner <david@fromorbit.com> Cc: Hugh Dickins <hughd@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29debug_locks.h: make warning more verboseJames Hogan
The WARN_ON(1) in DEBUG_LOCKS_WARN_ON is surprisingly awkward to track down when it's hit, as it's usually buried in macros, causing multiple instances to land on the same line number. This patch makes it more useful by switching to: WARN(1, "DEBUG_LOCKS_WARN_ON(%s)", #c); so that the particular DEBUG_LOCKS_WARN_ON is more easily identified and grep'd for. For example: WARNING: at kernel/mutex.c:198 _mutex_lock_nested+0x31c/0x380() DEBUG_LOCKS_WARN_ON(l->magic != l) Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: David Howells <dhowells@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29drivers/video: add Hyper-V Synthetic Video Frame Buffer DriverHaiyang Zhang
This is the driver for the Hyper-V Synthetic Video, which supports screen resolution up to Full HD 1920x1080 on Windows Server 2012 host, and 1600x1200 on Windows Server 2008 R2 or earlier. It also solves the double mouse cursor issue of the emulated video mode. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Cc: Olaf Hering <olaf@aepfle.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29Merge tag 'usb-3.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB patches from Greg Kroah-Hartman: "Here's the big USB pull request for 3.10-rc1. Lots of USB patches here, the majority being USB gadget changes and USB-serial driver cleanups, the rest being ARM build fixes / cleanups, and individual driver updates. We also finally got some chipidea fixes, which have been delayed for a number of kernel releases, as the maintainer has now reappeared. All of these have been in linux-next for a while" * tag 'usb-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (568 commits) USB: ehci-msm: USB_MSM_OTG needs USB_PHY USB: OHCI: avoid conflicting platform drivers USB: OMAP: ISP1301 needs USB_PHY USB: lpc32xx: ISP1301 needs USB_PHY USB: ftdi_sio: enable two UART ports on ST Microconnect Lite usb: phy: tegra: don't call into tegra-ehci directly usb: phy: phy core cannot yet be a module USB: Fix initconst in ehci driver usb-storage: CY7C68300A chips do not support Cypress ATACB USB: serial: option: Added support Olivetti Olicard 145 USB: ftdi_sio: correct ST Micro Connect Lite PIDs ARM: mxs_defconfig: add CONFIG_USB_PHY ARM: imx_v6_v7_defconfig: add CONFIG_USB_PHY usb: phy: remove exported function from __init section usb: gadget: zero: put function instances on unbind usb: gadget: f_sourcesink.c: correct a copy-paste misnomer usb: gadget: cdc2: fix error return code in cdc_do_config() usb: gadget: multi: fix error return code in rndis_do_config() usb: gadget: f_obex: fix error return code in obex_bind() USB: storage: convert to use module_usb_driver() ...
2013-04-29Merge tag 'tty-3.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver update from Greg Kroah-Hartman: "Here's the big tty/serial driver merge request for 3.10-rc1 Once again, Jiri has a number of TTY driver fixes and cleanups, and Peter Hurley came through with a bunch of ldisc fixes that resolve a number of reported issues. There are some other serial driver cleanups as well. All of these have been in the linux-next tree for a while" * tag 'tty-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (117 commits) tty/serial/sirf: fix MODULE_DEVICE_TABLE serial: mxs: drop superfluous {get|put}_device serial: mxs: fix buffer overflow ARM: PL011: add support for extended FIFO-size of PL011-r1p5 serial_core.c: add put_device() after device_find_child() tty: Fix unsafe bit ops in tty_throttle_safe/unthrottle_safe serial: sccnxp: Replace pdata.init/exit with regulator API serial: sccnxp: Do not override device name TTY: pty, fix compilation warning TTY: rocket, fix compilation warning TTY: ircomm: fix DTR being raised on hang up TTY: synclinkmp: fix DTR being raised on hang up TTY: synclink_gt: fix DTR being raised on hang up TTY: synclink: fix DTR being raised on hang up serial: 8250_dw: Fix the stub for dw8250_probe_acpi() serial: 8250_dw: Convert to devm_ioremap() serial: 8250_dw: Set port capabilities based on CPR register serial: 8250_dw: Let ACPI code extract the DMA client info serial: 8250_dw: Support clk framework also with ACPI serial: 8250_dw: Enable runtime PM ...
2013-04-29Merge tag 'staging-3.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver tree update from Greg Kroah-Hartman: "Here's the big staging driver tree update for 3.10-rc1 This update contains loads of comedi driver cleanups and fixes in here, iio updates, android driver changes, and other various staging driver cleanups. Thanks to some drivers being removed, and the comedi driver cleanups, we have removed more code than we added: 627 files changed, 65145 insertions(+), 76321 deletions(-) which is always nice to see. All of these have been in linux-next for a while." * tag 'staging-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (940 commits) staging: comedi: ni_labpc: fix legacy driver build staging: comedi: das800: cleanup the cio-das802/16 fifo comments staging: comedi: das800: rename CamelCase vars in das800_ai_do_cmd() staging: comedi: das800: tidy up the private data staging: comedi: das800: tidy up das800_interrupt() staging: comedi: das800: tidy up das800_ai_insn_read() staging: comedi: das800: tidy up das800_di_insn_bits() staging: comedi: das800: tidy up das800_do_insn_bits() staging: comedi: das800: remove extra divisor calculation call staging: comedi: das800: rename {enable,disable}_das800 staging: comedi: das800: tidy up subdevice init staging: comedi: das800: allow attaching without interrupt support staging: comedi: das800: interrupts are required for async command support staging: comedi: das800: tidy up das800_ai_do_cmdtest() staging: comedi: das800: remove 'volatile' on private data variables staging: comedi: das800: cleanup the boardinfo staging: comedi: das800: cleanup range table declarations staging: comedi: das800: introduce das800_ind_{write, read}() staging: comedi: das800: remove forward declarations staging: comedi: das800: move das800_set_frequency() ...
2013-04-29Merge tag 'driver-core-3.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core update from Greg Kroah-Hartman: "Here's the merge request for the driver core tree for 3.10-rc1 It's pretty small, just a number of driver core and sysfs updates and fixes, all of which have been in linux-next for a while now. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" Fixed conflict in kernel/rtmutex-tester.c, the locking tree had a better fix for the same sysfs file mode problem. * tag 'driver-core-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: PM / Runtime: Idle devices asynchronously after probe|release driver core: handle user namespaces properly with the uid/gid devtmpfs change driver core: devtmpfs: fix compile failure with CONFIG_UIDGID_STRICT_TYPE_CHECKS devtmpfs: add base.h include driver core: add uid and gid to devtmpfs sysfs: check if one entry has been removed before freeing sysfs: fix crash_notes_size build warning sysfs: fix use after free in case of concurrent read/write and readdir rtmutex-tester: fix mode of sysfs files Documentation: Add ABI entry for crash_notes and crash_notes_size sysfs: Add crash_notes_size to export percpu note size driver core: platform_device.h: fix checkpatch errors and warnings driver core: platform.c: fix checkpatch errors and warnings driver core: warn that platform_driver_probe can not use deferred probing sysfs: use atomic_inc_unless_negative in sysfs_get_active base: core: WARN() about bogus permissions on device attributes device: separate all subsys mutexes
2013-04-29Merge tag 'char-misc-3.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver update from Greg Kroah-Hartman: "Here's the big char / misc driver update for 3.10-rc1 A number of various driver updates, the majority being new functionality in the MEI driver subsystem (it's now a subsystem, it started out just a single driver), extcon updates, memory updates, hyper-v updates, and a bunch of other small stuff that doesn't fit in any other tree. All of these have been in linux-next for a while" * tag 'char-misc-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (148 commits) Tools: hv: Fix a checkpatch warning tools: hv: skip iso9660 mounts in hv_vss_daemon tools: hv: use FIFREEZE/FITHAW in hv_vss_daemon tools: hv: use getmntent in hv_vss_daemon Tools: hv: Fix a checkpatch warning tools: hv: fix checks for origin of netlink message in hv_vss_daemon Tools: hv: fix warnings in hv_vss_daemon misc: mark spear13xx-pcie-gadget as broken mei: fix krealloc() misuse in in mei_cl_irq_read_msg() mei: reduce flow control only for completed messages mei: reseting -> resetting mei: fix reading large reposnes mei: revamp mei_irq_read_client_message function mei: revamp mei_amthif_irq_read_message mei: revamp hbm state machine Revert "drivers/scsi: use module_pcmcia_driver() in pcmcia drivers" Revert "scsi: pcmcia: nsp_cs: remove module init/exit function prototypes" scsi: pcmcia: nsp_cs: remove module init/exit function prototypes mei: wd: fix line over 80 characters misc: tsl2550: Use dev_pm_ops ...
2013-04-29Merge tag 'hwmon-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon update from Guenter Roeck: - New drivers for NCT6775, NCT6776, NCT6779, and LM95234. - Added support for LTC2974, LTC3883, LM25056, TMP431, TMP432, ADT7310, and ADT7320 to existing drivers. - Various code cleanups and minor improvements. * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (54 commits) hwmon: (nct6775) Fix coding style problems hwmon: (nct6775) Constify strings hwmon: (tmp401) Add support for TMP432 hwmon: (tmp401) Add support for update_interval attribute hwmon: (tmp401) Reset valid flag when resetting temperature history hwmon: (tmp401) Simplification and cleanup hwmon: (tmp401) Use sysfs_create_group / sysfs_remove_group hwmon: (tmp401) Drop unused defines, use BIT for bit masks hwmon: (nct6775) Use ARRAY_SIZE for loops where possible documentation: hwmon: Fix typo in documentation/hwmon hwmon: (nct6775) Enable both AUXTIN and VIN3 on NCT6776 hwmon: (ad7314) use spi_get_drvdata() and spi_set_drvdata() MAINTAINERS: Add myself as maintainer for the NCT6775 driver hwmon: (nct6775) Expand scope of supported chips hwmon: (gpio-fan) Use is_visible to determine if attributes should be created hwmon: (tmp401) Fix device detection for TMP411B and TMP411C hwmon: Add driver for LM95234 hwmon: (tmp401) Add support for TMP431 hwmon: (pmbus/lm25066) Add support for LM25056 hwmon: (pmbus/lm25066) Refactor device specific coefficients ...
2013-04-29Merge tag 'pinctrl-for-v3.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pinctrl update from Linus Walleij: "These are the pinctrl changes for v3.10: - Patrice Chotard contributed a new configuration debugfs interface and reintroduced fine-grained locking into the core: instead of having a "big pinctrl lock" we have a per-controller lock and specialized locks for the global controller and pinctrl handle lists. - Haoijan Zhuang deleted all the PXA and MMP2 pinctrl drivers and replaced them with pinctrl-single (which is also used by other SoCs) so we are gaining consolidation. The platform particulars now come in through the device tree. - Haoijan also added support for generic pin config into the pinctrl-single driver which is another big consolidation win. - Finally also GPIO ranges are now supported by the pinctrl-single driver. - Tomasz Figa contributed a new Samsung S3C pinctrl driver, bringing more of the older Samsung platforms under the pinctrl umbrella and out of arch/arm. - Maxime Ripard contributed new Allwinner A10/A13 drivers. - Sachin Kamat, Wei Yongjun and Axel Lin did a lot of cleanups." * tag 'pinctrl-for-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (66 commits) pinctrl: move subsystem mutex to pinctrl_dev struct pinctrl/pinconfig: fix misplaced goto pinctrl: s3c64xx: Fix build error caused by undefined chained_irq_enter pinctrl/pinconfig: add debug interface pinctrl: abx500: fix issue when no pdata pinctrl: pinctrl-single: add missing double quote pinctrl: sunxi: Rename wemac functions to emac pinctrl: exynos5440: add gpio interrupt support pinctrl: exynos5440: fix probe failure due to missing pin-list in config nodes pinctrl: ab8505: Staticize some symbols pinctrl: ab8540: Staticize some symbols pinctrl: ab9540: Staticize some symbols pinctrl: ab8500: Staticize some symbols pinctrl: abx500: Staticize some symbols pinctrl: Add pinctrl-s3c64xx driver pinctrl: samsung: Handle banks with two configuration registers pinctrl: samsung: Remove hardcoded register offsets pinctrl: samsung: Split pin bank description into two structures pinctrl: samsung: Include pinctrl-exynos driver data conditionally pinctrl: samsung: Protect bank registers with a spinlock ...
2013-04-29Merge tag 'fbdev-for-3.10' of git://gitorious.org/linux-omap-dss2/linuxLinus Torvalds
Pull fbdev updates from Tomi Valkeinen: - use vm_iomap_memory() in various fb drivers to map the fb memory to userspace - Cleanups for the videomode and display_timing features - Updates to vt8500, wm8505 and auo-k190x fb drivers * tag 'fbdev-for-3.10' of git://gitorious.org/linux-omap-dss2/linux: (36 commits) fbdev: fix check for fb_mmap's mmio availability fbdev: improve fb_mmap bounds checks fbdev/ps3fb: use vm_iomap_memory() fbdev/sgivwfb: use vm_iomap_memory() fbdev/vermillion: use vm_iomap_memory() fbdev/sa1100fb: use vm_iomap_memory() fbdev/fb-puv3: use vm_iomap_memory() fbdev/controlfb: use vm_iomap_memory() fbdev/omapfb: use vm_iomap_memory() video: vt8500: fix Kconfig for videomode video/s3c: move platform_data out of arch/arm video/exynos: remove unnecessary header inclusions drivers/video: fsl-diu-fb: add hardware cursor support drivers: video: use module_platform_driver_probe() ARM: OMAP: remove "config FB_OMAP_CONSISTENT_DMA_SIZE" video: wm8505fb: Convert to devm_ioremap_resource() AUO-K190x: Add resolutions for portrait displays AUO-K190x: add framebuffer rotation support AUO-K190x: add a 16bit truecolor mode AUO-K190x: make color handling more flexible ...
2013-04-29Merge tag 'pci-v3.10-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "PCI changes for the v3.10 merge window: PCI device hotplug - Remove ACPI PCI subdrivers (Jiang Liu, Myron Stowe) - Make acpiphp builtin only, not modular (Jiang Liu) - Add acpiphp mutual exclusion (Jiang Liu) Power management - Skip "PME enabled/disabled" messages when not supported (Rafael Wysocki) - Fix fallback to PCI_D0 (Rafael Wysocki) Miscellaneous - Factor quirk_io_region (Yinghai Lu) - Cache MSI capability offsets & cleanup (Gavin Shan, Bjorn Helgaas) - Clean up EISA resource initialization and logging (Bjorn Helgaas) - Fix prototype warnings (Andy Shevchenko, Bjorn Helgaas) - MIPS: Initialize of_node before scanning bus (Gabor Juhos) - Fix pcibios_get_phb_of_node() declaration "weak" annotation (Gabor Juhos) - Add MSI INTX_DISABLE quirks for AR8161/AR8162/etc (Xiong Huang) - Fix aer_inject return values (Prarit Bhargava) - Remove PME/ACPI dependency (Andrew Murray) - Use shared PCI_BUS_NUM() and PCI_DEVID() (Shuah Khan)" * tag 'pci-v3.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (63 commits) vfio-pci: Use cached MSI/MSI-X capabilities vfio-pci: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK PCI: Remove "extern" from function declarations PCI: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK PCI: Drop msi_mask_reg() and remove drivers/pci/msi.h PCI: Use msix_table_size() directly, drop multi_msix_capable() PCI: Drop msix_table_offset_reg() and msix_pba_offset_reg() macros PCI: Drop is_64bit_address() and is_mask_bit_support() macros PCI: Drop msi_data_reg() macro PCI: Drop msi_lower_address_reg() and msi_upper_address_reg() macros PCI: Drop msi_control_reg() macro and use PCI_MSI_FLAGS directly PCI: Use cached MSI/MSI-X offsets from dev, not from msi_desc PCI: Clean up MSI/MSI-X capability #defines PCI: Use cached MSI-X cap while enabling MSI-X PCI: Use cached MSI cap while enabling MSI interrupts PCI: Remove MSI/MSI-X cap check in pci_msi_check_device() PCI: Cache MSI/MSI-X capability offsets in struct pci_dev PCI: Use u8, not int, for PM capability offset [SCSI] megaraid_sas: Use correct #define for MSI-X capability PCI: Remove "extern" from function declarations ...
2013-04-29Merge branch 'core-locking-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking changes from Ingo Molnar: "The most noticeable change are mutex speedups from Waiman Long, for higher loads. These scalability changes should be most noticeable on larger server systems. There are also cleanups, fixes and debuggability improvements." * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: lockdep: Consolidate bug messages into a single print_lockdep_off() function lockdep: Print out additional debugging advice when we hit lockdep BUGs mutex: Back out architecture specific check for negative mutex count mutex: Queue mutex spinners with MCS lock to reduce cacheline contention mutex: Make more scalable by doing less atomic operations mutex: Move mutex spinning code from sched/core.c back to mutex.c locking/rtmutex/tester: Set correct permissions on sysfs files lockdep: Remove unnecessary 'hlock_next' variable
2013-04-29Merge tag 'stable/for-linus-3.10-rc0-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen updates from Konrad Rzeszutek Wilk: "Features: - Populate the boot_params with EDD data. - Cleanups in the IRQ code. Bug-fixes: - CPU hotplug offline/online in PVHVM mode. - Re-upload processor PM data after ACPI S3 suspend/resume cycle." And Konrad gets a gold star for sending the pull request early when he thought he'd be away for the first week of the merge window (but because of 3.9 dragging out to -rc8 he then re-sent the reminder on the first day of the merge window anyway) * tag 'stable/for-linus-3.10-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: resolve section mismatch warnings in xen-acpi-processor xen: Re-upload processor PM data to hypervisor after S3 resume (v2) xen/smp: Unifiy some of the PVs and PVHVM offline CPU path xen/smp/pvhvm: Don't initialize IRQ_WORKER as we are using the native one. xen/spinlock: Disable IRQ spinlock (PV) allocation on PVHVM xen/spinlock: Check against default value of -1 for IRQ line. xen/time: Add default value of -1 for IRQ and check for that. xen/events: Check that IRQ value passed in is valid. xen/time: Fix kasprintf splat when allocating timer%d IRQ line. xen/smp/spinlock: Fix leakage of the spinlock interrupt line for every CPU online/offline xen/smp: Fix leakage of timer interrupt line for every CPU online/offline. xen kconfig: fix select INPUT_XEN_KBDDEV_FRONTEND xen: drop tracking of IRQ vector x86/xen: populate boot_params with EDD data
2013-04-26Merge branch '3.10/fb-mmap' into for-nextTomi Valkeinen
Merge topic branch to get vm_iomap_memory into use. Conflicts: drivers/video/fbmon.c
2013-04-24Merge branch 'pci/gavin-msi-cleanup' into nextBjorn Helgaas
* pci/gavin-msi-cleanup: vfio-pci: Use cached MSI/MSI-X capabilities vfio-pci: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK PCI: Remove "extern" from function declarations PCI: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK PCI: Drop msi_mask_reg() and remove drivers/pci/msi.h PCI: Use msix_table_size() directly, drop multi_msix_capable() PCI: Drop msix_table_offset_reg() and msix_pba_offset_reg() macros PCI: Drop is_64bit_address() and is_mask_bit_support() macros PCI: Drop msi_data_reg() macro PCI: Drop msi_lower_address_reg() and msi_upper_address_reg() macros PCI: Drop msi_control_reg() macro and use PCI_MSI_FLAGS directly PCI: Use cached MSI/MSI-X offsets from dev, not from msi_desc PCI: Clean up MSI/MSI-X capability #defines PCI: Use cached MSI-X cap while enabling MSI-X PCI: Use cached MSI cap while enabling MSI interrupts PCI: Remove MSI/MSI-X cap check in pci_msi_check_device() PCI: Cache MSI/MSI-X capability offsets in struct pci_dev PCI: Use u8, not int, for PM capability offset [SCSI] megaraid_sas: Use correct #define for MSI-X capability
2013-04-23usb: phy: tegra: don't call into tegra-ehci directlyArnd Bergmann
Both phy-tegra-usb.c and ehci-tegra.c export symbols used by the other one, which does not work if one of them or both are loadable modules, resulting in an error like: drivers/built-in.o: In function `utmi_phy_clk_disable': drivers/usb/phy/phy-tegra-usb.c:302: undefined reference to `tegra_ehci_set_phcd' drivers/built-in.o: In function `utmi_phy_clk_enable': drivers/usb/phy/phy-tegra-usb.c:324: undefined reference to `tegra_ehci_set_phcd' drivers/built-in.o: In function `utmi_phy_power_on': drivers/usb/phy/phy-tegra-usb.c:447: undefined reference to `tegra_ehci_set_pts' This turns the interface into a one-way dependency by letting the tegra ehci driver pass two function pointers for callbacks that need to be called by the phy driver. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Venu Byravarasu <vbyravarasu@nvidia.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Felipe Balbi <balbi@ti.com> Cc: Stephen Warren <swarren@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-23PCI: Remove "extern" from function declarationsBjorn Helgaas
We had an inconsistent mix of using and omitting the "extern" keyword on function declarations in header files. This removes them all. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23PCI: Clean up MSI/MSI-X capability #definesBjorn Helgaas
This doesn't change any existing symbols, but it puts them in logical order and uses explicit masks instead of shifts, like the rest of the file. It also adds new symbols for PCI_MSIX_TABLE_BIR, PCI_MSIX_TABLE_OFFSET, PCI_MSIX_PBA_BIR, and PCI_MSIX_PBA_OFFSET to replace the mis-named PCI_MSIX_FLAGS_BIRMASK (the BAR index fields are part of the Table and PBA registers, not the flags register). Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23PCI: Cache MSI/MSI-X capability offsets in struct pci_devGavin Shan
The patch caches the MSI and MSI-X capability offset in PCI device (struct pci_dev) so that we needn't read it from the config space upon enabling or disabling MSI or MSI-X interrupts. [bhelgaas: moved pm_cap size change to separate patch] Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23PCI: Use u8, not int, for PM capability offsetBjorn Helgaas
The Power Management Capability (PCI_CAP_ID_PM == 0x01) is defined by PCI and must appear in the 256-byte PCI Configuration Space from 0-0xff. It cannot be in the PCIe Extended Configuration space from 0x100-0xfff, so we only need a u8 to hold its offset. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-20Merge branch 'x86-kdump-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull kdump fixes from Peter Anvin: "The kexec/kdump people have found several problems with the support for loading over 4 GiB that was introduced in this merge cycle. This is partly due to a number of design problems inherent in the way the various pieces of kdump fit together (it is pretty horrifically manual in many places.) After a *lot* of iterations this is the patchset that was agreed upon, but of course it is now very late in the cycle. However, because it changes both the syntax and semantics of the crashkernel option, it would be desirable to avoid a stable release with the broken interfaces." I'm not happy with the timing, since originally the plan was to release the final 3.9 tomorrow. But apparently I'm doing an -rc8 instead... * 'x86-kdump-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kexec: use Crash kernel for Crash kernel low x86, kdump: Change crashkernel_high/low= to crashkernel=,high/low x86, kdump: Retore crashkernel= to allocate under 896M x86, kdump: Set crashkernel_low automatically
2013-04-20Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "Three groups of fixes: 1. Make sure we don't execute the early microcode patching if family < 6, since it would touch MSRs which don't exist on those families, causing crashes. 2. The Xen partial emulation of HyperV can be dealt with more gracefully than just disabling the driver. 3. More EFI variable space magic. In particular, variables hidden from runtime code need to be taken into account too." * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, microcode: Verify the family before dispatching microcode patching x86, hyperv: Handle Xen emulation of Hyper-V more gracefully x86,efi: Implement efi_no_storage_paranoia parameter efi: Export efi_query_variable_store() for efivars.ko x86/Kconfig: Make EFI select UCS2_STRING efi: Distinguish between "remaining space" and actually used space efi: Pass boot services variable info to runtime code Move utf16 functions to kernel core and rename x86,efi: Check max_size only if it is non-zero. x86, efivars: firmware bug workarounds should be in platform code
2013-04-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking updates from David Miller: 1) ax88796 does 64-bit divides which causes link errors on ARM, fix from Arnd Bergmann. 2) Once an improper offload setting is detected on an SKB we don't rate limit the log message so we can very easily live lock. From Ben Greear. 3) Openvswitch cannot report vport configuration changes reliably because it didn't preallocate the netlink notification message before changing state. From Jesse Gross. 4) The effective UID/GID SCM credentials fix, from Linus. 5) When a user explicitly asks for wireless authentication, cfg80211 isn't told about the AP detachment leaving inconsistent state. Fix from Johannes Berg. 6) Fix self-MAC checks in batman-adv on multi-mesh nodes, from Antonio Quartulli. 7) Revert build_skb() change sin IGB driver, can result in memory corruption. From Alexander Duyck. 8) Fix setting VLANs on virtual functions in IXGBE, from Greg Rose. 9) Fix TSO races in qlcnic driver, from Sritej Velaga. 10) In bnx2x the kernel driver and UNDI firmware can try to program the chip at the same time, resulting in corruption. Add proper synchronization. From Dmitry Kravkov. 11) Fix corruption of status block in firmware ram in bxn2x, from Ariel Elior. 12) Fix load balancing hash regression of bonding driver in forwarding configurations, from Eric Dumazet. 13) Fix TS ECR regression in TCP by calling tcp_replace_ts_recent() in all the right spots, from Eric Dumazet. 14) Fix several bonding bugs having to do with address manintainence, including not removing address when configuration operations encounter errors, missed locking on the address lists, missing refcounting on VLAN objects, etc. All from Nikolay Aleksandrov. 15) Add workarounds for firmware bugs in LTE qmi_wwan devices, wherein the devices fail to add a proper ethernet header while on LTE networks but otherwise properly do so on 2G and 3G ones. From Bjørn Mork. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits) net: fix incorrect credentials passing net: rate-limit warn-bad-offload splats. net: ax88796: avoid 64 bit arithmetic qlge: Update version to 1.00.00.32. qlge: Fix ethtool autoneg advertising. qlge: Fix receive path to drop error frames net: qmi_wwan: prevent duplicate mac address on link (firmware bug workaround) net: qmi_wwan: fixup destination address (firmware bug workaround) net: qmi_wwan: fixup missing ethernet header (firmware bug workaround) bonding: in bond_mc_swap() bond's mc addr list is walked without lock bonding: disable netpoll on enslave failure bonding: primary_slave & curr_active_slave are not cleaned on enslave failure bonding: vlans don't get deleted on enslave failure bonding: mc addresses don't get deleted on enslave failure pkt_sched: fix error return code in fw_change_attrs() irda: small read past the end of array in debug code tcp: call tcp_replace_ts_recent() from tcp_ack() netfilter: xt_rpfilter: skip locally generated broadcast/multicast, too netfilter: ipset: bitmap:ip,mac: fix listing with timeout bonding: fix l23 and l34 load balancing in forwarding path ...
2013-04-20net: fix incorrect credentials passingLinus Torvalds
Commit 257b5358b32f ("scm: Capture the full credentials of the scm sender") changed the credentials passing code to pass in the effective uid/gid instead of the real uid/gid. Obviously this doesn't matter most of the time (since normally they are the same), but it results in differences for suid binaries when the wrong uid/gid ends up being used. This just undoes that (presumably unintentional) part of the commit. Reported-by: Andy Lutomirski <luto@amacapital.net> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Serge E. Hallyn <serge@hallyn.com> Cc: David S. Miller <davem@davemloft.net> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19Merge remote-tracking branch 'efi/urgent' into x86/urgentH. Peter Anvin
Matt Fleming (1): x86, efivars: firmware bug workarounds should be in platform code Matthew Garrett (3): Move utf16 functions to kernel core and rename efi: Pass boot services variable info to runtime code efi: Distinguish between "remaining space" and actually used space Richard Weinberger (2): x86,efi: Check max_size only if it is non-zero. x86,efi: Implement efi_no_storage_paranoia parameter Sergey Vlasov (2): x86/Kconfig: Make EFI select UCS2_STRING efi: Export efi_query_variable_store() for efivars.ko Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-19irda: small read past the end of array in debug codeDan Carpenter
The "reason" can come from skb->data[] and it hasn't been capped so it can be from 0-255 instead of just 0-6. For example in irlmp_state_dtr() the code does: reason = skb->data[3]; ... irlmp_disconnect_indication(self, reason, skb); Also LMREASON has a couple other values which don't have entries in the irlmp_reasons[] array. And 0xff is a valid reason as well which means "unknown". So far as I can see we don't actually care about "reason" except for in the debug code. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>