summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-01-15cpuidle: irq: Increase the rating to have the governor as defaultcpuidle-eas-nextDaniel Lezcano
2015-01-15irq: timings: Fix next_prediction type to s64Daniel Lezcano
2015-01-15cpuidle: Add a simple irq governorDaniel Lezcano
This simple governor takes into account the predictable events: the timer sleep duration and the next expected irq sleep duration. By mixing both it deduced what idle state fits better. The main purpose of this governor is to handle the guessed next events in a categorized way: 1. deterministic events : timers 2. guessed events : IOs 3. predictable events : keystroke, incoming network packet, ... This governor is aimed to be moved later near the scheduler, so this one can inspect/inject more informations and act proactively rather than reactively. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Conflicts: drivers/cpuidle/Kconfig
2015-01-15irq: Add per block device the flag to measure the irqDaniel Lezcano
This is for KVM sata virtual hardware. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-01-15irq_timing: proc: Register irq to be tracked for timing measurementsDaniel Lezcano
The irq timing framework will track only the interrupts which were explicitely specified with the flag IRQF_TIMING. That allows to have a fine grained control of what we are tracking as source of events. Unfortunately each driver must be modified accordingly and that makes very difficult the development. This patch adds /proc/irq/<nr>/timing boolean file to specify from userspace the interrupts we want to track. So no need to find and to change for each platform the drivers in the kernel. That can be done from userspace now. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-01-15Default IRQ_TIMINGS to 'yes'Daniel Lezcano
2015-01-15irq_timings: registration for IRQ timing processingNicolas Pitre
Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-01-15irq_timings: add a trace eventNicolas Pitre
The standard deviation may be obtained by computing the variance's square root. Given the cost this is left to user space to do. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-01-15irq_timings: connect timing processing to IRQ eventsNicolas Pitre
Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-01-15irq_timings: function to retrieve time of next predicted IRQNicolas Pitre
Those events in the past, if any, are purged and then the first item on the list, if any, contains our next predicted IRQ time. We have access to the standard deviation and could use it to qualify our confidence in the prediction eventually. For now it's only the raw prediction that is returned. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-01-15irq_timings: add per-CPU prediction queueingNicolas Pitre
Once a good IRQ prediction is made, we need to enqueue it for later consumption. While at it we discard any predictions whose time stamp is in the past. There shouldn't be that many expected IRQs at any given time. A sorted list is most likely going to be good enough. And, by definition, the most frequent IRQs will end up near the beginning of the list anyway. Tthere is no generic way to determine what the IRQ controller is going to do if the IRQ affinity mask contains multiple CPUs. It is therefore assumed that the next occurrence of an IRQ is most likely to happen on the same CPU as the last one. It appears to be the case overall from observations on X86 despite active migration controlled from user space. On ARM it is the first CPU in the affinity mask that is selected by the GIC driver so this assumption is quite right in that case. If migration frequency becomes significant compared to IRQ occurrences then we could consider registering an affinity notifier. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-01-15irq_timings: introduce IRQ occurrence timing statisticsNicolas Pitre
Many IRQs are quiet most of the time, or they tend to come in bursts of fairly equal time intervals within each burst. It is therefore possible to detect those IRQs with stable intervals and guestimate when the next IRQ event is most likely to happen. Examples of such IRQs may include audio related IRQs where the FIFO size and/or DMA descriptor size with the sample rate create stable intervals, block devices during large data transfers, etc. Even network streaming of multimedia content creates patterns of periodic network interface IRQs in some cases. This patch adds code to track the mean interval and variance for each IRQ over a window of time intervals between IRQ events. Those statistics can be used to assist cpuidle in selecting the most appropriate sleep state by predicting the most likely time for the next interrupt. Because the stats are gathered in interrupt context, the core computation is as light as possible, turning into 3 subs, 3 adds and 6 mults where 4 of those mults involve a small compile-time constant. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-01-15sched: fair: Don't wake up a cpu before its break evenDaniel Lezcano
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-01-15sched: fair: Choose an non idle cpu in case the energy sched feature is setDaniel Lezcano
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-01-15sched: Add SCHED_ENERGY_IDLE optionDaniel Lezcano
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-01-15cpuidle: sysfs: Add per cpu idle state prediction statisticsDaniel Lezcano
This patch gives some statitics under /sys/devices/system/cpu/cpuX/cpuidle/stats The statistics are regarding the prediction vs the idle state. under_estimate: the sleep duration was longer than expected and a deeper idle could have been choose in this case over_estimate: the sleep duration was shorter than expected and we should have chose a shallower state, that increase exit latency right_estimate: the sleep duration is correct regarding the target residency and the idle state chose. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-01-15cpuidle: Use the state selection from the menu governorDaniel Lezcano
The menu governor chooses a state with a selection which is the same than the one we previously defined with the 'cpuidle_find_state' function in the previous patch. Let's use it and factor out the code. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-12-11cpuidle: Factor out the selection loop of the idle stateDaniel Lezcano
The selection loop follows the logic "choose an idle state fulfilling the sleep duration and the exit latency constraints". It can be encapsulated into a function and reused in different places. It gives the API to choose an idle state regarding the timing information passed as parameter. The 'cpuidle_find_deepest_state' function does the similar selection. The new function can be used instead with infinite time constraints as we are going to suspend. That is one more step to the cpuidle/scheduler integration. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-12-11cpuidle: Don't use the exit latency to find the deepest stateDaniel Lezcano
In the code, the convention is the deeper an idle is, the greater the exit latency is. No need to check the exit latency between the states. Furthermore, that will allow in the next patches to factor out this loop. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-12-11cpuidle: Remove the evil CPUIDLE_DRIVER_STATE_START macroDaniel Lezcano
The CPUIDLE_DRIVER_STATE_START macro is no longer used. Remove it. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-12-11cpuidle: menu: Default to state index 0Daniel Lezcano
The poll state does no longer exists, we can safely default to the index state 0 as now it won't be the poll state anymore. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-12-11cpuidle: x86: Remove the poll state for the x86 driverDaniel Lezcano
All the code is avoiding to use the poll state. The only place where the poll state is used is when we failed to find an idle state and there is a timer to expire within 5us. But in this place we call directly cpuidle_poll without using the poll state callback in the idle state array of the cpuidle driver. The poll state in the driveri's idle state array is no longer used. Remove it and cleanup this mess. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-12-11cpuidle: Remove CPUIDLE_DRIVER_STATE_START users from cpuidle.cDaniel Lezcano
In order to remove the poll state from the idle drivers in the next patch, let's remove the usage of the CPUIDLE_DRIVER_STATE_START from the idle state selection loops. 1. cpuidle_play_dead will ignore the poll idle state because this one does not implement the 'enter_dead' callback. We can safely remove the usage of the CPUIDLE_DRIVER_STATE_START macro and start from the zero index. 2. cpuidle_find_deepest_state will always ignore the poll idle state because its exit_latency is 0 and because of the check in the loop: if (s->disabled || su->disable || s->exit_latency <= latency_req) continue; ... it will be always ignored (exit_latency == latency_req) Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-12-11cpuidle: menu: Remove the default poll stateDaniel Lezcano
The following applies only for x86. The current code does default to C0, only if a timer is about to expire. As the timer expiration is a reliable information, this has a double guarantee: * we ensure fast exit from idle * we ensure we won't be polling for a too long period which is dangerous from a thermal point of view Unfortunately this code brings a lot of weirdness all around the default idle state and with the CPUIDLE_DRIVER_STATE_START macro. Regarding the number of times the poll function is called (1/10000 on my server), we can legitimately ask if this test is worth in the menu governor with the hardware we have nowadays with very low exit latency. Just remove this test. If it appears it brings a real measurable gain in performance, we can re-introduce it in a more clever way later. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-12-11cpuidle: menu: Move the update function before its declarationDaniel Lezcano
In order to prevent a pointless forward declaration, just move the function at the beginning of the file. This patch does not change the behavior of the governor, it is just code reordering. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Len Brown <len.brown@intel.com>
2014-12-11cpuidle: menu: Fix the get_typical_intervalDaniel Lezcano
The first time the 'get_typical_function' is called, it computes an average of zero as no data is filled yet. That leads the 'data->predicted_us' variable to be set to zero too. The caller, 'menu_select' will then do: interactivity_req = data->predicted_us / performance_multiplier(nr_iowaiters, cpu_load); That sets the interactivity_req to zero (0/performance...). and then if (latency_req > interactivity_req) latency_req = interactivity_req; ... setting 'latency_req' to zero too. No idle state will fulfill this constraint and we will go the C1 state as default and leading to an update. So the next calls will compute an average different from zero. Even if that works with the current code but with a broken semantic, it will just break with the next patches where we are stricter with the latencies check: the first check will fail (latency_req is zero), then no update will occur leading to always falling to choose an idle state. As there are no previous values and it is pointless to compute a standard deviation for these unexisting values. Change the function to return the computed value and use it only if it is different from zero and greater than the next timer expiration. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-12-11cpuidle: idle: menu: Don't reflect when a state selection failedDaniel Lezcano
In the current code, the check to reflect or not the outcoming state is done against the idle state which has been chosen and its value. Instead of doing a check in each of the reflect functions, just don't call reflect if something went wrong in the idle path. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Nicolas Pitre <nico@linaro.org>
2014-12-11sched: idle: Get the next timer event and pass it the cpuidle frameworkDaniel Lezcano
Following the logic of the previous patch, retrieve from the idle task the expected timer sleep duration and pass it to the cpuidle framework. Take the opportunity to remove the unused headers in the menu.c file. This patch does not change the current behavior. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Nicolas Pitre <nico@linaro.org> Reviewed-by: Len Brown <len.brown@intel.com>
2014-12-11sched: idle: cpuidle: Check the latency req before idleDaniel Lezcano
When the pmqos latency requirement is set to zero that means "poll in all the cases". That is correctly implemented on x86 but not on the other archs. As how is written the code, if the latency request is zero, the governor will return zero, so corresponding, for x86, to the poll function, but for the others arch the default idle function. For example, on ARM this is wait-for- interrupt with a latency of '1', so violating the constraint. In order to fix that, do the latency requirement check *before* calling the cpuidle framework in order to jump to the poll function without entering cpuidle. That has several benefits: 1. It clarifies and unifies the code 2. It fixes x86 vs other archs behavior 3. Factors out the call to the same function 4. Prevent to enter the cpuidle framework with its expensive cost in calculation As the latency_req is needed in all the cases, change the select API to take the latency_req as parameter in case it is not equal to zero. As a positive side effect, it introduces the latency constraint specified externally, so one more step to the cpuidle/scheduler integration. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Nicolas Pitre <nico@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Len Brown <len.brown@intel.com>
2014-12-11sched: idle: Add a weak arch_cpu_idle_poll functionDaniel Lezcano
The poll function is called when a timer expired or if we force to poll when the cpu_idle_force_poll option is set. The poll function does: local_irq_enable(); while (!tif_need_resched()) cpu_relax(); This default poll function suits for the x86 arch because its rep; nop; hardware power optimization. But on other archs, this optimization does not exists and we are not saving power. The arch specific bits may want to optimize this loop by adding their own optimization. Give an opportunity to the different platform to specify their own polling loop by adding a weak cpu_idle_poll_loop function. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-12-11sched: fair: Fix wrong idle timestamp usageDaniel Lezcano
The find_idlest_cpu is assuming the rq->idle_stamp information reflects when the cpu entered the idle state. This is wrong as the cpu may exit and enter the idle state several times without the rq->idle_stamp being updated. We have two informations here: * rq->idle_stamp gives when the idle task has been scheduled * idle->idle_stamp gives when the cpu entered the idle state The patch fixes that by using the latter information and fallbacks to the rq's timestamp when the idle state is not accessible Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-12-11cpuidle: Add some comments in the cpuidle_enter functionDaniel Lezcano
The code is a bit poor in comments. Fix that by adding some comments in the cpuidle enter function. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-12-11cpuidle: Store the idle start time stampDaniel Lezcano
The scheduler uses the idle timestamp stored in the struct rq to retrieve the time when the cpu went idle in order to find the idlest cpu. Unfortunately this information is wrong as it does not have the same meaning from the cpuidle point of view. The idle_stamp in the struct rq gives the information when the idle task has been scheduled while the idle task could be interrupted several times and the cpu going through an idle/wakeup multiple times. Add the idle start time in the idle state structure. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-12-11cpuidle: Invert CPUIDLE_FLAG_TIME_VALID logicDaniel Lezcano
The only place where the time is invalid is when the ACPI_CSTATE_FFH entry method is not set. Otherwise for all the drivers, the time can be correctly measured. Instead of duplicating the CPUIDLE_FLAG_TIME_VALID flag in all the drivers for all the states, just invert the logic by replacing it by the flag CPUIDLE_FLAG_TIME_INVALID, hence we can set this flag only for the acpi idle driver, remove the former flag from all the drivers and invert the logic with this flag in the different governor. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-12-07Linux 3.18Linus Torvalds
2014-12-07Merge branch 'for-3.18-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fixes from Tejun Heo: "Three libata fixes for v3.18. Nothing too interesting. PCI ID ID and quirk additions to ahci and an error handling path fix in sata_fsl" * 'for-3.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: ahci: disable MSI on SAMSUNG 0xa800 SSD sata_fsl: fix error handling of irq_of_parse_and_map AHCI: Add DeviceIDs for Sunrise Point-LP SATA controller
2014-12-06Merge git://www.linux-watchdog.org/linux-watchdogLinus Torvalds
Pull watchdog fix from Wim Van Sebroeck: "Fix the watchdog mask bit offset for Exynos7" * git://www.linux-watchdog.org/linux-watchdog: watchdog: s3c2410_wdt: Fix the mask bit offset for Exynos7
2014-12-06Merge branch 'i2c/for-current' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Here are two more driver bugfixes for I2C which would be good to have" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: cadence: Set the hardware time-out register to maximum value i2c: davinci: generate STP always when NACK is received
2014-12-05watchdog: s3c2410_wdt: Fix the mask bit offset for Exynos7Abhilash Kesavan
The watchdog mask bit offset listed for Exynos7 is incorrect. Fix this. Signed-off-by: Abhilash Kesavan <a.kesavan@samsung.com> Acked-by: Naveen Krishna Chatradhi <naveenkrishna.ch@gmail.com Reviewd-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2014-12-05Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "Two final fixlets for 3.18: - Prevent microcode reload wreckage on 32bit - Unbreak cross compilation" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, microcode: Limit the microcode reloading to 64-bit for now x86: Use $(OBJDUMP) instead of plain objdump
2014-12-05Merge tag 'sound-3.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixlet from Takashi Iwai: "Just one commit for adding a copule of HD-audio quirk entries" * tag 'sound-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/realtek - Add headset Mic support for new Dell machine
2014-12-04Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm intel fixes from Dave Airlie: "Two intel stable fixes, that should be it from me for this round" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/i915: Unlock panel even when LVDS is disabled drm/i915: More cautious with pch fifo underruns
2014-12-04Merge tag 'pm+acpi-3.18-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI backlight fix from Rafael Wysocki: "This is a simple fix for an ACPI backlight regression introduced by a recent commit that overlooked a corner case which should have been taken into account" * tag 'pm+acpi-3.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / video: update condition to check if device is in _DOD list
2014-12-05Merge tag 'drm-intel-fixes-2014-12-04' of ↵Dave Airlie
git://anongit.freedesktop.org/drm-intel into drm-fixes Silence some pch fifo underrun reports and panel locking backtraces, both cc: stable. * tag 'drm-intel-fixes-2014-12-04' of git://anongit.freedesktop.org/drm-intel: drm/i915: Unlock panel even when LVDS is disabled drm/i915: More cautious with pch fifo underruns
2014-12-04Merge tag 'media/v3.18-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "A core fix and some driver fixes: - regression fix in Remote Controller core affecting RC6 protocol handling - fix video buffer handling in cx23885 - race fix in solo6x10 - fix image selection in smiapp - fix reported payload size on s2255drv - two updates for MAINTAINERS file" * tag 'media/v3.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] rc-core: fix toggle handling in the rc6 decoder MAINTAINERS: Update mchehab's addresses [media] cx23885: use sg = sg_next(sg) instead of sg++ [media] s2255drv: fix payload size for JPG, MJPEG [media] Update MAINTAINERS for solo6x10 [media] solo6x10: fix a race in IRQ handler [media] smiapp: Only some selection targets are settable
2014-12-04uapi: fix to export linux/vm_sockets.hMasahiro Yamada
A typo "header=y" was introduced by commit 7071cf7fc435 ("uapi: add missing network related headers to kbuild"). Signed-off-by: Masahiro Yamada <yamada.m@jp.panasonic.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-04i2c: cadence: Set the hardware time-out register to maximum valueVishnu Motghare
Cadence I2C controller has bug wherein it generates invalid read transactions after timeout in master receiver mode. This driver does not use the HW timeout and this interrupt is disabled but the feature itself cannot be disabled. Hence, this patch writes the maximum value (0xFF) to this register. This is one of the workarounds to this bug and it will not avoid the issue completely but reduces the chances of error. Signed-off-by: Vishnu Motghare <vishnum@xilinx.com> Signed-off-by: Harini Katakam <harinik@xilinx.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org
2014-12-04i2c: davinci: generate STP always when NACK is receivedGrygorii Strashko
According to I2C specification the NACK should be handled as follows: "When SDA remains HIGH during this ninth clock pulse, this is defined as the Not Acknowledge signal. The master can then generate either a STOP condition to abort the transfer, or a repeated START condition to start a new transfer." [I2C spec Rev. 6, 3.1.6: http://www.nxp.com/documents/user_manual/UM10204.pdf] Currently the Davinci i2c driver interrupts the transfer on receipt of a NACK but fails to send a STOP in some situations and so makes the bus stuck until next I2C IP reset (idle/enable). For example, the issue will happen during SMBus read transfer which consists from two i2c messages write command/address and read data: S Slave Address Wr A Command Code A Sr Slave Address Rd A D1..Dn A P <--- write -----------------------> <--- read ---------------------> The I2C client device will send NACK if it can't recognize "Command Code" and it's expected from I2C master to generate STP in this case. But now, Davinci i2C driver will just exit with -EREMOTEIO and STP will not be generated. Hence, fix it by generating Stop condition (STP) always when NACK is received. This patch fixes Davinci I2C in the same way it was done for OMAP I2C commit cda2109a26eb ("i2c: omap: query STP always when NACK is received"). Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reported-by: Hein Tibosch <hein_tibosch@yahoo.es> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org
2014-12-04ahci: disable MSI on SAMSUNG 0xa800 SSDTejun Heo
Just like 0x1600 which got blacklisted by 66a7cbc303f4 ("ahci: disable MSI instead of NCQ on Samsung pci-e SSDs on macbooks"), 0xa800 chokes on NCQ commands if MSI is enabled. Disable MSI. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Dominik Mierzejewski <dominik@greysector.net> Link: https://bugzilla.kernel.org/show_bug.cgi?id=89171 Cc: stable@vger.kernel.org
2014-12-03context_tracking: Restore previous state in schedule_userAndy Lutomirski
It appears that some SCHEDULE_USER (asm for schedule_user) callers in arch/x86/kernel/entry_64.S are called from RCU kernel context, and schedule_user will return in RCU user context. This causes RCU warnings and possible failures. This is intended to be a minimal fix suitable for 3.18. Reported-and-tested-by: Dave Jones <davej@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>