Age | Commit message (Collapse) | Author |
|
Signed-off-by: Gary S. Robertson <gary.robertson@linaro.org>
|
|
This reverts commit c817b87cb66410545e0b45f05a015d3b6bc2cec3.
Per request from the patch's author.
|
|
In order to support ONESHOT_STOPPED mode.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
In order to support ONESHOT_STOPPED mode.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
Clockevent device can now be stopped by switching to ONESHOT_STOPPED mode, to
avoid getting spurious interrupts on a tickless CPU.
This patch switches mode to ONESHOT_STOPPED at three different places and
following is the reasoning behind them.
1.) NOHZ_MODE_LOWRES
Timers & hrtimers are dependent on tick for their working in this mode and the
only place from where clockevent device is programmed is the tick-code.
So, we only need to switch clockevent device to ONESHOT_STOPPED mode once ticks
aren't required anymore. And the only call site is: tick_nohz_stop_sched_tick().
In LOWRES mode we skip reprogramming the clockevent device here if expires ==
KTIME_MAX. In addition to that we must also switch the clockevent device to
ONESHOT_STOPPED mode to avoid all spurious interrupts that may follow.
2.) NOHZ_MODE_HIGHRES
Tick & timers are dependent on hrtimers for their working in this mode and the
only place from where clockevent device is programmed is the hrtimer-code.
There are two places here from which we reprogram the clockevent device or skip
reprogramming it on expires == KTIME_MAX.
Instead of skipping reprogramming the clockevent device, also switch its mode to
ONESHOT_STOPPED so that it doesn't generate any spurious interrupts.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[forward port to 3.18
- manually applied few patch.
]
Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
|
|
Clockevent device might have been switched to ONESHOT_STOPPED mode to avoid
getting spurious interrupts on a tickless CPU. Before reprogramming next event,
we must reconfigure clockevent device to ONESHOT mode if required.
This patch switches mode to ONESHOT at three different places and
following is the reasoning behind them.
1.) NOHZ_MODE_LOWRES
Timers & hrtimers are dependent on tick for their working in this mode and the
only place from where clockevent device is programmed is the tick-code.
So, we need to switch clockevent device to ONESHOT mode before we starting using
it. Two routines can restart ticks here in LOWRES mode:
tick_nohz_stop_sched_tick() and tick_nohz_restart().
2.) NOHZ_MODE_HIGHRES
Tick & timers are dependent on hrtimers for their working in this mode and the
only place from where clockevent device is programmed is the hrtimer-code.
Only hrtimer_reprogram() is responsible for programming the clockevent device
for next event, if the clockevent device is stopped earlier. And updating that
alone is sufficient here.
To make sure we haven't missed any corner case, add a WARN() for the case where
we try to reprogram clockevent device while we aren't configured in
ONESHOT_STOPPED mode.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
When no timers/hrtimers are pending, the expiry time is set to a special value:
'KTIME_MAX'. This normally happens with NO_HZ_{IDLE|FULL} in both LOWRES/HIGHRES
modes.
When 'expiry == KTIME_MAX', we either cancel the 'tick-sched' hrtimer
(NOHZ_MODE_HIGHRES) or skip reprogramming clockevent device (NOHZ_MODE_LOWRES).
But, the clockevent device is already reprogrammed from the tick-handler for
next tick.
As the clock event device is programmed in ONESHOT mode it will atleast fire one
more time (unnecessarily). Timers on many implementations (like arm_arch_timer,
powerpc, etc.) only support PERIODIC mode and their drivers emulate ONESHOT over
that. Which means that on these platforms we will get spurious interrupts at
last programmed interval rate, normally tick rate.
In order to avoid spurious interrupts/wakeups, the clockevent device should be
stopped or its interrupts should be masked.
A simple (yet hacky) solution to get this fixed could be: update
hrtimer_force_reprogram() to always reprogram clockevent device and update
clockevent drivers to STOP generating events (or delay it to max time) when
'expires' is set to KTIME_MAX. But the drawback here is that every clockevent
driver has to be hacked for this particular case and its very easy for new ones
to miss this.
However, Thomas suggested to add an optional mode ONESHOT_STOPPED to solve this
problem: lkml.org/lkml/2014/5/9/508.
This patch adds support for ONESHOT_STOPPED mode in clockevents core. It will
only be available to drivers that implement the mode-specific set-mode callbacks
instead of the legacy ->set_mode() callback.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
Many clockevent drivers are using a switch block for handling modes in their
->set_mode() callback. Some of these do not have a 'default' case and adding a
new mode in the enum clock_event_mode, starts giving warnings for these
platforms about unhandled modes.
This patch adds default cases for them.
In order to keep things simple, add these two lines to the switch blocks:
default:
break;
This can lead to different behavior for individual cases. Some of the drivers
don't do any special stuff in their ->set_mode() callback before or after the
switch blocks. And so this default case would simply return for them without any
updates to the clockevent device.
But in some cases, the clockevent device is stopped as soon as we enter the
->set_mode() callback and so it will stay stopped if we hit the default case.
The rationale behind this approach was that the default case *will never* be hit
during execution of code. All new modes (beyond RESUME) are handled with mode
specific ->set_mode_*() callbacks and ->set_mode() is never called for them. And
all modes before and including RESUME are handled by the clockevent drivers.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[forward port to 3.18 ,
ignore these files as they doesn't exist in 3.18 kernel
- arch/mips/loongson/loongson-3/hpet.c
- arch/nios2/kernel/time.c
]
Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
|
|
It is not possible for the clockevents core to know which modes (other than
those with a corresponding feature flag) are supported by a particular
implementation. And drivers are expected to handle transition to all modes
elegantly, as ->set_mode() would be issued for them unconditionally.
Now, adding support for a new mode complicates things a bit if we want to use
the legacy ->set_mode() callback. We need to closely review all clockevents
drivers to see if they would break on addition of a new mode. And after such
reviews, it is found that we have to do non-trivial changes to most of the
drivers [1].
Introduce mode-specific set_mode_*() callbacks, some of which the drivers may or
may not implement. A missing callback would clearly convey the message that the
corresponding mode isn't supported.
A driver may still choose to keep supporting the legacy ->set_mode() callback,
but ->set_mode() wouldn't be supporting any new modes beyond RESUME. If a driver
wants to get benefited by using a new mode, it would be required to migrate to
the mode specific callbacks.
The legacy ->set_mode() callback and the newly introduced mode-specific
callbacks are mutually exclusive. Only one of them should be supported by the
driver.
Sanity check is done at the time of registration to distinguish between optional
and required callbacks and to make error recovery and handling simpler. If the
legacy ->set_mode() callback is provided, all mode specific ones would be
ignored by the core.
Call sites calling ->set_mode() directly are also updated to use
__clockevents_set_mode() instead, as ->set_mode() may not be available anymore
for few drivers.
[1] https://lkml.org/lkml/2014/12/9/605
[2] https://lkml.org/lkml/2015/1/23/255
Suggested-by: Thomas Gleixner <tglx@linutronix.de> [2]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
|
|
An hrtimer may be pinned to a CPU but inactive, so it is no longer valid
to test the hrtimer.state struct member as having no bits set when inactive.
Changed the test function to mask out the HRTIMER_STATE_PINNED bit when
checking for inactive state.
Signed-off-by: Gary S. Robertson <gary.robertson@linaro.org>
|
|
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[forward port to 3.18]
Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
|
|
Allow debugfs override of sched_tick_max_deferment in order to ease
finding/fixing the remaining issues with full nohz.
The value to be written is in jiffies, and -1 means the max deferment
is disabled (scheduler_tick_max_deferment() returns KTIME_MAX.)
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Kevin Hilman <khilman@linaro.org>
|
|
When expires is set to KTIME_MAX in tick_program_event(), we are sure that there
are no events enqueued for a very long time and so there is no point keeping
event device running. We will get interrupted without any work to do many a
times, for example when timer's counter overflows.
So, its better to SHUTDOWN the event device then and restart it ones we get a
request for next event. For implementing this a new field 'last_mode' is added
to 'struct clock_event_device' to keep track of last mode used.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
In hrtimer_force_reprogram(), we are reprogramming event device only if the next
timer event is before KTIME_MAX. But what if it is equal to KTIME_MAX? As we
aren't reprogramming it again, it will be set to the last value it was, probably
tick interval, i.e. few milliseconds.
And we will get a interrupt due to that, wouldn't have any hrtimers to service
and return without doing much. But the implementation of event device's driver
may make it more stupid. For example: drivers/clocksource/arm_arch_timer.c
disables the event device only on SHUTDOWN/UNUSED requests in set-mode.
Otherwise, it will keep giving interrupts at tick interval even if
hrtimer_interrupt() didn't reprogram tick..
To get this fixed, lets reprogram event device even for KTIME_MAX, so that the
timer is scheduled for long enough.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[forward port to 3.18 kernel]
Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
|
|
CPUSets have cpusets.quiesce sysfs file now, with which some CPUs can opt for
isolating themselves from background kernel activities, like: timers & hrtimers.
get_nohz_timer_target() is used for finding suitable CPU for firing a timer. To
guarantee that new timers wouldn't be queued on quiesced CPUs, we need to modify
this routine.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
For networking applications, platforms need to provide one CPU per each user
space data plane thread. These CPUs shouldn't be interrupted by kernel at all
unless userspace has requested for some functionality. Currently, there are
background kernel activities that are running on almost every CPU, like:
timers/hrtimers/watchdogs/etc, and these are required to be migrated to other
CPUs.
To achieve that, this patch adds another option to cpusets, i.e. 'quiesce'.
Writing '1' on this file would migrate these unbound/unpinned timers/hrtimers
away from the CPUs of the cpuset in question. Also it would disallow addition of
any new unpinned timers/hrtimers to isolated CPUs (This would be handled in next
patch). Writing '0' will disable isolation of CPUs in current cpuset and
unpinned timers/hrtimers would be allowed in future on these CPUs.
Currently, only timers and hrtimers are migrated. This would be followed by
other kernel infrastructure later if required.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[forward port to 3.18]
Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
|
|
To isolate CPUs (isolate from hrtimers) from sysfs using cpusets, we need some
support from the hrtimer core. i.e. A routine hrtimer_quiesce_cpu() which would
migrate away all the unpinned hrtimers, but shouldn't touch the pinned ones.
This patch creates this routine.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[forward port to 3.18]
Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
|
|
'Pinned' information would be required in migrate_hrtimers() now, as we can
migrate non-pinned timers away without a hotplug (i.e. with cpuset.quiesce). And
so we may need to identify pinned timers now, as we can't migrate them.
This patch reuses the timer->state variable for setting this flag as there were
enough number of free bits available in this variable. And there is no point
increasing size of this struct by adding another field.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[forward port to 3.18]
Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
|
|
To isolate CPUs (isolate from timers) from sysfs using cpusets, we need some
support from the timer core. i.e. A routine timer_quiesce_cpu() which would
migrates away all the unpinned timers, but shouldn't touch the pinned ones.
This patch creates this routine.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[forward port to 3.18]
Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
|
|
In order to quiesce a CPU on which Isolation might be required, we need to move
away all the timers queued on that CPU. There are two types of timers queued on
any CPU: ones that are pinned to that CPU and others can run on any CPU but are
queued on CPU in question. And we need to migrate only the second type of timers
away from the CPU entering quiesce state.
For this we need some basic infrastructure in timer core to identify which
timers are pinned and which are not.
Hence, this patch adds another flag bit TIMER_PINNED which will be set only for
the timers which are pinned to a CPU.
It also removes 'pinned' parameter of __mod_timer() as it is no more required.
NOTE: One functional change worth mentioning
Existing Behavior: add_timer_on() followed by multiple mod_timer() wouldn't pin
the timer on CPU mentioned in add_timer_on()..
New Behavior: add_timer_on() followed by multiple mod_timer() would pin the
timer on CPU running mod_timer().
I didn't gave much attention to this as we should call mod_timer_on() for the
timers queued with add_timer_on(). Though if required we can simply clear the
TIMER_PINNED flag in mod_timer().
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[forward port to 3.18]
Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
|
|
noticed build break on x86
In file included from drivers/mtd/chips/chipreg.c:11:0:
include/linux/mtd/map.h: In function ‘inline_map_write’:
include/linux/mtd/map.h:420:3: error: implicit declaration of function ‘writeb_relaxed’ [-Werror=implicit-function-declaration]
writeb_relaxed(datum.x[0], map->virt + ofs);
^
include/linux/mtd/map.h:422:3: error: implicit declaration of function ‘writew_relaxed’ [-Werror=implicit-function-declaration]
writew_relaxed(datum.x[0], map->virt + ofs);
^
include/linux/mtd/map.h:424:3: error: implicit declaration of function ‘writel_relaxed’ [-Werror=implicit-function-declaration]
writel_relaxed(datum.x[0], map->virt + ofs);
This commit breaks for x86:
commit 1e885fabb4b148110c43056dd16842d11cde11db
Author: Victor Kamensky <victor.kamensky@linaro.org>
Date: Mon Jul 29 22:03:27 2013 -0700
mtd: map.h endian fix
Need to use endian neutral functions to read/write LE h/w registers. I.e
insted of __raw_readl and _raw_writel, readl_relaxed and writel_relaxed. If
the first just read/write register with memory barrier, the second will
byteswap it if host operates in BE mode.
Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org>
Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
Signed-off-by: Gary S. Robertson <gary.robertson@linaro.org>
|
|
Previous patches fixed endian neutral issues in Arndale BSP, so mark it as one
that supports big endian
Original patch signed-by: Victor Kamensky <victor.kamensky@linaro.org>
Merged into linux v3.18.11 by Gary S. Robertson
Conflicts:
arch/arm/mach-exynos/Kconfig
Signed-off-by: Gary S. Robertson <gary.robertson@linaro.org>
|
|
Need to use endian neutral functions to read/write LE h/w registers. I.e
instead of __raw_read[lw] and _raw_write[lw] functions code need to use
read[lw]_relaxed and write[lw]_relaxed functions. If the first just
read/write register with memory barrier, the second will byteswap it if host
operates in BE mode.
This patch covers drivers used by arndale board where all changes are trivial,
sed like replacement of __raw_xxx functions with xxx_relaxed variant.
Literally this sed program was used to make the change:
s|__raw_readl|readl_relaxed|g
s|__raw_writel|writel_relaxed|g
s|__raw_readw|readw_relaxed|g
s|__raw_writew|writew_relaxed|g
Original patch signed-by: Victor Kamensky <victor.kamensky@linaro.org>
Original patch signed-by: Anders Roxell <anders.roxell@linaro.org>
Merged into linux v3.18.11 by Gary S. Robertson
Conflicts:
arch/arm/mach-exynos/platsmp.c
Conflicts:
arch/arm/mach-exynos/common.c
arch/arm/mach-exynos/cpuidle.c
arch/arm/mach-exynos/hotplug.c
arch/arm/mach-exynos/include/mach/pm-core.h
arch/arm/mach-exynos/platsmp.c
arch/arm/mach-exynos/pm.c
arch/arm/mach-exynos/pm_domains.c
arch/arm/mach-exynos/pmu.c
arch/arm/plat-samsung/pm.c
drivers/clk/samsung/clk-pll.c
drivers/clk/samsung/clk.c
drivers/clocksource/exynos_mct.c
drivers/cpufreq/exynos4210-cpufreq.c
drivers/cpufreq/exynos4x12-cpufreq.c
drivers/cpufreq/exynos5250-cpufreq.c
drivers/cpufreq/exynos5440-cpufreq.c
drivers/gpio/gpio-samsung.c
drivers/thermal/samsung/exynos_tmu.c
Signed-off-by: Gary S. Robertson <gary.robertson@linaro.org>
|
|
uncompress serial line write utils need to use endian neutral functions to
read h/w register - i.e in case of BE host byteswap is needed. Fix uart_rd,
uart_wr and serial chip fifo related macros
Original-patch-signed-by: Victor Kamensky <victor.kamensky@linaro.org>
Merged into linux v3.18.11 by Gary S. Robertson
Conflicts:
arch/arm/include/debug/samsung.S
arch/arm/mach-exynos/include/mach/uncompress.h
arch/arm/plat-samsung/include/plat/uncompress.h
Signed-off-by: Gary S. Robertson <gary.robertson@linaro.org>
|
|
Samsung serial driver uses __set_bit and __clear_bit functions directly with
h/w registers. But these functions are not endian neutral. In case of BE host
before operation on the bit byte swap is needed and byte swap is required
after operation on the bit is complete. Patch creates and use __hw_set_bit
and __hw_clear_bit, that in case of LE just call __set_bit and __clear_bit,
but in case of BE they do required byteswaps
Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org>
|
|
If kernel operates in BE mode on board that has LE bootloader/rom code, we need
to switch CPU to operate in BE mode as soon as possible. generic
secondary_startup that is called from exynos specific secondary startup code
will do the switch, but we need it to do earlier because exynos specific
secondary_startup code works with BE data.
Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org>
|
|
Need to use endian neutral functions to read/write h/w registers. I.e
__raw_readl replaced with readl_relaxed and __raw_writel replaced with
writel_relaxed. The relaxed version of function will read/write LE h/w
register and byteswap it if host operates in BE mode.
However in case of this file __raw_read(wlq) and __raw_write(wlq) are also
used to transfer data from uchar buffer into h/w mmc host register. And in
this case byteswap is not need - bytes of data buffer should go into h/w
register in the same order as they are in memory. So we need to split control
mci_readl/mci_writel macros from one that operates on data mci_readw_data,
mci_readl_data, mci_readq_data, mci_writew, mci_writel_data, mci_writeq_data.
The latter one do not do byte swaps.
Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org>
|
|
Need to use endian neutral functions to read/write LE h/w registers. I.e
insted of __raw_readl and _raw_writel, readl_relaxed and writel_relaxed. If
the first just read/write register with memory barrier, the second will
byteswap it if host operates in BE mode.
Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org>
|
|
Signed-off-by: Gary S. Robertson <gary.robertson@linaro.org>
Conflicts:
kernel/irq/manage.c
Conflicts:
include/linux/interrupt.h
kernel/irq/manage.c
|
|
Signed-off-by: Gary S. Robertson <gary.robertson@linaro.org>
|
|
Signed-off-by: Gary S. Robertson <gary.robertson@linaro.org>
|
|
git://git.linaro.org/people/anders.roxell/linux-rt into linux-linaro-lsk-v3.18-rt
Linux 3.18.13-rt10
Changes since v3.18.13-rt9:
- Update of the dead lock fix for ftrace.
|
|
|
|
* v3.18/topic/configs:
linaro/configs: preempt-rt: remove SLUB setting
linaro/configs: remove android.conf, not used
|
|
RT now has its own Kconfig dependencies setup for the right allocator
so it's not needed here.
Signed-off-by: Kevin Hilman <khilman@linaro.org>
|
|
|
|
Used the "ours" merge strategy to throw away the previous -rt releases
|
|
Android/AOSP ships with android/configs/*.cfg which should be used
as config fragments. Drop the Linaro specific one in favor or the AOSP
fragments.
Signed-off-by: Kevin Hilman <khilman@linaro.org>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-8vdw4bfcsds27cvox6rpb334@git.kernel.org
|
|
Austin reported a XFS deadlock/stall on RT where scheduled work gets
never exececuted and tasks are waiting for each other for ever.
The underlying problem is the modification of the RT code to the
handling of workers which are about to go to sleep. In mainline a
worker thread which goes to sleep wakes an idle worker if there is
more work to do. This happens from the guts of the schedule()
function. On RT this must be outside and the accessed data structures
are not protected against scheduling due to the spinlock to rtmutex
conversion. So the naive solution to this was to move the code outside
of the scheduler and protect the data structures by the pool
lock. That approach turned out to be a little naive as we cannot call
into that code when the thread blocks on a lock, as it is not allowed
to block on two locks in parallel. So we dont call into the worker
wakeup magic when the worker is blocked on a lock, which causes the
deadlock/stall observed by Austin and Mike.
Looking deeper into that worker code it turns out that the only
relevant data structure which needs to be protected is the list of
idle workers which can be woken up.
So the solution is to protect the list manipulation operations with
preempt_enable/disable pairs on RT and call unconditionally into the
worker code even when the worker is blocked on a lock. The preemption
protection is safe as there is nothing which can fiddle with the list
outside of thread context.
Reported-and_tested-by: Austin Schuh <austin@peloton-tech.com>
Reported-and_tested-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://vger.kernel.org/r/alpine.DEB.2.10.1406271249510.5170@nanos
Cc: Richard Weinberger <richard.weinberger@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: stable-rt@vger.kernel.org
|
|
I talked with Peter Zijlstra about this, and he told me that the clearing
of the PF_NO_SETAFFINITY flag was to deal with the optimization of
migrate_disable/enable() that ignores tasks that have that flag set. But
that optimization was removed when I did a rework of the cpu hotplug code.
I found that ignoring tasks that had that flag set would cause those tasks
to not sync with the hotplug code and cause the kernel to crash. Thus it
needed to not treat them special and those tasks had to go though the same
work as tasks without that flag set.
Now that those tasks are not treated special, there's no reason to clear the
flag.
May still need to be tested as the migrate_me() code does not ignore those
flags.
Cc: stable-rt@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Clark Williams <williams@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140701111444.0cfebaa1@gandalf.local.home
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
It uses anon semaphores
|drivers/md/bcache/request.c: In function ‘cached_dev_write_complete’:
|drivers/md/bcache/request.c:1007:2: error: implicit declaration of function ‘up_read_non_owner’ [-Werror=implicit-function-declaration]
| up_read_non_owner(&dc->writeback_lock);
| ^
|drivers/md/bcache/request.c: In function ‘request_write’:
|drivers/md/bcache/request.c:1033:2: error: implicit declaration of function ‘down_read_non_owner’ [-Werror=implicit-function-declaration]
| down_read_non_owner(&dc->writeback_lock);
| ^
either we get rid of those or we have to introduce them…
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
The ntp code for notify_cmos_timer() is called from a hard interrupt
context. schedule_delayed_work() under PREEMPT_RT_FULL calls spinlocks
that have been converted to mutexes, thus calling schedule_delayed_work()
from interrupt is not safe.
Add a helper thread that does the call to schedule_delayed_work and wake
up that thread instead of calling schedule_delayed_work() directly.
This is only for CONFIG_PREEMPT_RT_FULL, otherwise the code still calls
schedule_delayed_work() directly in irq context.
Note: There's a few places in the kernel that do this. Perhaps the RT
code should have a dedicated thread that does the checks. Just register
a notifier on boot up for your check and wake up the thread when
needed. This will be a todo.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
mm, memcg: make refill_stock() use get_cpu_light()
Nikita reported the following memcg scheduling while atomic bug:
Call Trace:
[e22d5a90] [c0007ea8] show_stack+0x4c/0x168 (unreliable)
[e22d5ad0] [c0618c04] __schedule_bug+0x94/0xb0
[e22d5ae0] [c060b9ec] __schedule+0x530/0x550
[e22d5bf0] [c060bacc] schedule+0x30/0xbc
[e22d5c00] [c060ca24] rt_spin_lock_slowlock+0x180/0x27c
[e22d5c70] [c00b39dc] res_counter_uncharge_until+0x40/0xc4
[e22d5ca0] [c013ca88] drain_stock.isra.20+0x54/0x98
[e22d5cc0] [c01402ac] __mem_cgroup_try_charge+0x2e8/0xbac
[e22d5d70] [c01410d4] mem_cgroup_charge_common+0x3c/0x70
[e22d5d90] [c0117284] __do_fault+0x38c/0x510
[e22d5df0] [c011a5f4] handle_pte_fault+0x98/0x858
[e22d5e50] [c060ed08] do_page_fault+0x42c/0x6fc
[e22d5f40] [c000f5b4] handle_page_fault+0xc/0x80
What happens:
refill_stock()
get_cpu_var()
drain_stock()
res_counter_uncharge()
res_counter_uncharge_until()
spin_lock() <== boom
Fix it by replacing get/put_cpu_var() with get/put_cpu_light().
Cc: stable-rt@vger.kernel.org
Reported-by: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru>
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
To avoid:
|BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:914
|in_atomic(): 1, irqs_disabled(): 0, pid: 92, name: rcuc/11
|2 locks held by rcuc/11/92:
| #0: (rcu_callback){......}, at: [<ffffffff810e037e>] rcu_cpu_kthread+0x3de/0x940
| #1: (rcu_read_lock_sched){......}, at: [<ffffffff81328390>] percpu_ref_call_confirm_rcu+0x0/0xd0
|Preemption disabled at:[<ffffffff813284e2>] percpu_ref_switch_to_atomic_rcu+0x82/0xc0
|CPU: 11 PID: 92 Comm: rcuc/11 Not tainted 3.18.7-rt0+ #1
| ffff8802398cdf80 ffff880235f0bc28 ffffffff815b3a12 0000000000000000
| 0000000000000000 ffff880235f0bc48 ffffffff8109aa16 0000000000000000
| ffff8802398cdf80 ffff880235f0bc78 ffffffff815b8dd4 000000000000df80
|Call Trace:
| [<ffffffff815b3a12>] dump_stack+0x4f/0x7c
| [<ffffffff8109aa16>] __might_sleep+0x116/0x190
| [<ffffffff815b8dd4>] rt_spin_lock+0x24/0x60
| [<ffffffff8108d2cd>] queue_work_on+0x6d/0x1d0
| [<ffffffff8110c881>] css_release+0x81/0x90
| [<ffffffff8132844e>] percpu_ref_call_confirm_rcu+0xbe/0xd0
| [<ffffffff813284e2>] percpu_ref_switch_to_atomic_rcu+0x82/0xc0
| [<ffffffff810e03e5>] rcu_cpu_kthread+0x445/0x940
| [<ffffffff81098a2d>] smpboot_thread_fn+0x18d/0x2d0
| [<ffffffff810948d8>] kthread+0xe8/0x100
| [<ffffffff815b9c3c>] ret_from_fork+0x7c/0xb0
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Completions have no long lasting callbacks and therefor do not need
the complex waitqueue variant. Use simple waitqueues which reduces the
contention on the waitqueue lock.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Merged Steven's
static void rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp) {
- swait_wake(&rnp->nocb_gp_wq[rnp->completed & 0x1]);
+ wake_up_all(&rnp->nocb_gp_wq[rnp->completed & 0x1]);
}
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|