summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-04-07clocksource: Augment bindings for Faraday timerLinus Walleij
It turns out that the Cortina Gemini timer block is just a standard IP block from Faraday Technology named FTTMR010. In order to make things clear and understandable, we rename the bindings with a Faraday compatible as primary and the Cortina gemini as a more specific case. For the plain Faraday timer we require two clock references, while the Gemini can keep it's syscon lookup pattern. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Rob Herring <robh@kernel.org>
2017-04-07ARM: dts: rockchip: disable arm-global-timer for rk3188Alexander Kochetkov
The clocksource and the sched_clock provided by the arm_global_timer are quite unstable because their rates depend on the cpu frequency. On the other side, the arm_global_timer has a higher rating than the rockchip_timer, it will be selected by default by the time framework while we want to use the stable rockchip clocksource. Let's disable the arm_global_timer in order to have the rockchip clocksource selected by default. Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Heiko Stuebner <heiko@sntech.de>
2017-04-07ARM: dts: rockchip: Add timer entries to rk3188 SoCAlexander Kochetkov
The patch add two timers to all rk3188 based boards. The first timer is from alive subsystem and it act as a backup for the local timers at sleep time. It act the same as other SoC rockchip timers already present in kernel. The second timer is from CPU subsystem and act as replacement for the arm-global-timer clocksource and sched clock. It run at stable frequency 24MHz. Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Heiko Stuebner <heiko@sntech.de>
2017-04-07clocksource/drivers/rockchip_timer: Implement clocksource timerAlexander Kochetkov
The clock supplying the arm-global-timer on the rk3188 is coming from the the cpu clock itself and thus changes its rate everytime cpufreq adjusts the cpu frequency making this timer unsuitable as a stable clocksource and sched clock. The rk3188, rk3288 and following socs share a separate timer block already handled by the rockchip-timer driver. Therefore adapt this driver to also be able to act as clocksource and sched clock on rk3188. In order to test clocksource you can run following commands and check how much time it take in real. On rk3188 it take about ~45 seconds. cpufreq-set -f 1.6GHZ date; sleep 60; date In order to use the patch you need to declare two timers in the dts file. The first timer will be initialized as clockevent provider and the second one as clocksource. The clockevent must be from alive subsystem as it used as backup for the local timers at sleep time. The patch does not break compatibility with older device tree files. The older device tree files contain only one timer. The timer will be initialized as clockevent, as expected. rk3288 (and probably anything newer) is irrelevant to this patch, as it has the arch timer interface. This patch may be useful for Cortex-A9/A5 based parts. Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-04-07ARM: dts: rockchip: Update compatible property for rk322x timerAlexander Kochetkov
Property set to '"rockchip,rk3228-timer", "rockchip,rk3288-timer"' to match devicetree bindings. Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com> Suggested-by: Heiko Stübner <heiko@sntech.de> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-04-07dt-bindings: Clarify compatible property for rockchip timersAlexander Kochetkov
Make all properties description in form '"rockchip,<chip>-timer", "rockchip,rk3288-timer"' for all chips found in linux kernel. Suggested-by: Heiko Stübner <heiko@sntech.de> Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com> Acked-by: Rob Herring <robh@kernel.org> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-04-07clocksource: Add missing line break to error messagesRafał Miłecki
Printing with pr_* functions requires adding line break manually. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-04-07clocksource/drivers/orion: Add delay_timer implementationRussell King
Add an implementation for the ARM delay timer, which is used for udelay(). This provides less CPU dependent and more accurate delays - the CPU loop on Marvell Dove appears to calibrate to around 6% too short. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-04-07clocksource/drivers/orion: Read clock rate onceRussell King
Rather than reading the clock rate three times, read it once - we are about to add a fourth usage. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-03-31timekeeping: Remove pointless conversion to boolNicholas Mc Guire
interp_forward is type bool so assignment from a logical operation directly is sufficient. Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at> Cc: "Christopher S. Hall" <christopher.s.hall@intel.com> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/1490382215-30505-1-git-send-email-der.herr@hofr.at Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-03-31Merge branch 'fortglx/4.12/time' of ↵Thomas Gleixner
https://git.linaro.org/people/john.stultz/linux into timers/core Pull timekeeping changes from John Stultz: Main changes are the initial steps of Nicoli's work to make the clockevent timers be corrected for NTP adjustments. Then a few smaller fixes that I've queued, and adding Stephen Boyd to the maintainers list for timekeeping.
2017-03-23sysrq: Reset the watchdog timers while displaying high-resolution timersTom Hromatka
On systems with a large number of CPUs, running sysrq-<q> can cause watchdog timeouts. There are two slow sections of code in the sysrq-<q> path in timer_list.c. 1. print_active_timers() - This function is called by print_cpu() and contains a slow goto loop. On a machine with hundreds of CPUs, this loop took approximately 100ms for the first CPU in a NUMA node. (Subsequent CPUs in the same node ran much quicker.) The total time to print all of the CPUs is ultimately long enough to trigger the soft lockup watchdog. 2. print_tickdevice() - This function outputs a large amount of textual information. This function also took approximately 100ms per CPU. Since sysrq-<q> is not a performance critical path, there should be no harm in touching the nmi watchdog in both slow sections above. Touching it in just one location was insufficient on systems with hundreds of CPUs as occasional timeouts were still observed during testing. This issue was observed on an Oracle T7 machine with 128 CPUs, but I anticipate it may affect other systems with similarly large numbers of CPUs. Signed-off-by: Tom Hromatka <tom.hromatka@oracle.com> Reviewed-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-03-23timers, sched_clock: Update timeout for clock wrapDavid Engraf
The scheduler clock framework may not use the correct timeout for the clock wrap. This happens when a new clock driver calls sched_clock_register() after the kernel called sched_clock_postinit(). In this case the clock wrap timeout is too long thus sched_clock_poll() is called too late and the clock already wrapped. On my ARM system the scheduler was no longer scheduling any other task than the idle task because the sched_clock() wrapped. Signed-off-by: David Engraf <david.engraf@sysgo.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-03-23MAINTAINERS: Add Stephen Boyd as timekeeping reviewerJohn Stultz
After showing expertise and presenting on the timekeeping subsystem at ELC[1], Stephen clearly should be included in the maintainer list. [1] https://www.youtube.com/watch?v=Puv4mW55bF8 Acked-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-03-23clockevents: Make clockevents_config() staticNicolai Stange
A clockevent device's rate should be configured before or at registration and changed afterwards through clockevents_update_freq() only. For the configuration at registration, we already have clockevents_config_and_register(). Right now, there are no clockevents_config() users outside of the clockevents core. To mitigiate the risk of drivers errorneously reconfiguring their rates through clockevents_config() *after* device registration, make clockevents_config() static. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-03-23clocksource: h8300_timer8: Don't reset rate in ->set_state_oneshot()Nicolai Stange
With the upcoming NTP correction related rate adjustments to be implemented in the clockevents core, the latter needs to get informed about every rate change of a clockevent device made after its registration. Currently, h8300_timer8 violates this requirement in that it registers its clockevent device with the correct rate, but resets its ->mult and ->rate values in timer8_clock_event_start(), called from its ->set_state_oneshot() function. It seems like commit 4633f4cac85a ("clocksource/drivers/h8300: Cleanup startup and remove module code."), which introduced the rate initialization at registration, missed to remove the manual setting of ->mult and ->shift from timer8_clock_event_start(). Purge the setting of ->mult, ->shift, ->min_delta_ns and ->max_delta_ns from timer8_clock_event_start(). Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-03-23clocksource: em_sti: Compute rate before registrationNicolai Stange
With the upcoming NTP correction related rate adjustments to be implemented in the clockevents core, the latter needs to get informed about every rate change of a clockevent device made after its registration. Currently, em_sti violates this requirement in that it registers its clockevent device with a dummy rate and sets its final rate through clockevents_config() called from its ->set_state_oneshot(). This patch moves the setting of the clockevent device's rate to its registration. I checked all current em_sti users in arch/arm/mach-shmobile and right now, none of them changes any rate in any clock tree relevant to em_sti after their respective time_init(). Since all em_sti instances are created after time_init(), none of them should ever observe any clock rate changes. - Determine the ->rate value in em_sti_probe() at device probing rather than at first usage. - Set the clockevent device's rate at its registration. - Although not strictly necessary for the upcoming clockevent core changes, set the clocksource's rate at its registration for consistency. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-03-23clocksource: em_sti: Split clock prepare and enable stepsNicolai Stange
Currently, the em_sti driver prepares and enables the needed clock in em_sti_enable(), potentially called through its clockevent device's ->set_state_oneshot(). However, the clk_prepare() step may sleep whereas tick_program_event() and thus, ->set_state_oneshot(), can be called in atomic context. Split the clk_prepare_enable() in em_sti_enable() into two steps: - prepare the clock at device probing via clk_prepare() - and enable it in em_sti_enable() via clk_enable(). Slightly reorder resource initialization in em_sti_probe() in order to facilitate error handling in later patches. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-03-23clocksource: sh_tmu: Compute rate before registration againNicolai Stange
With the upcoming NTP correction related rate adjustments to be implemented in the clockevents core, the latter needs to get informed about every rate change of a clockevent device made after its registration. Currently, sh_tmu violates this requirement in that it registers its clockevent device with a dummy rate and sets its final rate through clockevents_config() called from its ->set_state_oneshot() and ->set_state_periodic() functions respectively. This patch moves the setting of the clockevent device's rate to its registration. Note that there has been some back and forth regarding this question with respect to the clocksource also provided by this driver: commit 66f49121ffa4 ("clocksource: sh_tmu: compute mult and shift before registration") moves the rate determination from the clocksource's ->enable() function to before its registration. OTOH, the later commit 0aeac458d9eb ("clocksource: sh_tmu: __clocksource_updatefreq_hz() update") basically reverts this, saying "Without this patch the old code uses clocksource_register() together with a hack that assumes a never changing clock rate." However, I checked all current sh_tmu users in arch/sh as well as in arch/arm/mach-shmobile carefully and right now, none of them changes any rate in any clock tree relevant to sh_tmu after their respective time_init(). Since all sh_tmu instances are created after time_init(), none of them should ever observe any clock rate changes. What's more, both, a clocksource as well as a clockevent device, can immediately get selected for use at their registration and thus, enabled at this point already. So it's probably safer to assume a "never changing clock rate" here. - Move the struct sh_tmu_channel's ->rate member to struct sh_tmu_device: it's a property of the underlying clock which is in turn specific to the sh_tmu_device. - Determine the ->rate value in sh_tmu_setup() at device probing rather than at first usage. - Set the clockevent device's rate at its registration. - Although not strictly necessary for the upcoming clockevent core changes, set the clocksource's rate at its registration for consistency. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-03-23clocksource: sh_cmt: Compute rate before registration againNicolai Stange
With the upcoming NTP correction related rate adjustments to be implemented in the clockevents core, the latter needs to get informed about every rate change of a clockevent device made after its registration. Currently, sh_cmt violates this requirement in that it registers its clockevent device with a dummy rate and sets its final ->mult and ->shift values from its ->set_state_oneshot() and ->set_state_periodic() functions respectively. This patch moves the setting of the clockevent device's ->mult and ->shift values to before its registration. Note that there has been some back and forth regarding this question with respect to the clocksource also provided by this driver: commit f4d7c3565c16 ("clocksource: sh_cmt: compute mult and shift before registration") moves the rate determination from the clocksource's ->enable() function to before its registration. OTOH, the later commit 3593f5fe40a1 ("clocksource: sh_cmt: __clocksource_updatefreq_hz() update") basically reverts this, saying "Without this patch the old code uses clocksource_register() together with a hack that assumes a never changing clock rate." However, I checked all current sh_cmt users in arch/sh as well as in arch/arm/mach-shmobile carefully and right now, none of them changes any rate in any clock tree relevant to sh_cmt after their respective time_init(). Since all sh_cmt instances are created after time_init(), none of them should ever observe any clock rate changes. What's more, both, a clocksource as well as a clockevent device, can immediately get selected for use at their registration and thus, enabled at this point already. So it's probably safer to assume a "never changing clock rate" here. - Move the struct sh_cmt_channel's ->rate member to struct sh_cmt_device: it's a property of the underlying clock which is in turn specific to the sh_cmt_device. - Determine the ->rate value in sh_cmt_setup() at device probing rather than at first usage. - Set the clockevent device's ->mult and ->shift values right before its registration. - Although not strictly necessary for the upcoming clockevent core changes, set the clocksource's rate at its registration for consistency. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-03-19Linux 4.11-rc3Linus Torvalds
2017-03-19mm/swap: don't BUG_ON() due to uninitialized swap slot cacheLinus Torvalds
This BUG_ON() triggered for me once at shutdown, and I don't see a reason for the check. The code correctly checks whether the swap slot cache is usable or not, so an uninitialized swap slot cache is not actually problematic afaik. I've temporarily just switched the BUG_ON() to a WARN_ON_ONCE(), since I'm not sure why that seemingly pointless check was there. I suspect the real fix is to just remove it entirely, but for now we'll warn about it but not bring the machine down. Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-19Merge tag 'powerpc-4.11-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull more powerpc fixes from Michael Ellerman: "A couple of minor powerpc fixes for 4.11: - wire up statx() syscall - don't print a warning on memory hotplug when HPT resizing isn't available Thanks to: David Gibson, Chandan Rajendra" * tag 'powerpc-4.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/pseries: Don't give a warning when HPT resizing isn't available powerpc: Wire up statx() syscall
2017-03-19Merge branch 'parisc-4.11-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: - Mikulas Patocka added support for R_PARISC_SECREL32 relocations in modules with CONFIG_MODVERSIONS. - Dave Anglin optimized the cache flushing for vmap ranges. - Arvind Yadav provided a fix for a potential NULL pointer dereference in the parisc perf code (and some code cleanups). - I wired up the new statx system call, fixed some compiler warnings with the access_ok() macro and fixed shutdown code to really halt a system at shutdown instead of crashing & rebooting. * 'parisc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Fix system shutdown halt parisc: perf: Fix potential NULL pointer dereference parisc: Avoid compiler warnings with access_ok() parisc: Wire up statx system call parisc: Optimize flush_kernel_vmap_range and invalidate_kernel_vmap_range parisc: support R_PARISC_SECREL32 relocation in modules
2017-03-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pendingLinus Torvalds
Pull SCSI target fixes from Nicholas Bellinger: "The bulk of the changes are in qla2xxx target driver code to address various issues found during Cavium/QLogic's internal testing (stable CC's included), along with a few other stability and smaller miscellaneous improvements. There are also a couple of different patch sets from Mike Christie, which have been a result of his work to use target-core ALUA logic together with tcm-user backend driver. Finally, a patch to address some long standing issues with pass-through SCSI export of TYPE_TAPE + TYPE_MEDIUM_CHANGER devices, which will make folks using physical (or virtual) magnetic tape happy" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (28 commits) qla2xxx: Update driver version to 9.00.00.00-k qla2xxx: Fix delayed response to command for loop mode/direct connect. qla2xxx: Change scsi host lookup method. qla2xxx: Add DebugFS node to display Port Database qla2xxx: Use IOCB interface to submit non-critical MBX. qla2xxx: Add async new target notification qla2xxx: Export DIF stats via debugfs qla2xxx: Improve T10-DIF/PI handling in driver. qla2xxx: Allow relogin to proceed if remote login did not finish qla2xxx: Fix sess_lock & hardware_lock lock order problem. qla2xxx: Fix inadequate lock protection for ABTS. qla2xxx: Fix request queue corruption. qla2xxx: Fix memory leak for abts processing qla2xxx: Allow vref count to timeout on vport delete. tcmu: Convert cmd_time_out into backend device attribute tcmu: make cmd timeout configurable tcmu: add helper to check if dev was configured target: fix race during implicit transition work flushes target: allow userspace to set state to transitioning target: fix ALUA transition timeout handling ...
2017-03-19Merge branch 'libnvdimm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull device-dax fixes from Dan Williams: "The device-dax driver was not being careful to handle falling back to smaller fault-granularity sizes. The driver already fails fault attempts that are smaller than the device's alignment, but it also needs to handle the cases where a larger page mapping could be established. For simplicity of the immediate fix the implementation just signals VM_FAULT_FALLBACK until fault-size == device-alignment. One fix is for -stable to address pmd-to-pte fallback from the original implementation, another fix is for the new (introduced in 4.11-rc1) pud-to-pmd regression, and a typo fix comes along for the ride. These have received a build success notification from the kbuild robot" * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: device-dax: fix debug output typo device-dax: fix pud fault fallback handling device-dax: fix pmd/pte fault fallback handling
2017-03-18qla2xxx: Update driver version to 9.00.00.00-kHimanshu Madhani
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18qla2xxx: Fix delayed response to command for loop mode/direct connect.Quinn Tran
Current driver wait for FW to be in the ready state before processing in-coming commands. For Arbitrated Loop or Point-to- Point (not switch), FW Ready state can take a while. FW will transition to ready state after all Nports have been logged in. In the mean time, certain initiators have completed the login and starts IO. Driver needs to start processing all queues if FW is already started. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18qla2xxx: Change scsi host lookup method.Quinn Tran
For target mode, when new scsi command arrive, driver first performs a look up of the SCSI Host. The current look up method is based on the ALPA portion of the NPort ID. For Cisco switch, the ALPA can not be used as the index. Instead, the new search method is based on the full value of the Nport_ID via btree lib. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18qla2xxx: Add DebugFS node to display Port DatabaseHimanshu Madhani
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18qla2xxx: Use IOCB interface to submit non-critical MBX.Quinn Tran
The Mailbox interface is currently over subscribed. We like to reserve the Mailbox interface for the chip managment and link initialization. Any non essential Mailbox command will be routed through the IOCB interface. The IOCB interface is able to absorb more commands. Following commands are being routed through IOCB interface - Get ID List (007Ch) - Get Port DB (0064h) - Get Link Priv Stats (006Dh) Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18qla2xxx: Add async new target notificationQuinn Tran
Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18qla2xxx: Export DIF stats via debugfsAnil Gurumurthy
Signed-off-by: Anil Gurumurthy <anil.gurumurthy@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18qla2xxx: Improve T10-DIF/PI handling in driver.Quinn Tran
Add routines to support T10 DIF tag. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Anil Gurumurthy <anil.gurumurthy@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18qla2xxx: Allow relogin to proceed if remote login did not finishQuinn Tran
If the remote port have started the login process, then the PLOGI and PRLI should be back to back. Driver will allow the remote port to complete the process. For the case where the remote port decide to back off from sending PRLI, this local port sets an expiration timer for the PRLI. Once the expiration time passes, the relogin retry logic is allowed to go through and perform login with the remote port. Signed-off-by: Quinn Tran <quinn.tran@qlogic.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18qla2xxx: Fix sess_lock & hardware_lock lock order problem.Quinn Tran
The main lock that needs to be held for CMD or TMR submission to upper layer is the sess_lock. The sess_lock is used to serialize cmd submission and session deletion. The addition of hardware_lock being held is not necessary. This patch removes hardware_lock dependency from CMD/TMR submission. Use hardware_lock only for error response in this case. Path1 CPU0 CPU1 ---- ---- lock(&(&ha->tgt.sess_lock)->rlock); lock(&(&ha->hardware_lock)->rlock); lock(&(&ha->tgt.sess_lock)->rlock); lock(&(&ha->hardware_lock)->rlock); Path2/deadlock *** DEADLOCK *** Call Trace: dump_stack+0x85/0xc2 print_circular_bug+0x1e3/0x250 __lock_acquire+0x1425/0x1620 lock_acquire+0xbf/0x210 _raw_spin_lock_irqsave+0x53/0x70 qlt_sess_work_fn+0x21d/0x480 [qla2xxx] process_one_work+0x1f4/0x6e0 Cc: <stable@vger.kernel.org> Cc: Bart Van Assche <Bart.VanAssche@sandisk.com> Reported-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18qla2xxx: Fix inadequate lock protection for ABTS.Quinn Tran
Normally, ABTS is sent to Target Core as Task MGMT command. In the case of error, qla2xxx needs to send response, hardware_lock is required to prevent request queue corruption. Cc: <stable@vger.kernel.org> Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18qla2xxx: Fix request queue corruption.Quinn Tran
When FW notify driver or driver detects low FW resource, driver tries to send out Busy SCSI Status to tell Initiator side to back off. During the send process, the lock was not held. Cc: <stable@vger.kernel.org> Signed-off-by: Quinn Tran <quinn.tran@qlogic.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18qla2xxx: Fix memory leak for abts processingQuinn Tran
Cc: <stable@vger.kernel.org> Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18qla2xxx: Allow vref count to timeout on vport delete.Joe Carnuccio
Cc: <stable@vger.kernel.org> Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18tcmu: Convert cmd_time_out into backend device attributeNicholas Bellinger
Instead of putting cmd_time_out under ../target/core/user_0/foo/control, which has historically been used by parameters needed for initial backend device configuration, go ahead and move cmd_time_out into a backend device attribute. In order to do this, tcmu_module_init() has been updated to create a local struct configfs_attribute **tcmu_attrs, that is based upon the existing passthrough_attrib_attrs along with the new cmd_time_out attribute. Once **tcm_attrs has been setup, go ahead and point it at tcmu_ops->tb_dev_attrib_attrs so it's picked up by target-core. Also following MNC's previous change, ->cmd_time_out is stored in milliseconds but exposed via configfs in seconds. Also, note this patch restricts the modification of ->cmd_time_out to before + after the TCMU device has been configured, but not while it has active fabric exports. Cc: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18tcmu: make cmd timeout configurableMike Christie
A single daemon could implement multiple types of devices using multuple types of real devices that may not support restarting from crashes and/or handling tcmu timeouts. This makes the cmd timeout configurable, so handlers that do not support it can turn if off for now. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18tcmu: add helper to check if dev was configuredMike Christie
This adds a helper to check if the dev was configured. It will be used in the next patch to prevent updates to some config settings after the device has been setup. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18Merge tag 'openrisc-for-linus' of git://github.com/openrisc/linuxLinus Torvalds
Pull OpenRISC fixes from Stafford Horne: "OpenRISC fixes for build issues that were exposed by kbuild robots after 4.11 merge. All from allmodconfig builds. This includes: - bug in the handling of 8-byte get_user() calls - module build failure due to multile missing symbol exports" * tag 'openrisc-for-linus' of git://github.com/openrisc/linux: openrisc: Export symbols needed by modules openrisc: fix issue handling 8 byte get_user calls openrisc: xchg: fix `computed is not used` warning
2017-03-18target: fix race during implicit transition work flushesMike Christie
This fixes the following races: 1. core_alua_do_transition_tg_pt could have read tg_pt_gp_alua_access_state and gone into this if chunk: if (!explicit && atomic_read(&tg_pt_gp->tg_pt_gp_alua_access_state) == ALUA_ACCESS_STATE_TRANSITION) { and then core_alua_do_transition_tg_pt_work could update the state. core_alua_do_transition_tg_pt would then only set tg_pt_gp_alua_pending_state and the tg_pt_gp_alua_access_state would not get updated with the second calls state. 2. core_alua_do_transition_tg_pt could be setting tg_pt_gp_transition_complete while the tg_pt_gp_transition_work is already completing. core_alua_do_transition_tg_pt then waits on the completion that will never be called. To handle these issues, we just call flush_work which will return when core_alua_do_transition_tg_pt_work has completed so there is no need to do the complete/wait. And, if core_alua_do_transition_tg_pt_work was running, instead of trying to sneak in the state change, we just schedule up another core_alua_do_transition_tg_pt_work call. Note that this does not handle a possible race where there are multiple threads call core_alua_do_transition_tg_pt at the same time. I think we need a mutex in target_tg_pt_gp_alua_access_state_store. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18target: allow userspace to set state to transitioningMike Christie
Userspace target_core_user handlers like tcmu-runner may want to set the ALUA state to transitioning while it does implicit transitions. This patch allows that state when set from configfs. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18target: fix ALUA transition timeout handlingMike Christie
The implicit transition time tells initiators the min time to wait before timing out a transition. We currently schedule the transition to occur in tg_pt_gp_implicit_trans_secs seconds so there is no room for delays. If core_alua_do_transition_tg_pt_work->core_alua_update_tpg_primary_metadata needs to write out info to a remote file, then the initiator can easily time out the operation. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18target: Use system workqueue for ALUA transitionsMike Christie
If tcmu-runner is processing a STPG and needs to change the kernel's ALUA state then we cannot use the same work queue for task management requests and ALUA transitions, because we could deadlock. The problem occurs when a STPG times out before tcmu-runner is able to call into target_tg_pt_gp_alua_access_state_store-> core_alua_do_port_transition -> core_alua_do_transition_tg_pt -> queue_work. In this case, the tmr is on the work queue waiting for the STPG to complete, but the STPG transition is now queued behind the waiting tmr. Note: This bug will also be fixed by this patch: http://www.spinics.net/lists/target-devel/msg14560.html which switches the tmr code to use the system workqueues. For both, I am not sure if we need a dedicated workqueue since it is not a performance path and I do not think we need WQ_MEM_RECLAIM to make forward progress to free up memory like the block layer does. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18target: fail ALUA transitions for pscsiMike Christie
We do not setup the LU group for pscsi devices, so if you write a state to alua_access_state that will cause a transition you will get a NULL pointer dereference. This patch will fail attempts to try and transition the path for backend devices that set the TRANSPORT_FLAG_PASSTHROUGH_ALUA flag. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18target: allow ALUA setup for some passthrough backendsMike Christie
This patch allows passthrough backends to use the core/base LIO ALUA setup and state checks, but still handle the execution of commands. This will allow the target_core_user module to execute STPG and RTPG in userspace, and not have to duplicate the ALUA state checks, path information (needed so we can check if command is executable on specific paths) and setup (rtslib sets/updates the configfs ALUA interface like it does for iblock or file). For STPG, the target_core_user userspace daemon, tcmu-runner will still execute the STPG, and to update the core/base LIO state it will use the existing configfs interface. For RTPG, tcmu-runner will loop over configfs and/or cache the state. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>