summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-02-08clocksource/drivers/arm_arch_timer: Work around Hisilicon erratum 161010101clockevents/4.11Ding Tianhong
Erratum Hisilicon-161010101 says that the ARM generic timer counter "has the potential to contain an erroneous value when the timer value changes". Accesses to TVAL (both read and write) are also affected due to the implicit counter read. Accesses to CVAL are not affected. The workaround is to reread the system count registers until the value of the second read is larger than the first one by less than 32, the system counter can be guaranteed not to return wrong value twice by back-to-back read and the error value is always larger than the correct one by 32. Writes to TVAL are replaced with an equivalent write to CVAL. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> [Mark: split patch, fix Kconfig, reword commit message] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-02-08clocksource/drivers/arm_arch_timer: Introduce generic errata handling ↵Ding Tianhong
infrastructure Currently we have code inline in the arch timer probe path to cater for Freescale erratum A-008585, complete with ifdeffery. This is a little ugly, and will get worse as we try to add more errata handling. This patch refactors the handling of Freescale erratum A-008585. Now the erratum is described in a generic arch_timer_erratum_workaround structure, and the probe path can iterate over these to detect errata and enable workarounds. This will simplify the addition and maintenance of code handling Hisilicon erratum 161010101. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> [Mark: split patch, correct Kconfig, reword commit message] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-02-08clocksource/drivers/arm_arch_timer: Remove fsl-a008585 parameterDing Tianhong
Having a command line option to flip the errata handling for a particular erratum is a little bit unusual, and it's vastly superior to pass this in the DT. By common consensus, it's best to kill off the command line parameter. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> [Mark: split patch, reword commit message] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-02-08clocksource/drivers/arm_arch_timer: Add dt binding for hisilicon-161010101 ↵Ding Tianhong
erratum This erratum describes a bug in logic outside the core, so MIDR can't be used to identify its presence, and reading an SoC-specific revision register from common arch timer code would be awkward. So, describe it in the device tree. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-02-07clocksource/drivers/ostm: Add renesas-ostm timer driverChris Brandt
This patch adds a OSTM driver for the Renesas architecture. The OS Timer (OSTM) has independent channels that can be used as a freerun or interval times. This driver uses the first probed device as a clocksource and then any additional devices as clock events. Signed-off-by: Chris Brandt <chris.brandt@renesas.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-02-07clocksource/drivers/ostm: Document renesas-ostm timer DT bindingsChris Brandt
Signed-off-by: Chris Brandt <chris.brandt@renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-02-07clocksource/drivers/tcb_clksrc: Use 32 bit tcb as sched_clockDavid Engraf
On newer boards the TC can be read as single 32 bit value without locking. Thus the clock can be used as reference for sched_clock which is much more accurate than the jiffies implementation. Tested on a Atmel SAMA5D2 board. Signed-off-by: David Engraf <david.engraf@sysgo.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-02-07clocksource/drivers/gemini: Add driver for the Cortina GeminiLinus Walleij
This is a rewrite of the Gemini timer driver in arch/arm/mach-gemini/timer.c trying to do everything the device tree way: - Make every IO-access relative to a base address and dynamic so we can do a dynamic ioremap and get going. - Do not poke around directly in the global syscon registers, access them using the syscon regmap style design pattern for the one register we need to check. - Find register range and interrupt from the device tree. Cc: Janos Laube <janos.dev@gmail.com> Cc: Paulius Zaleckas <paulius.zaleckas@gmail.com> Cc: Hans Ulli Kroll <ulli.kroll@googlemail.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-02-07clocksource: add DT bindings for Cortina GeminiLinus Walleij
This adds device tree bindings for the Cortina Systems Gemini timer block used in these SoCs. Cc: Janos Laube <janos.dev@gmail.com> Cc: Paulius Zaleckas <paulius.zaleckas@gmail.com> Cc: Hans Ulli Kroll <ulli.kroll@googlemail.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: devicetree@vger.kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-02-07clockevents: Add a clkevt-of mechanism like clksrc-ofDaniel Lezcano
The current code uses the CLOCKSOURCE_OF_DECLARE macro to fill the clksrc table with a t-uple (name, init_function). Unfortunately it ends up to the clockevent and the clocksource being both initialized with this macro. It is not a problem by itself but there is not a clear distinction between a clockevent and a clocksource in the code initialization path. Somebody can argue there are the same IP block and the same DT node. But conceptually from the software side, there are two distincts entities and as is they should be initialized separetely. Some drivers which do not have a clocksource end up by using the CLOCKSOURCE_OF_DECLARE macro to declare a clockevent. Another result is the fuzzy organization in the clocksource directory, where the clockevents are implemented in the same file than the clocksources or file labelled timer-something implementing a clocksource. This patch provides another macro to specifically declare a clockevent in the same way than the clocksource and gives the opportunity to write two separate drivers, one for the clocksource and another for the clockevents. Hopefully, that can help to do some housework in the directory, perhaps split the drivers in to entities, for example: - clksrc-rockchip.c - clkevt-rockchip.c Also, it gives the possibility to declare clocksources separately in the DT and then use a clocksource from IP block while while clockevents are used from another IP block. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-02-04tick/broadcast: Reduce lock cacheline contentionWaiman Long
It was observed that on an Intel x86 system without the ARAT (Always running APIC timer) feature and with fairly large number of CPUs as well as CPUs coming in and out of intel_idle frequently, the lock contention on the tick_broadcast_lock can become significant. To reduce contention, the lock is put into its own cacheline and all the cpumask_var_t variables are put into the __read_mostly section. Running the SP benchmark of the NAS Parallel Benchmarks on a 4-socket 16-core 32-thread Nehalam system, the performance number improved from 3353.94 Mop/s to 3469.31 Mop/s when this patch was applied on a 4.9.6 kernel. This is a 3.4% improvement. Signed-off-by: Waiman Long <longman@redhat.com> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1485799063-20857-1-git-send-email-longman@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-30Merge branch 'fortglx/4.11/time' of ↵Thomas Gleixner
https://git.linaro.org/people/john.stultz/linux into timers/core - Remove unused functions - Document udelay inaccuracy - Remove posix timer data from task struct when posix timers are off
2017-01-27timers: Omit POSIX timer stuff from task_struct when disabledNicolas Pitre
When CONFIG_POSIX_TIMERS is disabled, it is preferable to remove related structures from struct task_struct and struct signal_struct as they won't contain anything useful and shouldn't be relied upon by mistake. Code still referencing those structures is also disabled here. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-01-22x86/timer: Make delay() work during early bootupJiri Slaby
When a panic happens during bootup, "Rebooting in X seconds.." is shown, but reboot happens immediatelly. It is because panic() uses mdelay() and mdelay() calls __const_udelay() immediately, which does not work while booting. The per_cpu cpu_info.loops_per_jiffy value is not initialized yet, so __const_udelay() actually multiplies the number of loops by zero. This results in __const_udelay() to delay the execution only by a nanosecond or so. So check whether cpu_info.loops_per_jiffy is zero and use loops_per_jiffy in that case. mdelay() will not be so precise without proper calibration, but it works relatively well. Before: [ 0.170039] delaying 100ms [ 0.170828] done After [ 0.214042] delaying 100ms [ 0.313974] done I do not think the added check matters given we are about to spin the processor in the next few hundred cycles. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170119114730.2670-1-jslaby@suse.cz [ Minor edits. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-20delay: Add explanation of udelay() inaccuracyRussell King
There seems to be some misunderstanding that udelay() and friends will always guarantee the specified delay. This is a false understanding. When udelay() is based on CPU cycles, it can return early for many reasons which are detailed by Linus' reply to me in a thread in 2011: http://lists.openwall.net/linux-kernel/2011/01/12/372 However, a udelay test module was created in 2014 which allows udelay() to only be 0.5% fast, which is outside of the CPU-cycles udelay() results I measured back in 2011, which were deemed to be in the "we don't care" region. test_udelay() should be fixed to reflect the real allowable tolerance on udelay(), rather than 0.5%. Cc: David Riley <davidriley@chromium.org> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-01-20timerqueue: Use rb_entry_safe() instead of open-coding itGeliang Tang
Signed-off-by: Geliang Tang <geliangtang@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/0d5cf199ac43792df0b6f7e2145545c30fa1dbbe.1482222135.git.geliangtang@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-14math64, timers: Fix 32bit mul_u64_u32_shr() and friendsPeter Zijlstra
It turns out that while GCC-4.4 manages to generate 32x32->64 mult instructions for the 32bit mul_u64_u32_shr() code, any GCC after that fails horribly. Fix this by providing an explicit mul_u32_u32() function which can be architcture provided. Reported-by: Chris Metcalf <cmetcalf@mellanox.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile] Cc: Christopher S. Hall <christopher.s.hall@intel.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: John Stultz <john.stultz@linaro.org> Cc: Laurent Vivier <lvivier@redhat.com> Cc: Liav Rehana <liavr@mellanox.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Parit Bhargava <prarit@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20161209083011.GD15765@worktop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-13Merge branch 'for-linus-4.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "These are all over the place. The tracepoint part of the pull fixes a crash and adds a little more information to two tracepoints, while the rest are good old fashioned fixes" * 'for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: make tracepoint format strings more compact Btrfs: add truncated_len for ordered extent tracepoints Btrfs: add 'inode' for extent map tracepoint btrfs: fix crash when tracepoint arguments are freed by wq callbacks Btrfs: adjust outstanding_extents counter properly when dio write is split Btrfs: fix lockdep warning about log_mutex Btrfs: use down_read_nested to make lockdep silent btrfs: fix locking when we put back a delayed ref that's too new btrfs: fix error handling when run_delayed_extent_op fails btrfs: return the actual error value from from btrfs_uuid_tree_iterate
2017-01-13Merge tag 'ceph-for-4.10-rc4' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fixes from Ilya Dryomov: "Two small fixups for the filesystem changes that went into this merge window" * tag 'ceph-for-4.10-rc4' of git://github.com/ceph/ceph-client: ceph: fix get_oldest_context() ceph: fix mds cluster availability check
2017-01-13Merge tag 'vfio-v4.10-rc4' of git://github.com/awilliam/linux-vfioLinus Torvalds
Pull VFIO fixes from Alex Williamson: - Cleanups and bug fixes for the mtty sample driver (Dan Carpenter) - Export and make use of has_capability() to fix incorrect use of ns_capable() for testing task capabilities (Jike Song) * tag 'vfio-v4.10-rc4' of git://github.com/awilliam/linux-vfio: vfio/type1: Remove pid_namespace.h include vfio iommu type1: fix the testing of capability for remote task capability: export has_capability vfio-mdev: remove some dead code vfio-mdev: buffer overflow in ioctl() vfio-mdev: return -EFAULT if copy_to_user() fails
2017-01-13Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: - fix for module unload vs deferred jump labels (note: there might be other buggy modules!) - two NULL pointer dereferences from syzkaller - also syzkaller: fix emulation of fxsave/fxrstor/sgdt/sidt, problem made worse during this merge window, "just" kernel memory leak on releases - fix emulation of "mov ss" - somewhat serious on AMD, less so on Intel * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: fix emulation of "MOV SS, null selector" KVM: x86: fix NULL deref in vcpu_scan_ioapic KVM: eventfd: fix NULL deref irqbypass consumer KVM: x86: Introduce segmented_write_std KVM: x86: flush pending lapic jump label updates on module unload jump_labels: API for flushing deferred jump label updates
2017-01-13Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Fix huge_ptep_set_access_flags() to return "changed" when any of the ptes in the contiguous range is changed, not just the last one - Fix the adr_l assembly macro to work in modules under KASLR * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: assembler: make adr_l work in modules under KASLR arm64: hugetlb: fix the wrong return value for huge_ptep_set_access_flags
2017-01-13Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "The major fix is the bfa firmware, since the latest 10Gb cards fail probing with the current firmware. The rest is a set of minor fixes: one missed Kconfig dependency causing randconfig failures, a missed error return on an error leg, a change for how multiqueue waits on a blocked device and a don't reset while in reset fix" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: bfa: Increase requested firmware version to 3.2.5.1 scsi: snic: Return error code on memory allocation failure scsi: fnic: Avoid sending reset to firmware when another reset is in progress scsi: qedi: fix build, depends on UIO scsi: scsi-mq: Wait for .queue_rq() if necessary
2017-01-13Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: "Small driver fixups" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: elants_i2c - avoid divide by 0 errors on bad touchscreen data Input: adxl34x - make it enumerable in ACPI environment Input: ALPS - fix TrackStick Y axis handling for SS5 hardware Input: synaptics-rmi4 - fix F03 build error when serio is module Input: xpad - use correct product id for x360w controllers Input: synaptics_i2c - change msleep to usleep_range for small msecs Input: i8042 - add Pegatron touchpad to noloop table Input: joydev - remove unused linux/miscdevice.h include
2017-01-13vfio/type1: Remove pid_namespace.h includeAlex Williamson
Using has_capability() rather than ns_capable(), we're no longer using this header. Cc: Jike Song <jike.song@intel.com> Cc: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-01-12vfio iommu type1: fix the testing of capability for remote taskJike Song
Before the mdev enhancement type1 iommu used capable() to test the capability of current task; in the course of mdev development a new requirement, testing for another task other than current, was raised. ns_capable() was used for this purpose, however it still tests current, the only difference is, in a specified namespace. Fix it by using has_capability() instead, which tests the cap for specified task in init_user_ns, the same namespace as capable(). Cc: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Jike Song <jike.song@intel.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-01-12Merge tag 'sound-4.10-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "This time we got a few more fixes than the previous rc's, and most of commits were about ASoC. The only significant change in the core side is the regression fix wrt the aux device list handling, and all the rest are driver-specific small / trivial fixes" * tag 'sound-4.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: usb-audio: Add a quirk for Plantronics BT600 ASoC: rt5645: set sel_i2s_pre_div1 to 2 ASoC: dpcm: Avoid putting stream state to STOP when FE stream is paused ASoC: Intel: Skylake: Release FW ctx in cleanup ASoC: Intel: bytcr-rt5640: fix settings in internal clock mode ASoC: fsl_ssi: set fifo watermark to more reliable value ASoC: nau8825: fix invalid configuration in Pre-Scalar of FLL ASoC: nau8825: correct the function name of register ASoC: Intel: Skylake: Fix to fail safely if module not available in path ASoC: tlv320aic3x: Mark the RESET register as volatile ASoC: Fix binding and probing of auxiliary components ASoC: wm_adsp: Don't overrun firmware file buffer when reading region data ASoC: Intel: bytcr_rt5640: fallback mechanism if MCLK is not enabled ASoC: hdmi-codec: use unsigned type to structure members with bit-field ASoC: topology: kfree kcontrol->private_value before freeing kcontrol ASoC: rsnd: don't double free kctrl ASoC: dwc: Fix PIO mode initialization
2017-01-12Merge tag 'xfs-for-linus-4.10-rc4-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull xfs fixes from Darrick Wong: "As promised last week, here's some stability fixes from Christoph and Jan Kara: - fix free space request handling when low on disk space - remove redundant log failure error messages - free truncated dirty pages instead of letting them build up forever" * tag 'xfs-for-linus-4.10-rc4-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: Timely free truncated dirty pages xfs: don't print warnings when xfs_log_force fails xfs: don't rely on ->total in xfs_alloc_space_available xfs: adjust allocation length in xfs_alloc_space_available xfs: fix bogus minleft manipulations xfs: bump up reserved blocks in xfs_alloc_set_aside
2017-01-12Merge tag 'rproc-v4.10-fixes' of git://github.com/andersson/remoteprocLinus Torvalds
Pull remoteproc fixes from Bjorn Andersson: "This fixes two regressions that have been reported to be introduced in v4.10-rc1. - correct an incorrect usage of the kref api - revert the change to make the resource table read-only. As the space each vdev resource is used as virtio device config space it must be shared with the remote" * tag 'rproc-v4.10-fixes' of git://github.com/andersson/remoteproc: Revert "remoteproc: Merge table_ptr and cached_table pointers" remoteproc: fix vdev reference management
2017-01-12Merge tag 'rpmsg-v4.10-fixes' of git://github.com/andersson/remoteprocLinus Torvalds
Pull rpmsg fixes from Bjorn Andersson: "This fixes a regression introduced in v4.10-rc1 that prohibits multiple channels with the same name but different endpoint addresses to be used" * tag 'rpmsg-v4.10-fixes' of git://github.com/andersson/remoteproc: rpmsg: virtio_rpmsg_bus: fix channel creation
2017-01-12Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID fixes from Jiri Kosina: - device descriptor length validation fix to hid-cypress driver from Greg - introduction of a short delay into i2c-hid, which is not really mandated by the spec, but fixes Asus Touchpads - Petzl USB connectable flashlight quirk from myself * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: HID: i2c-hid: Add sleep between POWER ON and RESET HID: hid-cypress: validate length of report HID: ignore Petzl USB headlamp
2017-01-12Merge branch 'scsi-target-for-v4.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/bvanassche/linux Pull scsi target fixes from Bart Van Assche: - a series of bug fixes for the XCOPY implementation from David Disseldorp - one bug fix for the ibmvscsis driver, a driver that is used for communication between partitions on IBM POWER systems. * 'scsi-target-for-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/bvanassche/linux: ibmvscsis: Fix srp_transfer_data fail return code target: support XCOPY requests without parameters target: check for XCOPY parameter truncation target: use XCOPY segment descriptor CSCD IDs target: check XCOPY segment descriptor CSCD IDs target: simplify XCOPY wwn->se_dev lookup helper target: return UNSUPPORTED TARGET/SEGMENT DESC TYPE CODE sense target: bounds check XCOPY total descriptor list length target: bounds check XCOPY segment descriptor list target: use XCOPY TOO MANY TARGET DESCRIPTORS sense target: add XCOPY target/segment desc sense codes
2017-01-12ceph: fix get_oldest_context()Geng, Jichao
For no snapshot case, we should use ci->truncate_{seq,size}. Fixes: 5f743e456606 ("ceph: record truncate size/seq for snap data writeback") Signed-off-by: Geng, Jichao <geng.jichao@h3c.com> Signed-off-by: Yan, Zheng <zyan@redhat.com>
2017-01-12ceph: fix mds cluster availability checkYan, Zheng
We should apply the check after getting the initial mdsmap. Fixes: e9e427f0a14f ("ceph: check availability of mds cluster on mount") Link: http://tracker.ceph.com/issues/18161 Signed-off-by: Yan, Zheng <zyan@redhat.com>
2017-01-12Merge tag 'md/4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/mdLinus Torvalds
Pull md fixes from Shaohua Li: "Basically one fix for raid5 cache which is merged in this cycle, others are trival fixes" * tag 'md/4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md: md/raid5: Use correct IS_ERR() variation on pointer check md: cleanup mddev flag clear for takeover md/r5cache: fix spelling mistake on "recoverying" md/r5cache: assign conf->log before r5l_load_log() md/r5cache: simplify handling of sh->log_start in recovery md/raid5-cache: removes unnecessary write-through mode judgments md/raid10: Refactor raid10_make_request md/raid1: Refactor raid1_make_request
2017-01-12arm64: assembler: make adr_l work in modules under KASLRArd Biesheuvel
When CONFIG_RANDOMIZE_MODULE_REGION_FULL=y, the offset between loaded modules and the core kernel may exceed 4 GB, putting symbols exported by the core kernel out of the reach of the ordinary adrp/add instruction pairs used to generate relative symbol references. So make the adr_l macro emit a movz/movk sequence instead when executing in module context. While at it, remove the pointless special case for the stack pointer. Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-01-12KVM: x86: fix emulation of "MOV SS, null selector"Paolo Bonzini
This is CVE-2017-2583. On Intel this causes a failed vmentry because SS's type is neither 3 nor 7 (even though the manual says this check is only done for usable SS, and the dmesg splat says that SS is unusable!). On AMD it's worse: svm.c is confused and sets CPL to 0 in the vmcb. The fix fabricates a data segment descriptor when SS is set to a null selector, so that CPL and SS.DPL are set correctly in the VMCS/vmcb. Furthermore, only allow setting SS to a NULL selector if SS.RPL < 3; this in turn ensures CPL < 3 because RPL must be equal to CPL. Thanks to Andy Lutomirski and Willy Tarreau for help in analyzing the bug and deciphering the manuals. Reported-by: Xiaohan Zhang <zhangxiaohan1@huawei.com> Fixes: 79d5b4c3cd809c770d4bf9812635647016c56011 Cc: stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12capability: export has_capabilityJike Song
has_capability() is sometimes needed by modules to test capability for specified task other than current, so export it. Cc: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Jike Song <jike.song@intel.com> Acked-by: Serge Hallyn <serge@hallyn.com> Acked-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-01-12KVM: x86: fix NULL deref in vcpu_scan_ioapicWanpeng Li
Reported by syzkaller: BUG: unable to handle kernel NULL pointer dereference at 00000000000001b0 IP: _raw_spin_lock+0xc/0x30 PGD 3e28eb067 PUD 3f0ac6067 PMD 0 Oops: 0002 [#1] SMP CPU: 0 PID: 2431 Comm: test Tainted: G OE 4.10.0-rc1+ #3 Call Trace: ? kvm_ioapic_scan_entry+0x3e/0x110 [kvm] kvm_arch_vcpu_ioctl_run+0x10a8/0x15f0 [kvm] ? pick_next_task_fair+0xe1/0x4e0 ? kvm_arch_vcpu_load+0xea/0x260 [kvm] kvm_vcpu_ioctl+0x33a/0x600 [kvm] ? hrtimer_try_to_cancel+0x29/0x130 ? do_nanosleep+0x97/0xf0 do_vfs_ioctl+0xa1/0x5d0 ? __hrtimer_init+0x90/0x90 ? do_nanosleep+0x5b/0xf0 SyS_ioctl+0x79/0x90 do_syscall_64+0x6e/0x180 entry_SYSCALL64_slow_path+0x25/0x25 RIP: _raw_spin_lock+0xc/0x30 RSP: ffffa43688973cc0 The syzkaller folks reported a NULL pointer dereference due to ENABLE_CAP succeeding even without an irqchip. The Hyper-V synthetic interrupt controller is activated, resulting in a wrong request to rescan the ioapic and a NULL pointer dereference. #include <sys/ioctl.h> #include <sys/mman.h> #include <sys/types.h> #include <linux/kvm.h> #include <pthread.h> #include <stddef.h> #include <stdint.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #ifndef KVM_CAP_HYPERV_SYNIC #define KVM_CAP_HYPERV_SYNIC 123 #endif void* thr(void* arg) { struct kvm_enable_cap cap; cap.flags = 0; cap.cap = KVM_CAP_HYPERV_SYNIC; ioctl((long)arg, KVM_ENABLE_CAP, &cap); return 0; } int main() { void *host_mem = mmap(0, 0x1000, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); int kvmfd = open("/dev/kvm", 0); int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0); struct kvm_userspace_memory_region memreg; memreg.slot = 0; memreg.flags = 0; memreg.guest_phys_addr = 0; memreg.memory_size = 0x1000; memreg.userspace_addr = (unsigned long)host_mem; host_mem[0] = 0xf4; ioctl(vmfd, KVM_SET_USER_MEMORY_REGION, &memreg); int cpufd = ioctl(vmfd, KVM_CREATE_VCPU, 0); struct kvm_sregs sregs; ioctl(cpufd, KVM_GET_SREGS, &sregs); sregs.cr0 = 0; sregs.cr4 = 0; sregs.efer = 0; sregs.cs.selector = 0; sregs.cs.base = 0; ioctl(cpufd, KVM_SET_SREGS, &sregs); struct kvm_regs regs = { .rflags = 2 }; ioctl(cpufd, KVM_SET_REGS, &regs); ioctl(vmfd, KVM_CREATE_IRQCHIP, 0); pthread_t th; pthread_create(&th, 0, thr, (void*)(long)cpufd); usleep(rand() % 10000); ioctl(cpufd, KVM_RUN, 0); pthread_join(th, 0); return 0; } This patch fixes it by failing ENABLE_CAP if without an irqchip. Reported-by: Dmitry Vyukov <dvyukov@google.com> Fixes: 5c919412fe61 (kvm/x86: Hyper-V synthetic interrupt controller) Cc: stable@vger.kernel.org # 4.5+ Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12KVM: eventfd: fix NULL deref irqbypass consumerWanpeng Li
Reported syzkaller: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass] PGD 0 Oops: 0002 [#1] SMP CPU: 1 PID: 125 Comm: kworker/1:1 Not tainted 4.9.0+ #1 Workqueue: kvm-irqfd-cleanup irqfd_shutdown [kvm] task: ffff9bbe0dfbb900 task.stack: ffffb61802014000 RIP: 0010:irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass] Call Trace: irqfd_shutdown+0x66/0xa0 [kvm] process_one_work+0x16b/0x480 worker_thread+0x4b/0x500 kthread+0x101/0x140 ? process_one_work+0x480/0x480 ? kthread_create_on_node+0x60/0x60 ret_from_fork+0x25/0x30 RIP: irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass] RSP: ffffb61802017e20 CR2: 0000000000000008 The syzkaller folks reported a NULL pointer dereference that due to unregister an consumer which fails registration before. The syzkaller creates two VMs w/ an equal eventfd occasionally. So the second VM fails to register an irqbypass consumer. It will make irqfd as inactive and queue an workqueue work to shutdown irqfd and unregister the irqbypass consumer when eventfd is closed. However, the second consumer has been initialized though it fails registration. So the token(same as the first VM's) is taken to unregister the consumer through the workqueue, the consumer of the first VM is found and unregistered, then NULL deref incurred in the path of deleting consumer from the consumers list. This patch fixes it by making irq_bypass_register/unregister_consumer() looks for the consumer entry based on consumer pointer itself instead of token matching. Reported-by: Dmitry Vyukov <dvyukov@google.com> Suggested-by: Alex Williamson <alex.williamson@redhat.com> Cc: stable@vger.kernel.org Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12KVM: x86: Introduce segmented_write_stdSteve Rutherford
Introduces segemented_write_std. Switches from emulated reads/writes to standard read/writes in fxsave, fxrstor, sgdt, and sidt. This fixes CVE-2017-2584, a longstanding kernel memory leak. Since commit 283c95d0e389 ("KVM: x86: emulate FXSAVE and FXRSTOR", 2016-11-09), which is luckily not yet in any final release, this would also be an exploitable kernel memory *write*! Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Fixes: 96051572c819194c37a8367624b285be10297eca Fixes: 283c95d0e3891b64087706b344a4b545d04a6e62 Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Steve Rutherford <srutherford@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12KVM: x86: flush pending lapic jump label updates on module unloadDavid Matlack
KVM's lapic emulation uses static_key_deferred (apic_{hw,sw}_disabled). These are implemented with delayed_work structs which can still be pending when the KVM module is unloaded. We've seen this cause kernel panics when the kvm_intel module is quickly reloaded. Use the new static_key_deferred_flush() API to flush pending updates on module unload. Signed-off-by: David Matlack <dmatlack@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12jump_labels: API for flushing deferred jump label updatesDavid Matlack
Modules that use static_key_deferred need a way to synchronize with any delayed work that is still pending when the module is unloaded. Introduce static_key_deferred_flush() which flushes any pending jump label updates. Signed-off-by: David Matlack <dmatlack@google.com> Cc: stable@vger.kernel.org Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-11HID: i2c-hid: Add sleep between POWER ON and RESETBrendan McGrath
Support for the Asus Touchpad was recently added. It turns out this device can fail initialisation (and become unusable) when the RESET command is sent too soon after the POWER ON command. Unfortunately the i2c-hid specification does not specify the need for a delay between these two commands. But it was discovered the Windows driver has a 1ms delay. As a result, this patch modifies the i2c-hid module to add a sleep inbetween the POWER ON and RESET commands which lasts between 1ms and 5ms. See https://github.com/vlasenko/hid-asus-dkms/issues/24 for further details. Signed-off-by: Brendan McGrath <redmcg@redmandi.dyndns.org> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-01-11Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge fixes from Andrew Morton: "27 fixes. There are three patches that aren't actually fixes. They're simple function renamings which are nice-to-have in mainline as ongoing net development depends on them." * akpm: (27 commits) timerfd: export defines to userspace mm/hugetlb.c: fix reservation race when freeing surplus pages mm/slab.c: fix SLAB freelist randomization duplicate entries zram: support BDI_CAP_STABLE_WRITES zram: revalidate disk under init_lock mm: support anonymous stable page mm: add documentation for page fragment APIs mm: rename __page_frag functions to __page_frag_cache, drop order from drain mm: rename __alloc_page_frag to page_frag_alloc and __free_page_frag to page_frag_free mm, memcg: fix the active list aging for lowmem requests when memcg is enabled mm: don't dereference struct page fields of invalid pages mailmap: add codeaurora.org names for nameless email commits signal: protect SIGNAL_UNKILLABLE from unintentional clearing. mm: pmd dirty emulation in page fault handler ipc/sem.c: fix incorrect sem_lock pairing lib/Kconfig.debug: fix frv build failure mm: get rid of __GFP_OTHER_NODE mm: fix remote numa hits statistics mm: fix devm_memremap_pages crash, use mem_hotplug_{begin, done} ocfs2: fix crash caused by stale lvb with fsdlm plugin ...
2017-01-11vfio-mdev: remove some dead codeDan Carpenter
We set info.count to 1 in mtty_get_irq_info() so static checkers complain that, "Why do we have impossible conditions?" The answer is that it seems to be left over dead code that can be safely removed. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-01-11vfio-mdev: buffer overflow in ioctl()Dan Carpenter
This is a sample driver for documentation so the impact is probably pretty low. But we should check that bar_index is valid so we don't write beyond the end of the mdev_state->region_info[] array. Fixes: 9d1a546c53b4 ("docs: Sample driver to demonstrate how to use Mediated device framework.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-01-11vfio-mdev: return -EFAULT if copy_to_user() failsDan Carpenter
The copy_to_user() function returns the number of bytes which it wasn't able to copy but we want to return a negative error code. Fixes: 9d1a546c53b4 ("docs: Sample driver to demonstrate how to use Mediated device framework.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-01-11Merge tag 'asoc-fix-v4.10-rc3' of ↵Takashi Iwai
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v4.10 As well as the usual smattering of driver specific fixes collected since the merge window this has one particularly important fix to the core for handling of aux_devs which was broken during the merge window by some of the componentization refactoring.
2017-01-11xfs: Timely free truncated dirty pagesJan Kara
Commit 99579ccec4e2 "xfs: skip dirty pages in ->releasepage()" started to skip dirty pages in xfs_vm_releasepage() which also has the effect that if a dirty page is truncated, it does not get freed by block_invalidatepage() and is lingering in LRU list waiting for reclaim. So a simple loop like: while true; do dd if=/dev/zero of=file bs=1M count=100 rm file done will keep using more and more memory until we hit low watermarks and start pagecache reclaim which will eventually reclaim also the truncate pages. Keeping these truncated (and thus never usable) pages in memory is just a waste of memory, is unnecessarily stressing page cache reclaim, and reportedly also leads to anonymous mmap(2) returning ENOMEM prematurely. So instead of just skipping dirty pages in xfs_vm_releasepage(), return to old behavior of skipping them only if they have delalloc or unwritten buffers and fix the spurious warnings by warning only if the page is clean. CC: stable@vger.kernel.org CC: Brian Foster <bfoster@redhat.com> CC: Vlastimil Babka <vbabka@suse.cz> Reported-by: Petr Tůma <petr.tuma@d3s.mff.cuni.cz> Fixes: 99579ccec4e271c3d4d4e7c946058766812afdab Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>