aboutsummaryrefslogtreecommitdiff
path: root/drivers/vhost
AgeCommit message (Collapse)Author
2018-07-22vhost_net: validate sock before trying to put its fdJason Wang
[ Upstream commit b8f1f65882f07913157c44673af7ec0b308d03eb ] Sock will be NULL if we pass -1 to vhost_net_set_backend(), but when we meet errors during ubuf allocation, the code does not check for NULL before calling sockfd_put(), this will lead NULL dereferencing. Fixing by checking sock pointer before. Fixes: bab632d69ee4 ("vhost: vhost TX zero-copy support") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-26vhost: fix info leak due to uninitialized memoryMichael S. Tsirkin
commit 670ae9caaca467ea1bfd325cb2a5c98ba87f94ad upstream. struct vhost_msg within struct vhost_msg_node is copied to userspace. Unfortunately it turns out on 64 bit systems vhost_msg has padding after type which gcc doesn't initialize, leaking 4 uninitialized bytes to userspace. This padding also unfortunately means 32 bit users of this interface are broken on a 64 bit kernel which will need to be fixed separately. Fixes: CVE-2018-1118 Cc: stable@vger.kernel.org Reported-by: Kevin Easton <kevin@guarana.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reported-by: syzbot+87cfa083e727a224754b@syzkaller.appspotmail.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-13vhost: synchronize IOTLB message with dev cleanupJason Wang
[ Upstream commit 1b15ad683ab42a203f98b67045b40720e99d0e9a ] DaeRyong Jeong reports a race between vhost_dev_cleanup() and vhost_process_iotlb_msg(): Thread interleaving: CPU0 (vhost_process_iotlb_msg) CPU1 (vhost_dev_cleanup) (In the case of both VHOST_IOTLB_UPDATE and VHOST_IOTLB_INVALIDATE) ===== ===== vhost_umem_clean(dev->iotlb); if (!dev->iotlb) { ret = -EFAULT; break; } dev->iotlb = NULL; The reason is we don't synchronize between them, fixing by protecting vhost_process_iotlb_msg() with dev mutex. Reported-by: DaeRyong Jeong <threeearcat@gmail.com> Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-20vhost: fix vhost_vq_access_ok() log checkStefan Hajnoczi
[ Upstream commit d14d2b78090c7de0557362b26a4ca591aa6a9faa ] Commit d65026c6c62e7d9616c8ceb5a53b68bcdc050525 ("vhost: validate log when IOTLB is enabled") introduced a regression. The logic was originally: if (vq->iotlb) return 1; return A && B; After the patch the short-circuit logic for A was inverted: if (A || vq->iotlb) return A; return B; This patch fixes the regression by rewriting the checks in the obvious way, no longer returning A when vq->iotlb is non-NULL (which is hard to understand). Reported-by: syzbot+65a84dde0214b0387ccd@syzkaller.appspotmail.com Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13vhost_net: add missing lock nesting notationJason Wang
[ Upstream commit aaa3149bbee9ba9b4e6f0bd6e3e7d191edeae942 ] We try to hold TX virtqueue mutex in vhost_net_rx_peek_head_len() after RX virtqueue mutex is held in handle_rx(). This requires an appropriate lock nesting notation to calm down deadlock detector. Fixes: 0308813724606 ("vhost_net: basic polling support") Reported-by: syzbot+7f073540b1384a614e09@syzkaller.appspotmail.com Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13vhost: validate log when IOTLB is enabledJason Wang
[ Upstream commit d65026c6c62e7d9616c8ceb5a53b68bcdc050525 ] Vq log_base is the userspace address of bitmap which has nothing to do with IOTLB. So it needs to be validated unconditionally otherwise we may try use 0 as log_base which may lead to pin pages that will lead unexpected result (e.g trigger BUG_ON() in set_bit_to_user()). Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API") Reported-by: syzbot+6304bf97ef436580fede@syzkaller.appspotmail.com Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13vhost: correctly remove wait queue during poll failureJason Wang
[ Upstream commit dc6455a71c7fc5117977e197f67f71b49f27baba ] We tried to remove vq poll from wait queue, but do not check whether or not it was in a list before. This will lead double free. Fixing this by switching to use vhost_poll_stop() which zeros poll->wqh after removing poll from waitqueue to make sure it won't be freed twice. Cc: Darren Kenny <darren.kenny@oracle.com> Reported-by: syzbot+c0272972b01b872e604a@syzkaller.appspotmail.com Fixes: 2b8b328b61c79 ("vhost_net: handle polling errors when setting backend") Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25vhost: use mutex_lock_nested() in vhost_dev_lock_vqs()Jason Wang
commit e9cb4239134c860e5f92c75bf5321bd377bb505b upstream. We used to call mutex_lock() in vhost_dev_lock_vqs() which tries to hold mutexes of all virtqueues. This may confuse lockdep to report a possible deadlock because of trying to hold locks belong to same class. Switch to use mutex_lock_nested() to avoid false positive. Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API") Reported-by: syzbot+dbb7c1161485e61b0241@syzkaller.appspotmail.com Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-13vhost_net: stop device during reset ownerJason Wang
[ Upstream commit 4cd879515d686849eec5f718aeac62a70b067d82 ] We don't stop device before reset owner, this means we could try to serve any virtqueue kick before reset dev->worker. This will result a warn since the work was pending at llist during owner resetting. Fix this by stopping device during owner reset. Reported-by: syzbot+eb17c6162478cc50632c@syzkaller.appspotmail.com Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-25vhost-vsock: add pkt cancel capabilityPeng Tao
[ Upstream commit 16320f363ae128d9b9c70e60f00f2a572f57c23d ] To allow canceling all packets of a connection. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: Peng Tao <bergwolf@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30fix a page leak in vhost_scsi_iov_to_sgl() error recoveryAl Viro
commit 11d49e9d089ccec81be87c2386dfdd010d7f7f6e upstream. we are advancing sg as we go, so the pages we need to drop in case of error are *before* the current sg. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-20vhost_net: correctly check tx avail during rx busy pollingJason Wang
[ Upstream commit 8b949bef9172ca69d918e93509a4ecb03d0355e0 ] We check tx avail through vhost_enable_notify() in the past which is wrong since it only checks whether or not guest has filled more available buffer since last avail idx synchronization which was just done by vhost_vq_avail_empty() before. What we really want is checking pending buffers in the avail ring. Fix this by calling vhost_vq_avail_empty() instead. This issue could be noticed by doing netperf TCP_RR benchmark as client from guest (but not host). With this fix, TCP_RR from guest to localhost restores from 1375.91 trans per sec to 55235.28 trans per sec on my laptop (Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz). Fixes: 030881372460 ("vhost_net: basic polling support") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-17vhost/vsock: handle vhost_vq_init_access() errorStefan Hajnoczi
[ Upstream commit 0516ffd88fa0d006ee80389ce14a9ca5ae45e845 ] Propagate the error when vhost_vq_init_access() fails and set vq->private_data to NULL. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-09vhost: fix initialization for vq->is_leHalil Pasic
commit cda8bba0f99d25d2061c531113c14fa41effc3ae upstream. Currently, under certain circumstances vhost_init_is_le does just a part of the initialization job, and depends on vhost_reset_is_le being called too. For this reason vhost_vq_init_access used to call vhost_reset_is_le when vq->private_data is NULL. This is not only counter intuitive, but also real a problem because it breaks vhost_net. The bug was introduced to vhost_net with commit 2751c9882b94 ("vhost: cross-endian support for legacy devices"). The symptom is corruption of the vq's used.idx field (virtio) after VHOST_NET_SET_BACKEND was issued as a part of the vhost shutdown on a vq with pending descriptors. Let us make sure the outcome of vhost_init_is_le never depend on the state it is actually supposed to initialize, and fix virtio_net by removing the reset from vhost_vq_init_access. With the above, there is no reason for vhost_reset_is_le to do just half of the job. Let us make vhost_reset_is_le reinitialize is_le. Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reported-by: Michael A. Tebolt <miket@us.ibm.com> Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Fixes: commit 2751c9882b94 ("vhost: cross-endian support for legacy devices") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Michael A. Tebolt <miket@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-08vhost-vsock: fix orphan connection resetPeng Tao
local_addr.svm_cid is host cid. We should check guest cid instead, which is remote_addr.svm_cid. Otherwise we end up resetting all connections to all guests. Cc: stable@vger.kernel.org [4.8+] Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Peng Tao <bergwolf@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-05Merge 4.8-rc5 into char-misc-nextGreg Kroah-Hartman
We want the fixes in here for merging and testing. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-31miscdevice: Add helper macro for misc device boilerplatePrasannaKumar Muralidharan
Many modules call misc_register and misc_deregister in its module init and exit methods without any additional code. This ends up being boilerplate. This patch adds helper macro module_misc_device(), that replaces module_init()/ module_exit() with template functions. This patch also converts drivers to use new macro. Change since v1: Add device.h include in miscdevice.h as module_driver macro was not available from other include files in some architectures. Signed-off-by: PrasannaKumar Muralidharan <prasannatsmkumar@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-23vhost/scsi: fix reuse of &vq->iov[out] in responseBenjamin Coddington
The address of the iovec &vq->iov[out] is not guaranteed to contain the scsi command's response iovec throughout the lifetime of the command. Rather, it is more likely to contain an iovec from an immediately following command after looping back around to vhost_get_vq_desc(). Pass along the iovec entirely instead. Fixes: 79c14141a487 ("vhost/scsi: Convert completion path to use copy_to_iter") Cc: stable@vger.kernel.org Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-15vhost/test: fix after swiotlb changesMichael S. Tsirkin
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-09vhost/vsock: fix vhost virtio_vsock_pkt use-after-freeStefan Hajnoczi
Stash the packet length in a local variable before handing over ownership of the packet to virtio_transport_recv_pkt() or virtio_transport_free_pkt(). This patch solves the use-after-free since pkt is no longer guaranteed to be alive. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-06Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio/vhost updates from Michael Tsirkin: - new vsock device support in host and guest - platform IOMMU support in host and guest, including compatibility quirks for legacy systems. - misc fixes and cleanups. * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: VSOCK: Use kvfree() vhost: split out vringh Kconfig vhost: detect 32 bit integer wrap around vhost: new device IOTLB API vhost: drop vringh dependency vhost: convert pre sorted vhost memory array to interval tree vhost: introduce vhost memory accessors VSOCK: Add Makefile and Kconfig VSOCK: Introduce vhost_vsock.ko VSOCK: Introduce virtio_transport.ko VSOCK: Introduce virtio_vsock_common.ko VSOCK: defer sock removal to transports VSOCK: transport-specific vsock_transport functions vhost: drop vringh dependency vop: pull in vhost Kconfig virtio: new feature to detect IOMMU device quirk balloon: check the number of available pages in leak balloon vhost: lockless enqueuing vhost: simplify work flushing
2016-08-02VSOCK: Use kvfree()Wei Yongjun
Use kvfree() instead of open-coding it. Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02vhost: split out vringh KconfigMichael S. Tsirkin
vringh is pulled in by caif and mic, but the other vhost config does not need to be there. In particular, it makes no sense to have vhost net/scsi/sock under caif/mic. Create a separate Kconfig file and put vringh bits there. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02vhost: detect 32 bit integer wrap aroundMichael S. Tsirkin
Detect and fail early if long wrap around is triggered. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02vhost: new device IOTLB APIJason Wang
This patch tries to implement an device IOTLB for vhost. This could be used with userspace(qemu) implementation of DMA remapping to emulate an IOMMU for the guest. The idea is simple, cache the translation in a software device IOTLB (which is implemented as an interval tree) in vhost and use vhost_net file descriptor for reporting IOTLB miss and IOTLB update/invalidation. When vhost meets an IOTLB miss, the fault address, size and access can be read from the file. After userspace finishes the translation, it writes the translated address to the vhost_net file to update the device IOTLB. When device IOTLB is enabled by setting VIRTIO_F_IOMMU_PLATFORM all vq addresses set by ioctl are treated as iova instead of virtual address and the accessing can only be done through IOTLB instead of direct userspace memory access. Before each round or vq processing, all vq metadata is prefetched in device IOTLB to make sure no translation fault happens during vq processing. In most cases, virtqueues are contiguous even in virtual address space. The IOTLB translation for virtqueue itself may make it a little slower. We might add fast path cache on top of this patch. Signed-off-by: Jason Wang <jasowang@redhat.com> [mst: use virtio feature bit: VHOST_F_DEVICE_IOTLB -> VIRTIO_F_IOMMU_PLATFORM ] [mst: fix build warnings ] Signed-off-by: Michael S. Tsirkin <mst@redhat.com> [ weiyj.lk: missing unlock on error ] Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
2016-08-02vhost: convert pre sorted vhost memory array to interval treeJason Wang
Current pre-sorted memory region array has some limitations for future device IOTLB conversion: 1) need extra work for adding and removing a single region, and it's expected to be slow because of sorting or memory re-allocation. 2) need extra work of removing a large range which may intersect several regions with different size. 3) need trick for a replacement policy like LRU To overcome the above shortcomings, this patch convert it to interval tree which can easily address the above issue with almost no extra work. The patch could be used for: - Extend the current API and only let the userspace to send diffs of memory table. - Simplify Device IOTLB implementation. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02vhost: introduce vhost memory accessorsJason Wang
This patch introduces vhost memory accessors which were just wrappers for userspace address access helpers. This is a requirement for vhost device iotlb implementation which will add iotlb translations in those accessors. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02VSOCK: Add Makefile and KconfigAsias He
Enable virtio-vsock and vhost-vsock. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02VSOCK: Introduce vhost_vsock.koAsias He
VM sockets vhost transport implementation. This driver runs on the host. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02vhost: drop vringh dependencyMichael S. Tsirkin
vringh isn't used by vhost net or scsi - it's used by CAIF only at the moment. Drop the dependency. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-01vhost: lockless enqueuingJason Wang
We use spinlock to synchronize the work list now which may cause unnecessary contentions. So this patch switch to use llist to remove this contention. Pktgen tests shows about 5% improvement: Before: ~1300000 pps After: ~1370000 pps Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-01vhost: simplify work flushingJason Wang
We used to implement the work flushing through tracking queued seq, done seq, and the number of flushing. This patch simplify this by just implement work flushing through another kind of vhost work with completion. This will be used by lockless enqueuing patch. Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-01tun: switch to use skb array for txJason Wang
We used to queue tx packets in sk_receive_queue, this is less efficient since it requires spinlocks to synchronize between producer and consumer. This patch tries to address this by: - switch from sk_receive_queue to a skb_array, and resize it when tx_queue_len was changed. - introduce a new proto_ops peek_len which was used for peeking the skb length. - implement a tun version of peek_len for vhost_net to use and convert vhost_net to use peek_len if possible. Pktgen test shows about 15.3% improvement on guest receiving pps for small buffers: Before: ~1300000pps After : ~1500000pps Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07vhost_net: stop polling socket during rx processingJason Wang
We don't stop rx polling socket during rx processing, this will lead unnecessary wakeups from under layer net devices (E.g sock_def_readable() form tun). Rx will be slowed down in this way. This patch avoids this by stop polling socket during rx processing. A small drawback is that this introduces some overheads in light load case because of the extra start/stop polling, but single netperf TCP_RR does not notice any change. In a super heavy load case, e.g using pktgen to inject packet to guest, we get about ~8.8% improvement on pps: before: ~1240000 pkt/s after: ~1350000 pkt/s Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-10target: make close_session optionalChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-05-10target: make ->shutdown_session optionalChristoph Hellwig
Turns out the template and thus many drivers got the return value wrong: 0 means the fabrics driver needs to put a session reference, which no driver except for the iSCSI target drivers did. Fortunately none of these drivers supports explicit Node ACLs, so the bug was harmless. Even without that only qla2xxx and iscsi every did real work in shutdown_session, so get rid of the boilerplate code in all other drivers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-22Merge branch 'for-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "The highlights this round include: - Add target_alloc_session() w/ callback helper for doing se_session allocation + tag + se_node_acl lookup. (HCH + nab) - Tree-wide fabric driver conversion to use target_alloc_session() - Convert sbp-target to use percpu_ida tag pre-allocation, and TARGET_SCF_ACK_KREF I/O krefs (Chris Boot + nab) - Convert usb-gadget to use percpu_ida tag pre-allocation, and TARGET_SCF_ACK_KREF I/O krefs (Andrzej Pietrasiewicz + nab) - Convert xen-scsiback to use percpu_ida tag pre-allocation, and TARGET_SCF_ACK_KREF I/O krefs (Juergen Gross + nab) - Convert tcm_fc to use TARGET_SCF_ACK_KREF I/O + TMR krefs - Convert ib_srpt to use percpu_ida tag pre-allocation - Add DebugFS node for qla2xxx target sess list (Quinn) - Rework iser-target connection termination (Jenny + Sagi) - Convert iser-target to new CQ API (HCH) - Add pass-through WRITE_SAME support for IBLOCK (Mike Christie) - Introduce data_bitmap for asynchronous access of data area (Sheng Yang + Andy) - Fix target_release_cmd_kref shutdown comp leak (Himanshu Madhani) Also, there is a separate PULL request coming for cxgb4 NIC driver prerequisites for supporting hw iscsi segmentation offload (ISO), that will be the base for a number of v4.7 developments involving iscsi-target hw offloads" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (36 commits) target: Fix target_release_cmd_kref shutdown comp leak target: Avoid DataIN transfers for non-GOOD SAM status target/user: Report capability of handling out-of-order completions to userspace target/user: Fix size_t format-spec build warning target/user: Don't free expired command when time out target/user: Introduce data_bitmap, replace data_length/data_head/data_tail target/user: Free data ring in unified function target/user: Use iovec[] to describe continuous area target: Remove enum transport_lunflags_table target/iblock: pass WRITE_SAME to device if possible iser-target: Kill the ->isert_cmd back pointer in struct iser_tx_desc iser-target: Kill struct isert_rdma_wr iser-target: Convert to new CQ API iser-target: Split and properly type the login buffer iser-target: Remove ISER_RECV_DATA_SEG_LEN iser-target: Remove impossible condition from isert_wait_conn iser-target: Remove redundant wait in release_conn iser-target: Rework connection termination iser-target: Separate flows for np listeners and connections cma events iser-target: Add new state ISER_CONN_BOUND to isert_conn ...
2016-03-10vhost/scsi: Convert to target_alloc_session usageNicholas Bellinger
This patch converts vhost/scsi pre-allocation of vhost_scsi_cmd descriptors to use the new alloc_session callback(). Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-11vhost_net: basic polling supportJason Wang
This patch tries to poll for new added tx buffer or socket receive queue for a while at the end of tx/rx processing. The maximum time spent on polling were specified through a new kind of vring ioctl. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11vhost: introduce vhost_vq_avail_empty()Jason Wang
This patch introduces a helper which will return true if we're sure that the available ring is empty for a specific vq. When we're not sure, e.g vq access failure, return false instead. This could be used for busy polling code to exit the busy loop. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11vhost: introduce vhost_has_work()Jason Wang
This path introduces a helper which can give a hint for whether or not there's a work queued in the work list. This could be used for busy polling code to exit the busy loop. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-02vhost: rename vhost_init_used()Greg Kurz
Looking at how callers use this, maybe we should just rename init_used to vhost_vq_init_access. The _used suffix was a hint that we access the vq used ring. But maybe what callers care about is that it must be called after access_ok. Also, this function manipulates the vq->is_le field which isn't related to the vq used ring. This patch simply renames vhost_init_used() to vhost_vq_init_access() as suggested by Michael. No behaviour change. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-02vhost: rename cross-endian helpersGreg Kurz
The default use case for vhost is when the host and the vring have the same endianness (default native endianness). But there are cases where they differ and vhost should byteswap when accessing the vring. The first case is when the host is big endian and the vring belongs to a virtio 1.0 device, which is always little endian. This is covered by the vq->is_le field. This field is initialized when userspace calls the VHOST_SET_FEATURES ioctl. It is reset when the device stops. We already have a vhost_init_is_le() helper, but the reset operation is opencoded as follows: vq->is_le = virtio_legacy_is_little_endian(); It isn't clear that we are resetting vq->is_le here. This patch moves the code to a helper with a more explicit name. The other case where we may have to byteswap is when the architecture can switch endianness at runtime (bi-endian). If endianness differs in the host and in the guest, then legacy devices need to be used in cross-endian mode. This mode is available with CONFIG_VHOST_CROSS_ENDIAN_LEGACY=y, which introduces a vq->user_be field. Userspace may enable cross-endian mode by calling the SET_VRING_ENDIAN ioctl before the device is started. The cross-endian mode is disabled when the device is stopped. The current names of the helpers that manipulate vq->user_be are unclear. This patch renames those helpers to clearly show that this is cross-endian stuff and with explicit enable/disable semantics. No behaviour change. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-02vhost: fix error path in vhost_init_used()Greg Kurz
We don't want side effects. If something fails, we rollback vq->is_le to its previous value. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-07vhost: replace % with & on data pathMichael S. Tsirkin
We know vring num is a power of 2, so use & to mask the high bits. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-07vhost: relax log address alignmentMichael S. Tsirkin
commit 5d9a07b0de512b77bf28d2401e5fe3351f00a240 ("vhost: relax used address alignment") fixed the alignment for the used virtual address, but not for the physical address used for logging. That's a mistake: alignment should clearly be the same for virtual and physical addresses, Cc: stable@vger.kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-11-13Merge branch 'for-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "This series contains HCH's changes to absorb configfs attribute ->show() + ->store() function pointer usage from it's original tree-wide consumers, into common configfs code. It includes usb-gadget, target w/ drivers, netconsole and ocfs2 changes to realize the improved simplicity, that now renders the original include/target/configfs_macros.h CPP magic for fabric drivers and others, unnecessary and obsolete. And with common code in place, new configfs attributes can be added easier than ever before. Note, there are further improvements in-flight from other folks for v4.5 code in configfs land, plus number of target fixes for post -rc1 code" In the meantime, a new user of the now-removed old configfs API came in through the char/misc tree in commit 7bd1d4093c2f ("stm class: Introduce an abstraction for System Trace Module devices"). This merge resolution comes from Alexander Shishkin, who updated his stm class tracing abstraction to account for the removal of the old show_attribute and store_attribute methods in commit 517982229f78 ("configfs: remove old API") from this pull. As Alexander says about that patch: "There's no need to keep an extra wrapper structure per item and the awkward show_attribute/store_attribute item ops are no longer needed. This patch converts policy code to the new api, all the while making the code quite a bit smaller and easier on the eyes. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>" That patch was folded into the merge so that the tree should be fully bisectable. * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (23 commits) configfs: remove old API ocfs2/cluster: use per-attribute show and store methods ocfs2/cluster: move locking into attribute store methods netconsole: use per-attribute show and store methods target: use per-attribute show and store methods spear13xx_pcie_gadget: use per-attribute show and store methods dlm: use per-attribute show and store methods usb-gadget/f_serial: use per-attribute show and store methods usb-gadget/f_phonet: use per-attribute show and store methods usb-gadget/f_obex: use per-attribute show and store methods usb-gadget/f_uac2: use per-attribute show and store methods usb-gadget/f_uac1: use per-attribute show and store methods usb-gadget/f_mass_storage: use per-attribute show and store methods usb-gadget/f_sourcesink: use per-attribute show and store methods usb-gadget/f_printer: use per-attribute show and store methods usb-gadget/f_midi: use per-attribute show and store methods usb-gadget/f_loopback: use per-attribute show and store methods usb-gadget/ether: use per-attribute show and store methods usb-gadget/f_acm: use per-attribute show and store methods usb-gadget/f_hid: use per-attribute show and store methods ...
2015-10-27vhost: fix performance on LE hostsMichael S. Tsirkin
commit 2751c9882b947292fcfb084c4f604e01724af804 ("vhost: cross-endian support for legacy devices") introduced a minor regression: even with cross-endian disabled, and even on LE host, vhost_is_little_endian is checking is_le flag so there's always a branch. To fix, simply check virtio_legacy_is_little_endian first. Cc: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13target: use per-attribute show and store methodsChristoph Hellwig
This also allows to remove the target-specific old configfs macros, and gets rid of the target_core_fabric_configfs.h header which only had one function declaration left that could be moved to a better place. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org> Acked-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-09-18Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio fixes and cleanups from Michael Tsirkin: "This fixes the virtio-test tool, and improves the error handling for virtio-ccw" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio/s390: handle failures of READ_VQ_CONF ccw tools/virtio: propagate V=X to kernel build vhost: move features to core tools/virtio: fix build after 4.2 changes