aboutsummaryrefslogtreecommitdiff
path: root/drivers/scsi
AgeCommit message (Collapse)Author
2015-09-13libfc: Fix fc_fcp_cleanup_each_cmd()Bart Van Assche
commit 8f2777f53e3d5ad8ef2a176a4463a5c8e1a16431 upstream. Since fc_fcp_cleanup_cmd() can sleep this function must not be called while holding a spinlock. This patch avoids that fc_fcp_cleanup_each_cmd() triggers the following bug: BUG: scheduling while atomic: sg_reset/1512/0x00000202 1 lock held by sg_reset/1512: #0: (&(&fsp->scsi_pkt_lock)->rlock){+.-...}, at: [<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc] Preemption disabled at:[<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc] Call Trace: [<ffffffff816c612c>] dump_stack+0x4f/0x7b [<ffffffff810828bc>] __schedule_bug+0x6c/0xd0 [<ffffffff816c87aa>] __schedule+0x71a/0xa10 [<ffffffff816c8ad2>] schedule+0x32/0x80 [<ffffffffc0217eac>] fc_seq_set_resp+0xac/0x100 [libfc] [<ffffffffc0218b11>] fc_exch_done+0x41/0x60 [libfc] [<ffffffffc0225cff>] fc_fcp_cleanup_each_cmd.isra.21+0xcf/0x150 [libfc] [<ffffffffc0225f43>] fc_eh_device_reset+0x1c3/0x270 [libfc] [<ffffffff814a2cc9>] scsi_try_bus_device_reset+0x29/0x60 [<ffffffff814a3908>] scsi_ioctl_reset+0x258/0x2d0 [<ffffffff814a2650>] scsi_ioctl+0x150/0x440 [<ffffffff814b3a9d>] sd_ioctl+0xad/0x120 [<ffffffff8132f266>] blkdev_ioctl+0x1b6/0x810 [<ffffffff811da608>] block_ioctl+0x38/0x40 [<ffffffff811b4e08>] do_vfs_ioctl+0x2f8/0x530 [<ffffffff811b50c1>] SyS_ioctl+0x81/0xa0 [<ffffffff816cf8b2>] system_call_fastpath+0x16/0x7a Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Vasu Dev <vasu.dev@intel.com> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16sg_start_req(): make sure that there's not too many elements in iovecAl Viro
commit 451a2886b6bf90e2fb378f7c46c655450fb96e81 upstream. unfortunately, allowing an arbitrary 16bit value means a possibility of overflow in the calculation of total number of pages in bio_map_user_iov() - we rely on there being no more than PAGE_SIZE members of sum in the first loop there. If that sum wraps around, we end up allocating too small array of pointers to pages and it's easy to overflow it in the second loop. X-Coverup: TINC (and there's no lumber cartel either) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [bwh: s/MAX_UIOVEC/UIO_MAXIOV/. This was fixed upstream by commit fdc81f45e9f5 ("sg_start_req(): use import_iovec()"), but we don't have that function.] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16ipr: Fix invalid array indexing for HRRQBrian King
commit 3f1c0581310d5d94bd72740231507e763a6252a4 upstream. Fixes another signed / unsigned array indexing bug in the ipr driver. Currently, when hrrq_index wraps, it becomes a negative number. We do the modulo, but still have a negative number, so we end up indexing backwards in the array. Given where the hrrq array is located in memory, we probably won't actually reference memory we don't own, but nonetheless ipr is still looking at data within struct ipr_ioa_cfg and interpreting it as struct ipr_hrr_queue data, so bad things could certainly happen. Each ipr adapter has anywhere from 1 to 16 HRRQs. By default, we use 2 on new adapters. Let's take an example: Assume ioa_cfg->hrrq_index=0x7fffffffe and ioa_cfg->hrrq_num=4: The atomic_add_return will then return -1. We mod this with 3 and get -2, add one and get -1 for an array index. On adapters which support more than a single HRRQ, we dedicate HRRQ to adapter initialization and error interrupts so that we can optimize the other queues for fast path I/O. So all normal I/O uses HRRQ 1-15. So we want to spread the I/O requests across those HRRQs. With the default module parameter settings, this bug won't hit, only when someone sets the ipr.number_of_msix parameter to a value larger than 3 is when bad things start to happen. Tested-by: Wen Xiong <wenxiong@linux.vnet.ibm.com> Reviewed-by: Wen Xiong <wenxiong@linux.vnet.ibm.com> Reviewed-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16ipr: Fix incorrect trace indexingBrian King
commit bb7c54339e6a10ecce5c4961adf5e75b3cf0af30 upstream. When ipr's internal driver trace was changed to an atomic, a signed/unsigned bug slipped in which results in us indexing backwards in our memory buffer writing on memory that does not belong to us. This patch fixes this by removing the modulo and instead just mask off the low bits. Tested-by: Wen Xiong <wenxiong@linux.vnet.ibm.com> Reviewed-by: Wen Xiong <wenxiong@linux.vnet.ibm.com> Reviewed-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16ipr: Fix locking for unit attention handlingBrian King
commit 36b8e180e1e929e00b351c3b72aab3147fc14116 upstream. Make sure we have the host lock held when calling scsi_report_bus_reset. Fixes a crash seen as the __devices list in the scsi host was changing as we were iterating through it. Reviewed-by: Wen Xiong <wenxiong@linux.vnet.ibm.com> Reviewed-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-10st: null pointer dereference panic caused by use after kref_put by st_openSeymour, Shane M
commit e7ac6c6666bec0a354758a1298d3231e4a635362 upstream. Two SLES11 SP3 servers encountered similar crashes simultaneously following some kind of SAN/tape target issue: ... qla2xxx [0000:81:00.0]-801c:3: Abort command issued nexus=3:0:2 -- 1 2002. qla2xxx [0000:81:00.0]-801c:3: Abort command issued nexus=3:0:2 -- 1 2002. qla2xxx [0000:81:00.0]-8009:3: DEVICE RESET ISSUED nexus=3:0:2 cmd=ffff882f89c2c7c0. qla2xxx [0000:81:00.0]-800c:3: do_reset failed for cmd=ffff882f89c2c7c0. qla2xxx [0000:81:00.0]-800f:3: DEVICE RESET FAILED: Task management failed nexus=3:0:2 cmd=ffff882f89c2c7c0. qla2xxx [0000:81:00.0]-8009:3: TARGET RESET ISSUED nexus=3:0:2 cmd=ffff882f89c2c7c0. qla2xxx [0000:81:00.0]-800c:3: do_reset failed for cmd=ffff882f89c2c7c0. qla2xxx [0000:81:00.0]-800f:3: TARGET RESET FAILED: Task management failed nexus=3:0:2 cmd=ffff882f89c2c7c0. qla2xxx [0000:81:00.0]-8012:3: BUS RESET ISSUED nexus=3:0:2. qla2xxx [0000:81:00.0]-802b:3: BUS RESET SUCCEEDED nexus=3:0:2. qla2xxx [0000:81:00.0]-505f:3: Link is operational (8 Gbps). qla2xxx [0000:81:00.0]-8018:3: ADAPTER RESET ISSUED nexus=3:0:2. qla2xxx [0000:81:00.0]-00af:3: Performing ISP error recovery - ha=ffff88bf04d18000. rport-3:0-0: blocked FC remote port time out: removing target and saving binding qla2xxx [0000:81:00.0]-505f:3: Link is operational (8 Gbps). qla2xxx [0000:81:00.0]-8017:3: ADAPTER RESET SUCCEEDED nexus=3:0:2. rport-2:0-0: blocked FC remote port time out: removing target and saving binding sg_rq_end_io: device detached BUG: unable to handle kernel NULL pointer dereference at 00000000000002a8 IP: [<ffffffff8133b268>] __pm_runtime_idle+0x28/0x90 PGD 7e6586f067 PUD 7e5af06067 PMD 0 [1739975.390354] Oops: 0002 [#1] SMP CPU 0 ... Supported: No, Proprietary modules are loaded [1739975.390463] Pid: 27965, comm: ABCD Tainted: PF X 3.0.101-0.29-default #1 HP ProLiant DL580 Gen8 RIP: 0010:[<ffffffff8133b268>] [<ffffffff8133b268>] __pm_runtime_idle+0x28/0x90 RSP: 0018:ffff8839dc1e7c68 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff883f0592fc00 RCX: 0000000000000090 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000138 RBP: 0000000000000138 R08: 0000000000000010 R09: ffffffff81bd39d0 R10: 00000000000009c0 R11: ffffffff81025790 R12: 0000000000000001 R13: ffff883022212b80 R14: 0000000000000004 R15: ffff883022212b80 FS: 00007f8e54560720(0000) GS:ffff88407f800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000000002a8 CR3: 0000007e6ced6000 CR4: 00000000001407f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process ABCD (pid: 27965, threadinfo ffff8839dc1e6000, task ffff883592e0c640) Stack: ffff883f0592fc00 00000000fffffffa 0000000000000001 ffff883022212b80 ffff883eff772400 ffffffffa03fa309 0000000000000000 0000000000000000 ffffffffa04003a0 ffff883f063196c0 ffff887f0379a930 ffffffff8115ea1e Call Trace: [<ffffffffa03fa309>] st_open+0x129/0x240 [st] [<ffffffff8115ea1e>] chrdev_open+0x13e/0x200 [<ffffffff811588a8>] __dentry_open+0x198/0x310 [<ffffffff81167d74>] do_last+0x1f4/0x800 [<ffffffff81168fe9>] path_openat+0xd9/0x420 [<ffffffff8116946c>] do_filp_open+0x4c/0xc0 [<ffffffff8115a00f>] do_sys_open+0x17f/0x250 [<ffffffff81468d92>] system_call_fastpath+0x16/0x1b [<00007f8e4f617fd0>] 0x7f8e4f617fcf Code: eb d3 90 48 83 ec 28 40 f6 c6 04 48 89 6c 24 08 4c 89 74 24 20 48 89 fd 48 89 1c 24 4c 89 64 24 10 41 89 f6 4c 89 6c 24 18 74 11 <f0> ff 8f 70 01 00 00 0f 94 c0 45 31 ed 84 c0 74 2b 4c 8d a5 a0 RIP [<ffffffff8133b268>] __pm_runtime_idle+0x28/0x90 RSP <ffff8839dc1e7c68> CR2: 00000000000002a8 Analysis reveals the cause of the crash to be due to STp->device being NULL. The pointer was NULLed via scsi_tape_put(STp) when it calls scsi_tape_release(). In st_open() we jump to err_out after scsi_block_when_processing_errors() completes and returns the device as offline (sdev_state was SDEV_DEL): 1180 /* Open the device. Needs to take the BKL only because of incrementing the SCSI host 1181 module count. */ 1182 static int st_open(struct inode *inode, struct file *filp) 1183 { 1184 int i, retval = (-EIO); 1185 int resumed = 0; 1186 struct scsi_tape *STp; 1187 struct st_partstat *STps; 1188 int dev = TAPE_NR(inode); 1189 char *name; ... 1217 if (scsi_autopm_get_device(STp->device) < 0) { 1218 retval = -EIO; 1219 goto err_out; 1220 } 1221 resumed = 1; 1222 if (!scsi_block_when_processing_errors(STp->device)) { 1223 retval = (-ENXIO); 1224 goto err_out; 1225 } ... 1264 err_out: 1265 normalize_buffer(STp->buffer); 1266 spin_lock(&st_use_lock); 1267 STp->in_use = 0; 1268 spin_unlock(&st_use_lock); 1269 scsi_tape_put(STp); <-- STp->device = 0 after this 1270 if (resumed) 1271 scsi_autopm_put_device(STp->device); 1272 return retval; The ref count for the struct scsi_tape had already been reduced to 1 when the .remove method of the st module had been called. The kref_put() in scsi_tape_put() caused scsi_tape_release() to be called: 0266 static void scsi_tape_put(struct scsi_tape *STp) 0267 { 0268 struct scsi_device *sdev = STp->device; 0269 0270 mutex_lock(&st_ref_mutex); 0271 kref_put(&STp->kref, scsi_tape_release); <-- calls this 0272 scsi_device_put(sdev); 0273 mutex_unlock(&st_ref_mutex); 0274 } In scsi_tape_release() the struct scsi_device in the struct scsi_tape gets set to NULL: 4273 static void scsi_tape_release(struct kref *kref) 4274 { 4275 struct scsi_tape *tpnt = to_scsi_tape(kref); 4276 struct gendisk *disk = tpnt->disk; 4277 4278 tpnt->device = NULL; <<<---- where the dev is nulled 4279 4280 if (tpnt->buffer) { 4281 normalize_buffer(tpnt->buffer); 4282 kfree(tpnt->buffer->reserved_pages); 4283 kfree(tpnt->buffer); 4284 } 4285 4286 disk->private_data = NULL; 4287 put_disk(disk); 4288 kfree(tpnt); 4289 return; 4290 } Although the problem was reported on SLES11.3 the problem appears in linux-next as well. The crash is fixed by reordering the code so we no longer access the struct scsi_tape after the kref_put() is done on it in st_open(). Signed-off-by: Shane Seymour <shane.seymour@hp.com> Signed-off-by: Darren Lavender <darren.lavender@hp.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.com> Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03qla2xxx: Mark port lost when we receive an RSCN for it.Chad Dupuis
commit ef86cb2059a14b4024c7320999ee58e938873032 upstream. Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03ipr: Increase default adapter init stage change timeoutBrian King
commit 45c44b5ff9caa743ed9c2bfd44307c536c9caf1e upstream. Increase the default init stage change timeout from 15 seconds to 30 seconds. This resolves issues we have seen with some adapters not transitioning to the first init stage within 15 seconds, which results in adapter initialization failures. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-07-03hpsa: add missing pci_set_master in kdump pathTomas Henzl
commit 859c75aba20264d87dd026bab0d0ca3bff385955 upstream. Add a call to pci_set_master(...) missing in the previous patch "hpsa: refine the pci enable/disable handling". Found thanks to Rob Elliot. Signed-off-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Robert Elliott <elliott@hp.com> Tested-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-07-03hpsa: refine the pci enable/disable handlingTomas Henzl
commit 132aa220b45d60e9b20def1e9d8be9422eed9616 upstream. When a second(kdump) kernel starts and the hard reset method is used the driver calls pci_disable_device without previously enabling it, so the kernel shows a warning - [ 16.876248] WARNING: at drivers/pci/pci.c:1431 pci_disable_device+0x84/0x90() [ 16.882686] Device hpsa disabling already-disabled device ... This patch fixes it, in addition to this I tried to balance also some other pairs of enable/disable device in the driver. Unfortunately I wasn't able to verify the functionality for the case of a sw reset, because of a lack of proper hw. Signed-off-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-06-29lpfc: Add iotag memory barrierJames Smart
commit 27f344eb15dd0da80ebec80c7245e8c85043f841 upstream. Add a memory barrier to ensure the valid bit is read before any of the cqe payload is read. This fixes an issue seen on Power where the cqe payload was getting loaded before the valid bit. When this occurred, we saw an iotag out of range error when a command completed, but since the iotag looked invalid the command didn't get completed to scsi core. Later we hit the command timeout, attempted to abort the command, then waited for the aborted command to get returned. Since the adapter already returned the command, we timeout waiting, and end up escalating EEH all the way to host reset. This patch fixes this issue. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-133w-sas: fix command completion raceChristoph Hellwig
commit 579d69bc1fd56d5af5761969aa529d1d1c188300 upstream. The 3w-sas driver needs to tear down the dma mappings before returning the command to the midlayer, as there is no guarantee the sglist and count are valid after that point. Also remove the dma mapping helpers which have another inherent race due to the request_id index. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Torsten Luettgert <ml-lkml@enda.eu> Tested-by: Bernd Kardatzki <Bernd.Kardatzki@med.uni-tuebingen.de> Acked-by: Adam Radford <aradford@gmail.com> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-133w-9xxx: fix command completion raceChristoph Hellwig
commit 118c855b5623f3e2e6204f02623d88c09e0c34de upstream. The 3w-9xxx driver needs to tear down the dma mappings before returning the command to the midlayer, as there is no guarantee the sglist and count are valid after that point. Also remove the dma mapping helpers which have another inherent race due to the request_id index. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Adam Radford <aradford@gmail.com> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-133w-xxxx: fix command completion raceChristoph Hellwig
commit 9cd9554615cba14f0877cc9972a6537ad2bdde61 upstream. The 3w-xxxx driver needs to tear down the dma mappings before returning the command to the midlayer, as there is no guarantee the sglist and count are valid after that point. Also remove the dma mapping helpers which have another inherent race due to the request_id index. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Adam Radford <aradford@gmail.com> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-06mvsas: fix panic on expander attached SATA devicesJames Bottomley
commit 56cbd0ccc1b508de19561211d7ab9e1c77e6b384 upstream. mvsas is giving a General protection fault when it encounters an expander attached ATA device. Analysis of mvs_task_prep_ata() shows that the driver is assuming all ATA devices are locally attached and obtaining the phy mask by indexing the local phy table (in the HBA structure) with the phy id. Since expanders have many more phys than the HBA, this is causing the index into the HBA phy table to overflow and returning rubbish as the pointer. mvs_task_prep_ssp() instead does the phy mask using the port properties. Mirror this in mvs_task_prep_ata() to fix the panic. Reported-by: Adam Talbot <ajtalbot1@gmail.com> Tested-by: Adam Talbot <ajtalbot1@gmail.com> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-06scsi: storvsc: Fix a bug in copy_from_bounce_buffer()K. Y. Srinivasan
commit 8de580742fee8bc34d116f57a20b22b9a5f08403 upstream. We may exit this function without properly freeing up the maapings we may have acquired. Fix the bug. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-19be2iscsi: Fix kernel panic when device initialization failsJohn Soni Jose
commit 2e7cee027b26cbe7e6685a7a14bd2850bfe55d33 upstream. Kernel panic was happening as iscsi_host_remove() was called on a host which was not yet added. Signed-off-by: John Soni Jose <sony.john-n@emulex.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-19Defer processing of REQ_PREEMPT requests for blocked devicesBart Van Assche
commit bba0bdd7ad4713d82338bcd9b72d57e9335a664b upstream. SCSI transport drivers and SCSI LLDs block a SCSI device if the transport layer is not operational. This means that in this state no requests should be processed, even if the REQ_PREEMPT flag has been set. This patch avoids that a rescan shortly after a cable pull sporadically triggers the following kernel oops: BUG: unable to handle kernel paging request at ffffc9001a6bc084 IP: [<ffffffffa04e08f2>] mlx4_ib_post_send+0xd2/0xb30 [mlx4_ib] Process rescan-scsi-bus (pid: 9241, threadinfo ffff88053484a000, task ffff880534aae100) Call Trace: [<ffffffffa0718135>] srp_post_send+0x65/0x70 [ib_srp] [<ffffffffa071b9df>] srp_queuecommand+0x1cf/0x3e0 [ib_srp] [<ffffffffa0001ff1>] scsi_dispatch_cmd+0x101/0x280 [scsi_mod] [<ffffffffa0009ad1>] scsi_request_fn+0x411/0x4d0 [scsi_mod] [<ffffffff81223b37>] __blk_run_queue+0x27/0x30 [<ffffffff8122a8d2>] blk_execute_rq_nowait+0x82/0x110 [<ffffffff8122a9c2>] blk_execute_rq+0x62/0xf0 [<ffffffffa000b0e8>] scsi_execute+0xe8/0x190 [scsi_mod] [<ffffffffa000b2f3>] scsi_execute_req+0xa3/0x130 [scsi_mod] [<ffffffffa000c1aa>] scsi_probe_lun+0x17a/0x450 [scsi_mod] [<ffffffffa000ce86>] scsi_probe_and_add_lun+0x156/0x480 [scsi_mod] [<ffffffffa000dc2f>] __scsi_scan_target+0xdf/0x1f0 [scsi_mod] [<ffffffffa000dfa3>] scsi_scan_host_selected+0x183/0x1c0 [scsi_mod] [<ffffffffa000edfb>] scsi_scan+0xdb/0xe0 [scsi_mod] [<ffffffffa000ee13>] store_scan+0x13/0x20 [scsi_mod] [<ffffffff811c8d9b>] sysfs_write_file+0xcb/0x160 [<ffffffff811589de>] vfs_write+0xce/0x140 [<ffffffff81158b53>] sys_write+0x53/0xa0 [<ffffffff81464592>] system_call_fastpath+0x16/0x1b [<00007f611c9d9300>] 0x7f611c9d92ff Reported-by: Max Gurtuvoy <maxg@mellanox.com> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-13tcm_qla2xxx: Fix incorrect use of __transport_register_sessionBart Van Assche
commit 75c3d0bf9caebb502e96683b2bc37f9692437e68 upstream. This patch fixes the incorrect use of __transport_register_session() in tcm_qla2xxx_check_initiator_node_acl() code, that does not perform explicit se_tpg->session_lock when accessing se_tpg->tpg_sess_list to add new se_sess nodes. Given that tcm_qla2xxx_check_initiator_node_acl() is not called with qla_hw->hardware_lock held for all accesses of ->tpg_sess_list, the code should be using transport_register_session() instead. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Giridhar Malavali <giridhar.malavali@qlogic.com> Cc: Quinn Tran <quinn.tran@qlogic.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26libsas: Fix Kernel Crash in smp_execute_taskJames Bottomley
commit 6302ce4d80aa82b3fdb5c5cd68e7268037091b47 upstream. This crash was reported: [ 366.947370] sd 3:0:1:0: [sdb] Spinning up disk.... [ 368.804046] BUG: unable to handle kernel NULL pointer dereference at (null) [ 368.804072] IP: [<ffffffff81358457>] __mutex_lock_common.isra.7+0x9c/0x15b [ 368.804098] PGD 0 [ 368.804114] Oops: 0002 [#1] SMP [ 368.804143] CPU 1 [ 368.804151] Modules linked in: sg netconsole s3g(PO) uinput joydev hid_multitouch usbhid hid snd_hda_codec_via cpufreq_userspace cpufreq_powersave cpufreq_stats uhci_hcd cpufreq_conservative snd_hda_intel snd_hda_codec snd_hwdep snd_pcm sdhci_pci snd_page_alloc sdhci snd_timer snd psmouse evdev serio_raw pcspkr soundcore xhci_hcd shpchp s3g_drm(O) mvsas mmc_core ahci libahci drm i2c_core acpi_cpufreq mperf video processor button thermal_sys dm_dmirror exfat_fs exfat_core dm_zcache dm_mod padlock_aes aes_generic padlock_sha iscsi_target_mod target_core_mod configfs sswipe libsas libata scsi_transport_sas picdev via_cputemp hwmon_vid fuse parport_pc ppdev lp parport autofs4 ext4 crc16 mbcache jbd2 sd_mod crc_t10dif usb_storage scsi_mod ehci_hcd usbcore usb_common [ 368.804749] [ 368.804764] Pid: 392, comm: kworker/u:3 Tainted: P W O 3.4.87-logicube-ng.22 #1 To be filled by O.E.M. To be filled by O.E.M./EPIA-M920 [ 368.804802] RIP: 0010:[<ffffffff81358457>] [<ffffffff81358457>] __mutex_lock_common.isra.7+0x9c/0x15b [ 368.804827] RSP: 0018:ffff880117001cc0 EFLAGS: 00010246 [ 368.804842] RAX: 0000000000000000 RBX: ffff8801185030d0 RCX: ffff88008edcb420 [ 368.804857] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff8801185030d4 [ 368.804873] RBP: ffff8801181531c0 R08: 0000000000000020 R09: 00000000fffffffe [ 368.804885] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801185030d4 [ 368.804899] R13: 0000000000000002 R14: ffff880117001fd8 R15: ffff8801185030d8 [ 368.804916] FS: 0000000000000000(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000 [ 368.804931] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 368.804946] CR2: 0000000000000000 CR3: 000000000160b000 CR4: 00000000000006e0 [ 368.804962] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 368.804978] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 368.804995] Process kworker/u:3 (pid: 392, threadinfo ffff880117000000, task ffff8801181531c0) [ 368.805009] Stack: [ 368.805017] ffff8801185030d8 0000000000000000 ffffffff8161ddf0 ffffffff81056f7c [ 368.805062] 000000000000b503 ffff8801185030d0 ffff880118503000 0000000000000000 [ 368.805100] ffff8801185030d0 ffff8801188b8000 ffff88008edcb420 ffffffff813583ac [ 368.805135] Call Trace: [ 368.805153] [<ffffffff81056f7c>] ? up+0xb/0x33 [ 368.805168] [<ffffffff813583ac>] ? mutex_lock+0x16/0x25 [ 368.805194] [<ffffffffa018c414>] ? smp_execute_task+0x4e/0x222 [libsas] [ 368.805217] [<ffffffffa018ce1c>] ? sas_find_bcast_dev+0x3c/0x15d [libsas] [ 368.805240] [<ffffffffa018ce4f>] ? sas_find_bcast_dev+0x6f/0x15d [libsas] [ 368.805264] [<ffffffffa018e989>] ? sas_ex_revalidate_domain+0x37/0x2ec [libsas] [ 368.805280] [<ffffffff81355a2a>] ? printk+0x43/0x48 [ 368.805296] [<ffffffff81359a65>] ? _raw_spin_unlock_irqrestore+0xc/0xd [ 368.805318] [<ffffffffa018b767>] ? sas_revalidate_domain+0x85/0xb6 [libsas] [ 368.805336] [<ffffffff8104e5d9>] ? process_one_work+0x151/0x27c [ 368.805351] [<ffffffff8104f6cd>] ? worker_thread+0xbb/0x152 [ 368.805366] [<ffffffff8104f612>] ? manage_workers.isra.29+0x163/0x163 [ 368.805382] [<ffffffff81052c4e>] ? kthread+0x79/0x81 [ 368.805399] [<ffffffff8135fea4>] ? kernel_thread_helper+0x4/0x10 [ 368.805416] [<ffffffff81052bd5>] ? kthread_flush_work_fn+0x9/0x9 [ 368.805431] [<ffffffff8135fea0>] ? gs_change+0x13/0x13 [ 368.805442] Code: 83 7d 30 63 7e 04 f3 90 eb ab 4c 8d 63 04 4c 8d 7b 08 4c 89 e7 e8 fa 15 00 00 48 8b 43 10 4c 89 3c 24 48 89 63 10 48 89 44 24 08 <48> 89 20 83 c8 ff 48 89 6c 24 10 87 03 ff c8 74 35 4d 89 ee 41 [ 368.805851] RIP [<ffffffff81358457>] __mutex_lock_common.isra.7+0x9c/0x15b [ 368.805877] RSP <ffff880117001cc0> [ 368.805886] CR2: 0000000000000000 [ 368.805899] ---[ end trace b720682065d8f4cc ]--- It's directly caused by 89d3cf6 [SCSI] libsas: add mutex for SMP task execution, but shows a deeper cause: expander functions expect to be able to cast to and treat domain devices as expanders. The correct fix is to only do expander discover when we know we've got an expander device to avoid wrongly casting a non-expander device. Reported-by: Praveen Murali <pmurali@logicube.com> Tested-by: Praveen Murali <pmurali@logicube.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-18fixed invalid assignment of 64bit mask to host dma_boundary for scatter ↵Minh Duc Tran
gather segment boundary limit. commit f76a610a8b4b6280eaedf48f3af9d5d74e418b66 upstream. In reference to bug https://bugzilla.redhat.com/show_bug.cgi?id=1097141 Assert is seen with AMD cpu whenever calling pci_alloc_consistent. [ 29.406183] ------------[ cut here ]------------ [ 29.410505] kernel BUG at lib/iommu-helper.c:13! Signed-off-by: Minh Tran <minh.tran@emulex.com> Fixes: 6733b39a1301b0b020bbcbf3295852e93e624cb1 Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-18sg: fix read() error reportingTony Battersby
commit 3b524a683af8991b4eab4182b947c65f0ce1421b upstream. Fix SCSI generic read() incorrectly returning success after detecting an error. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-29ipr: wait for aborted command responsesBrian King
commit 6cdb08172bc89f0a39e1643c5e7eab362692fd1b upstream. Fixes a race condition in abort handling that was injected when multiple interrupt support was added. When only a single interrupt is present, the adapter guarantees it will send responses for aborted commands prior to the response for the abort command itself. With multiple interrupts, these responses generally come back on different interrupts, so we need to ensure the abort thread waits until the aborted command is complete so we don't perform a double completion. This race condition was being hit frequently in environments which were triggering command timeouts, which was resulting in a double completion causing a kernel oops. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Reviewed-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com> Tested-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-27storvsc: ring buffer failures may result in I/O freezeLong Li
commit e86fb5e8ab95f10ec5f2e9430119d5d35020c951 upstream. When ring buffer returns an error indicating retry, storvsc may not return a proper error code to SCSI when bounce buffer is not used. This has introduced I/O freeze on RAID running atop storvsc devices. This patch fixes it by always returning a proper error code. Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-08megaraid_sas: corrected return of wait_event from abort frame pathSumit.Saxena@avagotech.com
commit 170c238701ec38b1829321b17c70671c101bac55 upstream. Corrected wait_event() call which was waiting for wrong completion status (0xFF). Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-12-06bnx2fc: do not add shared skbs to the fcoe_rx_listMaurizio Lombardi
commit 01a4cc4d0cd6a836c7b923760e8eb1cbb6a47258 upstream. In some cases, the fcoe_rx_list may contains multiple instances of the same skb (the so called "shared skbs"). the bnx2fc_l2_rcv thread is a loop that extracts a skb from the list, modifies (and destroys) its content and then proceed to the next one. The problem is that if the skb is shared, the remaining instances will be corrupted. The solution is to use skb_share_check() before adding the skb to the fcoe_rx_list. [ 6286.808725] ------------[ cut here ]------------ [ 6286.808729] WARNING: at include/scsi/fc_frame.h:173 bnx2fc_l2_rcv_thread+0x425/0x450 [bnx2fc]() [ 6286.808748] Modules linked in: bnx2x(-) mdio dm_service_time bnx2fc cnic uio fcoe libfcoe 8021q garp stp mrp libfc llc scsi_transport_fc scsi_tgt sg iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel e1000e ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper ptp cryptd hpilo serio_raw hpwdt lpc_ich pps_core ipmi_si pcspkr mfd_core ipmi_msghandler shpchp pcc_cpufreq mperf nfsd auth_rpcgss nfs_acl lockd sunrpc dm_multipath xfs libcrc32c ata_generic pata_acpi sd_mod crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit ata_piix drm_kms_helper ttm drm libata i2c_core hpsa dm_mirror dm_region_hash dm_log dm_mod [last unloaded: mdio] [ 6286.808750] CPU: 3 PID: 1304 Comm: bnx2fc_l2_threa Not tainted 3.10.0-121.el7.x86_64 #1 [ 6286.808750] Hardware name: HP ProLiant DL120 G7, BIOS J01 07/01/2013 [ 6286.808752] 0000000000000000 000000000b36e715 ffff8800deba1e00 ffffffff815ec0ba [ 6286.808753] ffff8800deba1e38 ffffffff8105dee1 ffffffffa05618c0 ffff8801e4c81888 [ 6286.808754] ffffe8ffff663868 ffff8801f402b180 ffff8801f56bc000 ffff8800deba1e48 [ 6286.808754] Call Trace: [ 6286.808759] [<ffffffff815ec0ba>] dump_stack+0x19/0x1b [ 6286.808762] [<ffffffff8105dee1>] warn_slowpath_common+0x61/0x80 [ 6286.808763] [<ffffffff8105e00a>] warn_slowpath_null+0x1a/0x20 [ 6286.808765] [<ffffffffa054f415>] bnx2fc_l2_rcv_thread+0x425/0x450 [bnx2fc] [ 6286.808767] [<ffffffffa054eff0>] ? bnx2fc_disable+0x90/0x90 [bnx2fc] [ 6286.808769] [<ffffffff81085aef>] kthread+0xcf/0xe0 [ 6286.808770] [<ffffffff81085a20>] ? kthread_create_on_node+0x140/0x140 [ 6286.808772] [<ffffffff815fc76c>] ret_from_fork+0x7c/0xb0 [ 6286.808773] [<ffffffff81085a20>] ? kthread_create_on_node+0x140/0x140 [ 6286.808774] ---[ end trace c6cdb939184ccb4e ]--- Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Acked-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-21SCSI: hpsa: fix a race in cmd_free/scsi_doneTomas Henzl
commit 2cc5bfaf854463d9d1aa52091f60110fbf102a96 upstream. When the driver calls scsi_done and after that frees it's internal preallocated memory it can happen that a new job is enqueud before the memory is freed. The allocation fails and the message "cmd_alloc returned NULL" is shown. Patch below fixes it by moving cmd->scsi_done after cmd_free. Signed-off-by: Tomas Henzl <thenzl@redhat.com> Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Cc: Masoud Sharbiani <msharbiani@twitter.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-21scsi: only re-lock door after EH on devices that were resetChristoph Hellwig
commit 48379270fe6808cf4612ee094adc8da2b7a83baa upstream. Setups that use the blk-mq I/O path can lock up if a host with a single device that has its door locked enters EH. Make sure to only send the command to re-lock the door to devices that actually were reset and thus might have lost their state. Otherwise the EH code might be get blocked on blk_get_request as all requests for non-reset devices might be in use. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Meelis Roos <meelis.roos@ut.ee> Tested-by: Meelis Roos <meelis.roos@ut.ee> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14qla_target: don't delete changed naclsJoern Engel
commit f4c24db1b7ad0ce84409e15744d26c6f86a96840 upstream. The code is currently riddled with "drop the hardware_lock to avoid a deadlock" bugs that expose races. One of those races seems to expose a valid warning in tcm_qla2xxx_clear_nacl_from_fcport_map. Add some bandaid to it. Signed-off-by: Joern Engel <joern@logfs.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30qla2xxx: Use correct offset to req-q-out for reserve calculationArun Easi
commit 75554b68ac1e018bca00d68a430b92ada8ab52dd upstream. Signed-off-by: Arun Easi <arun.easi@qlogic.com> Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30be2iscsi: check ip buffer before copyingMike Christie
commit a41a9ad3bbf61fae0b6bfb232153da60d14fdbd9 upstream. Dan Carpenter found a issue where be2iscsi would copy the ip from userspace to the driver buffer before checking the len of the data being copied: http://marc.info/?l=linux-scsi&m=140982651504251&w=2 This patch just has us only copy what we the driver buffer can support. Tested-by: John Soni Jose <sony.john-n@emulex.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05SCSI: libiscsi: fix potential buffer overrun in __iscsi_conn_send_pduMike Christie
commit db9bfd64b14a3a8f1868d2164518fdeab1b26ad1 upstream. This patches fixes a potential buffer overrun in __iscsi_conn_send_pdu. This function is used by iscsi drivers and userspace to send iscsi PDUs/ commands. For login commands, we have a set buffer size. For all other commands we do not support data buffers. This was reported by Dan Carpenter here: http://www.spinics.net/lists/linux-scsi/msg66838.html Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-17bfa: Fix undefined bit shift on big-endian architectures with 32-bit DMA addressBen Hutchings
commit 03a6c3ff3282ee9fa893089304d951e0be93a144 upstream. bfa_swap_words() shifts its argument (assumed to be 64-bit) by 32 bits each way. In two places the argument type is dma_addr_t, which may be 32-bit, in which case the effect of the bit shift is undefined: drivers/scsi/bfa/bfa_fcpim.c: In function 'bfa_ioim_send_ioreq': drivers/scsi/bfa/bfa_fcpim.c:2497:4: warning: left shift count >= width of type [enabled by default] addr = bfa_sgaddr_le(sg_dma_address(sg)); ^ drivers/scsi/bfa/bfa_fcpim.c:2497:4: warning: right shift count >= width of type [enabled by default] drivers/scsi/bfa/bfa_fcpim.c:2509:4: warning: left shift count >= width of type [enabled by default] addr = bfa_sgaddr_le(sg_dma_address(sg)); ^ drivers/scsi/bfa/bfa_fcpim.c:2509:4: warning: right shift count >= width of type [enabled by default] Avoid this by adding casts to u64 in bfa_swap_words(). Compile-tested only. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Anil Gurumurthy <anil.gurumurthy@qlogic.com> Fixes: f16a17507b09 ('[SCSI] bfa: remove all OS wrappers') Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-17drivers: scsi: storvsc: Correctly handle TEST_UNIT_READY failureK. Y. Srinivasan
commit 3533f8603d28b77c62d75ec899449a99bc6b77a1 upstream. On some Windows hosts on FC SANs, TEST_UNIT_READY can return SRB_STATUS_ERROR. Correctly handle this. Note that there is sufficient sense information to support scsi error handling even in this case. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-17Drivers: scsi: storvsc: Implement a eh_timed_out handlerK. Y. Srinivasan
commit 56b26e69c8283121febedd12b3cc193384af46b9 upstream. On Azure, we have seen instances of unbounded I/O latencies. To deal with this issue, implement handler that can reset the timeout. Note that the host gaurantees that it will respond to each command that has been issued. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Hannes Reinecke <hare@suse.de> [hch: added a better comment explaining the issue] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-05hpsa: fix bad -ENOMEM return value in hpsa_big_passthru_ioctlStephen M. Cameron
commit 0758f4f732b08b6ef07f2e5f735655cf69fea477 upstream. When copy_from_user fails, return -EFAULT, not -ENOMEM Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Reported-by: Robert Elliott <elliott@hp.com> Reviewed-by: Joe Handzik <joseph.t.handzik@hp.com> Reviewed-by: Scott Teel <scott.teel@hp.com> Reviewed by: Mike MIller <michael.miller@canonical.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-08-07scsi: handle flush errors properlyJames Bottomley
commit 89fb4cd1f717a871ef79fa7debbe840e3225cd54 upstream. Flush commands don't transfer data and thus need to be special cased in the I/O completion handler so that we can propagate errors to the block layer and filesystem. Signed-off-by: James Bottomley <JBottomley@Parallels.com> Reported-by: Steven Haber <steven@qumulo.com> Tested-by: Steven Haber <steven@qumulo.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-09sym53c8xx_2: Set DID_REQUEUE return code when aborting squeueMikulas Patocka
commit fd1232b214af43a973443aec6a2808f16ee5bf70 upstream. This patch fixes I/O errors with the sym53c8xx_2 driver when the disk returns QUEUE FULL status. When the controller encounters an error (including QUEUE FULL or BUSY status), it aborts all not yet submitted requests in the function sym_dequeue_from_squeue. This function aborts them with DID_SOFT_ERROR. If the disk has full tag queue, the request that caused the overflow is aborted with QUEUE FULL status (and the scsi midlayer properly retries it until it is accepted by the disk), but the sym53c8xx_2 driver aborts the following requests with DID_SOFT_ERROR --- for them, the midlayer does just a few retries and then signals the error up to sd. The result is that disk returning QUEUE FULL causes request failures. The error was reproduced on 53c895 with COMPAQ BD03685A24 disk (rebranded ST336607LC) with command queue 48 or 64 tags. The disk has 64 tags, but under some access patterns it return QUEUE FULL when there are less than 64 pending tags. The SCSI specification allows returning QUEUE FULL anytime and it is up to the host to retry. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-09virtio-scsi: fix various bad behavior on aborted requestsPaolo Bonzini
commit 8faeb529b2dabb9df691d614dda18910a43d05c9 upstream. Even though the virtio-scsi spec guarantees that all requests related to the TMF will have been completed by the time the TMF itself completes, the request queue's callback might not have run yet. This causes requests to be completed more than once, and as a result triggers a variety of BUGs or oopses. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Venkatesh Srinivas <venkateshs@google.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-09virtio-scsi: avoid cancelling uninitialized work itemsPaolo Bonzini
commit cdda0e5acbb78f7b777049f8c27899e5c5bb368f upstream. Calling the workqueue interface on uninitialized work items isn't a good idea even if they're zeroed. It's not failing catastrophically only through happy accidents. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-09ibmvscsi: Add memory barriers for send / receiveBrian King
commit 7114aae02742d6b5c5a0d39a41deb61d415d3717 upstream. Add a memory barrier prior to sending a new command to the VIOS to ensure the VIOS does not receive stale data in the command buffer. Also add a memory barrier when processing the CRQ for completed commands. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Acked-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-09ibmvscsi: Abort init sequence during error recoveryBrian King
commit 9ee755974bea2f9880e517ec985dc9dede1b3a36 upstream. If a CRQ reset is triggered for some reason while in the middle of performing VSCSI adapter initialization, we don't want to call the done function for the initialization MAD commands as this will only result in two threads attempting initialization at the same time, resulting in failures. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Acked-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-26net: Use netlink_ns_capable to verify the permisions of netlink messagesEric W. Biederman
[ Upstream commit 90f62cf30a78721641e08737bda787552428061e ] It is possible by passing a netlink socket to a more privileged executable and then to fool that executable into writing to the socket data that happens to be valid netlink message to do something that privileged executable did not intend to do. To keep this from happening replace bare capable and ns_capable calls with netlink_capable, netlink_net_calls and netlink_ns_capable calls. Which act the same as the previous calls except they verify that the opener of the socket had the desired permissions as well. Reported-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-16SCSI: megaraid: Use resource_size_t for PCI resources, not longBen Collins
commit 11f8a7b31f2140b0dc164bb484281235ffbe51d3 upstream. The assumption that sizeof(long) >= sizeof(resource_size_t) can lead to truncation of the PCI resource address, meaning this driver didn't work on 32-bit systems with 64-bit PCI adressing ranges. Signed-off-by: Ben Collins <ben.c@servergy.com> Acked-by: Sumit Saxena <sumit.saxena@lsi.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-30SCSI: megaraid: missing bounds check in mimd_to_kioc()Dan Carpenter
commit 3de2260140417759c669d391613d583baf03b0cf upstream. pthru32->dataxferlen comes from the user so we need to check that it's not too large so we don't overflow the buffer. Reported-by: Nico Golde <nico@ngolde.de> Reported-by: Fabian Yamaguchi <fabs@goesec.de> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Sumit Saxena <sumit.saxena@lsi.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-30SCSI: dual scan thread bug fixJames Bottomley
commit f2495e228fce9f9cec84367547813cbb0d6db15a upstream. In the highly unusual case where two threads are running concurrently through the scanning code scanning the same target, we run into the situation where one may allocate the target while the other is still using it. In this case, because the reap checks for STARGET_CREATED and kills the target without reference counting, the second thread will do the wrong thing on reap. Fix this by reference counting even creates and doing the STARGET_CREATED check in the final put. Tested-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-30scsi: fix our current target reap infrastructureJames Bottomley
commit e63ed0d7a98014fdfc2cfeb3f6dada313dcabb59 upstream. This patch eliminates the reap_ref and replaces it with a proper kref. On last put of this kref, the target is removed from visibility in sysfs. The final call to scsi_target_reap() for the device is done from __scsi_remove_device() and only if the device was made visible. This ensures that the target disappears as soon as the last device is gone rather than waiting until final release of the device (which is often too long). Reviewed-by: Alan Stern <stern@rowland.harvard.edu> Tested-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-13mpt2sas: Don't disable device twice at suspend.Tyler Stachecki
commit af61e27c3f77c7623b5335590ae24b6a5c323e22 upstream. On suspend, _scsih_suspend calls mpt2sas_base_free_resources, which in turn calls pci_disable_device if the device is enabled prior to suspending. However, _scsih_suspend also calls pci_disable_device itself. Thus, in the event that the device is enabled prior to suspending, pci_disable_device will be called twice. This patch removes the duplicate call to pci_disable_device in _scsi_suspend as it is both unnecessary and results in a kernel oops. Signed-off-by: Tyler Stachecki <tstache1@binghamton.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-13virtio-scsi: Skip setting affinity on uninitialized vqFam Zheng
commit 0c8482ac92db5ac15792caf23b7f7df9e4f48ae1 upstream. virtscsi_init calls virtscsi_remove_vqs on err, even before initializing the vqs. The latter calls virtscsi_set_affinity, so let's check the pointer there before setting affinity on it. This fixes a panic when setting device's num_queues=2 on RHEL 6.5: qemu-system-x86_64 ... \ -device virtio-scsi-pci,id=scsi0,addr=0x13,...,num_queues=2 \ -drive file=/stor/vm/dummy.raw,id=drive-scsi-disk,... \ -device scsi-hd,drive=drive-scsi-disk,... [ 0.354734] scsi0 : Virtio SCSI HBA [ 0.379504] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 0.380141] IP: [<ffffffff814741ef>] __virtscsi_set_affinity+0x4f/0x120 [ 0.380141] PGD 0 [ 0.380141] Oops: 0000 [#1] SMP [ 0.380141] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.0+ #5 [ 0.380141] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 [ 0.380141] task: ffff88003c9f0000 ti: ffff88003c9f8000 task.ti: ffff88003c9f8000 [ 0.380141] RIP: 0010:[<ffffffff814741ef>] [<ffffffff814741ef>] __virtscsi_set_affinity+0x4f/0x120 [ 0.380141] RSP: 0000:ffff88003c9f9c08 EFLAGS: 00010256 [ 0.380141] RAX: 0000000000000000 RBX: ffff88003c3a9d40 RCX: 0000000000001070 [ 0.380141] RDX: 0000000000000002 RSI: 0000000000000000 RDI: 0000000000000000 [ 0.380141] RBP: ffff88003c9f9c28 R08: 00000000000136c0 R09: ffff88003c801c00 [ 0.380141] R10: ffffffff81475229 R11: 0000000000000008 R12: 0000000000000000 [ 0.380141] R13: ffffffff81cc7ca8 R14: ffff88003cac3d40 R15: ffff88003cac37a0 [ 0.380141] FS: 0000000000000000(0000) GS:ffff88003e400000(0000) knlGS:0000000000000000 [ 0.380141] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 0.380141] CR2: 0000000000000020 CR3: 0000000001c0e000 CR4: 00000000000006f0 [ 0.380141] Stack: [ 0.380141] ffff88003c3a9d40 0000000000000000 ffff88003cac3d80 ffff88003cac3d40 [ 0.380141] ffff88003c9f9c48 ffffffff814742e8 ffff88003c26d000 ffff88003c26d000 [ 0.380141] ffff88003c9f9c68 ffffffff81474321 ffff88003c26d000 ffff88003c3a9d40 [ 0.380141] Call Trace: [ 0.380141] [<ffffffff814742e8>] virtscsi_set_affinity+0x28/0x40 [ 0.380141] [<ffffffff81474321>] virtscsi_remove_vqs+0x21/0x50 [ 0.380141] [<ffffffff81475231>] virtscsi_init+0x91/0x240 [ 0.380141] [<ffffffff81365290>] ? vp_get+0x50/0x70 [ 0.380141] [<ffffffff81475544>] virtscsi_probe+0xf4/0x280 [ 0.380141] [<ffffffff81363ea5>] virtio_dev_probe+0xe5/0x140 [ 0.380141] [<ffffffff8144c669>] driver_probe_device+0x89/0x230 [ 0.380141] [<ffffffff8144c8ab>] __driver_attach+0x9b/0xa0 [ 0.380141] [<ffffffff8144c810>] ? driver_probe_device+0x230/0x230 [ 0.380141] [<ffffffff8144c810>] ? driver_probe_device+0x230/0x230 [ 0.380141] [<ffffffff8144ac1c>] bus_for_each_dev+0x8c/0xb0 [ 0.380141] [<ffffffff8144c499>] driver_attach+0x19/0x20 [ 0.380141] [<ffffffff8144bf28>] bus_add_driver+0x198/0x220 [ 0.380141] [<ffffffff8144ce9f>] driver_register+0x5f/0xf0 [ 0.380141] [<ffffffff81d27c91>] ? spi_transport_init+0x79/0x79 [ 0.380141] [<ffffffff8136403b>] register_virtio_driver+0x1b/0x30 [ 0.380141] [<ffffffff81d27d19>] init+0x88/0xd6 [ 0.380141] [<ffffffff81d27c18>] ? scsi_init_procfs+0x5b/0x5b [ 0.380141] [<ffffffff81ce88a7>] do_one_initcall+0x7f/0x10a [ 0.380141] [<ffffffff81ce8aa7>] kernel_init_freeable+0x14a/0x1de [ 0.380141] [<ffffffff81ce8b3b>] ? kernel_init_freeable+0x1de/0x1de [ 0.380141] [<ffffffff817dec20>] ? rest_init+0x80/0x80 [ 0.380141] [<ffffffff817dec29>] kernel_init+0x9/0xf0 [ 0.380141] [<ffffffff817e68fc>] ret_from_fork+0x7c/0xb0 [ 0.380141] [<ffffffff817dec20>] ? rest_init+0x80/0x80 [ 0.380141] RIP [<ffffffff814741ef>] __virtscsi_set_affinity+0x4f/0x120 [ 0.380141] RSP <ffff88003c9f9c08> [ 0.380141] CR2: 0000000000000020 [ 0.380141] ---[ end trace 8074b70c3d5e1d73 ]--- [ 0.475018] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 [ 0.475018] [ 0.475068] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff) [ 0.475068] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 [jejb: checkpatch fixes] Signed-off-by: Fam Zheng <famz@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-06SCSI: arcmsr: upper 32 of dma address lostDan Carpenter
commit e2c70425f05219b142b3a8a9489a622c736db39d upstream. The original code always set the upper 32 bits to zero because it was doing a shift of the wrong variable. Fixes: 1a4f550a09f8 ('[SCSI] arcmsr: 1.20.00.15: add SATA RAID plus other fixes') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>