[ Upstream commit 120f1d95efb1cdb6fe023c84e38ba06d8f78cd03 ]
I'm facing this warning when building for the parisc64 architecture:
drivers/scsi/mpt3sas/mpt3sas_base.c: In function ‘_base_make_ioc_operational’:
drivers/scsi/mpt3sas/mpt3sas_base.c:5396:40: warning: array subscript ‘Mpi2SasIOUnitPage1_t {aka struct _MPI2_CONFIG_PAGE_SASIOUNIT_1}[0]’ is partly outside array bounds of ‘unsigned char[20]’ [-Warray-bounds]
5396 | (le16_to_cpu(sas_iounit_pg1->SASWideMaxQueueDepth)) ?
drivers/scsi/mpt3sas/mpt3sas_base.c:5382:26: note: referencing an object of size 20 allocated by ‘kzalloc’
5382 | sas_iounit_pg1 = kzalloc(sz, GFP_KERNEL);
| ^~~~~~~~~~~~~~~~~~~~~~~
The problem is, that only 20 bytes are allocated with kmalloc(), which is
sufficient to hold the bytes which are needed. Nevertheless, gcc complains
because the whole Mpi2SasIOUnitPage1_t struct is 32 bytes in size and thus
doesn't fit into those 20 bytes.
This patch simply allocates all 32 bytes (instead of 20) and thus avoids
the warning. There is no functional change introduced by this patch.
While touching the code I cleaned up to calculation of max_wideport_qd,
max_narrowport_qd and max_sata_qd to make it easier readable.
Test successfully tested on a HP C8000 PA-RISC workstation with 64-bit
kernel.
Link: https://lore.kernel.org/r/YpZ197iZdDZSCzrT@p100
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 336d63615466b4c06b9401c987813fd19bdde39b ]
After issuing a LIP, a specific target vendor does not ACC the FLOGI that
lpfc sends. However, it does send its own FLOGI that lpfc ACCs. The
target then establishes the port IDs by sending a PLOGI. lpfc PLOGI_ACCs
and starts the RPI registration for DID 0x000001. The target then sends a
LOGO to the fabric DID. lpfc is currently treating the LOGO from the
fabric DID as a link down and cleans up all the ndlps. The ndlp for DID
0x000001 is put back into NPR and discovery stops, leaving the port in
stuck in bypassed mode.
Change lpfc behavior such that if a LOGO is received for the fabric DID in
PT2PT topology skip the lpfc_linkdown_port() routine and just move the
fabric DID back to NPR.
Link: https://lore.kernel.org/r/20220603174329.63777-7-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b1b3440f437b75fb2a9b0cfe58df461e40eca474 ]
A use-after-free crash can occur after an ELS LOGO is aborted.
Specifically, a nodelist structure is freed and then
ndlp->vport->cfg_log_verbose is dereferenced in lpfc_nlp_get() when the
discovery state machine is mistakenly called a second time with
NLP_EVT_DEVICE_RM argument.
Rework lpfc_cmpl_els_logo() to prevent the duplicate calls to release a
nodelist structure.
Link: https://lore.kernel.org/r/20220603174329.63777-6-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cf71d59c2eceadfcde0fb52e237990a0909880d7 ]
vcpuHint has been expanded to 16 bit on host to enable routing to more
CPUs. Guest side should align with the change. This change has been tested
with hosts with 8-bit and 16-bit vcpuHint, on both platforms host side can
get correct value.
Link: https://lore.kernel.org/r/EF35F4D5-5DCC-42C5-BCC4-29DF1729B24C@vmware.com
Signed-off-by: Wentao Wang <wwentao@vmware.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f9f0a46141e2e39bedb4779c88380d1b5f018c14 ]
When myrb_probe() fails the callback might not be set, so we need to
validate the 'disable_intr' callback in myrb_cleanup() to not cause a null
pointer exception. And while at it do not call myrb_cleanup() if we cannot
enable the PCI device at all.
Link: https://lore.kernel.org/r/20220523120244.99515-1-hare@suse.de
Reported-by: Zheyu Ma <zheyuma97@gmail.com>
Tested-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 05fbde3a77a4f1d62e4c4428f384288c1f1a0be5 ]
If sd_probe() sees an early error before sdkp->device is initialized,
sd_zbc_release_disk() is called. This causes a NULL pointer dereference
when sd_is_zoned() is called inside that function. Avoid this by removing
the call to sd_zbc_release_disk() in sd_probe() error path.
This change is safe and does not result in zone information memory leakage
because the zone information for a zoned disk is allocated only when
sd_revalidate_disk() is called, at which point sdkp->disk_dev is fully set,
resulting in sd_disk_release() being called when needed to cleanup a disk
zone information using sd_zbc_release_disk().
Link: https://lore.kernel.org/r/20220601062544.905141-2-damien.lemoal@opensource.wdc.com
Fixes: 89d9475610 ("sd: Implement support for ZBC devices")
Reported-by: Dongliang Mu <mudongliangabcd@gmail.com>
Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 8eecddfca30e1651dc1c74531ed5eef21dcce7e3 upstream.
In ufs_qcom_dev_ref_clk_ctrl(), it was noted that the ref_clk needs to be
stable for at least 1us. Even though there is wmb() to make sure the write
gets "completed", there is no guarantee that the write actually reached the
UFS device. There is a good chance that the write could be stored in a
Write Buffer (WB). In that case, even though the CPU waits for 1us, the
ref_clk might not be stable for that period.
So lets do a readl() to make sure that the previous write has reached the
UFS device before udelay().
Also, the wmb() after writel_relaxed() is not really needed. Both writel()
and readl() are ordered on all architectures and the CPU won't speculate
instructions after readl() due to the in-built control dependency with read
value on weakly ordered architectures. So it can be safely removed.
Link: https://lore.kernel.org/r/20220504084212.11605-4-manivannan.sadhasivam@linaro.org
Fixes: f06fcc7155 ("scsi: ufs-qcom: add QUniPro hardware support and power optimizations")
Cc: stable@vger.kernel.org
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 036a45aa587a10fa2abbd50fbd0f6c4cfc44f69f upstream.
The bug is here:
p->target_id, p->target_lun);
The list iterator 'p' will point to a bogus position containing HEAD if the
list is empty or no element is found. This case must be checked before any
use of the iterator, otherwise it will lead to an invalid memory access.
To fix this bug, add a check. Use a new variable 'iter' as the list
iterator, and use the original variable 'p' as a dedicated pointer to point
to the found element.
Link: https://lore.kernel.org/r/20220414040231.2662-1-xiam0nd.tong@gmail.com
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 54db804d5d7d36709d1ce70bde3b9a6c61b290b6 ]
Fix the following Wstringop-overflow warnings when building with GCC-11:
drivers/scsi/fcoe/fcoe.c: In function ‘fcoe_netdev_config’:
drivers/scsi/fcoe/fcoe.c:744:32: warning: ‘fcoe_wwn_from_mac’ accessing 32 bytes in a region of size 6 [-Wstringop-overflow=]
744 | wwnn = fcoe_wwn_from_mac(ctlr->ctl_src_addr, 1, 0);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/fcoe/fcoe.c:744:32: note: referencing argument 1 of type ‘unsigned char *’
In file included from drivers/scsi/fcoe/fcoe.c:36:
./include/scsi/libfcoe.h:252:5: note: in a call to function ‘fcoe_wwn_from_mac’
252 | u64 fcoe_wwn_from_mac(unsigned char mac[MAX_ADDR_LEN], unsigned int, unsigned int);
| ^~~~~~~~~~~~~~~~~
drivers/scsi/fcoe/fcoe.c:747:32: warning: ‘fcoe_wwn_from_mac’ accessing 32 bytes in a region of size 6 [-Wstringop-overflow=]
747 | wwpn = fcoe_wwn_from_mac(ctlr->ctl_src_addr,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
748 | 2, 0);
| ~~~~~
drivers/scsi/fcoe/fcoe.c:747:32: note: referencing argument 1 of type ‘unsigned char *’
In file included from drivers/scsi/fcoe/fcoe.c:36:
./include/scsi/libfcoe.h:252:5: note: in a call to function ‘fcoe_wwn_from_mac’
252 | u64 fcoe_wwn_from_mac(unsigned char mac[MAX_ADDR_LEN], unsigned int, unsigned int);
| ^~~~~~~~~~~~~~~~~
CC drivers/scsi/bnx2fc/bnx2fc_io.o
In function ‘bnx2fc_net_config’,
inlined from ‘bnx2fc_if_create’ at drivers/scsi/bnx2fc/bnx2fc_fcoe.c:1543:7:
drivers/scsi/bnx2fc/bnx2fc_fcoe.c:833:32: warning: ‘fcoe_wwn_from_mac’ accessing 32 bytes in a region of size 6 [-Wstringop-overflow=]
833 | wwnn = fcoe_wwn_from_mac(ctlr->ctl_src_addr,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
834 | 1, 0);
| ~~~~~
drivers/scsi/bnx2fc/bnx2fc_fcoe.c: In function ‘bnx2fc_if_create’:
drivers/scsi/bnx2fc/bnx2fc_fcoe.c:833:32: note: referencing argument 1 of type ‘unsigned char *’
In file included from drivers/scsi/bnx2fc/bnx2fc.h:53,
from drivers/scsi/bnx2fc/bnx2fc_fcoe.c:17:
./include/scsi/libfcoe.h:252:5: note: in a call to function ‘fcoe_wwn_from_mac’
252 | u64 fcoe_wwn_from_mac(unsigned char mac[MAX_ADDR_LEN], unsigned int, unsigned int);
| ^~~~~~~~~~~~~~~~~
In function ‘bnx2fc_net_config’,
inlined from ‘bnx2fc_if_create’ at drivers/scsi/bnx2fc/bnx2fc_fcoe.c:1543:7:
drivers/scsi/bnx2fc/bnx2fc_fcoe.c:839:32: warning: ‘fcoe_wwn_from_mac’ accessing 32 bytes in a region of size 6 [-Wstringop-overflow=]
839 | wwpn = fcoe_wwn_from_mac(ctlr->ctl_src_addr,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
840 | 2, 0);
| ~~~~~
drivers/scsi/bnx2fc/bnx2fc_fcoe.c: In function ‘bnx2fc_if_create’:
drivers/scsi/bnx2fc/bnx2fc_fcoe.c:839:32: note: referencing argument 1 of type ‘unsigned char *’
In file included from drivers/scsi/bnx2fc/bnx2fc.h:53,
from drivers/scsi/bnx2fc/bnx2fc_fcoe.c:17:
./include/scsi/libfcoe.h:252:5: note: in a call to function ‘fcoe_wwn_from_mac’
252 | u64 fcoe_wwn_from_mac(unsigned char mac[MAX_ADDR_LEN], unsigned int, unsigned int);
| ^~~~~~~~~~~~~~~~~
drivers/scsi/qedf/qedf_main.c: In function ‘__qedf_probe’:
drivers/scsi/qedf/qedf_main.c:3520:30: warning: ‘fcoe_wwn_from_mac’ accessing 32 bytes in a region of size 6 [-Wstringop-overflow=]
3520 | qedf->wwnn = fcoe_wwn_from_mac(qedf->mac, 1, 0);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/qedf/qedf_main.c:3520:30: note: referencing argument 1 of type ‘unsigned char *’
In file included from drivers/scsi/qedf/qedf.h:9,
from drivers/scsi/qedf/qedf_main.c:23:
./include/scsi/libfcoe.h:252:5: note: in a call to function ‘fcoe_wwn_from_mac’
252 | u64 fcoe_wwn_from_mac(unsigned char mac[MAX_ADDR_LEN], unsigned int, unsigned int);
| ^~~~~~~~~~~~~~~~~
drivers/scsi/qedf/qedf_main.c:3521:30: warning: ‘fcoe_wwn_from_mac’ accessing 32 bytes in a region of size 6 [-Wstringop-overflow=]
3521 | qedf->wwpn = fcoe_wwn_from_mac(qedf->mac, 2, 0);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/qedf/qedf_main.c:3521:30: note: referencing argument 1 of type ‘unsigned char *’
In file included from drivers/scsi/qedf/qedf.h:9,
from drivers/scsi/qedf/qedf_main.c:23:
./include/scsi/libfcoe.h:252:5: note: in a call to function ‘fcoe_wwn_from_mac’
252 | u64 fcoe_wwn_from_mac(unsigned char mac[MAX_ADDR_LEN], unsigned int, unsigned int);
| ^~~~~~~~~~~~~~~~~
by changing the array size to the correct value of ETH_ALEN in the
argument declaration.
Also, fix a couple of checkpatch warnings:
WARNING: function definition argument 'unsigned int' should also have an identifier name
This helps with the ongoing efforts to globally enable
-Wstringop-overflow.
Link: https://github.com/KSPP/linux/issues/181
Fixes: 85b4aa4926 ("[SCSI] fcoe: Fibre Channel over Ethernet")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ef60031022eb6d972aac86ca26c98c33e1289436 ]
Some devices may return invalid or zeroed data during an UIC error
condition. In addition, reading these SFRs will clear them. This means the
subsequent error handling will not be able to see them and therefore no
error handling will be scheduled.
Skip reading these SFRs in ufshcd_dump_regs().
Link: https://lore.kernel.org/r/1648689845-33521-1-git-send-email-kwmad.kim@samsung.com
Fixes: d672475664 ("scsi: ufs: Use explicit access size in ufshcd_dump_regs")
Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e6f51041450282a8668af3a8fc5c7744e81a447c ]
When configuring CMF management based on signals instead of FPINs, FPIN
alarm and warning statistics are not tracked.
Change the behavior so that FPIN alarms and warnings are always tracked
regardless of the configured mode.
Similar changes are made in the CMF signal stat accounting logic. Upon
receipt of a signal, only track signaled alarms and warnings. FPIN stats
should not be incremented upon receipt of a signal.
Link: https://lore.kernel.org/r/20220506035519.50908-11-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d6d45f67a11136cb88a70a29ab22ea6db8ae6bd5 ]
The following was seen with CMF enabled:
BUG: using smp_processor_id() in preemptible
code: systemd-udevd/31711
kernel: caller is lpfc_update_cmf_cmd+0x214/0x420 [lpfc]
kernel: CPU: 12 PID: 31711 Comm: systemd-udevd
kernel: Call Trace:
kernel: <TASK>
kernel: dump_stack_lvl+0x44/0x57
kernel: check_preemption_disabled+0xbf/0xe0
kernel: lpfc_update_cmf_cmd+0x214/0x420 [lpfc]
kernel: lpfc_nvme_fcp_io_submit+0x23b4/0x4df0 [lpfc]
this_cpu_ptr() calls smp_processor_id() in a preemptible context.
Fix by using per_cpu_ptr() with raw_smp_processor_id() instead.
Link: https://lore.kernel.org/r/20220412222008.126521-16-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 03cbbd7c2f5ee288f648f4aeedc765a181188553 ]
During stress I/O tests with 500+ vports, hard LOCKUP call traces are
observed.
CPU A:
native_queued_spin_lock_slowpath+0x192
_raw_spin_lock_irqsave+0x32
lpfc_handle_fcp_err+0x4c6
lpfc_fcp_io_cmd_wqe_cmpl+0x964
lpfc_sli4_fp_handle_cqe+0x266
__lpfc_sli4_process_cq+0x105
__lpfc_sli4_hba_process_cq+0x3c
lpfc_cq_poll_hdler+0x16
irq_poll_softirq+0x76
__softirqentry_text_start+0xe4
irq_exit+0xf7
do_IRQ+0x7f
CPU B:
native_queued_spin_lock_slowpath+0x5b
_raw_spin_lock+0x1c
lpfc_abort_handler+0x13e
scmd_eh_abort_handler+0x85
process_one_work+0x1a7
worker_thread+0x30
kthread+0x112
ret_from_fork+0x1f
Diagram of lockup:
CPUA CPUB
---- ----
lpfc_cmd->buf_lock
phba->hbalock
lpfc_cmd->buf_lock
phba->hbalock
Fix by reordering the taking of the lpfc_cmd->buf_lock and phba->hbalock in
lpfc_abort_handler routine so that it tries to take the lpfc_cmd->buf_lock
first before phba->hbalock.
Link: https://lore.kernel.org/r/20220412222008.126521-7-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e294647b1aed4247fe52851f3a3b2b19ae906228 ]
In an attempt to log message 0126 with LOG_TRACE_EVENT, the following hard
lockup call trace hangs the system.
Call Trace:
_raw_spin_lock_irqsave+0x32/0x40
lpfc_dmp_dbg.part.32+0x28/0x220 [lpfc]
lpfc_cmpl_els_fdisc+0x145/0x460 [lpfc]
lpfc_sli_cancel_jobs+0x92/0xd0 [lpfc]
lpfc_els_flush_cmd+0x43c/0x670 [lpfc]
lpfc_els_flush_all_cmd+0x37/0x60 [lpfc]
lpfc_sli4_async_event_proc+0x956/0x1720 [lpfc]
lpfc_do_work+0x1485/0x1d70 [lpfc]
kthread+0x112/0x130
ret_from_fork+0x1f/0x40
Kernel panic - not syncing: Hard LOCKUP
The same CPU tries to claim the phba->port_list_lock twice.
Move the cfg_log_verbose checks as part of the lpfc_printf_vlog() and
lpfc_printf_log() macros before calling lpfc_dmp_dbg(). There is no need
to take the phba->port_list_lock within lpfc_dmp_dbg().
Link: https://lore.kernel.org/r/20220412222008.126521-3-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 26f9ce53817a8fd84b69a73473a7de852a24c897 ]
Aborting commands that have already been sent to the firmware can
cause BUG in qlt_free_cmd(): BUG_ON(cmd->sg_mapped)
For instance:
- Command passes rdx_to_xfer state, maps sgl, sends to the firmware
- Reset occurs, qla2xxx performs ISP error recovery, aborts the command
- Target stack calls qlt_abort_cmd() and then qlt_free_cmd()
- BUG_ON(cmd->sg_mapped) in qlt_free_cmd() occurs because sgl was not
unmapped
Thus, unmap sgl in qlt_abort_cmd() for commands with the aborted flag set.
Link: https://lore.kernel.org/r/AS8PR10MB4952D545F84B6B1DFD39EC1E9DEE9@AS8PR10MB4952.EURPRD10.PROD.OUTLOOK.COM
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Gleb Chesnokov <Chesnokov.G@raidix.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6056a92ceb2a7705d61df7ec5370548e96aee258 ]
The handling of the ALUA transitioning state is currently broken. When a
target goes into this state, it is expected that the target is allowed to
stay in this state for the implicit transition timeout without a path
failure. The handler has this logic, but it gets skipped currently.
When the target transitions, there is in-flight I/O from the initiator. The
first of these responses from the target will be a unit attention letting
the initiator know that the ALUA state has changed. The remaining
in-flight I/Os, before the initiator finds out that the portal state has
changed, will return not ready, ALUA state is transitioning. The portal
state will change to SCSI_ACCESS_STATE_TRANSITIONING. This will lead to all
new I/O immediately failing the path unexpectedly. The path failure happens
in less than a second instead of the expected successes until the
transition timer is exceeded.
Allow I/Os to continue while the path is in the ALUA transitioning
state. The handler already takes care of a target that stays in the
transitioning state for too long by changing the state to ALUA state
standby once the transition timeout is exceeded at which point the path
will fail.
Link: https://lore.kernel.org/r/CAHZQxy+4sTPz9+pY3=7VJH+CLUJsDct81KtnR2be8ycN5mhqTg@mail.gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Krishna Kant <krishna.kant@purestorage.com>
Acked-by: Seamus Connor <sconnor@purestorage.com>
Signed-off-by: Brian Bunker <brian@purestorage.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit faad6cebded8e0fd902b672f220449b93db479eb ]
sr_ioctl.c uses this pattern:
result = sr_do_ioctl(cd, &cgc);
to-user = buffer[];
kfree(buffer);
return result;
Use of a buffer without checking leaks information. Check result and jump
over the use of buffer if there is an error.
result = sr_do_ioctl(cd, &cgc);
if (result)
goto err;
to-user = buffer[];
err:
kfree(buffer);
return result;
Additionally, initialize the buffer to zero.
This problem can be seen in the 2.4.0 kernel.
Link: https://lore.kernel.org/r/20220411174756.2418435-1-trix@redhat.com
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 857b06527f707f5df634b854898a191b5c1d0272 ]
We set the qedi_ep state to EP_STATE_OFLDCONN_START when the ep is
created. Then in qedi_set_path we kick off the offload work. If userspace
times out the connection and calls ep_disconnect, qedi will only flush the
offload work if the qedi_ep state has transitioned away from
EP_STATE_OFLDCONN_START. If we can't connect we will not have transitioned
state and will leave the offload work running, and we will free the qedi_ep
from under it.
This patch just has us init the work when we create the ep, then always
flush it.
Link: https://lore.kernel.org/r/20220408001314.5014-10-michael.christie@oracle.com
Tested-by: Manish Rangankar <mrangankar@marvell.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Chris Leech <cleech@redhat.com>
Acked-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 2bd3b6b75946db2ace06e145d53988e10ed7e99a upstream.
When ufs initializes without scmd->device->sector_size set, scsi_get_lba()
will get a wrong shift number and trigger an ubsan error. The shift
exponent 4294967286 is too large for the 64-bit type 'sector_t' (aka
'unsigned long long').
Call scsi_get_lba() only when opcode is READ_10/WRITE_10/UNMAP.
Link: https://lore.kernel.org/r/20220307111752.10465-1-peter.wang@mediatek.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 56495f295d8e021f77d065b890fc0100e3f9f6d8 ]
The megaraid_sas driver supports single LUN for RAID devices. That is LUN
0. All other LUNs are unsupported. When a device scan on a logical target
with invalid LUN number is invoked through sysfs, that target ends up
getting removed.
Add LUN ID validation in the slave destroy function to avoid the target
deletion.
Link: https://lore.kernel.org/r/20220324094711.48833-1-chandrakanth.patil@broadcom.com
Signed-off-by: Chandrakanth patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0bade8e53279157c7cc9dd95d573b7e82223d78a ]
The adapter request_limit is hardcoded to be INITIAL_SRP_LIMIT which is
currently an arbitrary value of 800. Increase this value to 1024 which
better matches the characteristics of the typical IBMi Initiator that
supports 32 LUNs and a queue depth of 32.
This change also has the secondary benefit of being a power of two as
required by the kfifo API. Since, Commit ab9bb6318b ("Partially revert
"kfifo: fix kfifo_alloc() and kfifo_init()"") the size of IU pool for each
target has been rounded down to 512 when attempting to kfifo_init() those
pools with the current request_limit size of 800.
Link: https://lore.kernel.org/r/20220322194443.678433-1-tyreld@linux.ibm.com
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit df0101197c4d9596682901631f3ee193ed354873 ]
When recovering from a pci-parity error the driver is failing to re-create
queues, causing recovery to fail. Looking deeper, it was found that the
interrupt vector count allocated on the recovery was fewer than the vectors
originally allocated. This disparity resulted in CPU map entries with stale
information. When the driver tries to re-create the queues, it attempts to
use the stale information which indicates an eq/interrupt vector that was
no longer created.
Fix by clearng the cpup map array before enabling and requesting the IRQs
in the lpfc_sli_reset_slot_s4 routine().
Link: https://lore.kernel.org/r/20220317032737.45308-4-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 03690d81974535f228e892a14f0d2d44404fe555 ]
If a driver raises a connection error before the connection is bound, we
can leave a cleanup_work queued that can later run and disconnect/stop a
connection that is logged in. The problem is that drivers can call
iscsi_conn_error_event for endpoints that are connected but not yet bound
when something like the network port they are using is brought down.
iscsi_cleanup_conn_work_fn will check for this and exit early, but if the
cleanup_work is stuck behind other works, it might not get run until after
userspace has done ep_disconnect. Because the endpoint is not yet bound
there was no way for ep_disconnect to flush the work.
The bug of leaving stop_conns queued was added in:
Commit 23d6fefbb3 ("scsi: iscsi: Fix in-kernel conn failure handling")
and:
Commit 0ab710458d ("scsi: iscsi: Perform connection failure entirely in
kernel space")
was supposed to fix it, but left this case.
This patch moves the conn state check to before we even queue the work so
we can avoid queueing.
Link: https://lore.kernel.org/r/20220408001314.5014-7-michael.christie@oracle.com
Fixes: 0ab710458d ("scsi: iscsi: Perform connection failure entirely in kernel space")
Tested-by: Manish Rangankar <mrangankar@marvell.com>
Reviewed-by: Lee Duncan <lduncan@@suse.com>
Reviewed-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7c6e99c18167ed89729bf167ccb4a7e3ab3115ba ]
If iscsid is doing a stop_conn at the same time the kernel is starting
error recovery we can hit a race that allows the cleanup work to run on a
valid connection. In the race, iscsi_if_stop_conn sees the cleanup bit set,
but it calls flush_work on the clean_work before iscsi_conn_error_event has
queued it. The flush then returns before the queueing and so the
cleanup_work can run later and disconnect/stop a conn while it's in a
connected state.
The patch:
Commit 0ab710458d ("scsi: iscsi: Perform connection failure entirely in
kernel space")
added the late stop_conn call bug originally, and the patch:
Commit 23d6fefbb3 ("scsi: iscsi: Fix in-kernel conn failure handling")
attempted to fix it but only fixed the normal EH case and left the above
race for the iscsid restart case. For the normal EH case we don't hit the
race because we only signal userspace to start recovery after we have done
the queueing, so the flush will always catch the queued work or see it
completed.
For iscsid restart cases like boot, we can hit the race because iscsid will
call down to the kernel before the kernel has signaled any error, so both
code paths can be running at the same time. This adds a lock around the
setting of the cleanup bit and queueing so they happen together.
Link: https://lore.kernel.org/r/20220408001314.5014-6-michael.christie@oracle.com
Fixes: 0ab710458d ("scsi: iscsi: Perform connection failure entirely in kernel space")
Tested-by: Manish Rangankar <mrangankar@marvell.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0aadafb5c34403a7cced1a8d61877048dc059f70 ]
This patch fixes a bug where when using iSCSI offload we can free an
endpoint while userspace still thinks it's active. That then causes the
endpoint ID to be reused for a new connection's endpoint while userspace
still thinks the ID is for the original connection. Userspace will then end
up disconnecting a running connection's endpoint or trying to bind to
another connection's endpoint.
This bug is a regression added in:
Commit 23d6fefbb3 ("scsi: iscsi: Fix in-kernel conn failure handling")
where we added a in kernel ep_disconnect call to fix a bug in:
Commit 0ab710458d ("scsi: iscsi: Perform connection failure entirely in
kernel space")
where we would call stop_conn without having done ep_disconnect. This early
ep_disconnect call will then free the endpoint and it's ID while userspace
still thinks the ID is valid.
Fix the early release of the ID by having the in kernel recovery code keep
a reference to the endpoint until userspace has called into the kernel to
finish cleaning up the endpoint/connection. It requires the previous commit
"scsi: iscsi: Release endpoint ID when its freed" which moved the freeing
of the ID until when the endpoint is released.
Link: https://lore.kernel.org/r/20220408001314.5014-5-michael.christie@oracle.com
Fixes: 23d6fefbb3 ("scsi: iscsi: Fix in-kernel conn failure handling")
Tested-by: Manish Rangankar <mrangankar@marvell.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cbd2283aaf47fef4ded4b29124b1ef3beb515f3a ]
When userspace restarts during boot or upgrades it won't know about the
offload driver's endpoint and connection mappings. iscsid will start by
cleaning up the old session by doing a stop_conn call. Later, if we are
able to create a new connection, we clean up the old endpoint during the
binding stage. The problem is that if we do stop_conn before doing the
ep_disconnect call offload, drivers can still be executing I/O. We then
might free tasks from the under the card/driver.
This moves the ep_disconnect call to before we do the stop_conn call for
this case. It will then work and look like a normal recovery/cleanup
procedure from the driver's point of view.
Link: https://lore.kernel.org/r/20220408001314.5014-3-michael.christie@oracle.com
Tested-by: Manish Rangankar <mrangankar@marvell.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 294080eacf92a0781e6d43663448a55001ec8c64 ]
When upper inbound and outbound queues 32-63 are enabled, we see upper
vectors 32-63 in interrupt service routine. We need corresponding registers
to handle masking and unmasking of these upper interrupts.
To achieve this, we use registers MSGU_ODMR_U(0x34) to mask and
MSGU_ODMR_CLR_U(0x3C) to unmask the interrupts. In these registers bit 0-31
represents interrupt vectors 32-63.
Link: https://lore.kernel.org/r/20220411064603.668448-2-Ajish.Koshy@microchip.com
Fixes: 05c6c029a4 ("scsi: pm80xx: Increase number of supported queues")
Reviewed-by: John Garry <john.garry@huawei.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Ajish Koshy <Ajish.Koshy@microchip.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 87d663d40801dffc99a5ad3b0188ad3e2b4d1557 upstream.
The function mpt3sas_transport_port_remove() called in
_scsih_expander_node_remove() frees the port field of the sas_expander
structure, leading to the following use-after-free splat from KASAN when
the ioc_info() call following that function is executed (e.g. when doing
rmmod of the driver module):
[ 3479.371167] ==================================================================
[ 3479.378496] BUG: KASAN: use-after-free in _scsih_expander_node_remove+0x710/0x750 [mpt3sas]
[ 3479.386936] Read of size 1 at addr ffff8881c037691c by task rmmod/1531
[ 3479.393524]
[ 3479.395035] CPU: 18 PID: 1531 Comm: rmmod Not tainted 5.17.0-rc8+ #1436
[ 3479.401712] Hardware name: Supermicro Super Server/H12SSL-NT, BIOS 2.1 06/02/2021
[ 3479.409263] Call Trace:
[ 3479.411743] <TASK>
[ 3479.413875] dump_stack_lvl+0x45/0x59
[ 3479.417582] print_address_description.constprop.0+0x1f/0x120
[ 3479.423389] ? _scsih_expander_node_remove+0x710/0x750 [mpt3sas]
[ 3479.429469] kasan_report.cold+0x83/0xdf
[ 3479.433438] ? _scsih_expander_node_remove+0x710/0x750 [mpt3sas]
[ 3479.439514] _scsih_expander_node_remove+0x710/0x750 [mpt3sas]
[ 3479.445411] ? _raw_spin_unlock_irqrestore+0x2d/0x40
[ 3479.452032] scsih_remove+0x525/0xc90 [mpt3sas]
[ 3479.458212] ? mpt3sas_expander_remove+0x1d0/0x1d0 [mpt3sas]
[ 3479.465529] ? down_write+0xde/0x150
[ 3479.470746] ? up_write+0x14d/0x460
[ 3479.475840] ? kernfs_find_ns+0x137/0x310
[ 3479.481438] pci_device_remove+0x65/0x110
[ 3479.487013] __device_release_driver+0x316/0x680
[ 3479.493180] driver_detach+0x1ec/0x2d0
[ 3479.498499] bus_remove_driver+0xe7/0x2d0
[ 3479.504081] pci_unregister_driver+0x26/0x250
[ 3479.510033] _mpt3sas_exit+0x2b/0x6cf [mpt3sas]
[ 3479.516144] __x64_sys_delete_module+0x2fd/0x510
[ 3479.522315] ? free_module+0xaa0/0xaa0
[ 3479.527593] ? __cond_resched+0x1c/0x90
[ 3479.532951] ? lockdep_hardirqs_on_prepare+0x273/0x3e0
[ 3479.539607] ? syscall_enter_from_user_mode+0x21/0x70
[ 3479.546161] ? trace_hardirqs_on+0x1c/0x110
[ 3479.551828] do_syscall_64+0x35/0x80
[ 3479.556884] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 3479.563402] RIP: 0033:0x7f1fc482483b
...
[ 3479.943087] ==================================================================
Fix this by introducing the local variable port_id to store the port ID
value before executing mpt3sas_transport_port_remove(). This local variable
is then used in the call to ioc_info() instead of dereferencing the freed
port structure.
Link: https://lore.kernel.org/r/20220322055702.95276-1-damien.lemoal@opensource.wdc.com
Fixes: 7d310f2410 ("scsi: mpt3sas: Get device objects using sas_address & portID")
Cc: stable@vger.kernel.org
Acked-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit eaba83b5b8506bbc9ee7ca2f10aeab3fff3719e7 ]
In commit edb854a3680b ("scsi: core: Reallocate device's budget map on
queue depth change"), the sbitmap for the device budget map may be
reallocated after the slave device depth is configured.
When the sbitmap is reallocated we use the result from
scsi_device_max_queue_depth() for the sbitmap size, but don't resize to
match the actual device queue depth.
Fix by resizing the sbitmap after reallocating the budget sbitmap. We do
this instead of init'ing the sbitmap to the device queue depth as the user
may want to change the queue depth later via sysfs or other.
Link: https://lore.kernel.org/r/1647423870-143867-1-git-send-email-john.garry@huawei.com
Fixes: edb854a3680b ("scsi: core: Reallocate device's budget map on queue depth change")
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bc5519c18a32ce855bb51b9f5eceb77a9489d080 ]
Commit 2e27f576ab ("scsi: scsi_ioctl: Call scsi_cmd_ioctl() from
scsi_ioctl()") seems to have a typo as it is checking ret instead of cmd in
the if statement checking for CDROMCLOSETRAY and CDROMEJECT. This changes
the behaviour of these ioctls as the cdrom_ioctl handling of these is more
restrictive than the scsi_ioctl version.
Link: https://lore.kernel.org/r/20220323002242.21157-1-kgroeneveld@lenbrook.com
Fixes: 2e27f576ab ("scsi: scsi_ioctl: Call scsi_cmd_ioctl() from scsi_ioctl()")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Groeneveld <kgroeneveld@lenbrook.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 271add11994ba1a334859069367e04d2be2ebdd4 ]
fc_exch_release(ep) will decrease the ep's reference count. When the
reference count reaches zero, it is freed. But ep is still used in the
following code, which will lead to a use after free.
Return after the fc_exch_release() call to avoid use after free.
Link: https://lore.kernel.org/r/20220303015115.459778-1-niejianglei2021@163.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>