367 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
a929037d0e Merge 4.14.195 into android-4.14-stable
Changes in 4.14.195
	drm/vgem: Replace opencoded version of drm_gem_dumb_map_offset()
	perf probe: Fix memory leakage when the probe point is not found
	khugepaged: khugepaged_test_exit() check mmget_still_valid()
	khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter()
	powerpc/mm: Only read faulting instruction when necessary in do_page_fault()
	powerpc: Allow 4224 bytes of stack expansion for the signal frame
	btrfs: export helpers for subvolume name/id resolution
	btrfs: don't show full path of bind mounts in subvol=
	btrfs: Move free_pages_out label in inline extent handling branch in compress_file_range
	btrfs: inode: fix NULL pointer dereference if inode doesn't need compression
	btrfs: sysfs: use NOFS for device creation
	romfs: fix uninitialized memory leak in romfs_dev_read()
	kernel/relay.c: fix memleak on destroy relay channel
	mm: include CMA pages in lowmem_reserve at boot
	mm, page_alloc: fix core hung in free_pcppages_bulk()
	ext4: fix checking of directory entry validity for inline directories
	jbd2: add the missing unlock_buffer() in the error path of jbd2_write_superblock()
	spi: Prevent adding devices below an unregistering controller
	scsi: ufs: Add DELAY_BEFORE_LPM quirk for Micron devices
	media: budget-core: Improve exception handling in budget_register()
	rtc: goldfish: Enable interrupt in set_alarm() when necessary
	media: vpss: clean up resources in init
	Input: psmouse - add a newline when printing 'proto' by sysfs
	m68knommu: fix overwriting of bits in ColdFire V3 cache control
	xfs: fix inode quota reservation checks
	jffs2: fix UAF problem
	cpufreq: intel_pstate: Fix cpuinfo_max_freq when MSR_TURBO_RATIO_LIMIT is 0
	scsi: libfc: Free skb in fc_disc_gpn_id_resp() for valid cases
	virtio_ring: Avoid loop when vq is broken in virtqueue_poll
	xfs: Fix UBSAN null-ptr-deref in xfs_sysfs_init
	alpha: fix annotation of io{read,write}{16,32}be()
	ext4: fix potential negative array index in do_split()
	i40e: Set RX_ONLY mode for unicast promiscuous on VLAN
	i40e: Fix crash during removing i40e driver
	net: fec: correct the error path for regulator disable in probe
	bonding: show saner speed for broadcast mode
	bonding: fix a potential double-unregister
	ASoC: msm8916-wcd-analog: fix register Interrupt offset
	ASoC: intel: Fix memleak in sst_media_open
	vfio/type1: Add proper error unwind for vfio_iommu_replay()
	bonding: fix active-backup failover for current ARP slave
	hv_netvsc: Fix the queue_mapping in netvsc_vf_xmit()
	net: dsa: b53: check for timeout
	powerpc/pseries: Do not initiate shutdown when system is running on UPS
	epoll: Keep a reference on files added to the check list
	do_epoll_ctl(): clean the failure exits up a bit
	mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible
	xen: don't reschedule in preemption off sections
	clk: Evict unregistered clks from parent caches
	KVM: arm/arm64: Don't reschedule in unmap_stage2_range()
	Linux 4.14.195

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I6c25044ba9166ec01723671d9cfa3fdf08ccc43f
2020-08-26 11:01:28 +02:00
Alex Williamson
36e839f9c0 vfio/type1: Add proper error unwind for vfio_iommu_replay()
[ Upstream commit aae7a75a821a793ed6b8ad502a5890fb8e8f172d ]

The vfio_iommu_replay() function does not currently unwind on error,
yet it does pin pages, perform IOMMU mapping, and modify the vfio_dma
structure to indicate IOMMU mapping.  The IOMMU mappings are torn down
when the domain is destroyed, but the other actions go on to cause
trouble later.  For example, the iommu->domain_list can be empty if we
only have a non-IOMMU backed mdev attached.  We don't currently check
if the list is empty before getting the first entry in the list, which
leads to a bogus domain pointer.  If a vfio_dma entry is erroneously
marked as iommu_mapped, we'll attempt to use that bogus pointer to
retrieve the existing physical page addresses.

This is the scenario that uncovered this issue, attempting to hot-add
a vfio-pci device to a container with an existing mdev device and DMA
mappings, one of which could not be pinned, causing a failure adding
the new group to the existing container and setting the conditions
for a subsequent attempt to explode.

To resolve this, we can first check if the domain_list is empty so
that we can reject replay of a bogus domain, should we ever encounter
this inconsistent state again in the future.  The real fix though is
to add the necessary unwind support, which means cleaning up the
current pinning if an IOMMU mapping fails, then walking back through
the r-b tree of DMA entries, reading from the IOMMU which ranges are
mapped, and unmapping and unpinning those ranges.  To be able to do
this, we also defer marking the DMA entry as IOMMU mapped until all
entries are processed, in order to allow the unwind to know the
disposition of each entry.

Fixes: a54eb55045 ("vfio iommu type1: Add support for mediated devices")
Reported-by: Zhiyi Guo <zhguo@redhat.com>
Tested-by: Zhiyi Guo <zhguo@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-26 10:29:59 +02:00
Greg Kroah-Hartman
e570b0fb2f Merge 4.14.186 into android-4.14-stable
Changes in 4.14.186
	s390: fix syscall_get_error for compat processes
	drm/i915: Whitelist context-local timestamp in the gen9 cmdparser
	power: supply: bq24257_charger: Replace depends on REGMAP_I2C with select
	clk: sunxi: Fix incorrect usage of round_down()
	i2c: piix4: Detect secondary SMBus controller on AMD AM4 chipsets
	iio: pressure: bmp280: Tolerate IRQ before registering
	remoteproc: Fix IDR initialisation in rproc_alloc()
	clk: qcom: msm8916: Fix the address location of pll->config_reg
	backlight: lp855x: Ensure regulators are disabled on probe failure
	ASoC: davinci-mcasp: Fix dma_chan refcnt leak when getting dma type
	ARM: integrator: Add some Kconfig selections
	scsi: qedi: Check for buffer overflow in qedi_set_path()
	ALSA: isa/wavefront: prevent out of bounds write in ioctl
	scsi: qla2xxx: Fix issue with adapter's stopping state
	iio: bmp280: fix compensation of humidity
	f2fs: report delalloc reserve as non-free in statfs for project quota
	i2c: pxa: clear all master action bits in i2c_pxa_stop_message()
	usblp: poison URBs upon disconnect
	dm mpath: switch paths in dm_blk_ioctl() code path
	PCI: aardvark: Don't blindly enable ASPM L0s and don't write to read-only register
	ps3disk: use the default segment boundary
	vfio/pci: fix memory leaks in alloc_perm_bits()
	m68k/PCI: Fix a memory leak in an error handling path
	mfd: wm8994: Fix driver operation if loaded as modules
	scsi: lpfc: Fix lpfc_nodelist leak when processing unsolicited event
	clk: clk-flexgen: fix clock-critical handling
	powerpc/perf/hv-24x7: Fix inconsistent output values incase multiple hv-24x7 events run
	nfsd: Fix svc_xprt refcnt leak when setup callback client failed
	powerpc/crashkernel: Take "mem=" option into account
	yam: fix possible memory leak in yam_init_driver
	NTB: Fix the default port and peer numbers for legacy drivers
	mksysmap: Fix the mismatch of '.L' symbols in System.map
	apparmor: fix introspection of of task mode for unconfined tasks
	scsi: sr: Fix sr_probe() missing deallocate of device minor
	scsi: ibmvscsi: Don't send host info in adapter info MAD after LPM
	staging: greybus: fix a missing-check bug in gb_lights_light_config()
	scsi: qedi: Do not flush offload work if ARP not resolved
	ALSA: usb-audio: Improve frames size computation
	s390/qdio: put thinint indicator after early error
	tty: hvc: Fix data abort due to race in hvc_open
	thermal/drivers/ti-soc-thermal: Avoid dereferencing ERR_PTR
	staging: sm750fb: add missing case while setting FB_VISUAL
	i2c: pxa: fix i2c_pxa_scream_blue_murder() debug output
	serial: amba-pl011: Make sure we initialize the port.lock spinlock
	drivers: base: Fix NULL pointer exception in __platform_driver_probe() if a driver developer is foolish
	PCI: rcar: Fix incorrect programming of OB windows
	PCI/ASPM: Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges
	scsi: qla2xxx: Fix warning after FC target reset
	power: supply: lp8788: Fix an error handling path in 'lp8788_charger_probe()'
	power: supply: smb347-charger: IRQSTAT_D is volatile
	scsi: mpt3sas: Fix double free warnings
	dlm: remove BUG() before panic()
	clk: ti: composite: fix memory leak
	PCI: Fix pci_register_host_bridge() device_register() error handling
	tty: n_gsm: Fix SOF skipping
	tty: n_gsm: Fix waking up upper tty layer when room available
	powerpc/pseries/ras: Fix FWNMI_VALID off by one
	powerpc/ps3: Fix kexec shutdown hang
	vfio-pci: Mask cap zero
	usb/ohci-platform: Fix a warning when hibernating
	drm/msm/mdp5: Fix mdp5_init error path for failed mdp5_kms allocation
	USB: host: ehci-mxc: Add error handling in ehci_mxc_drv_probe()
	tty: n_gsm: Fix bogus i++ in gsm_data_kick
	clk: samsung: exynos5433: Add IGNORE_UNUSED flag to sclk_i2s1
	powerpc/64s/pgtable: fix an undefined behaviour
	dm zoned: return NULL if dmz_get_zone_for_reclaim() fails to find a zone
	PCI/PTM: Inherit Switch Downstream Port PTM settings from Upstream Port
	IB/cma: Fix ports memory leak in cma_configfs
	watchdog: da9062: No need to ping manually before setting timeout
	usb: dwc2: gadget: move gadget resume after the core is in L0 state
	USB: gadget: udc: s3c2410_udc: Remove pointless NULL check in s3c2410_udc_nuke
	usb: gadget: lpc32xx_udc: don't dereference ep pointer before null check
	usb: gadget: fix potential double-free in m66592_probe.
	usb: gadget: Fix issue with config_ep_by_speed function
	x86/apic: Make TSC deadline timer detection message visible
	clk: bcm2835: Fix return type of bcm2835_register_gate
	scsi: ufs-qcom: Fix scheduling while atomic issue
	net: sunrpc: Fix off-by-one issues in 'rpc_ntop6'
	NFSv4.1 fix rpc_call_done assignment for BIND_CONN_TO_SESSION
	powerpc/4xx: Don't unmap NULL mbase
	extcon: adc-jack: Fix an error handling path in 'adc_jack_probe()'
	ASoC: fsl_asrc_dma: Fix dma_chan leak when config DMA channel failed
	vfio/mdev: Fix reference count leak in add_mdev_supported_type
	openrisc: Fix issue with argument clobbering for clone/fork
	gfs2: Allow lock_nolock mount to specify jid=X
	scsi: iscsi: Fix reference count leak in iscsi_boot_create_kobj
	scsi: ufs: Don't update urgent bkops level when toggling auto bkops
	pinctrl: imxl: Fix an error handling path in 'imx1_pinctrl_core_probe()'
	pinctrl: freescale: imx: Fix an error handling path in 'imx_pinctrl_probe()'
	crypto: omap-sham - add proper load balancing support for multicore
	geneve: change from tx_error to tx_dropped on missing metadata
	lib/zlib: remove outdated and incorrect pre-increment optimization
	include/linux/bitops.h: avoid clang shift-count-overflow warnings
	elfnote: mark all .note sections SHF_ALLOC
	selftests/vm/pkeys: fix alloc_random_pkey() to make it really random
	blktrace: use errno instead of bi_status
	blktrace: fix endianness in get_pdu_int()
	blktrace: fix endianness for blk_log_remap()
	gfs2: fix use-after-free on transaction ail lists
	selftests/net: in timestamping, strncpy needs to preserve null byte
	drm/sun4i: hdmi ddc clk: Fix size of m divider
	scsi: acornscsi: Fix an error handling path in acornscsi_probe()
	usb/xhci-plat: Set PM runtime as active on resume
	usb/ehci-platform: Set PM runtime as active on resume
	perf report: Fix NULL pointer dereference in hists__fprintf_nr_sample_events()
	bcache: fix potential deadlock problem in btree_gc_coalesce
	block: Fix use-after-free in blkdev_get()
	arm64: hw_breakpoint: Don't invoke overflow handler on uaccess watchpoints
	libata: Use per port sync for detach
	drm: encoder_slave: fix refcouting error for modules
	drm/dp_mst: Reformat drm_dp_check_act_status() a bit
	drm/qxl: Use correct notify port address when creating cursor ring
	selinux: fix double free
	ext4: fix partial cluster initialization when splitting extent
	drm/dp_mst: Increase ACT retry timeout to 3s
	x86/boot/compressed: Relax sed symbol type regex for LLVM ld.lld
	block: nr_sects_write(): Disable preemption on seqcount write
	mtd: rawnand: Pass a nand_chip object to nand_release()
	mtd: rawnand: diskonchip: Fix the probe error path
	mtd: rawnand: sharpsl: Fix the probe error path
	mtd: rawnand: xway: Fix the probe error path
	mtd: rawnand: orion: Fix the probe error path
	mtd: rawnand: oxnas: Add of_node_put()
	mtd: rawnand: oxnas: Fix the probe error path
	mtd: rawnand: socrates: Fix the probe error path
	mtd: rawnand: plat_nand: Fix the probe error path
	mtd: rawnand: mtk: Fix the probe error path
	mtd: rawnand: tmio: Fix the probe error path
	crypto: algif_skcipher - Cap recv SG list at ctx->used
	crypto: algboss - don't wait during notifier callback
	kprobes: Fix to protect kick_kprobe_optimizer() by kprobe_mutex
	e1000e: Do not wake up the system via WOL if device wakeup is disabled
	kretprobe: Prevent triggering kretprobe from within kprobe_flush_task
	sched/rt, net: Use CONFIG_PREEMPTION.patch
	net: core: device_rename: Use rwsem instead of a seqcount
	md: add feature flag MD_FEATURE_RAID0_LAYOUT
	kvm: x86: Move kvm_set_mmio_spte_mask() from x86.c to mmu.c
	kvm: x86: Fix reserved bits related calculation errors caused by MKTME
	KVM: x86/mmu: Set mmio_value to '0' if reserved #PF can't be generated
	Linux 4.14.186

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I5a9f5c8483f37ac08cf01991ffa43b333fdfa0a3
2020-06-25 16:12:13 +02:00
Qiushi Wu
fcc43e9661 vfio/mdev: Fix reference count leak in add_mdev_supported_type
[ Upstream commit aa8ba13cae3134b8ef1c1b6879f66372531da738 ]

kobject_init_and_add() takes reference even when it fails.
If this function returns an error, kobject_put() must be called to
properly clean up the memory associated with the object. Thus,
replace kfree() by kobject_put() to fix this issue. Previous
commit "b8eb718348b8" fixed a similar problem.

Fixes: 7b96953bc6 ("vfio: Mediated device Core driver")
Signed-off-by: Qiushi Wu <wu000273@umn.edu>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-25 15:41:56 +02:00
Alex Williamson
26340f8f0d vfio-pci: Mask cap zero
[ Upstream commit bc138db1b96264b9c1779cf18d5a3b186aa90066 ]

The PCI Code and ID Assignment Specification changed capability ID 0
from reserved to a NULL capability in the v1.1 revision.  The NULL
capability is defined to include only the 16-bit capability header,
ie. only the ID and next pointer.  Unfortunately vfio-pci creates a
map of config space, where ID 0 is used to reserve the standard type
0 header.  Finding an actual capability with this ID therefore results
in a bogus range marked in that map and conflicts with subsequent
capabilities.  As this seems to be a dummy capability anyway and we
already support dropping capabilities, let's hide this one rather than
delving into the potentially subtle dependencies within our map.

Seen on an NVIDIA Tesla T4.

Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-25 15:41:53 +02:00
Qian Cai
f87ebe6764 vfio/pci: fix memory leaks in alloc_perm_bits()
[ Upstream commit 3e63b94b6274324ff2e7d8615df31586de827c4e ]

vfio_pci_disable() calls vfio_config_free() but forgets to call
free_perm_bits() resulting in memory leaks,

unreferenced object 0xc000000c4db2dee0 (size 16):
  comm "qemu-kvm", pid 4305, jiffies 4295020272 (age 3463.780s)
  hex dump (first 16 bytes):
    00 00 ff 00 ff ff ff ff ff ff ff ff ff ff 00 00  ................
  backtrace:
    [<00000000a6a4552d>] alloc_perm_bits+0x58/0xe0 [vfio_pci]
    [<00000000ac990549>] vfio_config_init+0xdf0/0x11b0 [vfio_pci]
    init_pci_cap_msi_perm at drivers/vfio/pci/vfio_pci_config.c:1125
    (inlined by) vfio_msi_cap_len at drivers/vfio/pci/vfio_pci_config.c:1180
    (inlined by) vfio_cap_len at drivers/vfio/pci/vfio_pci_config.c:1241
    (inlined by) vfio_cap_init at drivers/vfio/pci/vfio_pci_config.c:1468
    (inlined by) vfio_config_init at drivers/vfio/pci/vfio_pci_config.c:1707
    [<000000006db873a1>] vfio_pci_open+0x234/0x700 [vfio_pci]
    [<00000000630e1906>] vfio_group_fops_unl_ioctl+0x8e0/0xb84 [vfio]
    [<000000009e34c54f>] ksys_ioctl+0xd8/0x130
    [<000000006577923d>] sys_ioctl+0x28/0x40
    [<000000006d7b1cf2>] system_call_exception+0x114/0x1e0
    [<0000000008ea7dd5>] system_call_common+0xf0/0x278
unreferenced object 0xc000000c4db2e330 (size 16):
  comm "qemu-kvm", pid 4305, jiffies 4295020272 (age 3463.780s)
  hex dump (first 16 bytes):
    00 ff ff 00 ff ff ff ff ff ff ff ff ff ff 00 00  ................
  backtrace:
    [<000000004c71914f>] alloc_perm_bits+0x44/0xe0 [vfio_pci]
    [<00000000ac990549>] vfio_config_init+0xdf0/0x11b0 [vfio_pci]
    [<000000006db873a1>] vfio_pci_open+0x234/0x700 [vfio_pci]
    [<00000000630e1906>] vfio_group_fops_unl_ioctl+0x8e0/0xb84 [vfio]
    [<000000009e34c54f>] ksys_ioctl+0xd8/0x130
    [<000000006577923d>] sys_ioctl+0x28/0x40
    [<000000006d7b1cf2>] system_call_exception+0x114/0x1e0
    [<0000000008ea7dd5>] system_call_common+0xf0/0x278

Fixes: 89e1f7d4c6 ("vfio: Add PCI device driver")
Signed-off-by: Qian Cai <cai@lca.pw>
[aw: rolled in follow-up patch]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-25 15:41:49 +02:00
Greg Kroah-Hartman
10557abb61 Merge 4.14.179 into android-4.14-stable
Changes in 4.14.179
	ext4: fix special inode number checks in __ext4_iget()
	drm/edid: Fix off-by-one in DispID DTD pixel clock
	drm/qxl: qxl_release leak in qxl_draw_dirty_fb()
	drm/qxl: qxl_release leak in qxl_hw_surface_alloc()
	drm/qxl: qxl_release use after free
	btrfs: fix block group leak when removing fails
	btrfs: fix partial loss of prealloc extent past i_size after fsync
	mmc: sdhci-xenon: fix annoying 1.8V regulator warning
	mmc: sdhci-pci: Fix eMMC driver strength for BYT-based controllers
	ALSA: hda/realtek - Two front mics on a Lenovo ThinkCenter
	ALSA: hda/hdmi: fix without unlocked before return
	ALSA: pcm: oss: Place the plugin buffer overflow checks correctly
	PM: ACPI: Output correct message on target power state
	PM: hibernate: Freeze kernel threads in software_resume()
	dm verity fec: fix hash block number in verity_fec_decode
	RDMA/mlx5: Set GRH fields in query QP on RoCE
	RDMA/mlx4: Initialize ib_spec on the stack
	vfio: avoid possible overflow in vfio_iommu_type1_pin_pages
	vfio/type1: Fix VA->PA translation for PFNMAP VMAs in vaddr_get_pfn()
	iommu/qcom: Fix local_base status check
	scsi: target/iblock: fix WRITE SAME zeroing
	iommu/amd: Fix legacy interrupt remapping for x2APIC-enabled system
	ALSA: opti9xx: shut up gcc-10 range warning
	nfs: Fix potential posix_acl refcnt leak in nfs3_set_acl
	dmaengine: dmatest: Fix iteration non-stop logic
	selinux: properly handle multiple messages in selinux_netlink_send()
	Linux 4.14.179

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ic0182187e4c248ccfcfa3a7ff407d64f2756ef14
2020-05-06 09:01:56 +02:00
Sean Christopherson
440e152362 vfio/type1: Fix VA->PA translation for PFNMAP VMAs in vaddr_get_pfn()
commit 5cbf3264bc715e9eb384e2b68601f8c02bb9a61d upstream.

Use follow_pfn() to get the PFN of a PFNMAP VMA instead of assuming that
vma->vm_pgoff holds the base PFN of the VMA.  This fixes a bug where
attempting to do VFIO_IOMMU_MAP_DMA on an arbitrary PFNMAP'd region of
memory calculates garbage for the PFN.

Hilariously, this only got detected because the first "PFN" calculated
by vaddr_get_pfn() is PFN 0 (vma->vm_pgoff==0), and iommu_iova_to_phys()
uses PA==0 as an error, which triggers a WARN in vfio_unmap_unpin()
because the translation "failed".  PFN 0 is now unconditionally reserved
on x86 in order to mitigate L1TF, which causes is_invalid_reserved_pfn()
to return true and in turns results in vaddr_get_pfn() returning success
for PFN 0.  Eventually the bogus calculation runs into PFNs that aren't
reserved and leads to failure in vfio_pin_map_dma().  The subsequent
call to vfio_remove_dma() attempts to unmap PFN 0 and WARNs.

  WARNING: CPU: 8 PID: 5130 at drivers/vfio/vfio_iommu_type1.c:750 vfio_unmap_unpin+0x2e1/0x310 [vfio_iommu_type1]
  Modules linked in: vfio_pci vfio_virqfd vfio_iommu_type1 vfio ...
  CPU: 8 PID: 5130 Comm: sgx Tainted: G        W         5.6.0-rc5-705d787c7fee-vfio+ #3
  Hardware name: Intel Corporation Mehlow UP Server Platform/Moss Beach Server, BIOS CNLSE2R1.D00.X119.B49.1803010910 03/01/2018
  RIP: 0010:vfio_unmap_unpin+0x2e1/0x310 [vfio_iommu_type1]
  Code: <0f> 0b 49 81 c5 00 10 00 00 e9 c5 fe ff ff bb 00 10 00 00 e9 3d fe
  RSP: 0018:ffffbeb5039ebda8 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffff9a55cbf8d480 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff9a52b771c200
  RBP: 0000000000000000 R08: 0000000000000040 R09: 00000000fffffff2
  R10: 0000000000000001 R11: ffff9a51fa896000 R12: 0000000184010000
  R13: 0000000184000000 R14: 0000000000010000 R15: ffff9a55cb66ea08
  FS:  00007f15d3830b40(0000) GS:ffff9a55d5600000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000561cf39429e0 CR3: 000000084f75f005 CR4: 00000000003626e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   vfio_remove_dma+0x17/0x70 [vfio_iommu_type1]
   vfio_iommu_type1_ioctl+0x9e3/0xa7b [vfio_iommu_type1]
   ksys_ioctl+0x92/0xb0
   __x64_sys_ioctl+0x16/0x20
   do_syscall_64+0x4c/0x180
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f15d04c75d7
  Code: <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48

Fixes: 73fa0d10d0 ("vfio: Type1 IOMMU implementation")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-05 19:15:51 +02:00
Yan Zhao
b4324db6f5 vfio: avoid possible overflow in vfio_iommu_type1_pin_pages
commit 0ea971f8dcd6dee78a9a30ea70227cf305f11ff7 upstream.

add parentheses to avoid possible vaddr overflow.

Fixes: a54eb55045 ("vfio iommu type1: Add support for mediated devices")
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-05 19:15:51 +02:00
Greg Kroah-Hartman
509b38045c Merge 4.14.168 into android-4.14
Changes in 4.14.168
	xfs: Sanity check flags of Q_XQUOTARM call
	mfd: intel-lpss: Add default I2C device properties for Gemini Lake
	powerpc/archrandom: fix arch_get_random_seed_int()
	tipc: fix wrong timeout input for tipc_wait_for_cond()
	mt7601u: fix bbp version check in mt7601u_wait_bbp_ready
	crypto: sun4i-ss - fix big endian issues
	drm/sti: do not remove the drm_bridge that was never added
	drm/virtio: fix bounds check in virtio_gpu_cmd_get_capset()
	ALSA: hda: fix unused variable warning
	apparmor: don't try to replace stale label in ptrace access check
	PCI: iproc: Remove PAXC slot check to allow VF support
	drm/hisilicon: hibmc: Don't overwrite fb helper surface depth
	IB/rxe: replace kvfree with vfree
	IB/hfi1: Add mtu check for operational data VLs
	ALSA: usb-audio: update quirk for B&W PX to remove microphone
	staging: comedi: ni_mio_common: protect register write overflow
	pwm: lpss: Release runtime-pm reference from the driver's remove callback
	drm/sun4i: hdmi: Fix double flag assignation
	mlxsw: reg: QEEC: Add minimum shaper fields
	NTB: ntb_hw_idt: replace IS_ERR_OR_NULL with regular NULL checks
	pcrypt: use format specifier in kobject_add
	exportfs: fix 'passing zero to ERR_PTR()' warning
	drm/dp_mst: Skip validating ports during destruction, just ref
	net: phy: Fix not to call phy_resume() if PHY is not attached
	IB/rxe: Fix incorrect cache cleanup in error flow
	staging: bcm2835-camera: Abort probe if there is no camera
	switchtec: Remove immediate status check after submitting MRPC command
	pinctrl: sh-pfc: r8a7740: Add missing REF125CK pin to gether_gmii group
	pinctrl: sh-pfc: r8a7740: Add missing LCD0 marks to lcd0_data24_1 group
	pinctrl: sh-pfc: r8a7791: Remove bogus ctrl marks from qspi_data4_b group
	pinctrl: sh-pfc: r8a7791: Remove bogus marks from vin1_b_data18 group
	pinctrl: sh-pfc: sh73a0: Add missing TO pin to tpu4_to3 group
	pinctrl: sh-pfc: r8a7794: Remove bogus IPSR9 field
	pinctrl: sh-pfc: sh7734: Add missing IPSR11 field
	pinctrl: sh-pfc: r8a77995: Remove bogus SEL_PWM[0-3]_3 configurations
	pinctrl: sh-pfc: sh7269: Add missing PCIOR0 field
	pinctrl: sh-pfc: sh7734: Remove bogus IPSR10 value
	vxlan: changelink: Fix handling of default remotes
	Input: nomadik-ske-keypad - fix a loop timeout test
	clk: highbank: fix refcount leak in hb_clk_init()
	clk: qoriq: fix refcount leak in clockgen_init()
	clk: socfpga: fix refcount leak
	clk: samsung: exynos4: fix refcount leak in exynos4_get_xom()
	clk: imx6q: fix refcount leak in imx6q_clocks_init()
	clk: imx6sx: fix refcount leak in imx6sx_clocks_init()
	clk: imx7d: fix refcount leak in imx7d_clocks_init()
	clk: vf610: fix refcount leak in vf610_clocks_init()
	clk: armada-370: fix refcount leak in a370_clk_init()
	clk: kirkwood: fix refcount leak in kirkwood_clk_init()
	clk: armada-xp: fix refcount leak in axp_clk_init()
	clk: mv98dx3236: fix refcount leak in mv98dx3236_clk_init()
	clk: dove: fix refcount leak in dove_clk_init()
	MIPS: BCM63XX: drop unused and broken DSP platform device
	IB/usnic: Fix out of bounds index check in query pkey
	RDMA/ocrdma: Fix out of bounds index check in query pkey
	RDMA/qedr: Fix out of bounds index check in query pkey
	drm/shmob: Fix return value check in shmob_drm_probe
	arm64: dts: apq8016-sbc: Increase load on l11 for SDCARD
	spi: cadence: Correct initialisation of runtime PM
	RDMA/iw_cxgb4: Fix the unchecked ep dereference
	drm/etnaviv: NULL vs IS_ERR() buf in etnaviv_core_dump()
	media: s5p-jpeg: Correct step and max values for V4L2_CID_JPEG_RESTART_INTERVAL
	kbuild: mark prepare0 as PHONY to fix external module build
	crypto: brcm - Fix some set-but-not-used warning
	crypto: tgr192 - fix unaligned memory access
	ASoC: imx-sgtl5000: put of nodes if finding codec fails
	IB/iser: Pass the correct number of entries for dma mapped SGL
	rtc: cmos: ignore bogus century byte
	spi/topcliff_pch: Fix potential NULL dereference on allocation error
	clk: sunxi-ng: sun8i-a23: Enable PLL-MIPI LDOs when ungating it
	iwlwifi: mvm: avoid possible access out of array.
	net/mlx5: Take lock with IRQs disabled to avoid deadlock
	iwlwifi: mvm: fix A-MPDU reference assignment
	tty: ipwireless: Fix potential NULL pointer dereference
	driver: uio: fix possible memory leak in __uio_register_device
	driver: uio: fix possible use-after-free in __uio_register_device
	crypto: crypto4xx - Fix wrong ppc4xx_trng_probe()/ppc4xx_trng_remove() arguments
	driver core: Do not resume suppliers under device_links_write_lock()
	ARM: dts: lpc32xx: add required clocks property to keypad device node
	ARM: dts: lpc32xx: reparent keypad controller to SIC1
	ARM: dts: lpc32xx: fix ARM PrimeCell LCD controller variant
	ARM: dts: lpc32xx: fix ARM PrimeCell LCD controller clocks property
	ARM: dts: lpc32xx: phy3250: fix SD card regulator voltage
	iwlwifi: mvm: fix RSS config command
	staging: most: cdev: add missing check for cdev_add failure
	rtc: ds1672: fix unintended sign extension
	thermal: mediatek: fix register index error
	net: phy: fixed_phy: Fix fixed_phy not checking GPIO
	rtc: ds1307: rx8130: Fix alarm handling
	rtc: 88pm860x: fix unintended sign extension
	rtc: 88pm80x: fix unintended sign extension
	rtc: pm8xxx: fix unintended sign extension
	fbdev: chipsfb: remove set but not used variable 'size'
	iw_cxgb4: use tos when importing the endpoint
	iw_cxgb4: use tos when finding ipv6 routes
	drm/etnaviv: potential NULL dereference
	pinctrl: sh-pfc: emev2: Add missing pinmux functions
	pinctrl: sh-pfc: r8a7791: Fix scifb2_data_c pin group
	pinctrl: sh-pfc: r8a7792: Fix vin1_data18_b pin group
	pinctrl: sh-pfc: sh73a0: Fix fsic_spdif pin groups
	PCI: endpoint: functions: Use memcpy_fromio()/memcpy_toio()
	usb: phy: twl6030-usb: fix possible use-after-free on remove
	block: don't use bio->bi_vcnt to figure out segment number
	keys: Timestamp new keys
	vfio_pci: Enable memory accesses before calling pci_map_rom
	hwmon: (pmbus/tps53679) Fix driver info initialization in probe routine
	KVM: PPC: Release all hardware TCE tables attached to a group
	staging: r8822be: check kzalloc return or bail
	dmaengine: mv_xor: Use correct device for DMA API
	cdc-wdm: pass return value of recover_from_urb_loss
	regulator: pv88060: Fix array out-of-bounds access
	regulator: pv88080: Fix array out-of-bounds access
	regulator: pv88090: Fix array out-of-bounds access
	net: dsa: qca8k: Enable delay for RGMII_ID mode
	drm/nouveau/bios/ramcfg: fix missing parentheses when calculating RON
	drm/nouveau/pmu: don't print reply values if exec is false
	ASoC: qcom: Fix of-node refcount unbalance in apq8016_sbc_parse_of()
	fs/nfs: Fix nfs_parse_devname to not modify it's argument
	staging: rtlwifi: Use proper enum for return in halmac_parse_psd_data_88xx
	powerpc/64s: Fix logic when handling unknown CPU features
	NFS: Fix a soft lockup in the delegation recovery code
	clocksource/drivers/sun5i: Fail gracefully when clock rate is unavailable
	clocksource/drivers/exynos_mct: Fix error path in timer resources initialization
	platform/x86: wmi: fix potential null pointer dereference
	NFS/pnfs: Bulk destroy of layouts needs to be safe w.r.t. umount
	mmc: sdhci-brcmstb: handle mmc_of_parse() errors during probe
	ARM: 8847/1: pm: fix HYP/SVC mode mismatch when MCPM is used
	ARM: 8848/1: virt: Align GIC version check with arm64 counterpart
	regulator: wm831x-dcdc: Fix list of wm831x_dcdc_ilim from mA to uA
	netfilter: nft_set_hash: fix lookups with fixed size hash on big endian
	NFSv4/flexfiles: Fix invalid deref in FF_LAYOUT_DEVID_NODE()
	net: aquantia: fixed instack structure overflow
	powerpc/mm: Check secondary hash page table
	nios2: ksyms: Add missing symbol exports
	x86/mm: Remove unused variable 'cpu'
	scsi: megaraid_sas: reduce module load time
	drivers/rapidio/rio_cm.c: fix potential oops in riocm_ch_listen()
	xen, cpu_hotplug: Prevent an out of bounds access
	net: sh_eth: fix a missing check of of_get_phy_mode
	regulator: lp87565: Fix missing register for LP87565_BUCK_0
	media: ivtv: update *pos correctly in ivtv_read_pos()
	media: cx18: update *pos correctly in cx18_read_pos()
	media: wl128x: Fix an error code in fm_download_firmware()
	media: cx23885: check allocation return
	regulator: tps65086: Fix tps65086_ldoa1_ranges for selector 0xB
	jfs: fix bogus variable self-initialization
	tipc: tipc clang warning
	m68k: mac: Fix VIA timer counter accesses
	arm64: dts: allwinner: a64: Add missing PIO clocks
	ARM: OMAP2+: Fix potentially uninitialized return value for _setup_reset()
	media: davinci-isif: avoid uninitialized variable use
	media: tw5864: Fix possible NULL pointer dereference in tw5864_handle_frame
	spi: tegra114: clear packed bit for unpacked mode
	spi: tegra114: fix for unpacked mode transfers
	spi: tegra114: terminate dma and reset on transfer timeout
	spi: tegra114: flush fifos
	spi: tegra114: configure dma burst size to fifo trig level
	soc/fsl/qe: Fix an error code in qe_pin_request()
	spi: bcm2835aux: fix driver to not allow 65535 (=-1) cs-gpios
	ehea: Fix a copy-paste err in ehea_init_port_res
	scsi: qla2xxx: Unregister chrdev if module initialization fails
	scsi: target/core: Fix a race condition in the LUN lookup code
	ARM: pxa: ssp: Fix "WARNING: invalid free of devm_ allocated data"
	net: hns3: fix for vport->bw_limit overflow problem
	hwmon: (w83627hf) Use request_muxed_region for Super-IO accesses
	platform/x86: alienware-wmi: fix kfree on potentially uninitialized pointer
	tipc: set sysctl_tipc_rmem and named_timeout right range
	selftests/ipc: Fix msgque compiler warnings
	powerpc: vdso: Make vdso32 installation conditional in vdso_install
	ARM: dts: ls1021: Fix SGMII PCS link remaining down after PHY disconnect
	media: ov2659: fix unbalanced mutex_lock/unlock
	6lowpan: Off by one handling ->nexthdr
	dmaengine: axi-dmac: Don't check the number of frames for alignment
	ALSA: usb-audio: Handle the error from snd_usb_mixer_apply_create_quirk()
	NFS: Don't interrupt file writeout due to fatal errors
	irqchip/gic-v3-its: fix some definitions of inner cacheability attributes
	scsi: qla2xxx: Fix a format specifier
	scsi: qla2xxx: Avoid that qlt_send_resp_ctio() corrupts memory
	packet: in recvmsg msg_name return at least sizeof sockaddr_ll
	ASoC: fix valid stream condition
	usb: gadget: fsl: fix link error against usb-gadget module
	dwc2: gadget: Fix completed transfer size calculation in DDMA
	IB/mlx5: Add missing XRC options to QP optional params mask
	iommu/vt-d: Make kernel parameter igfx_off work with vIOMMU
	net: ena: fix swapped parameters when calling ena_com_indirect_table_fill_entry
	net: ena: fix: Free napi resources when ena_up() fails
	net: ena: fix incorrect test of supported hash function
	net: ena: fix ena_com_fill_hash_function() implementation
	dmaengine: tegra210-adma: restore channel status
	mmc: core: fix possible use after free of host
	lightnvm: pblk: fix lock order in pblk_rb_tear_down_check
	afs: Fix the afs.cell and afs.volume xattr handlers
	vfio/mdev: Avoid release parent reference during error path
	vfio/mdev: Fix aborting mdev child device removal if one fails
	l2tp: Fix possible NULL pointer dereference
	media: omap_vout: potential buffer overflow in vidioc_dqbuf()
	media: davinci/vpbe: array underflow in vpbe_enum_outputs()
	platform/x86: alienware-wmi: printing the wrong error code
	crypto: caam - fix caam_dump_sg that iterates through scatterlist
	netfilter: ebtables: CONFIG_COMPAT: reject trailing data after last rule
	pwm: meson: Consider 128 a valid pre-divider
	pwm: meson: Don't disable PWM when setting duty repeatedly
	ARM: riscpc: fix lack of keyboard interrupts after irq conversion
	kdb: do a sanity check on the cpu in kdb_per_cpu()
	backlight: lm3630a: Return 0 on success in update_status functions
	thermal: cpu_cooling: Actually trace CPU load in thermal_power_cpu_get_power
	EDAC/mc: Fix edac_mc_find() in case no device is found
	ARM: dts: sun8i-h3: Fix wifi in Beelink X2 DT
	dmaengine: tegra210-adma: Fix crash during probe
	arm64: dts: meson: libretech-cc: set eMMC as removable
	RDMA/qedr: Fix incorrect device rate.
	spi: spi-fsl-spi: call spi_finalize_current_message() at the end
	crypto: ccp - fix AES CFB error exposed by new test vectors
	crypto: ccp - Fix 3DES complaint from ccp-crypto module
	serial: stm32: fix rx error handling
	serial: stm32: fix transmit_chars when tx is stopped
	serial: stm32: Add support of TC bit status check
	serial: stm32: fix wakeup source initialization
	misc: sgi-xp: Properly initialize buf in xpc_get_rsvd_page_pa
	iommu: Use right function to get group for device
	signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of force_sig
	inet: frags: call inet_frags_fini() after unregister_pernet_subsys()
	netvsc: unshare skb in VF rx handler
	cpufreq: brcmstb-avs-cpufreq: Fix initial command check
	cpufreq: brcmstb-avs-cpufreq: Fix types for voltage/frequency
	media: vivid: fix incorrect assignment operation when setting video mode
	mpls: fix warning with multi-label encap
	iommu/vt-d: Duplicate iommu_resv_region objects per device list
	qed: iWARP - Use READ_ONCE and smp_store_release to access ep->state
	powerpc/cacheinfo: add cacheinfo_teardown, cacheinfo_rebuild
	powerpc/pseries/mobility: rebuild cacheinfo hierarchy post-migration
	drm/msm/mdp5: Fix mdp5_cfg_init error return
	net: netem: fix backlog accounting for corrupted GSO frames
	net/af_iucv: always register net_device notifier
	ASoC: ti: davinci-mcasp: Fix slot mask settings when using multiple AXRs
	rtc: pcf8563: Fix interrupt trigger method
	rtc: pcf8563: Clear event flags and disable interrupts before requesting irq
	drm/msm/a3xx: remove TPL1 regs from snapshot
	perf/ioctl: Add check for the sample_period value
	dmaengine: hsu: Revert "set HSU_CH_MTSR to memory width"
	clk: qcom: Fix -Wunused-const-variable
	nvmem: imx-ocotp: Ensure WAIT bits are preserved when setting timing
	bnxt_en: Fix ethtool selftest crash under error conditions.
	iommu/amd: Make iommu_disable safer
	mfd: intel-lpss: Release IDA resources
	rxrpc: Fix uninitialized error code in rxrpc_send_data_packet()
	devres: allow const resource arguments
	RDMA/hns: Fixs hw access invalid dma memory error
	net: pasemi: fix an use-after-free in pasemi_mac_phy_init()
	scsi: libfc: fix null pointer dereference on a null lport
	clk: sunxi-ng: v3s: add the missing PLL_DDR1
	PM: sleep: Fix possible overflow in pm_system_cancel_wakeup()
	libertas_tf: Use correct channel range in lbtf_geo_init
	qed: reduce maximum stack frame size
	usb: host: xhci-hub: fix extra endianness conversion
	mic: avoid statically declaring a 'struct device'.
	x86/kgbd: Use NMI_VECTOR not APIC_DM_NMI
	crypto: ccp - Reduce maximum stack usage
	ALSA: aoa: onyx: always initialize register read value
	tipc: reduce risk of wakeup queue starvation
	ARM: dts: stm32: add missing vdda-supply to adc on stm32h743i-eval
	net/mlx5: Fix mlx5_ifc_query_lag_out_bits
	cifs: fix rmmod regression in cifs.ko caused by force_sig changes
	crypto: caam - free resources in case caam_rng registration failed
	ext4: set error return correctly when ext4_htree_store_dirent fails
	ASoC: es8328: Fix copy-paste error in es8328_right_line_controls
	ASoC: cs4349: Use PM ops 'cs4349_runtime_pm'
	ASoC: wm8737: Fix copy-paste error in wm8737_snd_controls
	net/rds: Add a few missing rds_stat_names entries
	bnxt_en: Fix handling FRAG_ERR when NVM_INSTALL_UPDATE cmd fails
	signal: Allow cifs and drbd to receive their terminating signals
	ASoC: sun4i-i2s: RX and TX counter registers are swapped
	dmaengine: dw: platform: Switch to acpi_dma_controller_register()
	mac80211: minstrel_ht: fix per-group max throughput rate initialization
	media: atmel: atmel-isi: fix timeout value for stop streaming
	rtc: pcf2127: bugfix: read rtc disables watchdog
	mips: avoid explicit UB in assignment of mips_io_port_base
	iommu/mediatek: Fix iova_to_phys PA start for 4GB mode
	ahci: Do not export local variable ahci_em_messages
	Partially revert "kfifo: fix kfifo_alloc() and kfifo_init()"
	hwmon: (lm75) Fix write operations for negative temperatures
	power: supply: Init device wakeup after device_add()
	x86, perf: Fix the dependency of the x86 insn decoder selftest
	staging: greybus: light: fix a couple double frees
	irqdomain: Add the missing assignment of domain->fwnode for named fwnode
	bcma: fix incorrect update of BCMA_CORE_PCI_MDIO_DATA
	iio: dac: ad5380: fix incorrect assignment to val
	ath9k: dynack: fix possible deadlock in ath_dynack_node_{de}init
	tty: serial: fsl_lpuart: Use appropriate lpuart32_* I/O funcs
	net: sonic: return NETDEV_TX_OK if failed to map buffer
	scsi: fnic: fix msix interrupt allocation
	Btrfs: fix hang when loading existing inode cache off disk
	Btrfs: fix inode cache waiters hanging on failure to start caching thread
	Btrfs: fix inode cache waiters hanging on path allocation failure
	btrfs: use correct count in btrfs_file_write_iter()
	ixgbe: sync the first fragment unconditionally
	hwmon: (shtc1) fix shtc1 and shtw1 id mask
	net: sonic: replace dev_kfree_skb in sonic_send_packet
	pinctrl: iproc-gpio: Fix incorrect pinconf configurations
	ath10k: adjust skb length in ath10k_sdio_mbox_rx_packet
	RDMA/cma: Fix false error message
	net/rds: Fix 'ib_evt_handler_call' element in 'rds_ib_stat_names'
	iommu/amd: Wait for completion of IOTLB flush in attach_device
	net: aquantia: Fix aq_vec_isr_legacy() return value
	net: hisilicon: Fix signedness bug in hix5hd2_dev_probe()
	net: broadcom/bcmsysport: Fix signedness in bcm_sysport_probe()
	net: stmmac: dwmac-meson8b: Fix signedness bug in probe
	net: axienet: fix a signedness bug in probe
	of: mdio: Fix a signedness bug in of_phy_get_and_connect()
	net: ethernet: stmmac: Fix signedness bug in ipq806x_gmac_of_parse()
	nvme: retain split access workaround for capability reads
	net: stmmac: gmac4+: Not all Unicast addresses may be available
	mac80211: accept deauth frames in IBSS mode
	llc: fix another potential sk_buff leak in llc_ui_sendmsg()
	llc: fix sk_buff refcounting in llc_conn_state_process()
	net: stmmac: fix length of PTP clock's name string
	act_mirred: Fix mirred_init_module error handling
	net: avoid possible false sharing in sk_leave_memory_pressure()
	net: add {READ|WRITE}_ONCE() annotations on ->rskq_accept_head
	tcp: annotate lockless access to tcp_memory_pressure
	drm/msm/dsi: Implement reset correctly
	dmaengine: imx-sdma: fix size check for sdma script_number
	net: netem: fix error path for corrupted GSO frames
	net: netem: correct the parent's backlog when corrupted packet was dropped
	net: qca_spi: Move reset_count to struct qcaspi
	afs: Fix large file support
	MIPS: Loongson: Fix return value of loongson_hwmon_init
	hv_netvsc: flag software created hash value
	net: neigh: use long type to store jiffies delta
	packet: fix data-race in fanout_flow_is_huge()
	mmc: sdio: fix wl1251 vendor id
	mmc: core: fix wl1251 sdio quirks
	affs: fix a memory leak in affs_remount
	dmaengine: ti: edma: fix missed failure handling
	drm/radeon: fix bad DMA from INTERRUPT_CNTL2
	arm64: dts: juno: Fix UART frequency
	IB/iser: Fix dma_nents type definition
	serial: stm32: fix clearing interrupt error flags
	m68k: Call timer_interrupt() with interrupts disabled
	Linux 4.14.168

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I3eeaa348e8e99998356d27c99d06dcb38e48e7d5
2020-01-27 15:30:05 +01:00
Parav Pandit
f736690af3 vfio/mdev: Fix aborting mdev child device removal if one fails
[ Upstream commit 6093e348a5e2475c5bb2e571346460f939998670 ]

device_for_each_child() stops executing callback function for remaining
child devices, if callback hits an error.
Each child mdev device is independent of each other.
While unregistering parent device, mdev core must remove all child mdev
devices.
Therefore, mdev_device_remove_cb() always returns success so that
device_for_each_child doesn't abort if one child removal hits error.

While at it, improve remove and unregister functions for below simplicity.

There isn't need to pass forced flag pointer during mdev parent
removal which invokes mdev_device_remove(). So simplify the flow.

mdev_device_remove() is called from two paths.
1. mdev_unregister_driver()
     mdev_device_remove_cb()
       mdev_device_remove()
2. remove_store()
     mdev_device_remove()

Fixes: 7b96953bc6 ("vfio: Mediated device Core driver")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27 14:46:33 +01:00
Parav Pandit
2d464b0246 vfio/mdev: Avoid release parent reference during error path
[ Upstream commit 60e7f2c3fe9919cee9534b422865eed49f4efb15 ]

During mdev parent registration in mdev_register_device(),
if parent device is duplicate, it releases the reference of existing
parent device.
This is incorrect. Existing parent device should not be touched.

Fixes: 7b96953bc6 ("vfio: Mediated device Core driver")
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27 14:46:33 +01:00
Eric Auger
6de29266dd vfio_pci: Enable memory accesses before calling pci_map_rom
[ Upstream commit 0cfd027be1d6def4a462cdc180c055143af24069 ]

pci_map_rom/pci_get_rom_size() performs memory access in the ROM.
In case the Memory Space accesses were disabled, readw() is likely
to trigger a synchronous external abort on some platforms.

In case memory accesses were disabled, re-enable them before the
call and disable them back again just after.

Fixes: 89e1f7d4c6 ("vfio: Add PCI device driver")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27 14:46:19 +01:00
Greg Kroah-Hartman
0f543a0283 Merge 4.14.160 into android-4.14
Changes in 4.14.160
	net: bridge: deny dev_set_mac_address() when unregistering
	net: dsa: fix flow dissection on Tx path
	net: ethernet: ti: cpsw: fix extra rx interrupt
	net: thunderx: start phy before starting autonegotiation
	openvswitch: support asymmetric conntrack
	tcp: md5: fix potential overestimation of TCP option space
	tipc: fix ordering of tipc module init and exit routine
	tcp: fix rejected syncookies due to stale timestamps
	tcp: tighten acceptance of ACKs not matching a child socket
	tcp: Protect accesses to .ts_recent_stamp with {READ,WRITE}_ONCE()
	inet: protect against too small mtu values.
	nvme: host: core: fix precedence of ternary operator
	Revert "regulator: Defer init completion for a while after late_initcall"
	PCI/PM: Always return devices to D0 when thawing
	PCI: Fix Intel ACS quirk UPDCR register address
	PCI/MSI: Fix incorrect MSI-X masking on resume
	PCI: Apply Cavium ACS quirk to ThunderX2 and ThunderX3
	xtensa: fix TLB sanity checker
	rpmsg: glink: Set tail pointer to 0 at end of FIFO
	rpmsg: glink: Fix reuse intents memory leak issue
	rpmsg: glink: Fix use after free in open_ack TIMEOUT case
	rpmsg: glink: Put an extra reference during cleanup
	rpmsg: glink: Fix rpmsg_register_device err handling
	rpmsg: glink: Don't send pending rx_done during remove
	rpmsg: glink: Free pending deferred work on remove
	CIFS: Respect O_SYNC and O_DIRECT flags during reconnect
	ARM: dts: s3c64xx: Fix init order of clock providers
	ARM: tegra: Fix FLOW_CTLR_HALT register clobbering by tegra_resume()
	vfio/pci: call irq_bypass_unregister_producer() before freeing irq
	dma-buf: Fix memory leak in sync_file_merge()
	dm btree: increase rebalance threshold in __rebalance2()
	scsi: iscsi: Fix a potential deadlock in the timeout handler
	drm/radeon: fix r1xx/r2xx register checker for POT textures
	xhci: fix USB3 device initiated resume race with roothub autosuspend
	net: stmmac: use correct DMA buffer size in the RX descriptor
	net: stmmac: don't stop NAPI processing when dropping a packet
	Linux 4.14.160

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-12-21 13:49:01 -05:00
Jiang Yi
05d1ce97c6 vfio/pci: call irq_bypass_unregister_producer() before freeing irq
commit d567fb8819162099035e546b11a736e29c2af0ea upstream.

Since irq_bypass_register_producer() is called after request_irq(), we
should do tear-down in reverse order: irq_bypass_unregister_producer()
then free_irq().

Specifically free_irq() may release resources required by the
irqbypass del_producer() callback.  Notably an example provided by
Marc Zyngier on arm64 with GICv4 that he indicates has the potential
to wedge the hardware:

 free_irq(irq)
   __free_irq(irq)
     irq_domain_deactivate_irq(irq)
       its_irq_domain_deactivate()
         [unmap the VLPI from the ITS]

 kvm_arch_irq_bypass_del_producer(cons, prod)
   kvm_vgic_v4_unset_forwarding(kvm, irq, ...)
     its_unmap_vlpi(irq)
       [Unmap the VLPI from the ITS (again), remap the original LPI]

Signed-off-by: Jiang Yi <giangyi@amazon.com>
Cc: stable@vger.kernel.org # v4.4+
Fixes: 6d7425f109 ("vfio: Register/unregister irq_bypass_producer")
Link: https://lore.kernel.org/kvm/20191127164910.15888-1-giangyi@amazon.com
Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
[aw: commit log]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-21 10:47:50 +01:00
Greg Kroah-Hartman
84afceb668 Merge 4.14.158 into android-4.14
Changes in 4.14.158
	Revert "KVM: nVMX: reset cache/shadows when switching loaded VMCS"
	clk: meson: gxbb: let sar_adc_clk_div set the parent clock rate
	ASoC: msm8916-wcd-analog: Fix RX1 selection in RDAC2 MUX
	ASoC: compress: fix unsigned integer overflow check
	reset: Fix memory leak in reset_control_array_put()
	ASoC: kirkwood: fix external clock probe defer
	clk: samsung: exynos5420: Preserve PLL configuration during suspend/resume
	reset: fix reset_control_ops kerneldoc comment
	clk: at91: avoid sleeping early
	clk: sunxi-ng: a80: fix the zero'ing of bits 16 and 18
	idr: Fix idr_alloc_u32 on 32-bit systems
	x86/resctrl: Prevent NULL pointer dereference when reading mondata
	clk: ti: dra7-atl-clock: Remove ti_clk_add_alias call
	net: fec: add missed clk_disable_unprepare in remove
	bridge: ebtables: don't crash when using dnat target in output chains
	can: peak_usb: report bus recovery as well
	can: c_can: D_CAN: c_can_chip_config(): perform a sofware reset on open
	can: rx-offload: can_rx_offload_queue_tail(): fix error handling, avoid skb mem leak
	can: rx-offload: can_rx_offload_offload_one(): do not increase the skb_queue beyond skb_queue_len_max
	can: rx-offload: can_rx_offload_offload_one(): increment rx_fifo_errors on queue overflow or OOM
	can: rx-offload: can_rx_offload_offload_one(): use ERR_PTR() to propagate error value in case of errors
	can: rx-offload: can_rx_offload_irq_offload_timestamp(): continue on error
	can: rx-offload: can_rx_offload_irq_offload_fifo(): continue on error
	watchdog: meson: Fix the wrong value of left time
	scripts/gdb: fix debugging modules compiled with hot/cold partitioning
	net: bcmgenet: reapply manual settings to the PHY
	ceph: return -EINVAL if given fsc mount option on kernel w/o support
	mac80211: fix station inactive_time shortly after boot
	block: drbd: remove a stray unlock in __drbd_send_protocol()
	pwm: bcm-iproc: Prevent unloading the driver module while in use
	scsi: lpfc: Fix kernel Oops due to null pring pointers
	scsi: lpfc: Fix dif and first burst use in write commands
	ARM: dts: Fix up SQ201 flash access
	ARM: debug-imx: only define DEBUG_IMX_UART_PORT if needed
	ARM: dts: imx53-voipac-dmm-668: Fix memory node duplication
	parisc: Fix serio address output
	parisc: Fix HP SDC hpa address output
	arm64: mm: Prevent mismatched 52-bit VA support
	arm64: smp: Handle errors reported by the firmware
	ARM: OMAP1: fix USB configuration for device-only setups
	RDMA/vmw_pvrdma: Use atomic memory allocation in create AH
	PM / AVS: SmartReflex: NULL check before some freeing functions is not needed
	ARM: ks8695: fix section mismatch warning
	ACPI / LPSS: Ignore acpi_device_fix_up_power() return value
	scsi: lpfc: Enable Management features for IF_TYPE=6
	crypto: user - support incremental algorithm dumps
	mwifiex: fix potential NULL dereference and use after free
	mwifiex: debugfs: correct histogram spacing, formatting
	rtl818x: fix potential use after free
	xfs: require both realtime inodes to mount
	ubi: Put MTD device after it is not used
	ubi: Do not drop UBI device reference before using
	microblaze: adjust the help to the real behavior
	microblaze: move "... is ready" messages to arch/microblaze/Makefile
	iwlwifi: move iwl_nvm_check_version() into dvm
	gpiolib: Fix return value of gpio_to_desc() stub if !GPIOLIB
	kvm: vmx: Set IA32_TSC_AUX for legacy mode guests
	VSOCK: bind to random port for VMADDR_PORT_ANY
	mmc: meson-gx: make sure the descriptor is stopped on errors
	mtd: rawnand: sunxi: Write pageprog related opcodes to WCMD_SET
	btrfs: only track ref_heads in delayed_ref_updates
	HID: intel-ish-hid: fixes incorrect error handling
	serial: 8250: Rate limit serial port rx interrupts during input overruns
	kprobes/x86/xen: blacklist non-attachable xen interrupt functions
	xen/pciback: Check dev_data before using it
	vfio-mdev/samples: Use u8 instead of char for handle functions
	pinctrl: xway: fix gpio-hog related boot issues
	net/mlx5: Continue driver initialization despite debugfs failure
	exofs_mount(): fix leaks on failure exits
	bnxt_en: Return linux standard errors in bnxt_ethtool.c
	bnxt_en: query force speeds before disabling autoneg mode.
	KVM: s390: unregister debug feature on failing arch init
	pinctrl: sh-pfc: sh7264: Fix PFCR3 and PFCR0 register configuration
	pinctrl: sh-pfc: sh7734: Fix shifted values in IPSR10
	HID: doc: fix wrong data structure reference for UHID_OUTPUT
	dm flakey: Properly corrupt multi-page bios.
	gfs2: take jdata unstuff into account in do_grow
	xfs: Align compat attrlist_by_handle with native implementation.
	xfs: Fix bulkstat compat ioctls on x32 userspace.
	IB/qib: Fix an error code in qib_sdma_verbs_send()
	clocksource/drivers/fttmr010: Fix invalid interrupt register access
	vxlan: Fix error path in __vxlan_dev_create()
	powerpc/book3s/32: fix number of bats in p/v_block_mapped()
	powerpc/xmon: fix dump_segments()
	drivers/regulator: fix a missing check of return value
	Bluetooth: hci_bcm: Handle specific unknown packets after firmware loading
	serial: max310x: Fix tx_empty() callback
	openrisc: Fix broken paths to arch/or32
	RDMA/srp: Propagate ib_post_send() failures to the SCSI mid-layer
	scsi: qla2xxx: deadlock by configfs_depend_item
	scsi: csiostor: fix incorrect dma device in case of vport
	ath6kl: Only use match sets when firmware supports it
	ath6kl: Fix off by one error in scan completion
	powerpc/perf: Fix unit_sel/cache_sel checks
	powerpc/prom: fix early DEBUG messages
	powerpc/mm: Make NULL pointer deferences explicit on bad page faults.
	powerpc/44x/bamboo: Fix PCI range
	vfio/spapr_tce: Get rid of possible infinite loop
	powerpc/powernv/eeh/npu: Fix uninitialized variables in opal_pci_eeh_freeze_status
	drbd: ignore "all zero" peer volume sizes in handshake
	drbd: reject attach of unsuitable uuids even if connected
	drbd: do not block when adjusting "disk-options" while IO is frozen
	drbd: fix print_st_err()'s prototype to match the definition
	IB/rxe: Make counters thread safe
	regulator: tps65910: fix a missing check of return value
	powerpc/83xx: handle machine check caused by watchdog timer
	powerpc/pseries: Fix node leak in update_lmb_associativity_index()
	crypto: mxc-scc - fix build warnings on ARM64
	pwm: clps711x: Fix period calculation
	net/netlink_compat: Fix a missing check of nla_parse_nested
	net/net_namespace: Check the return value of register_pernet_subsys()
	f2fs: fix to dirty inode synchronously
	um: Make GCOV depend on !KCOV
	net: (cpts) fix a missing check of clk_prepare
	net: stmicro: fix a missing check of clk_prepare
	net: dsa: bcm_sf2: Propagate error value from mdio_write
	atl1e: checking the status of atl1e_write_phy_reg
	tipc: fix a missing check of genlmsg_put
	net/wan/fsl_ucc_hdlc: Avoid double free in ucc_hdlc_probe()
	ocfs2: clear journal dirty flag after shutdown journal
	vmscan: return NODE_RECLAIM_NOSCAN in node_reclaim() when CONFIG_NUMA is n
	lib/genalloc.c: fix allocation of aligned buffer from non-aligned chunk
	lib/genalloc.c: use vzalloc_node() to allocate the bitmap
	fork: fix some -Wmissing-prototypes warnings
	drivers/base/platform.c: kmemleak ignore a known leak
	lib/genalloc.c: include vmalloc.h
	mtd: Check add_mtd_device() ret code
	tipc: fix memory leak in tipc_nl_compat_publ_dump
	net/core/neighbour: tell kmemleak about hash tables
	PCI/MSI: Return -ENOSPC from pci_alloc_irq_vectors_affinity()
	net/core/neighbour: fix kmemleak minimal reference count for hash tables
	serial: 8250: Fix serial8250 initialization crash
	gpu: ipu-v3: pre: don't trigger update if buffer address doesn't change
	sfc: suppress duplicate nvmem partition types in efx_ef10_mtd_probe
	ip_tunnel: Make none-tunnel-dst tunnel port work with lwtunnel
	decnet: fix DN_IFREQ_SIZE
	net/smc: prevent races between smc_lgr_terminate() and smc_conn_free()
	blktrace: Show requests without sector
	tipc: fix skb may be leaky in tipc_link_input
	sfc: initialise found bitmap in efx_ef10_mtd_probe
	net: fix possible overflow in __sk_mem_raise_allocated()
	sctp: don't compare hb_timer expire date before starting it
	bpf: decrease usercnt if bpf_map_new_fd() fails in bpf_map_get_fd_by_id()
	net: dev: Use unsigned integer as an argument to left-shift
	kvm: properly check debugfs dentry before using it
	bpf: drop refcount if bpf_map_new_fd() fails in map_create()
	net: hns3: Change fw error code NOT_EXEC to NOT_SUPPORTED
	iommu/amd: Fix NULL dereference bug in match_hid_uid
	apparmor: delete the dentry in aafs_remove() to avoid a leak
	scsi: libsas: Support SATA PHY connection rate unmatch fixing during discovery
	ACPI / APEI: Don't wait to serialise with oops messages when panic()ing
	ACPI / APEI: Switch estatus pool to use vmalloc memory
	scsi: libsas: Check SMP PHY control function result
	powerpc/pseries/dlpar: Fix a missing check in dlpar_parse_cc_property()
	mtd: Remove a debug trace in mtdpart.c
	mm, gup: add missing refcount overflow checks on s390
	clk: at91: fix update bit maps on CFG_MOR write
	clk: at91: generated: set audio_pll_allowed in at91_clk_register_generated()
	staging: rtl8192e: fix potential use after free
	staging: rtl8723bs: Drop ACPI device ids
	staging: rtl8723bs: Add 024c:0525 to the list of SDIO device-ids
	USB: serial: ftdi_sio: add device IDs for U-Blox C099-F9P
	mei: bus: prefix device names on bus with the bus name
	xfrm: Fix memleak on xfrm state destroy
	media: v4l2-ctrl: fix flags for DO_WHITE_BALANCE
	net: macb: fix error format in dev_err()
	pwm: Clear chip_data in pwm_put()
	media: atmel: atmel-isc: fix asd memory allocation
	media: atmel: atmel-isc: fix INIT_WORK misplacement
	macvlan: schedule bc_work even if error
	net: psample: fix skb_over_panic
	openvswitch: fix flow command message size
	slip: Fix use-after-free Read in slip_open
	openvswitch: drop unneeded BUG_ON() in ovs_flow_cmd_build_info()
	openvswitch: remove another BUG_ON()
	tipc: fix link name length check
	sctp: cache netns in sctp_ep_common
	net: sched: fix `tc -s class show` no bstats on class with nolock subqueues
	ext4: add more paranoia checking in ext4_expand_extra_isize handling
	watchdog: sama5d4: fix WDD value to be always set to max
	net: macb: Fix SUBNS increment and increase resolution
	net: macb driver, check for SKBTX_HW_TSTAMP
	mtd: rawnand: atmel: Fix spelling mistake in error message
	mtd: rawnand: atmel: fix possible object reference leak
	mtd: spi-nor: cast to u64 to avoid uint overflows
	y2038: futex: Move compat implementation into futex.c
	futex: Prevent robust futex exit race
	futex: Move futex exit handling into futex code
	futex: Replace PF_EXITPIDONE with a state
	exit/exec: Seperate mm_release()
	futex: Split futex_mm_release() for exit/exec
	futex: Set task::futex_state to DEAD right after handling futex exit
	futex: Mark the begin of futex exit explicitly
	futex: Sanitize exit state handling
	futex: Provide state handling for exec() as well
	futex: Add mutex around futex exit
	futex: Provide distinct return value when owner is exiting
	futex: Prevent exit livelock
	HID: core: check whether Usage Page item is after Usage ID items
	crypto: stm32/hash - Fix hmac issue more than 256 bytes
	media: stm32-dcmi: fix DMA corruption when stopping streaming
	hwrng: stm32 - fix unbalanced pm_runtime_enable
	mailbox: mailbox-test: fix null pointer if no mmio
	pinctrl: stm32: fix memory leak issue
	ASoC: stm32: i2s: fix dma configuration
	ASoC: stm32: i2s: fix 16 bit format support
	ASoC: stm32: i2s: fix IRQ clearing
	platform/x86: hp-wmi: Fix ACPI errors caused by too small buffer
	platform/x86: hp-wmi: Fix ACPI errors caused by passing 0 as input size
	net: fec: fix clock count mis-match
	Linux 4.14.158

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-12-05 15:48:19 +01:00
Alexey Kardashevskiy
7a0d07f81e vfio/spapr_tce: Get rid of possible infinite loop
[ Upstream commit 517ad4ae8aa93dccdb9a88c27257ecb421c9e848 ]

As a part of cleanup, the SPAPR TCE IOMMU subdriver releases preregistered
memory. If there is a bug in memory release, the loop in
tce_iommu_release() becomes infinite; this actually happened to me.

This makes the loop finite and prints a warning on every failure to make
the code more bug prone.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-05 15:37:42 +01:00
Greg Kroah-Hartman
7bc77fd339 Merge 4.14.155 into android-4.14
Changes in 4.14.155
	kvm: mmu: Don't read PDPTEs when paging is not enabled
	KVM: x86: introduce is_pae_paging
	MIPS: BCM63XX: fix switch core reset on BCM6368
	scsi: core: Handle drivers which set sg_tablesize to zero
	Revert "Input: synaptics-rmi4 - avoid processing unknown IRQs"
	powerpc/perf: Fix IMC_MAX_PMU macro
	powerpc/perf: Fix kfree memory allocated for nest pmus
	ax88172a: fix information leak on short answers
	net: usb: qmi_wwan: add support for Foxconn T77W968 LTE modules
	slip: Fix memory leak in slip_open error path
	ALSA: usb-audio: Fix missing error check at mixer resolution test
	ALSA: usb-audio: not submit urb for stopped endpoint
	Input: ff-memless - kill timer in destroy()
	Input: synaptics-rmi4 - fix video buffer size
	Input: synaptics-rmi4 - disable the relative position IRQ in the F12 driver
	Input: synaptics-rmi4 - do not consume more data than we have (F11, F12)
	Input: synaptics-rmi4 - clear IRQ enables for F54
	Input: synaptics-rmi4 - destroy F54 poller workqueue when removing
	IB/hfi1: Ensure full Gen3 speed in a Gen4 system
	i2c: acpi: Force bus speed to 400KHz if a Silead touchscreen is present
	ecryptfs_lookup_interpose(): lower_dentry->d_inode is not stable
	ecryptfs_lookup_interpose(): lower_dentry->d_parent is not stable either
	iommu/vt-d: Fix QI_DEV_IOTLB_PFSID and QI_DEV_EIOTLB_PFSID macros
	mm: memcg: switch to css_tryget() in get_mem_cgroup_from_mm()
	mm: hugetlb: switch to css_tryget() in hugetlb_cgroup_charge_cgroup()
	mmc: sdhci-of-at91: fix quirk2 overwrite
	iio: adc: max9611: explicitly cast gain_selectors
	tee: optee: take DT status property into account
	ath10k: fix kernel panic by moving pci flush after napi_disable
	iio: dac: mcp4922: fix error handling in mcp4922_write_raw
	arm64: dts: allwinner: a64: Olinuxino: fix DRAM voltage
	arm64: dts: allwinner: a64: NanoPi-A64: Fix DCDC1 voltage
	ALSA: pcm: signedness bug in snd_pcm_plug_alloc()
	arm64: dts: tegra210-p2180: Correct sdmmc4 vqmmc-supply
	ARM: dts: at91/trivial: Fix USART1 definition for at91sam9g45
	rtc: rv8803: fix the rv8803 id in the OF table
	remoteproc/davinci: Use %zx for formating size_t
	extcon: cht-wc: Return from default case to avoid warnings
	cfg80211: Avoid regulatory restore when COUNTRY_IE_IGNORE is set
	ALSA: seq: Do error checks at creating system ports
	ath9k: fix tx99 with monitor mode interface
	ath10k: limit available channels via DT ieee80211-freq-limit
	gfs2: Don't set GFS2_RDF_UPTODATE when the lvb is updated
	ASoC: dpcm: Properly initialise hw->rate_max
	pinctrl: ingenic: Probe driver at subsys_initcall
	MIPS: BCM47XX: Enable USB power on Netgear WNDR3400v3
	ARM: dts: exynos: Fix sound in Snow-rev5 Chromebook
	liquidio: fix race condition in instruction completion processing
	ARM: dts: exynos: Fix regulators configuration on Peach Pi/Pit Chromebooks
	i40e: use correct length for strncpy
	i40e: hold the rtnl lock on clearing interrupt scheme
	i40e: Prevent deleting MAC address from VF when set by PF
	IB/rxe: fixes for rdma read retry
	iwlwifi: don't WARN on trying to dump dead firmware
	iwlwifi: mvm: avoid sending too many BARs
	ARM: dts: pxa: fix the rtc controller
	ARM: dts: pxa: fix power i2c base address
	rtl8187: Fix warning generated when strncpy() destination length matches the sixe argument
	soc: imx: gpc: fix PDN delay
	ASoC: rsnd: ssi: Fix issue in dma data address assignment
	net: phy: mscc: read 'vsc8531,vddmac' as an u32
	net: phy: mscc: read 'vsc8531, edge-slowdown' as an u32
	ARM: dts: meson8: fix the clock controller register size
	ARM: dts: meson8b: fix the clock controller register size
	net: lan78xx: Bail out if lan78xx_get_endpoints fails
	ASoC: sgtl5000: avoid division by zero if lo_vag is zero
	ARM: dts: exynos: Disable pull control for S5M8767 PMIC
	ath10k: wmi: disable softirq's while calling ieee80211_rx
	IB/ipoib: Ensure that MTU isn't less than minimum permitted
	RDMA/core: Rate limit MAD error messages
	RDMA/core: Follow correct unregister order between sysfs and cgroup
	mips: txx9: fix iounmap related issue
	ASoC: Intel: hdac_hdmi: Limit sampling rates at dai creation
	of: make PowerMac cache node search conditional on CONFIG_PPC_PMAC
	ARM: dts: omap3-gta04: give spi_lcd node a label so that we can overwrite in other DTS files
	ARM: dts: omap3-gta04: fixes for tvout / venc
	ARM: dts: omap3-gta04: tvout: enable as display1 alias
	ARM: dts: omap3-gta04: fix touchscreen tsc2007
	ARM: dts: omap3-gta04: make NAND partitions compatible with recent U-Boot
	ARM: dts: omap3-gta04: keep vpll2 always on
	sched/debug: Use symbolic names for task state constants
	arm64: dts: rockchip: Fix VCC5V0_HOST_EN on rk3399-sapphire
	dmaengine: dma-jz4780: Don't depend on MACH_JZ4780
	dmaengine: dma-jz4780: Further residue status fix
	EDAC, sb_edac: Return early on ADDRV bit and address type test
	rtc: mt6397: fix possible race condition
	rtc: pl030: fix possible race condition
	ath9k: add back support for using active monitor interfaces for tx99
	IB/hfi1: Missing return value in error path for user sdma
	signal: Always ignore SIGKILL and SIGSTOP sent to the global init
	signal: Properly deliver SIGILL from uprobes
	signal: Properly deliver SIGSEGV from x86 uprobes
	f2fs: fix memory leak of percpu counter in fill_super()
	scsi: qla2xxx: Fix iIDMA error
	scsi: qla2xxx: Defer chip reset until target mode is enabled
	scsi: qla2xxx: Fix dropped srb resource.
	scsi: lpfc: Fix errors in log messages.
	scsi: sym53c8xx: fix NULL pointer dereference panic in sym_int_sir()
	ARM: imx6: register pm_power_off handler if "fsl,pmic-stby-poweroff" is set
	scsi: pm80xx: Corrected dma_unmap_sg() parameter
	scsi: pm80xx: Fixed system hang issue during kexec boot
	kprobes: Don't call BUG_ON() if there is a kprobe in use on free list
	Drivers: hv: vmbus: Fix synic per-cpu context initialization
	nvmem: core: return error code instead of NULL from nvmem_device_get
	media: dt-bindings: adv748x: Fix decimal unit addresses
	media: fix: media: pci: meye: validate offset to avoid arbitrary access
	media: dvb: fix compat ioctl translation
	arm64: dts: meson: libretech: update board model
	ALSA: intel8x0m: Register irq handler after register initializations
	pinctrl: at91-pio4: fix has_config check in atmel_pctl_dt_subnode_to_map()
	llc: avoid blocking in llc_sap_close()
	ARM: dts: qcom: ipq4019: fix cpu0's qcom,saw2 reg value
	soc: qcom: wcnss_ctrl: Avoid string overflow
	powerpc/vdso: Correct call frame information
	ARM: dts: socfpga: Fix I2C bus unit-address error
	pinctrl: at91: don't use the same irqchip with multiple gpiochips
	cxgb4: Fix endianness issue in t4_fwcache()
	blok, bfq: do not plug I/O if all queues are weight-raised
	arm64: dts: meson: Fix erroneous SPI bus warnings
	power: supply: ab8500_fg: silence uninitialized variable warnings
	power: reset: at91-poweroff: do not procede if at91_shdwc is allocated
	power: supply: max8998-charger: Fix platform data retrieval
	component: fix loop condition to call unbind() if bind() fails
	kernfs: Fix range checks in kernfs_get_target_path
	ip_gre: fix parsing gre header in ipgre_err
	ARM: dts: rockchip: Fix erroneous SPI bus dtc warnings on rk3036
	ACPI / LPSS: Exclude I2C busses shared with PUNIT from pmc_atom_d3_mask
	ath9k: Fix a locking bug in ath9k_add_interface()
	s390/qeth: invoke softirqs after napi_schedule()
	PCI/ACPI: Correct error message for ASPM disabling
	serial: uartps: Fix suspend functionality
	serial: samsung: Enable baud clock for UART reset procedure in resume
	serial: mxs-auart: Fix potential infinite loop
	samples/bpf: fix a compilation failure
	spi: mediatek: Don't modify spi_transfer when transfer.
	ipmi:dmi: Ignore IPMI SMBIOS entries with a zero base address
	net: hns3: fix return type of ndo_start_xmit function
	powerpc/iommu: Avoid derefence before pointer check
	powerpc/64s/hash: Fix stab_rr off by one initialization
	powerpc/pseries: Disable CPU hotplug across migrations
	powerpc: Fix duplicate const clang warning in user access code
	RDMA/i40iw: Fix incorrect iterator type
	OPP: Protect dev_list with opp_table lock
	libfdt: Ensure INT_MAX is defined in libfdt_env.h
	power: supply: twl4030_charger: fix charging current out-of-bounds
	power: supply: twl4030_charger: disable eoc interrupt on linear charge
	net: toshiba: fix return type of ndo_start_xmit function
	net: xilinx: fix return type of ndo_start_xmit function
	net: broadcom: fix return type of ndo_start_xmit function
	net: amd: fix return type of ndo_start_xmit function
	net: sun: fix return type of ndo_start_xmit function
	net: hns3: Fix for setting speed for phy failed problem
	net: hns3: Fix parameter type for q_id in hclge_tm_q_to_qs_map_cfg()
	nfp: provide a better warning when ring allocation fails
	usb: chipidea: imx: enable OTG overcurrent in case USB subsystem is already started
	usb: chipidea: Fix otg event handler
	mlxsw: spectrum: Init shaper for TCs 8..15
	ARM: dts: am335x-evm: fix number of cpsw
	f2fs: fix to recover inode's uid/gid during POR
	ARM: dts: ux500: Correct SCU unit address
	ARM: dts: ux500: Fix LCDA clock line muxing
	ARM: dts: ste: Fix SPI controller node names
	spi: pic32: Use proper enum in dmaengine_prep_slave_rg
	cpufeature: avoid warning when compiling with clang
	crypto: arm/crc32 - avoid warning when compiling with Clang
	ARM: dts: marvell: Fix SPI and I2C bus warnings
	x86/mce-inject: Reset injection struct after injection
	ARM: dts: clearfog: fix sdhci supply property name
	bnx2x: Ignore bandwidth attention in single function mode
	samples/bpf: fix compilation failure
	net: phy: mdio-bcm-unimac: Allow configuring MDIO clock divider
	net: micrel: fix return type of ndo_start_xmit function
	net: freescale: fix return type of ndo_start_xmit function
	x86/CPU: Use correct macros for Cyrix calls
	x86/CPU: Change query logic so CPUID is enabled before testing
	MIPS: kexec: Relax memory restriction
	arm64: dts: rockchip: Fix microSD in rk3399 sapphire board
	media: pci: ivtv: Fix a sleep-in-atomic-context bug in ivtv_yuv_init()
	media: au0828: Fix incorrect error messages
	media: davinci: Fix implicit enum conversion warning
	ARM: dts: rockchip: explicitly set vcc_sd0 pin to gpio on rk3188-radxarock
	usb: gadget: uvc: configfs: Drop leaked references to config items
	usb: gadget: uvc: configfs: Prevent format changes after linking header
	i2c: aspeed: fix invalid clock parameters for very large divisors
	phy: brcm-sata: allow PHY_BRCM_SATA driver to be built for DSL SoCs
	phy: renesas: rcar-gen3-usb2: fix vbus_ctrl for role sysfs
	phy: phy-twl4030-usb: fix denied runtime access
	usb: gadget: uvc: Factor out video USB request queueing
	usb: gadget: uvc: Only halt video streaming endpoint in bulk mode
	coresight: Fix handling of sinks
	coresight: perf: Fix per cpu path management
	coresight: perf: Disable trace path upon source error
	coresight: etm4x: Configure EL2 exception level when kernel is running in HYP
	coresight: tmc: Fix byte-address alignment for RRP
	misc: kgdbts: Fix restrict error
	misc: genwqe: should return proper error value.
	vfio/pci: Fix potential memory leak in vfio_msi_cap_len
	vfio/pci: Mask buggy SR-IOV VF INTx support
	scsi: libsas: always unregister the old device if going to discover new
	phy: lantiq: Fix compile warning
	ARM: dts: tegra30: fix xcvr-setup-use-fuses
	ARM: tegra: apalis_t30: fix mmc1 cmd pull-up
	ARM: dts: paz00: fix wakeup gpio keycode
	net: smsc: fix return type of ndo_start_xmit function
	net: faraday: fix return type of ndo_start_xmit function
	f2fs: fix to recover inode's project id during POR
	f2fs: mark inode dirty explicitly in recover_inode()
	EDAC: Raise the maximum number of memory controllers
	ARM: dts: realview: Fix SPI controller node names
	firmware: dell_rbu: Make payload memory uncachable
	Bluetooth: hci_serdev: clear HCI_UART_PROTO_READY to avoid closing proto races
	Bluetooth: L2CAP: Detect if remote is not able to use the whole MPS
	x86/hyperv: Suppress "PCI: Fatal: No config space access function found"
	crypto: s5p-sss: Fix Fix argument list alignment
	crypto: fix a memory leak in rsa-kcs1pad's encryption mode
	iwlwifi: dbg: don't crash if the firmware crashes in the middle of a debug dump
	iwlwifi: api: annotate compressed BA notif array sizes
	iwlwifi: mvm: Allow TKIP for AP mode
	scsi: NCR5380: Clear all unissued commands on host reset
	scsi: NCR5380: Have NCR5380_select() return a bool
	scsi: NCR5380: Withhold disconnect privilege for REQUEST SENSE
	scsi: NCR5380: Use DRIVER_SENSE to indicate valid sense data
	scsi: NCR5380: Check for invalid reselection target
	scsi: NCR5380: Don't clear busy flag when abort fails
	scsi: NCR5380: Don't call dsprintk() following reselection interrupt
	scsi: NCR5380: Handle BUS FREE during reselection
	scsi: NCR5380: Check for bus reset
	arm64: dts: amd: Fix SPI bus warnings
	arm64: dts: lg: Fix SPI controller node names
	ARM: dts: lpc32xx: Fix SPI controller node names
	rtc: armada38x: fix possible race condition
	netfilter: masquerade: don't flush all conntracks if only one address deleted on device
	usb: xhci-mtk: fix ISOC error when interval is zero
	fuse: use READ_ONCE on congestion_threshold and max_background
	IB/iser: Fix possible NULL deref at iser_inv_desc()
	net: phy: mdio-bcm-unimac: mark PM functions as __maybe_unused
	memfd: Use radix_tree_deref_slot_protected to avoid the warning.
	slcan: Fix memory leak in error path
	Linux 4.14.155

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-11-20 20:51:45 +01:00
Alex Williamson
d7bb792bc8 vfio/pci: Mask buggy SR-IOV VF INTx support
[ Upstream commit db04264fe9bc0f2b62e036629f9afb530324b693 ]

The SR-IOV spec requires that VFs must report zero for the INTx pin
register as VFs are precluded from INTx support.  It's much easier for
the host kernel to understand whether a device is a VF and therefore
whether a non-zero pin register value is bogus than it is to do the
same in userspace.  Override the INTx count for such devices and
virtualize the pin register to provide a consistent view of the device
to the user.

As this is clearly a spec violation, warn about it to support hardware
validation, but also provide a known whitelist as it doesn't do much
good to continue complaining if the hardware vendor doesn't plan to
fix it.

Known devices with this issue: 8086:270c

Tested-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-20 18:00:40 +01:00
Li Qiang
f1b10ba573 vfio/pci: Fix potential memory leak in vfio_msi_cap_len
[ Upstream commit 30ea32ab1951c80c6113f300fce2c70cd12659e4 ]

Free allocated vdev->msi_perm in error path.

Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-20 18:00:40 +01:00
Andrey Konovalov
674e978231 UPSTREAM: vfio/type1: untag user pointers in vaddr_get_pfn
(Upstream commit 6cf5354c1c4b74fd2e5527db084f163e9d4dae4e).

This patch is a part of a series that extends kernel ABI to allow to pass
tagged user pointers (with the top byte set to something else other than
0x00) as syscall arguments.

vaddr_get_pfn() uses provided user pointers for vma lookups, which can
only by done with untagged pointers.

Untag user pointers in this function.

Link: http://lkml.kernel.org/r/87422b4d72116a975896f2b19b00f38acbd28f33.1563904656.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Jens Wiklander <jens.wiklander@linaro.org>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 135692346
Change-Id: Ic4a7b673658538f7fcc5a5b76e77e574417214c7
2019-10-07 14:56:04 -04:00
Greg Kroah-Hartman
8390d98a1a Merge 4.14.148 into android-4.14
Changes in 4.14.148
	tpm: migrate pubek_show to struct tpm_buf
	tpm: use tpm_try_get_ops() in tpm-sysfs.c.
	tpm: Fix TPM 1.2 Shutdown sequence to prevent future TPM operations
	drm/bridge: tc358767: Increase AUX transfer length limit
	drm/panel: simple: fix AUO g185han01 horizontal blanking
	video: ssd1307fb: Start page range at page_offset
	drm/stm: attach gem fence to atomic state
	drm/radeon: Fix EEH during kexec
	gpu: drm: radeon: Fix a possible null-pointer dereference in radeon_connector_set_property()
	ipmi_si: Only schedule continuously in the thread in maintenance mode
	clk: qoriq: Fix -Wunused-const-variable
	clk: sunxi-ng: v3s: add missing clock slices for MMC2 module clocks
	clk: sirf: Don't reference clk_init_data after registration
	clk: zx296718: Don't reference clk_init_data after registration
	powerpc/xmon: Check for HV mode when dumping XIVE info from OPAL
	powerpc/rtas: use device model APIs and serialization during LPM
	powerpc/futex: Fix warning: 'oldval' may be used uninitialized in this function
	powerpc/pseries/mobility: use cond_resched when updating device tree
	pinctrl: tegra: Fix write barrier placement in pmx_writel
	vfio_pci: Restore original state on release
	drm/nouveau/volt: Fix for some cards having 0 maximum voltage
	drm/amdgpu/si: fix ASIC tests
	powerpc/64s/exception: machine check use correct cfar for late handler
	powerpc/pseries: correctly track irq state in default idle
	arm64: fix unreachable code issue with cmpxchg
	clk: at91: select parent if main oscillator or bypass is enabled
	scsi: core: Reduce memory required for SCSI logging
	dma-buf/sw_sync: Synchronize signal vs syncpt free
	MIPS: tlbex: Explicitly cast _PAGE_NO_EXEC to a boolean
	i2c-cht-wc: Fix lockdep warning
	mfd: intel-lpss: Remove D3cold delay
	PCI: tegra: Fix OF node reference leak
	livepatch: Nullify obj->mod in klp_module_coming()'s error path
	ARM: 8898/1: mm: Don't treat faults reported from cache maintenance as writes
	rtc: snvs: fix possible race condition
	HID: apple: Fix stuck function keys when using FN
	PCI: rockchip: Propagate errors for optional regulators
	PCI: imx6: Propagate errors for optional regulators
	PCI: exynos: Propagate errors for optional PHYs
	security: smack: Fix possible null-pointer dereferences in smack_socket_sock_rcv_skb()
	ARM: 8903/1: ensure that usable memory in bank 0 starts from a PMD-aligned address
	fat: work around race with userspace's read via blockdev while mounting
	pktcdvd: remove warning on attempting to register non-passthrough dev
	hypfs: Fix error number left in struct pointer member
	kbuild: clean compressed initramfs image
	ocfs2: wait for recovering done after direct unlock request
	kmemleak: increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE default to 16K
	bpf: fix use after free in prog symbol exposure
	cxgb4:Fix out-of-bounds MSI-X info array access
	erspan: remove the incorrect mtu limit for erspan
	hso: fix NULL-deref on tty open
	ipv6: drop incoming packets having a v4mapped source address
	net: ipv4: avoid mixed n_redirects and rate_tokens usage
	net: qlogic: Fix memory leak in ql_alloc_large_buffers
	net: Unpublish sk from sk_reuseport_cb before call_rcu
	nfc: fix memory leak in llcp_sock_bind()
	qmi_wwan: add support for Cinterion CLS8 devices
	sch_dsmark: fix potential NULL deref in dsmark_init()
	vsock: Fix a lockdep warning in __vsock_release()
	net/rds: Fix error handling in rds_ib_add_one()
	xen-netfront: do not use ~0U as error return value for xennet_fill_frags()
	tipc: fix unlimited bundling of small messages
	sch_cbq: validate TCA_CBQ_WRROPT to avoid crash
	ipv6: Handle missing host route in __ipv6_ifa_notify
	Smack: Don't ignore other bprm->unsafe flags if LSM_UNSAFE_PTRACE is set
	smack: use GFP_NOFS while holding inode_smack::smk_lock
	NFC: fix attrs checks in netlink interface
	kexec: bail out upon SIGKILL when allocating memory.
	Linux 4.14.148

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-10-07 19:12:31 +02:00
hexin
ed9544cadc vfio_pci: Restore original state on release
[ Upstream commit 92c8026854c25093946e0d7fe536fd9eac440f06 ]

vfio_pci_enable() saves the device's initial configuration information
with the intent that it is restored in vfio_pci_disable().  However,
the commit referenced in Fixes: below replaced the call to
__pci_reset_function_locked(), which is not wrapped in a state save
and restore, with pci_try_reset_function(), which overwrites the
restored device state with the current state before applying it to the
device.  Reinstate use of __pci_reset_function_locked() to return to
the desired behavior.

Fixes: 890ed578df ("vfio-pci: Use pci "try" reset interface")
Signed-off-by: hexin <hexin15@baidu.com>
Signed-off-by: Liu Qi <liuqi16@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-07 18:55:03 +02:00
David Howells
fae859c849 UPSTREAM: Make anon_inodes unconditional
Make the anon_inodes facility unconditional so that it can be used by core
VFS code.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

(cherry picked from commit dadd2299ab61fc2b55b95b7b3a8f674cdd3b69c9)

Bug: 135608568
Test: test program using syscall(__NR_sys_pidfd_open,..) and poll()
Change-Id: I2f97bda4f360d8d05bbb603de839717b3d8067ae
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2019-08-12 13:29:46 -04:00
Farhan Ali
488c10be17 vfio: Fix WARNING "do not call blocking ops when !TASK_RUNNING"
[ Upstream commit 41be3e2618174fdf3361e49e64f2bf530f40c6b0 ]

vfio_dev_present() which is the condition to
wait_event_interruptible_timeout(), will call vfio_group_get_device
and try to acquire the mutex group->device_lock.

wait_event_interruptible_timeout() will set the state of the current
task to TASK_INTERRUPTIBLE, before doing the condition check. This
means that we will try to acquire the mutex while already in a
sleeping state. The scheduler warns us by giving the following
warning:

[ 4050.264464] ------------[ cut here ]------------
[ 4050.264508] do not call blocking ops when !TASK_RUNNING; state=1 set at [<00000000b33c00e2>] prepare_to_wait_event+0x14a/0x188
[ 4050.264529] WARNING: CPU: 12 PID: 35924 at kernel/sched/core.c:6112 __might_sleep+0x76/0x90
....

 4050.264756] Call Trace:
[ 4050.264765] ([<000000000017bbaa>] __might_sleep+0x72/0x90)
[ 4050.264774]  [<0000000000b97edc>] __mutex_lock+0x44/0x8c0
[ 4050.264782]  [<0000000000b9878a>] mutex_lock_nested+0x32/0x40
[ 4050.264793]  [<000003ff800d7abe>] vfio_group_get_device+0x36/0xa8 [vfio]
[ 4050.264803]  [<000003ff800d87c0>] vfio_del_group_dev+0x238/0x378 [vfio]
[ 4050.264813]  [<000003ff8015f67c>] mdev_remove+0x3c/0x68 [mdev]
[ 4050.264825]  [<00000000008e01b0>] device_release_driver_internal+0x168/0x268
[ 4050.264834]  [<00000000008de692>] bus_remove_device+0x162/0x190
[ 4050.264843]  [<00000000008daf42>] device_del+0x1e2/0x368
[ 4050.264851]  [<00000000008db12c>] device_unregister+0x64/0x88
[ 4050.264862]  [<000003ff8015ed84>] mdev_device_remove+0xec/0x130 [mdev]
[ 4050.264872]  [<000003ff8015f074>] remove_store+0x6c/0xa8 [mdev]
[ 4050.264881]  [<000000000046f494>] kernfs_fop_write+0x14c/0x1f8
[ 4050.264890]  [<00000000003c1530>] __vfs_write+0x38/0x1a8
[ 4050.264899]  [<00000000003c187c>] vfs_write+0xb4/0x198
[ 4050.264908]  [<00000000003c1af2>] ksys_write+0x5a/0xb0
[ 4050.264916]  [<0000000000b9e270>] system_call+0xdc/0x2d8
[ 4050.264925] 4 locks held by sh/35924:
[ 4050.264933]  #0: 000000001ef90325 (sb_writers#4){.+.+}, at: vfs_write+0x9e/0x198
[ 4050.264948]  #1: 000000005c1ab0b3 (&of->mutex){+.+.}, at: kernfs_fop_write+0x1cc/0x1f8
[ 4050.264963]  #2: 0000000034831ab8 (kn->count#297){++++}, at: kernfs_remove_self+0x12e/0x150
[ 4050.264979]  #3: 00000000e152484f (&dev->mutex){....}, at: device_release_driver_internal+0x5c/0x268
[ 4050.264993] Last Breaking-Event-Address:
[ 4050.265002]  [<000000000017bbaa>] __might_sleep+0x72/0x90
[ 4050.265010] irq event stamp: 7039
[ 4050.265020] hardirqs last  enabled at (7047): [<00000000001cee7a>] console_unlock+0x6d2/0x740
[ 4050.265029] hardirqs last disabled at (7054): [<00000000001ce87e>] console_unlock+0xd6/0x740
[ 4050.265040] softirqs last  enabled at (6416): [<0000000000b8fe26>] __udelay+0xb6/0x100
[ 4050.265049] softirqs last disabled at (6415): [<0000000000b8fe06>] __udelay+0x96/0x100
[ 4050.265057] ---[ end trace d04a07d39d99a9f9 ]---

Let's fix this as described in the article
https://lwn.net/Articles/628628/.

Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
[remove now redundant vfio_dev_present()]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-15 11:54:54 +02:00
Louis Taylor
5a56270b66 vfio/pci: use correct format characters
[ Upstream commit 426b046b748d1f47e096e05bdcc6fb4172791307 ]

When compiling with -Wformat, clang emits the following warnings:

drivers/vfio/pci/vfio_pci.c:1601:5: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                ^~~~~~

drivers/vfio/pci/vfio_pci.c:1601:13: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                        ^~~~~~

drivers/vfio/pci/vfio_pci.c:1601:21: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                                ^~~~~~~~~

drivers/vfio/pci/vfio_pci.c:1601:32: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                                           ^~~~~~~~~

drivers/vfio/pci/vfio_pci.c:1605:5: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                ^~~~~~

drivers/vfio/pci/vfio_pci.c:1605:13: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                        ^~~~~~

drivers/vfio/pci/vfio_pci.c:1605:21: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                                ^~~~~~~~~

drivers/vfio/pci/vfio_pci.c:1605:32: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                                           ^~~~~~~~~
The types of these arguments are unconditionally defined, so this patch
updates the format character to the correct ones for unsigned ints.

Link: https://github.com/ClangBuiltLinux/linux/issues/378
Signed-off-by: Louis Taylor <louis@kragniz.eu>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-08 07:20:49 +02:00
Alex Williamson
73a95f1a41 vfio/type1: Limit DMA mappings per container
commit 492855939bdb59c6f947b0b5b44af9ad82b7e38c upstream.

Memory backed DMA mappings are accounted against a user's locked
memory limit, including multiple mappings of the same memory.  This
accounting bounds the number of such mappings that a user can create.
However, DMA mappings that are not backed by memory, such as DMA
mappings of device MMIO via mmaps, do not make use of page pinning
and therefore do not count against the user's locked memory limit.
These mappings still consume memory, but the memory is not well
associated to the process for the purpose of oom killing a task.

To add bounding on this use case, we introduce a limit to the total
number of concurrent DMA mappings that a user is allowed to create.
This limit is exposed as a tunable module option where the default
value of 64K is expected to be well in excess of any reasonable use
case (a large virtual machine configuration would typically only make
use of tens of concurrent mappings).

This fixes CVE-2019-3882.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-02 09:40:30 +02:00
Alex Williamson
827faa4eb5 vfio/type1: Fix task tracking for QEMU vCPU hotplug
[ Upstream commit 48d8476b41eed63567dd2f0ad125c895b9ac648a ]

MAP_DMA ioctls might be called from various threads within a process,
for example when using QEMU, the vCPU threads are often generating
these calls and we therefore take a reference to that vCPU task.
However, QEMU also supports vCPU hotplug on some machines and the task
that called MAP_DMA may have exited by the time UNMAP_DMA is called,
resulting in the mm_struct pointer being NULL and thus a failure to
match against the existing mapping.

To resolve this, we instead take a reference to the thread
group_leader, which has the same mm_struct and resource limits, but
is less likely exit, at least in the QEMU case.  A difficulty here is
guaranteeing that the capabilities of the group_leader match that of
the calling thread, which we resolve by tracking CAP_IPC_LOCK at the
time of calling rather than at an indeterminate time in the future.
Potentially this also results in better efficiency as this is now
recorded once per MAP_DMA ioctl.

Reported-by: Xu Yandong <xuyandong2@huawei.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-03 07:50:23 +02:00
Alex Williamson
8f38152f2a vfio/mdev: Check globally for duplicate devices
[ Upstream commit 002fe996f67f4f46d8917b14cfb6e4313c20685a ]

When we create an mdev device, we check for duplicates against the
parent device and return -EEXIST if found, but the mdev device
namespace is global since we'll link all devices from the bus.  We do
catch this later in sysfs_do_create_link_sd() to return -EEXIST, but
with it comes a kernel warning and stack trace for trying to create
duplicate sysfs links, which makes it an undesirable response.

Therefore we should really be looking for duplicates across all mdev
parent devices, or as implemented here, against our mdev device list.
Using mdev_list to prevent duplicates means that we can remove
mdev_parent.lock, but in order not to serialize mdev device creation
and removal globally, we add mdev_device.active which allows UUIDs to
be reserved such that we can drop the mdev_list_lock before the mdev
device is fully in place.

Two behavioral notes; first, mdev_parent.lock had the side-effect of
serializing mdev create and remove ops per parent device.  This was
an implementation detail, not an intentional guarantee provided to
the mdev vendor drivers.  Vendor drivers can trivially provide this
serialization internally if necessary.  Second, review comments note
the new -EAGAIN behavior when the device, and in particular the remove
attribute, becomes visible in sysfs.  If a remove is triggered prior
to completion of mdev_device_create() the user will see a -EAGAIN
error.  While the errno is different, receiving an error during this
period is not, the previous implementation returned -ENODEV for the
same condition.  Furthermore, the consistency to the user is improved
in the case where mdev_device_remove_ops() returns error.  Previously
concurrent calls to mdev_device_remove() could see the device
disappear with -ENODEV and return in the case of error.  Now a user
would see -EAGAIN while the device is in this transitory state.

Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-03 07:50:22 +02:00
Geert Uytterhoeven
ca014df110 vfio: platform: Fix reset module leak in error path
[ Upstream commit 28a68387888997e8a7fa57940ea5d55f2e16b594 ]

If the IOMMU group setup fails, the reset module is not released.

Fixes: b5add544d6 ("vfio, platform: make reset driver a requirement by default")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-03 07:50:22 +02:00
Alexey Kardashevskiy
58113603a4 KVM: PPC: Check if IOMMU page is contained in the pinned physical page
commit 76fa4975f3ed12d15762bc979ca44078598ed8ee upstream.

A VM which has:
 - a DMA capable device passed through to it (eg. network card);
 - running a malicious kernel that ignores H_PUT_TCE failure;
 - capability of using IOMMU pages bigger that physical pages
can create an IOMMU mapping that exposes (for example) 16MB of
the host physical memory to the device when only 64K was allocated to the VM.

The remaining 16MB - 64K will be some other content of host memory, possibly
including pages of the VM, but also pages of host kernel memory, host
programs or other VMs.

The attacking VM does not control the location of the page it can map,
and is only allowed to map as many pages as it has pages of RAM.

We already have a check in drivers/vfio/vfio_iommu_spapr_tce.c that
an IOMMU page is contained in the physical page so the PCI hardware won't
get access to unassigned host memory; however this check is missing in
the KVM fastpath (H_PUT_TCE accelerated code). We were lucky so far and
did not hit this yet as the very first time when the mapping happens
we do not have tbl::it_userspace allocated yet and fall back to
the userspace which in turn calls VFIO IOMMU driver, this fails and
the guest does not retry,

This stores the smallest preregistered page size in the preregistered
region descriptor and changes the mm_iommu_xxx API to check this against
the IOMMU page size.

This calculates maximum page size as a minimum of the natural region
alignment and compound page size. For the page shift this uses the shift
returned by find_linux_pte() which indicates how the page is mapped to
the current userspace - if the page is huge and this is not a zero, then
it is a leaf pte and the page is mapped within the range.

Fixes: 121f80ba68 ("KVM: PPC: VFIO: Add in-kernel acceleration for VFIO")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-28 07:55:41 +02:00
Alexey Kardashevskiy
9a2e4a01de vfio/spapr: Use IOMMU pageshift rather than pagesize
commit 1463edca6734d42ab4406fa2896e20b45478ea36 upstream.

The size is always equal to 1 page so let's use this. Later on this will
be used for other checks which use page shifts to check the granularity
of access.

This should cause no behavioral change.

Cc: stable@vger.kernel.org # v4.12+
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-25 11:25:08 +02:00
Gustavo A. R. Silva
a5b8eae536 vfio/pci: Fix potential Spectre v1
commit 0e714d27786ce1fb3efa9aac58abc096e68b1c2a upstream.

info.index can be indirectly controlled by user-space, hence leading
to a potential exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

drivers/vfio/pci/vfio_pci.c:734 vfio_pci_ioctl()
warn: potential spectre issue 'vdev->region'

Fix this by sanitizing info.index before indirectly using it to index
vdev->region

Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].

[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-25 11:25:08 +02:00
Jason Gunthorpe
5d8ddc819c vfio: Use get_user_pages_longterm correctly
commit bb94b55af3461e26b32f0e23d455abeae0cfca5d upstream.

The patch noted in the fixes below converted get_user_pages_fast() to
get_user_pages_longterm(), however the two calls differ in a few ways.

First _fast() is documented to not require the mmap_sem, while _longterm()
is documented to need it. Hold the mmap sem as required.

Second, _fast accepts an 'int write' while _longterm uses 'unsigned int
gup_flags', so the expression '!!(prot & IOMMU_WRITE)' is only working by
luck as FOLL_WRITE is currently == 0x1. Use the expected FOLL_WRITE
constant instead.

Fixes: 94db151dc892 ("vfio: disable filesystem-dax page pinning")
Cc: <stable@vger.kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-11 16:29:15 +02:00
Alex Williamson
2ccdea040e vfio/pci: Virtualize Maximum Read Request Size
commit cf0d53ba4947aad6e471491d5b20a567cbe92e56 upstream.

MRRS defines the maximum read request size a device is allowed to
make.  Drivers will often increase this to allow more data transfer
with a single request.  Completions to this request are bound by the
MPS setting for the bus.  Aside from device quirks (none known), it
doesn't seem to make sense to set an MRRS value less than MPS, yet
this is a likely scenario given that user drivers do not have a
system-wide view of the PCI topology.  Virtualize MRRS such that the
user can set MRRS >= MPS, but use MPS as the floor value that we'll
write to hardware.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-24 09:36:34 +02:00
Dan Williams
d5168ce354 vfio: disable filesystem-dax page pinning
commit 94db151dc89262bfa82922c44e8320cea2334667 upstream.

Filesystem-DAX is incompatible with 'longterm' page pinning. Without
page cache indirection a DAX mapping maps filesystem blocks directly.
This means that the filesystem must not modify a file's block map while
any page in a mapping is pinned. In order to prevent the situation of
userspace holding of filesystem operations indefinitely, disallow
'longterm' Filesystem-DAX mappings.

RDMA has the same conflict and the plan there is to add a 'with lease'
mechanism to allow the kernel to notify userspace that the mapping is
being torn down for block-map maintenance. Perhaps something similar can
be put in place for vfio.

Note that xfs and ext4 still report:

   "DAX enabled. Warning: EXPERIMENTAL, use at your own risk"

...at mount time, and resolving the dax-dma-vs-truncate problem is one
of the last hurdles to remove that designation.

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org>
Reported-by: Haozhong Zhang <haozhong.zhang@intel.com>
Tested-by: Haozhong Zhang <haozhong.zhang@intel.com>
Fixes: d475c6346a ("dax,ext2: replace XIP read and write with DAX I/O")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-08 22:41:06 -08:00
Alex Williamson
b27bbf1f5b vfio/pci: Virtualize Maximum Payload Size
[ Upstream commit 523184972b282cd9ca17a76f6ca4742394856818 ]

With virtual PCI-Express chipsets, we now see userspace/guest drivers
trying to match the physical MPS setting to a virtual downstream port.
Of course a lone physical device surrounded by virtual interconnects
cannot make a correct decision for a proper MPS setting.  Instead,
let's virtualize the MPS control register so that writes through to
hardware are disallowed.  Userspace drivers like QEMU assume they can
write anything to the device and we'll filter out anything dangerous.
Since mismatched MPS can lead to AER and other faults, let's add it
to the kernel side rather than relying on userspace virtualization to
handle it.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-25 14:26:29 +01:00
Greg Kroah-Hartman
b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Arvind Yadav
417fb50d55 vfio: platform: constify amba_id
amba_id are not supposed to change at runtime. All functions
working with const amba_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-08-30 14:03:42 -06:00
Alex Williamson
6586b561a9 vfio: Stall vfio_del_group_dev() for container group detach
When the user unbinds the last device of a group from a vfio bus
driver, the devices within that group should be available for other
purposes.  We currently have a race that makes this generally, but
not always true.  The device can be unbound from the vfio bus driver,
but remaining IOMMU context of the group attached to the container
can result in errors as the next driver configures DMA for the device.

Wait for the group to be detached from the IOMMU backend before
allowing the bus driver remove callback to complete.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-08-30 14:02:16 -06:00
Eric Auger
d935ad91f0 vfio: fix noiommu vfio_iommu_group_get reference count
In vfio_iommu_group_get() we want to increase the reference
count of the iommu group.

In noiommu case, the group does not exist and is allocated.
iommu_group_add_device() increases the group ref count. However we
then call iommu_group_put() which decrements it.

This leads to a "refcount_t: underflow WARN_ON".

Only decrement the ref count in case of iommu_group_add_device
failure.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-08-30 14:00:47 -06:00
Robin Murphy
f203f7f1db vfio/type1: Give hardware MSI regions precedence
If the IOMMU driver advertises 'real' reserved regions for MSIs, but
still includes the software-managed region as well, we are currently
blind to the former and will configure the IOMMU domain to map MSIs into
the latter, which is unlikely to work as expected.

Since it would take a ridiculous hardware topology for both regions to
be valid (which would be rather difficult to support in general), we
should be safe to assume that the presence of any hardware regions makes
the software region irrelevant. However, the IOMMU driver might still
advertise the software region by default, particularly if the hardware
regions are filled in elsewhere by generic code, so it might not be fair
for VFIO to be super-strict about not mixing them. To that end, make
vfio_iommu_has_sw_msi() robust against the presence of both region types
at once, so that we end up doing what is almost certainly right, rather
than what is almost certainly wrong.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-08-10 13:11:50 -06:00
Robin Murphy
db406cc0ac vfio/type1: Cope with hardware MSI reserved regions
For ARM-based systems with a GICv3 ITS to provide interrupt isolation,
but hardware limitations which are worked around by having MSIs bypass
SMMU translation (e.g. HiSilicon Hip06/Hip07), VFIO neglects to check
for the IRQ_DOMAIN_FLAG_MSI_REMAP capability, (and thus erroneously
demands unsafe_interrupts) if a software-managed MSI region is absent.

Fix this by always checking for isolation capability at both the IRQ
domain and IOMMU domain levels, rather than predicating that on whether
MSIs require an IOMMU mapping (which was always slightly tenuous logic).

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-08-10 13:11:49 -06:00
Alex Williamson
796b755066 vfio/pci: Fix handling of RC integrated endpoint PCIe capability size
Root complex integrated endpoints do not have a link and therefore may
use a smaller PCIe capability in config space than we expect when
building our config map.  Add a case for these to avoid reporting an
erroneous overlap.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-07-27 10:39:33 -06:00
Alex Williamson
9f47803503 vfio/pci: Use pci_try_reset_function() on initial open
Device lock bites again; if a device .remove() callback races a user
calling ioctl(VFIO_GROUP_GET_DEVICE_FD), the unbind request will hold
the device lock, but the user ioctl may have already taken a vfio_device
reference.  In the case of a PCI device, the initial open will attempt
to reset the device, which again attempts to get the device lock,
resulting in deadlock.  Use the trylock PCI reset interface and return
error on the open path if reset fails due to lock contention.

Link: https://lkml.org/lkml/2017/7/25/381
Reported-by: Wen Congyang <wencongyang2@huawei.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-07-26 14:33:15 -06:00
Linus Torvalds
8c6f5e7359 Merge tag 'vfio-v4.13-rc1' of git://github.com/awilliam/linux-vfio
Pull VFIO updates from Alex Williamson:

 - Include Intel XXV710 in INTx workaround (Alex Williamson)

 - Make use of ERR_CAST() for error return (Dan Carpenter)

 - Fix vfio_group release deadlock from iommu notifier (Alex Williamson)

 - Unset KVM-VFIO attributes only on group match (Alex Williamson)

 - Fix release path group/file matching with KVM-VFIO (Alex Williamson)

 - Remove unnecessary lock uses triggering lockdep splat (Alex Williamson)

* tag 'vfio-v4.13-rc1' of git://github.com/awilliam/linux-vfio:
  vfio: Remove unnecessary uses of vfio_container.group_lock
  vfio: New external user group/file match
  kvm-vfio: Decouple only when we match a group
  vfio: Fix group release deadlock
  vfio: Use ERR_CAST() instead of open coding it
  vfio/pci: Add Intel XXV710 to hidden INTx devices
2017-07-13 12:23:54 -07:00
Alex Williamson
7f56c30bd0 vfio: Remove unnecessary uses of vfio_container.group_lock
The original intent of vfio_container.group_lock is to protect
vfio_container.group_list, however over time it's become a crutch to
prevent changes in container composition any time we call into the
iommu driver backend.  This introduces problems when we start to have
more complex interactions, for example when a user's DMA unmap request
triggers a notification to an mdev vendor driver, who responds by
attempting to unpin mappings within that request, re-entering the
iommu backend.  We incorrectly assume that the use of read-locks here
allow for this nested locking behavior, but a poorly timed write-lock
could in fact trigger a deadlock.

The current use of group_lock seems to fall into the trap of locking
code, not data.  Correct that by removing uses of group_lock that are
not directly related to group_list.  Note that the vfio type1 iommu
backend has its own mutex, vfio_iommu.lock, which it uses to protect
itself for each of these interfaces anyway.  The group_lock appears to
be a redundancy for these interfaces and type1 even goes so far as to
release its mutex to allow for exactly the re-entrant code path above.

Reported-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: stable@vger.kernel.org # v4.10+
2017-07-07 15:37:38 -06:00
Alex Williamson
5d6dee80a1 vfio: New external user group/file match
At the point where the kvm-vfio pseudo device wants to release its
vfio group reference, we can't always acquire a new reference to make
that happen.  The group can be in a state where we wouldn't allow a
new reference to be added.  This new helper function allows a caller
to match a file to a group to facilitate this.  Given a file and
group, report if they match.  Thus the caller needs to already have a
group reference to match to the file.  This allows the deletion of a
group without acquiring a new reference.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Cc: stable@vger.kernel.org
2017-06-28 13:50:05 -06:00
Alex Williamson
811642d8d8 vfio: Fix group release deadlock
If vfio_iommu_group_notifier() acquires a group reference and that
reference becomes the last reference to the group, then vfio_group_put
introduces a deadlock code path where we're trying to unregister from
the iommu notifier chain from within a callout of that chain.  Use a
work_struct to release this reference asynchronously.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Cc: stable@vger.kernel.org
2017-06-28 13:49:38 -06:00
Ingo Molnar
ac6424b981 sched/wait: Rename wait_queue_t => wait_queue_entry_t
Rename:

	wait_queue_t		=>	wait_queue_entry_t

'wait_queue_t' was always a slight misnomer: its name implies that it's a "queue",
but in reality it's a queue *entry*. The 'real' queue is the wait queue head,
which had to carry the name.

Start sorting this out by renaming it to 'wait_queue_entry_t'.

This also allows the real structure name 'struct __wait_queue' to
lose its double underscore and become 'struct wait_queue_entry',
which is the more canonical nomenclature for such data types.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20 12:18:27 +02:00