Commit Graph

26900 Commits

Author SHA1 Message Date
Sami Tolvanen
8c25f26ead ANDROID: cfi: remove unnecessary <asm/memory.h> include
This change makes it possible to compile CFI for x86, which doesn't
provide this header file.

Bug: 145297900
Change-Id: I60ad190bb0c2296b67eef2194b72f381e7f94e2c
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2020-04-30 10:08:09 -07:00
Greg Kroah-Hartman
95495cdf37 Merge 4.14.177 into android-4.14-stable
Changes in 4.14.177
	bus: sunxi-rsb: Return correct data when mixing 16-bit and 8-bit reads
	net: vxge: fix wrong __VA_ARGS__ usage
	hinic: fix a bug of waitting for IO stopped
	hinic: fix wrong para of wait_for_completion_timeout
	cxgb4/ptp: pass the sign of offset delta in FW CMD
	qlcnic: Fix bad kzalloc null test
	i2c: st: fix missing struct parameter description
	null_blk: Fix the null_add_dev() error path
	null_blk: Handle null_add_dev() failures properly
	null_blk: fix spurious IO errors after failed past-wp access
	x86: Don't let pgprot_modify() change the page encryption bit
	block: keep bdi->io_pages in sync with max_sectors_kb for stacked devices
	irqchip/versatile-fpga: Handle chained IRQs properly
	sched: Avoid scale real weight down to zero
	selftests/x86/ptrace_syscall_32: Fix no-vDSO segfault
	PCI/switchtec: Fix init_completion race condition with poll_wait()
	libata: Remove extra scsi_host_put() in ata_scsi_add_hosts()
	gfs2: Don't demote a glock until its revokes are written
	x86/boot: Use unsigned comparison for addresses
	efi/x86: Ignore the memory attributes table on i386
	genirq/irqdomain: Check pointer in irq_domain_alloc_irqs_hierarchy()
	block: Fix use-after-free issue accessing struct io_cq
	usb: dwc3: core: add support for disabling SS instances in park mode
	irqchip/gic-v4: Provide irq_retrigger to avoid circular locking dependency
	locking/lockdep: Avoid recursion in lockdep_count_{for,back}ward_deps()
	block, bfq: fix use-after-free in bfq_idle_slice_timer_body
	btrfs: remove a BUG_ON() from merge_reloc_roots()
	btrfs: track reloc roots based on their commit root bytenr
	uapi: rename ext2_swab() to swab() and share globally in swab.h
	misc: rtsx: set correct pcr_ops for rts522A
	slub: improve bit diffusion for freelist ptr obfuscation
	ASoC: fix regwmask
	ASoC: dapm: connect virtual mux with default value
	ASoC: dpcm: allow start or stop during pause for backend
	ASoC: topology: use name_prefix for new kcontrol
	usb: gadget: f_fs: Fix use after free issue as part of queue failure
	usb: gadget: composite: Inform controller driver of self-powered
	ALSA: usb-audio: Add mixer workaround for TRX40 and co
	ALSA: hda: Add driver blacklist
	ALSA: hda: Fix potential access overflow in beep helper
	ALSA: ice1724: Fix invalid access for enumerated ctl items
	ALSA: pcm: oss: Fix regression by buffer overflow fix
	ALSA: doc: Document PC Beep Hidden Register on Realtek ALC256
	ALSA: hda/realtek - Set principled PC Beep configuration for ALC256
	media: ti-vpe: cal: fix disable_irqs to only the intended target
	acpi/x86: ignore unspecified bit positions in the ACPI global lock field
	thermal: devfreq_cooling: inline all stubs for CONFIG_DEVFREQ_THERMAL=n
	nvme-fc: Revert "add module to ops template to allow module references"
	PCI/ASPM: Clear the correct bits when enabling L1 substates
	PCI: endpoint: Fix for concurrent memory allocation in OB address region
	KEYS: reaching the keys quotas correctly
	irqchip/versatile-fpga: Apply clear-mask earlier
	MIPS: OCTEON: irq: Fix potential NULL pointer dereference
	ath9k: Handle txpower changes even when TPC is disabled
	signal: Extend exec_id to 64bits
	x86/entry/32: Add missing ASM_CLAC to general_protection entry
	KVM: nVMX: Properly handle userspace interrupt window request
	KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks
	KVM: s390: vsie: Fix delivery of addressing exceptions
	KVM: x86: Allocate new rmap and large page tracking when moving memslot
	KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support
	KVM: VMX: fix crash cleanup when KVM wasn't used
	CIFS: Fix bug which the return value by asynchronous read is error
	btrfs: drop block from cache on error in relocation
	crypto: mxs-dcp - fix scatterlist linearization for hash
	ALSA: hda: Initialize power_state field properly
	net: rtnl_configure_link: fix dev flags changes arg to __dev_notify_flags
	powerpc/pseries: Drop pointless static qualifier in vpa_debugfs_init()
	x86/speculation: Remove redundant arch_smt_update() invocation
	tools: gpio: Fix out-of-tree build regression
	mm: Use fixed constant in page_frag_alloc instead of size + 1
	dm verity fec: fix memory leak in verity_fec_dtr
	scsi: zfcp: fix missing erp_lock in port recovery trigger for point-to-point
	arm64: armv8_deprecated: Fix undef_hook mask for thumb setend
	rtc: omap: Use define directive for PIN_CONFIG_ACTIVE_HIGH
	NFS: Fix a page leak in nfs_destroy_unlinked_subrequests()
	ext4: fix a data race at inode->i_blocks
	fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once()
	ocfs2: no need try to truncate file beyond i_size
	perf tools: Support Python 3.8+ in Makefile
	s390/diag: fix display of diagnose call statistics
	Input: i8042 - add Acer Aspire 5738z to nomux list
	kmod: make request_module() return an error when autoloading is disabled
	cpufreq: powernv: Fix use-after-free
	hfsplus: fix crash and filesystem corruption when deleting files
	libata: Return correct status in sata_pmp_eh_recover_pm() when ATA_DFLAG_DETACH is set
	powerpc/powernv/idle: Restore AMR/UAMOR/AMOR after idle
	powerpc/64/tm: Don't let userspace set regs->trap via sigreturn
	powerpc/hash64/devmap: Use H_PAGE_THP_HUGE when setting up huge devmap PTE entries
	powerpc/xive: Use XIVE_BAD_IRQ instead of zero to catch non configured IPIs
	powerpc/kprobes: Ignore traps that happened in real mode
	scsi: mpt3sas: Fix kernel panic observed on soft HBA unplug
	powerpc: Add attributes for setjmp/longjmp
	powerpc: Make setjmp/longjmp signature standard
	Btrfs: fix crash during unmount due to race with delayed inode workers
	btrfs: use nofs allocations for running delayed items
	dm zoned: remove duplicate nr_rnd_zones increase in dmz_init_zone()
	crypto: caam - update xts sector size for large input length
	drm/dp_mst: Fix clearing payload state on topology disable
	drm: Remove PageReserved manipulation from drm_pci_alloc
	ftrace/kprobe: Show the maxactive number on kprobe_events
	ipmi: fix hung processes in __get_guid()
	powerpc/fsl_booke: Avoid creating duplicate tlb1 entry
	misc: echo: Remove unnecessary parentheses and simplify check for zero
	mfd: dln2: Fix sanity checking for endpoints
	amd-xgbe: Use __napi_schedule() in BH context
	hsr: check protocol version in hsr_newlink()
	net: ipv4: devinet: Fix crash when add/del multicast IP with autojoin
	net: ipv6: do not consider routes via gateways for anycast address check
	net: qrtr: send msgs from local of same id as broadcast
	net: revert default NAPI poll timeout to 2 jiffies
	net: stmmac: dwmac-sunxi: Provide TX and RX fifo sizes
	scsi: ufs: Fix ufshcd_hold() caused scheduling while atomic
	jbd2: improve comments about freeing data buffers whose page mapping is NULL
	pwm: pca9685: Fix PWM/GPIO inter-operation
	ext4: fix incorrect group count in ext4_fill_super error message
	ext4: fix incorrect inodes per group in error message
	ASoC: Intel: mrfld: fix incorrect check on p->sink
	ASoC: Intel: mrfld: return error codes when an error occurs
	ALSA: usb-audio: Don't override ignore_ctl_error value from the map
	tracing: Fix the race between registering 'snapshot' event trigger and triggering 'snapshot' operation
	btrfs: check commit root generation in should_ignore_root
	mac80211_hwsim: Use kstrndup() in place of kasprintf()
	ext4: do not zeroout extents beyond i_disksize
	dm flakey: check for null arg_name in parse_features()
	kvm: x86: Host feature SSBD doesn't imply guest feature SPEC_CTRL_SSBD
	scsi: target: remove boilerplate code
	scsi: target: fix hang when multiple threads try to destroy the same iscsi session
	x86/microcode/AMD: Increase microcode PATCH_MAX_SIZE
	x86/intel_rdt: Enumerate L2 Code and Data Prioritization (CDP) feature
	x86/intel_rdt: Add two new resources for L2 Code and Data Prioritization (CDP)
	x86/intel_rdt: Enable L2 CDP in MSR IA32_L2_QOS_CFG
	x86/resctrl: Preserve CDP enable over CPU hotplug
	x86/resctrl: Fix invalid attempt at removing the default resource group
	mm/vmalloc.c: move 'area->pages' after if statement
	objtool: Fix switch table detection in .text.unlikely
	scsi: sg: add sg_remove_request in sg_common_write
	ext4: use non-movable memory for superblock readahead
	arm, bpf: Fix bugs with ALU64 {RSH, ARSH} BPF_K shift by 0
	netfilter: nf_tables: report EOPNOTSUPP on unsupported flags/object type
	irqchip/mbigen: Free msi_desc on device teardown
	ALSA: hda: Don't release card at firmware loading error
	lib/raid6: use vdupq_n_u8 to avoid endianness warnings
	video: fbdev: sis: Remove unnecessary parentheses and commented code
	drm: NULL pointer dereference [null-pointer-deref] (CWE 476) problem
	clk: Fix debugfs_create_*() usage
	Revert "gpio: set up initial state from .get_direction()"
	arm64: perf: remove unsupported events for Cortex-A73
	arm64: traps: Don't print stack or raw PC/LR values in backtraces
	arch_topology: Fix section miss match warning due to free_raw_capacity()
	wil6210: increase firmware ready timeout
	wil6210: fix temperature debugfs
	scsi: ufs: make sure all interrupts are processed
	scsi: ufs: ufs-qcom: remove broken hci version quirk
	wil6210: rate limit wil_rx_refill error
	rpmsg: glink: use put_device() if device_register fail
	rtc: pm8xxx: Fix issue in RTC write path
	rpmsg: glink: Fix missing mutex_init() in qcom_glink_alloc_channel()
	rpmsg: glink: smem: Ensure ordering during tx
	wil6210: fix PCIe bus mastering in case of interface down
	wil6210: add block size checks during FW load
	wil6210: fix length check in __wmi_send
	wil6210: abort properly in cfg suspend
	soc: qcom: smem: Use le32_to_cpu for comparison
	of: fix missing kobject init for !SYSFS && OF_DYNAMIC config
	rbd: avoid a deadlock on header_rwsem when flushing notifies
	rbd: call rbd_dev_unprobe() after unwatching and flushing notifies
	of: unittest: kmemleak in of_unittest_platform_populate()
	clk: at91: usb: continue if clk_hw_round_rate() return zero
	power: supply: bq27xxx_battery: Silence deferred-probe error
	clk: tegra: Fix Tegra PMC clock out parents
	soc: imx: gpc: fix power up sequencing
	rtc: 88pm860x: fix possible race condition
	NFSv4/pnfs: Return valid stateids in nfs_layout_find_inode_by_stateid()
	NFS: direct.c: Fix memory leak of dreq when nfs_get_lock_context fails
	s390/cpuinfo: fix wrong output when CPU0 is offline
	powerpc/maple: Fix declaration made after definition
	ext4: do not commit super on read-only bdev
	include/linux/swapops.h: correct guards for non_swap_entry()
	percpu_counter: fix a data race at vm_committed_as
	compiler.h: fix error in BUILD_BUG_ON() reporting
	KVM: s390: vsie: Fix possible race when shadowing region 3 tables
	x86: ACPI: fix CPU hotplug deadlock
	drm/amdkfd: kfree the wrong pointer
	NFS: Fix memory leaks in nfs_pageio_stop_mirroring()
	iommu/vt-d: Fix mm reference leak
	ext2: fix empty body warnings when -Wextra is used
	ext2: fix debug reference to ext2_xattr_cache
	libnvdimm: Out of bounds read in __nd_ioctl()
	iommu/amd: Fix the configuration of GCR3 table root pointer
	net: dsa: bcm_sf2: Fix overflow checks
	fbdev: potential information leak in do_fb_ioctl()
	tty: evh_bytechan: Fix out of bounds accesses
	locktorture: Print ratio of acquisitions, not failures
	mtd: lpddr: Fix a double free in probe()
	mtd: phram: fix a double free issue in error path
	KEYS: Use individual pages in big_key for crypto buffers
	KEYS: Don't write out to userspace while holding key semaphore
	Linux 4.14.177

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I5eb89921eb63ee9e92a031fc6f3a10d9e2616358
2020-04-24 08:41:10 +02:00
Paul E. McKenney
7c1449d4e8 locktorture: Print ratio of acquisitions, not failures
commit 80c503e0e68fbe271680ab48f0fe29bc034b01b7 upstream.

The __torture_print_stats() function in locktorture.c carefully
initializes local variable "min" to statp[0].n_lock_acquired, but
then compares it to statp[i].n_lock_fail.  Given that the .n_lock_fail
field should normally be zero, and given the initialization, it seems
reasonable to display the maximum and minimum number acquisitions
instead of miscomputing the maximum and minimum number of failures.
This commit therefore switches from failures to acquisitions.

And this turns out to be not only a day-zero bug, but entirely my
own fault.  I hate it when that happens!

Fixes: 0af3fe1efa ("locktorture: Add a lock-torture kernel module")
Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-24 08:01:23 +02:00
Xiao Yang
7054f86f26 tracing: Fix the race between registering 'snapshot' event trigger and triggering 'snapshot' operation
commit 0bbe7f719985efd9adb3454679ecef0984cb6800 upstream.

Traced event can trigger 'snapshot' operation(i.e. calls snapshot_trigger()
or snapshot_count_trigger()) when register_snapshot_trigger() has completed
registration but doesn't allocate buffer for 'snapshot' event trigger.  In
the rare case, 'snapshot' operation always detects the lack of allocated
buffer so make register_snapshot_trigger() allocate buffer first.

trigger-snapshot.tc in kselftest reproduces the issue on slow vm:
-----------------------------------------------------------
cat trace
...
ftracetest-3028  [002] ....   236.784290: sched_process_fork: comm=ftracetest pid=3028 child_comm=ftracetest child_pid=3036
     <...>-2875  [003] ....   240.460335: tracing_snapshot_instance_cond: *** SNAPSHOT NOT ALLOCATED ***
     <...>-2875  [003] ....   240.460338: tracing_snapshot_instance_cond: *** stopping trace here!   ***
-----------------------------------------------------------

Link: http://lkml.kernel.org/r/20200414015145.66236-1-yangx.jy@cn.fujitsu.com

Cc: stable@vger.kernel.org
Fixes: 93e31ffbf4 ("tracing: Add 'snapshot' event trigger command")
Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-24 08:00:56 +02:00
Masami Hiramatsu
abd1348c09 ftrace/kprobe: Show the maxactive number on kprobe_events
[ Upstream commit 6a13a0d7b4d1171ef9b80ad69abc37e1daa941b3 ]

Show maxactive parameter on kprobe_events.
This allows user to save the current configuration and
restore it without losing maxactive parameter.

Link: http://lkml.kernel.org/r/4762764a-6df7-bc93-ed60-e336146dce1f@gmail.com
Link: http://lkml.kernel.org/r/158503528846.22706.5549974121212526020.stgit@devnote2

Cc: stable@vger.kernel.org
Fixes: 696ced4fb1 ("tracing/kprobes: expose maxactive for kretprobe in kprobe_events")
Reported-by: Taeung Song <treeze.taeung@gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-24 08:00:50 +02:00
Eric Biggers
ad2259b81d kmod: make request_module() return an error when autoloading is disabled
commit d7d27cfc5cf0766a26a8f56868c5ad5434735126 upstream.

Patch series "module autoloading fixes and cleanups", v5.

This series fixes a bug where request_module() was reporting success to
kernel code when module autoloading had been completely disabled via
'echo > /proc/sys/kernel/modprobe'.

It also addresses the issues raised on the original thread
(https://lkml.kernel.org/lkml/20200310223731.126894-1-ebiggers@kernel.org/T/#u)
bydocumenting the modprobe sysctl, adding a self-test for the empty path
case, and downgrading a user-reachable WARN_ONCE().

This patch (of 4):

It's long been possible to disable kernel module autoloading completely
(while still allowing manual module insertion) by setting
/proc/sys/kernel/modprobe to the empty string.

This can be preferable to setting it to a nonexistent file since it
avoids the overhead of an attempted execve(), avoids potential
deadlocks, and avoids the call to security_kernel_module_request() and
thus on SELinux-based systems eliminates the need to write SELinux rules
to dontaudit module_request.

However, when module autoloading is disabled in this way,
request_module() returns 0.  This is broken because callers expect 0 to
mean that the module was successfully loaded.

Apparently this was never noticed because this method of disabling
module autoloading isn't used much, and also most callers don't use the
return value of request_module() since it's always necessary to check
whether the module registered its functionality or not anyway.

But improperly returning 0 can indeed confuse a few callers, for example
get_fs_type() in fs/filesystems.c where it causes a WARNING to be hit:

	if (!fs && (request_module("fs-%.*s", len, name) == 0)) {
		fs = __get_fs_type(name, len);
		WARN_ONCE(!fs, "request_module fs-%.*s succeeded, but still no fs?\n", len, name);
	}

This is easily reproduced with:

	echo > /proc/sys/kernel/modprobe
	mount -t NONEXISTENT none /

It causes:

	request_module fs-NONEXISTENT succeeded, but still no fs?
	WARNING: CPU: 1 PID: 1106 at fs/filesystems.c:275 get_fs_type+0xd6/0xf0
	[...]

This should actually use pr_warn_once() rather than WARN_ONCE(), since
it's also user-reachable if userspace immediately unloads the module.
Regardless, request_module() should correctly return an error when it
fails.  So let's make it return -ENOENT, which matches the error when
the modprobe binary doesn't exist.

I've also sent patches to document and test this case.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Ben Hutchings <benh@debian.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200310223731.126894-1-ebiggers@kernel.org
Link: http://lkml.kernel.org/r/20200312202552.241885-1-ebiggers@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-24 08:00:44 +02:00
Zhenzhong Duan
1733d2a94f x86/speculation: Remove redundant arch_smt_update() invocation
commit 34d66caf251df91ff27b24a3a786810d29989eca upstream.

With commit a74cfffb03b7 ("x86/speculation: Rework SMT state change"),
arch_smt_update() is invoked from each individual CPU hotplug function.

Therefore the extra arch_smt_update() call in the sysfs SMT control is
redundant.

Fixes: a74cfffb03b7 ("x86/speculation: Rework SMT state change")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <konrad.wilk@oracle.com>
Cc: <dwmw@amazon.co.uk>
Cc: <bp@suse.de>
Cc: <srinivas.eeda@oracle.com>
Cc: <peterz@infradead.org>
Cc: <hpa@zytor.com>
Link: https://lkml.kernel.org/r/e2e064f2-e8ef-42ca-bf4f-76b612964752@default
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-24 08:00:41 +02:00
Eric W. Biederman
28c63ef17d signal: Extend exec_id to 64bits
commit d1e7fd6462ca9fc76650fbe6ca800e35b24267da upstream.

Replace the 32bit exec_id with a 64bit exec_id to make it impossible
to wrap the exec_id counter.  With care an attacker can cause exec_id
wrap and send arbitrary signals to a newly exec'd parent.  This
bypasses the signal sending checks if the parent changes their
credentials during exec.

The severity of this problem can been seen that in my limited testing
of a 32bit exec_id it can take as little as 19s to exec 65536 times.
Which means that it can take as little as 14 days to wrap a 32bit
exec_id.  Adam Zabrocki has succeeded wrapping the self_exe_id in 7
days.  Even my slower timing is in the uptime of a typical server.
Which means self_exec_id is simply a speed bump today, and if exec
gets noticably faster self_exec_id won't even be a speed bump.

Extending self_exec_id to 64bits introduces a problem on 32bit
architectures where reading self_exec_id is no longer atomic and can
take two read instructions.  Which means that is is possible to hit
a window where the read value of exec_id does not match the written
value.  So with very lucky timing after this change this still
remains expoiltable.

I have updated the update of exec_id on exec to use WRITE_ONCE
and the read of exec_id in do_notify_parent to use READ_ONCE
to make it clear that there is no locking between these two
locations.

Link: https://lore.kernel.org/kernel-hardening/20200324215049.GA3710@pi3.com.pl
Fixes: 2.3.23pre2
Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-24 08:00:38 +02:00
Boqun Feng
105d88c02f locking/lockdep: Avoid recursion in lockdep_count_{for,back}ward_deps()
[ Upstream commit 25016bd7f4caf5fc983bbab7403d08e64cba3004 ]

Qian Cai reported a bug when PROVE_RCU_LIST=y, and read on /proc/lockdep
triggered a warning:

  [ ] DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
  ...
  [ ] Call Trace:
  [ ]  lock_is_held_type+0x5d/0x150
  [ ]  ? rcu_lockdep_current_cpu_online+0x64/0x80
  [ ]  rcu_read_lock_any_held+0xac/0x100
  [ ]  ? rcu_read_lock_held+0xc0/0xc0
  [ ]  ? __slab_free+0x421/0x540
  [ ]  ? kasan_kmalloc+0x9/0x10
  [ ]  ? __kmalloc_node+0x1d7/0x320
  [ ]  ? kvmalloc_node+0x6f/0x80
  [ ]  __bfs+0x28a/0x3c0
  [ ]  ? class_equal+0x30/0x30
  [ ]  lockdep_count_forward_deps+0x11a/0x1a0

The warning got triggered because lockdep_count_forward_deps() call
__bfs() without current->lockdep_recursion being set, as a result
a lockdep internal function (__bfs()) is checked by lockdep, which is
unexpected, and the inconsistency between the irq-off state and the
state traced by lockdep caused the warning.

Apart from this warning, lockdep internal functions like __bfs() should
always be protected by current->lockdep_recursion to avoid potential
deadlocks and data inconsistency, therefore add the
current->lockdep_recursion on-and-off section to protect __bfs() in both
lockdep_count_forward_deps() and lockdep_count_backward_deps()

Reported-by: Qian Cai <cai@lca.pw>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200312151258.128036-1-boqun.feng@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-24 08:00:29 +02:00
Alexander Sverdlin
b8efdd0c43 genirq/irqdomain: Check pointer in irq_domain_alloc_irqs_hierarchy()
[ Upstream commit 87f2d1c662fa1761359fdf558246f97e484d177a ]

irq_domain_alloc_irqs_hierarchy() has 3 call sites in the compilation unit
but only one of them checks for the pointer which is being dereferenced
inside the called function. Move the check into the function. This allows
for catching the error instead of the following crash:

Unable to handle kernel NULL pointer dereference at virtual address 00000000
PC is at 0x0
LR is at gpiochip_hierarchy_irq_domain_alloc+0x11f/0x140
...
[<c06c23ff>] (gpiochip_hierarchy_irq_domain_alloc)
[<c0462a89>] (__irq_domain_alloc_irqs)
[<c0462dad>] (irq_create_fwspec_mapping)
[<c06c2251>] (gpiochip_to_irq)
[<c06c1c9b>] (gpiod_to_irq)
[<bf973073>] (gpio_irqs_init [gpio_irqs])
[<bf974048>] (gpio_irqs_exit+0xecc/0xe84 [gpio_irqs])
Code: bad PC value

Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200306174720.82604-1-alexander.sverdlin@nokia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-24 08:00:28 +02:00
Michael Wang
964a86d291 sched: Avoid scale real weight down to zero
[ Upstream commit 26cf52229efc87e2effa9d788f9b33c40fb3358a ]

During our testing, we found a case that shares no longer
working correctly, the cgroup topology is like:

  /sys/fs/cgroup/cpu/A		(shares=102400)
  /sys/fs/cgroup/cpu/A/B	(shares=2)
  /sys/fs/cgroup/cpu/A/B/C	(shares=1024)

  /sys/fs/cgroup/cpu/D		(shares=1024)
  /sys/fs/cgroup/cpu/D/E	(shares=1024)
  /sys/fs/cgroup/cpu/D/E/F	(shares=1024)

The same benchmark is running in group C & F, no other tasks are
running, the benchmark is capable to consumed all the CPUs.

We suppose the group C will win more CPU resources since it could
enjoy all the shares of group A, but it's F who wins much more.

The reason is because we have group B with shares as 2, since
A->cfs_rq.load.weight == B->se.load.weight == B->shares/nr_cpus,
so A->cfs_rq.load.weight become very small.

And in calc_group_shares() we calculate shares as:

  load = max(scale_load_down(cfs_rq->load.weight), cfs_rq->avg.load_avg);
  shares = (tg_shares * load) / tg_weight;

Since the 'cfs_rq->load.weight' is too small, the load become 0
after scale down, although 'tg_shares' is 102400, shares of the se
which stand for group A on root cfs_rq become 2.

While the se of D on root cfs_rq is far more bigger than 2, so it
wins the battle.

Thus when scale_load_down() scale real weight down to 0, it's no
longer telling the real story, the caller will have the wrong
information and the calculation will be buggy.

This patch add check in scale_load_down(), so the real weight will
be >= MIN_SHARES after scale, after applied the group C wins as
expected.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/38e8e212-59a1-64b2-b247-b6d0b52d8dc1@linux.alibaba.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-24 08:00:26 +02:00
Kelly Rossmoyer
f20eb0ba96 ANDROID: fix wakeup reason findings
The 0-day test bot found three minor issues in the wakeup_reason
enhancements patch, including two undeclared functions that should have
been static, an allegedly uninitialized pointer (which is actually set
in the line immediately prior to cppcheck's complaint), and a type
mismatch when printing timespec64 fields on a 32-bit build.

These changes address those findings.

Fixes: 8c29afa601 ("ANDROID: power: wakeup_reason: wake reason
enhancements")
Bug: 153727431
Reported-by: kbuild test robot <lkp@intel.com>
Change-Id: I9194f85d0ca7921461866b73dc24e1783b1da6c6
Signed-off-by: Kelly Rossmoyer <krossmo@google.com>
2020-04-23 15:55:02 -07:00
Masahiro Yamada
4c13d6ace5 UPSTREAM: kheaders: include only headers into kheaders_data.tar.xz
Currently, kheaders_data.tar.xz contains some build scripts as well as
headers. None of them is needed in the header archive.

For ARCH=x86, this commit excludes the following from the archive:

  arch/x86/include/asm/Kbuild
  arch/x86/include/uapi/asm/Kbuild
  include/asm-generic/Kbuild
  include/config/auto.conf
  include/config/kernel.release
  include/config/tristate.conf
  include/uapi/asm-generic/Kbuild
  include/uapi/Kbuild
  kernel/gen_kheaders.sh

This change is actually motivated for the planned header compile-testing
because it will generate more build artifacts, which should not be
included in the archive.

Change-Id: I688e041842740216cace0373ca9f358bc7704809
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
(cherry picked from commit 7199ff7d74003b5aad1e6328bf6128cd8ceea735)
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
2020-04-16 15:37:40 -07:00
Masahiro Yamada
bb7cd18b32 UPSTREAM: kheaders: remove meaningless -R option of 'ls'
The -R option of 'ls' is supposed to be used for directories.

   -R, --recursive
          list subdirectories recursively

Since 'find ... -type f' only matches to regular files, we do not
expect directories passed to the 'ls' command here.

Giving -R is harmless at least, but unneeded.

Change-Id: I73588f18e40824ccecc4149fbc467015b5c5e142
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
(cherry picked from commit b60b7c2ea9b7f854d457fefd592c77f621a86580)
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
2020-04-16 15:37:39 -07:00
Greg Kroah-Hartman
341ba4f053 Merge 4.14.176 into android-4.14-stable
Changes in 4.14.176
	ipv4: fix a RCU-list lock in fib_triestat_seq_show
	net, ip_tunnel: fix interface lookup with no key
	sctp: fix refcount bug in sctp_wfree
	sctp: fix possibly using a bad saddr with a given dst
	drm/bochs: downgrade pci_request_region failure from error to warning
	initramfs: restore default compression behavior
	tools/power turbostat: Fix gcc build warnings
	drm/etnaviv: replace MMU flush marker with flush sequence
	blk-mq: sync the update nr_hw_queues with blk_mq_queue_tag_busy_iter
	blk-mq: Allow blocking queue tag iter callbacks
	misc: pci_endpoint_test: Fix to support > 10 pci-endpoint-test devices
	coresight: do not use the BIT() macro in the UAPI header
	padata: always acquire cpu_hotplug_lock before pinst->lock
	mm: mempolicy: require at least one nodeid for MPOL_PREFERRED
	ipv6: don't auto-add link-local address to lag ports
	net: dsa: bcm_sf2: Ensure correct sub-node is parsed
	net: phy: micrel: kszphy_resume(): add delay after genphy_resume() before accessing PHY registers
	net: stmmac: dwmac1000: fix out-of-bounds mac address reg setting
	slcan: Don't transmit uninitialized stack data in padding
	mlxsw: spectrum_flower: Do not stop at FLOW_ACTION_VLAN_MANGLE
	random: always use batched entropy for get_random_u{32,64}
	tools/accounting/getdelays.c: fix netlink attribute length
	hwrng: imx-rngc - fix an error path
	ASoC: jz4740-i2s: Fix divider written at incorrect offset in register
	IB/hfi1: Call kobject_put() when kobject_init_and_add() fails
	IB/hfi1: Fix memory leaks in sysfs registration and unregistration
	ceph: remove the extra slashes in the server path
	ceph: canonicalize server path in place
	Bluetooth: RFCOMM: fix ODEBUG bug in rfcomm_dev_ioctl
	RDMA/cm: Update num_paths in cma_resolve_iboe_route error flow
	fbcon: fix null-ptr-deref in fbcon_switch
	acpi/nfit: Fix bus command validation
	clk: qcom: rcg: Return failure for RCG update
	drm/msm: stop abusing dma_map/unmap for cache
	arm64: Fix size of __early_cpu_boot_status
	rpmsg: glink: Remove chunk size word align warning
	usb: dwc3: don't set gadget->is_otg flag
	drm_dp_mst_topology: fix broken drm_dp_sideband_parse_remote_dpcd_read()
	rpmsg: glink: smem: Support rx peak for size less than 4 bytes
	drm/msm: Use the correct dma_sync calls in msm_gem
	Linux 4.14.176

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I15dbb5a2b6b015683408249990a95894658f611a
2020-04-13 13:04:53 +02:00
Daniel Jordan
2d8260be1c padata: always acquire cpu_hotplug_lock before pinst->lock
commit 38228e8848cd7dd86ccb90406af32de0cad24be3 upstream.

lockdep complains when padata's paths to update cpumasks via CPU hotplug
and sysfs are both taken:

  # echo 0 > /sys/devices/system/cpu/cpu1/online
  # echo ff > /sys/kernel/pcrypt/pencrypt/parallel_cpumask

  ======================================================
  WARNING: possible circular locking dependency detected
  5.4.0-rc8-padata-cpuhp-v3+ #1 Not tainted
  ------------------------------------------------------
  bash/205 is trying to acquire lock:
  ffffffff8286bcd0 (cpu_hotplug_lock.rw_sem){++++}, at: padata_set_cpumask+0x2b/0x120

  but task is already holding lock:
  ffff8880001abfa0 (&pinst->lock){+.+.}, at: padata_set_cpumask+0x26/0x120

  which lock already depends on the new lock.

padata doesn't take cpu_hotplug_lock and pinst->lock in a consistent
order.  Which should be first?  CPU hotplug calls into padata with
cpu_hotplug_lock already held, so it should have priority.

Fixes: 6751fb3c0e ("padata: Use get_online_cpus/put_online_cpus")
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-13 10:34:27 +02:00
Kelly Rossmoyer
8c29afa601 ANDROID: power: wakeup_reason: wake reason enhancements
These changes build upon the existing Android kernel wakeup reason code
to:
* improve the positioning of suspend abort logging calls in suspend flow
* add logging of abnormal wakeup reasons like unexpected HW IRQs and
  IRQs configured as both wake-enabled and no-suspend
* add support for capturing deferred-processing threaded nested IRQs as
  wakeup reasons rather than their synchronously-processed parents

Bug: 150970830
Bug: 140217217

Signed-off-by: Kelly Rossmoyer <krossmo@google.com>
Change-Id: I903b811a0fe11a605a25815c3a341668a23de700
2020-04-09 15:28:36 +00:00
Eric Biggers
64029e7ca4 FROMLIST: kmod: make request_module() return an error when autoloading is disabled
It's long been possible to disable kernel module autoloading completely
(while still allowing manual module insertion) by setting
/proc/sys/kernel/modprobe to the empty string.  This can be preferable
to setting it to a nonexistent file since it avoids the overhead of an
attempted execve(), avoids potential deadlocks, and avoids the call to
security_kernel_module_request() and thus on SELinux-based systems
eliminates the need to write SELinux rules to dontaudit module_request.

However, when module autoloading is disabled in this way,
request_module() returns 0.  This is broken because callers expect 0 to
mean that the module was successfully loaded.

Apparently this was never noticed because this method of disabling
module autoloading isn't used much, and also most callers don't use the
return value of request_module() since it's always necessary to check
whether the module registered its functionality or not anyway.  But
improperly returning 0 can indeed confuse a few callers, for example
get_fs_type() in fs/filesystems.c where it causes a WARNING to be hit:

	if (!fs && (request_module("fs-%.*s", len, name) == 0)) {
		fs = __get_fs_type(name, len);
		WARN_ONCE(!fs, "request_module fs-%.*s succeeded, but still no fs?\n", len, name);
	}

This is easily reproduced with:

	echo > /proc/sys/kernel/modprobe
	mount -t NONEXISTENT none /

It causes:

	request_module fs-NONEXISTENT succeeded, but still no fs?
	WARNING: CPU: 1 PID: 1106 at fs/filesystems.c:275 get_fs_type+0xd6/0xf0
	[...]

This should actually use pr_warn_once() rather than WARN_ONCE(), since
it's also user-reachable if userspace immediately unloads the module.
Regardless, request_module() should correctly return an error when it
fails.  So let's make it return -ENOENT, which matches the error when
the modprobe binary doesn't exist.

I've also sent patches to document and test this case.

Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Jessica Yu <jeyu@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: NeilBrown <neilb@suse.com>
Link: https://lore.kernel.org/r/20200318230515.171692-2-ebiggers@kernel.org
Bug: 151589316
Change-Id: I5e04f85e12a4f85da23e53bc11da1ade565abcd6
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-04-06 10:45:02 -07:00
Greg Kroah-Hartman
fae4e1d295 Merge 4.14.175 into android-4.14
Changes in 4.14.175
	spi: qup: call spi_qup_pm_resume_runtime before suspending
	powerpc: Include .BTF section
	ARM: dts: dra7: Add "dma-ranges" property to PCIe RC DT nodes
	spi: pxa2xx: Add CS control clock quirk
	spi/zynqmp: remove entry that causes a cs glitch
	drm/exynos: dsi: propagate error value and silence meaningless warning
	drm/exynos: dsi: fix workaround for the legacy clock name
	drivers/perf: arm_pmu_acpi: Fix incorrect checking of gicc pointer
	altera-stapl: altera_get_note: prevent write beyond end of 'key'
	dm bio record: save/restore bi_end_io and bi_integrity
	xenbus: req->body should be updated before req->state
	xenbus: req->err should be updated before req->state
	block, bfq: fix overwrite of bfq_group pointer in bfq_find_set_group()
	parse-maintainers: Mark as executable
	USB: Disable LPM on WD19's Realtek Hub
	usb: quirks: add NO_LPM quirk for RTL8153 based ethernet adapters
	USB: serial: option: add ME910G1 ECM composition 0x110b
	usb: host: xhci-plat: add a shutdown
	USB: serial: pl2303: add device-id for HP LD381
	usb: xhci: apply XHCI_SUSPEND_DELAY to AMD XHCI controller 1022:145c
	ALSA: line6: Fix endless MIDI read loop
	ALSA: seq: virmidi: Fix running status after receiving sysex
	ALSA: seq: oss: Fix running status after receiving sysex
	ALSA: pcm: oss: Avoid plugin buffer overflow
	ALSA: pcm: oss: Remove WARNING from snd_pcm_plug_alloc() checks
	iio: trigger: stm32-timer: disable master mode when stopping
	iio: magnetometer: ak8974: Fix negative raw values in sysfs
	mmc: sdhci-of-at91: fix cd-gpios for SAMA5D2
	staging: rtl8188eu: Add device id for MERCUSYS MW150US v2
	staging/speakup: fix get_word non-space look-ahead
	intel_th: Fix user-visible error codes
	intel_th: pci: Add Elkhart Lake CPU support
	rtc: max8907: add missing select REGMAP_IRQ
	xhci: Do not open code __print_symbolic() in xhci trace events
	memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event
	mm: slub: be more careful about the double cmpxchg of freelist
	mm, slub: prevent kmalloc_node crashes and memory leaks
	page-flags: fix a crash at SetPageError(THP_SWAP)
	x86/mm: split vmalloc_sync_all()
	USB: cdc-acm: fix close_delay and closing_wait units in TIOCSSERIAL
	USB: cdc-acm: fix rounding error in TIOCSSERIAL
	iio: adc: at91-sama5d2_adc: fix channel configuration for differential channels
	iio: adc: at91-sama5d2_adc: fix differential channels in triggered mode
	kbuild: Disable -Wpointer-to-enum-cast
	futex: Fix inode life-time issue
	futex: Unbreak futex hashing
	Revert "vrf: mark skb for multicast or link-local as enslaved to VRF"
	Revert "ipv6: Fix handling of LLA with VRF and sockets bound to VRF"
	ALSA: hda/realtek: Fix pop noise on ALC225
	arm64: smp: fix smp_send_stop() behaviour
	arm64: smp: fix crash_smp_send_stop() behaviour
	drm/bridge: dw-hdmi: fix AVI frame colorimetry
	staging: greybus: loopback_test: fix potential path truncation
	staging: greybus: loopback_test: fix potential path truncations
	Revert "drm/dp_mst: Skip validating ports during destruction, just ref"
	hsr: fix general protection fault in hsr_addr_is_self()
	macsec: restrict to ethernet devices
	net: dsa: Fix duplicate frames flooded by learning
	net: mvneta: Fix the case where the last poll did not process all rx
	net/packet: tpacket_rcv: avoid a producer race condition
	net: qmi_wwan: add support for ASKEY WWHC050
	net_sched: cls_route: remove the right filter from hashtable
	net_sched: keep alloc_hash updated after hash allocation
	net: stmmac: dwmac-rk: fix error path in rk_gmac_probe
	NFC: fdp: Fix a signedness bug in fdp_nci_send_patch()
	slcan: not call free_netdev before rtnl_unlock in slcan_open
	bnxt_en: fix memory leaks in bnxt_dcbnl_ieee_getets()
	net: dsa: mt7530: Change the LINK bit to reflect the link status
	vxlan: check return value of gro_cells_init()
	hsr: use rcu_read_lock() in hsr_get_node_{list/status}()
	hsr: add restart routine into hsr_get_node_list()
	hsr: set .netnsok flag
	net: ipv4: don't let PMTU updates increase route MTU
	cgroup-v1: cgroup_pidlist_next should update position index
	cpupower: avoid multiple definition with gcc -fno-common
	drivers/of/of_mdio.c:fix of_mdiobus_register()
	cgroup1: don't call release_agent when it is ""
	dt-bindings: net: FMan erratum A050385
	arm64: dts: ls1043a: FMan erratum A050385
	fsl/fman: detect FMan erratum A050385
	scsi: ipr: Fix softlockup when rescanning devices in petitboot
	mac80211: Do not send mesh HWMP PREQ if HWMP is disabled
	dpaa_eth: Remove unnecessary boolean expression in dpaa_get_headroom
	sxgbe: Fix off by one in samsung driver strncpy size arg
	arm64: ptrace: map SPSR_ELx<->PSR for compat tasks
	arm64: compat: map SPSR_ELx<->PSR for signals
	ftrace/x86: Anotate text_mutex split between ftrace_arch_code_modify_post_process() and ftrace_arch_code_modify_prepare()
	i2c: hix5hd2: add missed clk_disable_unprepare in remove
	Input: synaptics - enable RMI on HP Envy 13-ad105ng
	Input: avoid BIT() macro usage in the serio.h UAPI header
	ARM: dts: dra7: Add bus_dma_limit for L3 bus
	ARM: dts: omap5: Add bus_dma_limit for L3 bus
	perf probe: Do not depend on dwfl_module_addrsym()
	tools: Let O= makes handle a relative path with -C option
	scripts/dtc: Remove redundant YYLOC global declaration
	scsi: sd: Fix optimal I/O size for devices that change reported values
	mac80211: mark station unauthorized before key removal
	gpiolib: acpi: Correct comment for HP x2 10 honor_wakeup quirk
	gpiolib: acpi: Rework honor_wakeup option into an ignore_wake option
	gpiolib: acpi: Add quirk to ignore EC wakeups on HP x2 10 BYT + AXP288 model
	RDMA/core: Ensure security pkey modify is not lost
	genirq: Fix reference leaks on irq affinity notifiers
	xfrm: handle NETDEV_UNREGISTER for xfrm device
	vti[6]: fix packet tx through bpf_redirect() in XinY cases
	RDMA/mlx5: Block delay drop to unprivileged users
	xfrm: fix uctx len check in verify_sec_ctx_len
	xfrm: add the missing verify_sec_ctx_len check in xfrm_add_acquire
	xfrm: policy: Fix doulbe free in xfrm_policy_timer
	netfilter: nft_fwd_netdev: validate family and chain type
	vti6: Fix memory leak of skb if input policy check fails
	Input: raydium_i2c_ts - use true and false for boolean values
	Input: raydium_i2c_ts - fix error codes in raydium_i2c_boot_trigger()
	afs: Fix some tracing details
	USB: serial: option: add support for ASKEY WWHC050
	USB: serial: option: add BroadMobi BM806U
	USB: serial: option: add Wistron Neweb D19Q1
	USB: cdc-acm: restore capability check order
	USB: serial: io_edgeport: fix slab-out-of-bounds read in edge_interrupt_callback
	usb: musb: fix crash with highmen PIO and usbmon
	media: flexcop-usb: fix endpoint sanity check
	media: usbtv: fix control-message timeouts
	staging: rtl8188eu: Add ASUS USB-N10 Nano B1 to device table
	staging: wlan-ng: fix ODEBUG bug in prism2sta_disconnect_usb
	staging: wlan-ng: fix use-after-free Read in hfa384x_usbin_callback
	libfs: fix infoleak in simple_attr_read()
	media: ov519: add missing endpoint sanity checks
	media: dib0700: fix rc endpoint lookup
	media: stv06xx: add missing descriptor sanity checks
	media: xirlink_cit: add missing descriptor sanity checks
	mac80211: Check port authorization in the ieee80211_tx_dequeue() case
	mac80211: fix authentication with iwlwifi/mvm
	vt: selection, introduce vc_is_sel
	vt: ioctl, switch VT_IS_IN_USE and VT_BUSY to inlines
	vt: switch vt_dont_switch to bool
	vt: vt_ioctl: remove unnecessary console allocation checks
	vt: vt_ioctl: fix VT_DISALLOCATE freeing in-use virtual console
	vt: vt_ioctl: fix use-after-free in vt_in_use()
	platform/x86: pmc_atom: Add Lex 2I385SW to critclk_systems DMI table
	bpf: Explicitly memset the bpf_attr structure
	bpf: Explicitly memset some bpf info structures declared on the stack
	gpiolib: acpi: Add quirk to ignore EC wakeups on HP x2 10 CHT + AXP288 model
	net: ks8851-ml: Fix IO operations, again
	arm64: alternative: fix build with clang integrated assembler
	perf map: Fix off by one in strncpy() size argument
	ARM: dts: oxnas: Fix clear-mask property
	ARM: bcm2835-rpi-zero-w: Add missing pinctrl name
	arm64: dts: ls1043a-rdb: correct RGMII delay mode to rgmii-id
	arm64: dts: ls1046ardb: set RGMII interfaces to RGMII_ID mode
	Linux 4.14.175

Change-Id: If2c2cb5b3745ed6fbc5cb77737cfb1758fea4cb9
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2020-04-03 08:18:27 +02:00
Alistair Delva
c5d2219efd ANDROID: Fix wq fp check for CFI builds
A previous change added a test on the wrong config flag; rename
CFI to CFI_CLANG.

Bug: 145210207
Change-Id: Id8aead2eb2c75ad6442d10165f6cb86ccfb9c2f9
Signed-off-by: Alistair Delva <adelva@google.com>
2020-04-02 21:48:26 +00:00
Greg Kroah-Hartman
7855a721d9 bpf: Explicitly memset some bpf info structures declared on the stack
commit 5c6f25887963f15492b604dd25cb149c501bbabf upstream.

Trying to initialize a structure with "= {};" will not always clean out
all padding locations in a structure. So be explicit and call memset to
initialize everything for a number of bpf information structures that
are then copied from userspace, sometimes from smaller memory locations
than the size of the structure.

Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200320162258.GA794295@kroah.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-02 16:34:37 +02:00
Greg Kroah-Hartman
ba1ebf3aef bpf: Explicitly memset the bpf_attr structure
commit 8096f229421f7b22433775e928d506f0342e5907 upstream.

For the bpf syscall, we are relying on the compiler to properly zero out
the bpf_attr union that we copy userspace data into. Unfortunately that
doesn't always work properly, padding and other oddities might not be
correctly zeroed, and in some tests odd things have been found when the
stack is pre-initialized to other values.

Fix this by explicitly memsetting the structure to 0 before using it.

Reported-by: Maciej Żenczykowski <maze@google.com>
Reported-by: John Stultz <john.stultz@linaro.org>
Reported-by: Alexander Potapenko <glider@google.com>
Reported-by: Alistair Delva <adelva@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://android-review.googlesource.com/c/kernel/common/+/1235490
Link: https://lore.kernel.org/bpf/20200320094813.GA421650@kroah.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-02 16:34:37 +02:00
Edward Cree
cd195db914 genirq: Fix reference leaks on irq affinity notifiers
commit df81dfcfd6991d547653d46c051bac195cd182c1 upstream.

The handling of notify->work did not properly maintain notify->kref in two
 cases:
1) where the work was already scheduled, another irq_set_affinity_locked()
   would get the ref and (no-op-ly) schedule the work.  Thus when
   irq_affinity_notify() ran, it would drop the original ref but not the
   additional one.
2) when cancelling the (old) work in irq_set_affinity_notifier(), if there
   was outstanding work a ref had been got for it but was never put.
Fix both by checking the return values of the work handling functions
 (schedule_work() for (1) and cancel_work_sync() for (2)) and put the
 extra ref if the return value indicates preexisting work.

Fixes: cd7eab44e9 ("genirq: Add IRQ affinity notifiers")
Fixes: 59c39840f5ab ("genirq: Prevent use-after-free and work list corruption")
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ben Hutchings <ben@decadent.org.uk>
Link: https://lkml.kernel.org/r/24f5983f-2ab5-e83a-44ee-a45b5f9300f5@solarflare.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-02 16:34:31 +02:00
Tycho Andersen
ef76f1e9da cgroup1: don't call release_agent when it is ""
[ Upstream commit 2e5383d7904e60529136727e49629a82058a5607 ]

Older (and maybe current) versions of systemd set release_agent to "" when
shutting down, but do not set notify_on_release to 0.

Since 64e90a8acb ("Introduce STATIC_USERMODEHELPER to mediate
call_usermodehelper()"), we filter out such calls when the user mode helper
path is "". However, when used in conjunction with an actual (i.e. non "")
STATIC_USERMODEHELPER, the path is never "", so the real usermode helper
will be called with argv[0] == "".

Let's avoid this by not invoking the release_agent when it is "".

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-02 16:34:27 +02:00
Vasily Averin
e244073116 cgroup-v1: cgroup_pidlist_next should update position index
[ Upstream commit db8dd9697238be70a6b4f9d0284cd89f59c0e070 ]

if seq_file .next fuction does not change position index,
read after some lseek can generate unexpected output.

 # mount | grep cgroup
 # dd if=/mnt/cgroup.procs bs=1  # normal output
...
1294
1295
1296
1304
1382
584+0 records in
584+0 records out
584 bytes copied

dd: /mnt/cgroup.procs: cannot skip to specified offset
83  <<< generates end of last line
1383  <<< ... and whole last line once again
0+1 records in
0+1 records out
8 bytes copied

dd: /mnt/cgroup.procs: cannot skip to specified offset
1386  <<< generates last line anyway
0+1 records in
0+1 records out
5 bytes copied

https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-02 16:34:26 +02:00
Thomas Gleixner
25963a6606 futex: Unbreak futex hashing
commit 8d67743653dce5a0e7aa500fcccb237cde7ad88e upstream.

The recent futex inode life time fix changed the ordering of the futex key
union struct members, but forgot to adjust the hash function accordingly,

As a result the hashing omits the leading 64bit and even hashes beyond the
futex key causing a bad hash distribution which led to a ~100% performance
regression.

Hand in the futex key pointer instead of a random struct member and make
the size calculation based of the struct offset.

Fixes: 8019ad13ef7f ("futex: Fix inode life-time issue")
Reported-by: Rong Chen <rong.a.chen@intel.com>
Decoded-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Rong Chen <rong.a.chen@intel.com>
Link: https://lkml.kernel.org/r/87h7yy90ve.fsf@nanos.tec.linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-02 16:34:21 +02:00
Peter Zijlstra
e52694b56e futex: Fix inode life-time issue
commit 8019ad13ef7f64be44d4f892af9c840179009254 upstream.

As reported by Jann, ihold() does not in fact guarantee inode
persistence. And instead of making it so, replace the usage of inode
pointers with a per boot, machine wide, unique inode identifier.

This sequence number is global, but shared (file backed) futexes are
rare enough that this should not become a performance issue.

Reported-by: Jann Horn <jannh@google.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-02 16:34:21 +02:00
Joerg Roedel
c3cf92fd9b x86/mm: split vmalloc_sync_all()
commit 763802b53a427ed3cbd419dbba255c414fdd9e7c upstream.

Commit 3f8fd02b1bf1 ("mm/vmalloc: Sync unmappings in
__purge_vmap_area_lazy()") introduced a call to vmalloc_sync_all() in
the vunmap() code-path.  While this change was necessary to maintain
correctness on x86-32-pae kernels, it also adds additional cycles for
architectures that don't need it.

Specifically on x86-64 with CONFIG_VMAP_STACK=y some people reported
severe performance regressions in micro-benchmarks because it now also
calls the x86-64 implementation of vmalloc_sync_all() on vunmap().  But
the vmalloc_sync_all() implementation on x86-64 is only needed for newly
created mappings.

To avoid the unnecessary work on x86-64 and to gain the performance
back, split up vmalloc_sync_all() into two functions:

	* vmalloc_sync_mappings(), and
	* vmalloc_sync_unmappings()

Most call-sites to vmalloc_sync_all() only care about new mappings being
synchronized.  The only exception is the new call-site added in the
above mentioned commit.

Shile Zhang directed us to a report of an 80% regression in reaim
throughput.

Fixes: 3f8fd02b1bf1 ("mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()")
Reported-by: kernel test robot <oliver.sang@intel.com>
Reported-by: Shile Zhang <shile.zhang@linux.alibaba.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Borislav Petkov <bp@suse.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	[GHES]
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20191009124418.8286-1-joro@8bytes.org
Link: https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/4D3JPPHBNOSPFK2KEPC6KGKS6J25AIDB/
Link: http://lkml.kernel.org/r/20191113095530.228959-1-shile.zhang@linux.alibaba.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-02 16:34:20 +02:00
Greg Kroah-Hartman
bb8c715f7f UPSTREAM: bpf: Explicitly memset some bpf info structures declared on the stack
Trying to initialize a structure with "= {};" will not always clean out
all padding locations in a structure. So be explicit and call memset to
initialize everything for a number of bpf information structures that
are then copied from userspace, sometimes from smaller memory locations
than the size of the structure.

Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200320162258.GA794295@kroah.com
(cherry picked from commit 269efb7fc478563a7e7b22590d8076823f4ac82a)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I52a2cab20aa310085ec104bd811ac4f2b83657b6
2020-03-23 16:56:52 +01:00
Greg Kroah-Hartman
9affadc76e UPSTREAM: bpf: Explicitly memset the bpf_attr structure
For the bpf syscall, we are relying on the compiler to properly zero out
the bpf_attr union that we copy userspace data into. Unfortunately that
doesn't always work properly, padding and other oddities might not be
correctly zeroed, and in some tests odd things have been found when the
stack is pre-initialized to other values.

Fix this by explicitly memsetting the structure to 0 before using it.

Reported-by: Maciej Żenczykowski <maze@google.com>
Reported-by: John Stultz <john.stultz@linaro.org>
Reported-by: Alexander Potapenko <glider@google.com>
Reported-by: Alistair Delva <adelva@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://android-review.googlesource.com/c/kernel/common/+/1235490
Link: https://lore.kernel.org/bpf/20200320094813.GA421650@kroah.com
(cherry picked from commit 8096f229421f7b22433775e928d506f0342e5907)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I2dc28cd45024da5cc6861ff4a9b25fae389cc6d8
2020-03-23 16:46:27 +01:00
Greg Kroah-Hartman
32bc956bc2 Merge 4.14.174 into android-4.14
Changes in 4.14.174
	phy: Revert toggling reset changes.
	net: phy: Avoid multiple suspends
	cgroup, netclassid: periodically release file_lock on classid updating
	gre: fix uninit-value in __iptunnel_pull_header
	ipv6/addrconf: call ipv6_mc_up() for non-Ethernet interface
	ipvlan: add cond_resched_rcu() while processing muticast backlog
	ipvlan: do not use cond_resched_rcu() in ipvlan_process_multicast()
	netlink: Use netlink header as base to calculate bad attribute offset
	net: macsec: update SCI upon MAC address change.
	net: nfc: fix bounds checking bugs on "pipe"
	net/packet: tpacket_rcv: do not increment ring index on drop
	r8152: check disconnect status after long sleep
	sfc: detach from cb_page in efx_copy_channel()
	bnxt_en: reinitialize IRQs when MTU is modified
	cgroup: memcg: net: do not associate sock with unrelated cgroup
	net: memcg: late association of sock to memcg
	net: memcg: fix lockdep splat in inet_csk_accept()
	fib: add missing attribute validation for tun_id
	nl802154: add missing attribute validation
	nl802154: add missing attribute validation for dev_type
	can: add missing attribute validation for termination
	macsec: add missing attribute validation for port
	net: fq: add missing attribute validation for orphan mask
	team: add missing attribute validation for port ifindex
	team: add missing attribute validation for array index
	nfc: add missing attribute validation for SE API
	nfc: add missing attribute validation for vendor subcommand
	net: phy: fix MDIO bus PM PHY resuming
	bonding/alb: make sure arp header is pulled before accessing it
	slip: make slhc_compress() more robust against malicious packets
	net: fec: validate the new settings in fec_enet_set_coalesce()
	macvlan: add cond_resched() during multicast processing
	inet_diag: return classid for all socket types
	ipvlan: do not add hardware address of master to its unicast filter list
	ipvlan: egress mcast packets are not exceptional
	ipvlan: don't deref eth hdr before checking it's set
	cgroup: cgroup_procs_next should increase position index
	cgroup: Iterate tasks that did not finish do_exit()
	iwlwifi: mvm: Do not require PHY_SKU NVM section for 3168 devices
	virtio-blk: fix hw_queue stopped on arbitrary error
	iommu/vt-d: quirk_ioat_snb_local_iommu: replace WARN_TAINT with pr_warn + add_taint
	workqueue: don't use wq_select_unbound_cpu() for bound works
	drm/amd/display: remove duplicated assignment to grph_obj_type
	ktest: Add timeout for ssh sync testing
	cifs_atomic_open(): fix double-put on late allocation failure
	gfs2_atomic_open(): fix O_EXCL|O_CREAT handling on cold dcache
	KVM: x86: clear stale x86_emulate_ctxt->intercept value
	ARC: define __ALIGN_STR and __ALIGN symbols for ARC
	efi: Fix a race and a buffer overflow while reading efivars via sysfs
	x86/mce: Fix logic and comments around MSR_PPIN_CTL
	iommu/dma: Fix MSI reservation allocation
	iommu/vt-d: dmar: replace WARN_TAINT with pr_warn + add_taint
	iommu/vt-d: Fix a bug in intel_iommu_iova_to_phys() for huge page
	pinctrl: meson-gxl: fix GPIOX sdio pins
	pinctrl: core: Remove extra kref_get which blocks hogs being freed
	nl80211: add missing attribute validation for critical protocol indication
	nl80211: add missing attribute validation for beacon report scanning
	nl80211: add missing attribute validation for channel switch
	netfilter: cthelper: add missing attribute validation for cthelper
	netfilter: nft_payload: add missing attribute validation for payload csum flags
	iommu/vt-d: Fix the wrong printing in RHSA parsing
	iommu/vt-d: Ignore devices with out-of-spec domain number
	i2c: acpi: put device when verifying client fails
	ipv6: restrict IPV6_ADDRFORM operation
	net/smc: check for valid ib_client_data
	efi: Add a sanity check to efivar_store_raw()
	batman-adv: Avoid spurious warnings from bat_v neigh_cmp implementation
	batman-adv: Always initialize fragment header priority
	batman-adv: Fix check of retrieved orig_gw in batadv_v_gw_is_eligible
	batman-adv: Fix lock for ogm cnt access in batadv_iv_ogm_calc_tq
	batman-adv: Fix internal interface indices types
	batman-adv: update data pointers after skb_cow()
	batman-adv: Avoid race in TT TVLV allocator helper
	batman-adv: Fix TT sync flags for intermediate TT responses
	batman-adv: prevent TT request storms by not sending inconsistent TT TLVLs
	batman-adv: Fix debugfs path for renamed hardif
	batman-adv: Fix debugfs path for renamed softif
	batman-adv: Fix duplicated OGMs on NETDEV_UP
	batman-adv: Avoid free/alloc race when handling OGM2 buffer
	batman-adv: Avoid free/alloc race when handling OGM buffer
	batman-adv: Don't schedule OGM for disabled interface
	perf/amd/uncore: Replace manual sampling check with CAP_NO_INTERRUPT flag
	ACPI: watchdog: Allow disabling WDAT at boot
	HID: apple: Add support for recent firmware on Magic Keyboards
	HID: i2c-hid: add Trekstor Surfbook E11B to descriptor override
	cfg80211: check reg_rule for NULL in handle_channel_custom()
	scsi: libfc: free response frame from GPN_ID
	net: usb: qmi_wwan: restore mtu min/max values after raw_ip switch
	net: ks8851-ml: Fix IRQ handling and locking
	mac80211: rx: avoid RCU list traversal under mutex
	signal: avoid double atomic counter increments for user accounting
	slip: not call free_netdev before rtnl_unlock in slip_open
	hinic: fix a bug of setting hw_ioctxt
	net: rmnet: fix NULL pointer dereference in rmnet_newlink()
	jbd2: fix data races at struct journal_head
	ARM: 8957/1: VDSO: Match ARMv8 timer in cntvct_functional()
	ARM: 8958/1: rename missed uaccess .fixup section
	mm: slub: add missing TID bump in kmem_cache_alloc_bulk()
	ipv4: ensure rcu_read_lock() in cipso_v4_error()
	Linux 4.14.174

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I8594f155b5b3df71510fdb5dc034c80fb2332c91
2020-03-20 12:19:46 +01:00
Linus Torvalds
d8a4a55bdc signal: avoid double atomic counter increments for user accounting
[ Upstream commit fda31c50292a5062332fa0343c084bd9f46604d9 ]

When queueing a signal, we increment both the users count of pending
signals (for RLIMIT_SIGPENDING tracking) and we increment the refcount
of the user struct itself (because we keep a reference to the user in
the signal structure in order to correctly account for it when freeing).

That turns out to be fairly expensive, because both of them are atomic
updates, and particularly under extreme signal handling pressure on big
machines, you can get a lot of cache contention on the user struct.
That can then cause horrid cacheline ping-pong when you do these
multiple accesses.

So change the reference counting to only pin the user for the _first_
pending signal, and to unpin it when the last pending signal is
dequeued.  That means that when a user sees a lot of concurrent signal
queuing - which is the only situation when this matters - the only
atomic access needed is generally the 'sigpending' count update.

This was noticed because of a particularly odd timing artifact on a
dual-socket 96C/192T Cascade Lake platform: when you get into bad
contention, on that machine for some reason seems to be much worse when
the contention happens in the upper 32-byte half of the cacheline.

As a result, the kernel test robot will-it-scale 'signal1' benchmark had
an odd performance regression simply due to random alignment of the
'struct user_struct' (and pointed to a completely unrelated and
apparently nonsensical commit for the regression).

Avoiding the double increments (and decrements on the dequeueing side,
of course) makes for much less contention and hugely improved
performance on that will-it-scale microbenchmark.

Quoting Feng Tang:

 "It makes a big difference, that the performance score is tripled! bump
  from original 17000 to 54000. Also the gap between 5.0-rc6 and
  5.0-rc6+Jiri's patch is reduced to around 2%"

[ The "2% gap" is the odd cacheline placement difference on that
  platform: under the extreme contention case, the effect of which half
  of the cacheline was hot was 5%, so with the reduced contention the
  odd timing artifact is reduced too ]

It does help in the non-contended case too, but is not nearly as
noticeable.

Reported-and-tested-by: Feng Tang <feng.tang@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Philip Li <philip.li@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-03-20 10:54:25 +01:00
Hillf Danton
48c336253b workqueue: don't use wq_select_unbound_cpu() for bound works
commit aa202f1f56960c60e7befaa0f49c72b8fa11b0a8 upstream.

wq_select_unbound_cpu() is designed for unbound workqueues only, but
it's wrongly called when using a bound workqueue too.

Fixing this ensures work queued to a bound workqueue with
cpu=WORK_CPU_UNBOUND always runs on the local CPU.

Before, that would happen only if wq_unbound_cpumask happened to include
it (likely almost always the case), or was empty, or we got lucky with
forced round-robin placement.  So restricting
/sys/devices/virtual/workqueue/cpumask to a small subset of a machine's
CPUs would cause some bound work items to run unexpectedly there.

Fixes: ef55718044 ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs")
Cc: stable@vger.kernel.org # v4.5+
Signed-off-by: Hillf Danton <hdanton@sina.com>
[dj: massage changelog]
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-20 10:54:16 +01:00
Michal Koutný
713f26696c cgroup: Iterate tasks that did not finish do_exit()
commit 9c974c77246460fa6a92c18554c3311c8c83c160 upstream.

PF_EXITING is set earlier than actual removal from css_set when a task
is exitting. This can confuse cgroup.procs readers who see no PF_EXITING
tasks, however, rmdir is checking against css_set membership so it can
transitionally fail with EBUSY.

Fix this by listing tasks that weren't unlinked from css_set active
lists.
It may happen that other users of the task iterator (without
CSS_TASK_ITER_PROCS) spot a PF_EXITING task before cgroup_exit(). This
is equal to the state before commit c03cd7738a83 ("cgroup: Include dying
leaders with live threads in PROCS iterations") but it may be reviewed
later.

Reported-by: Suren Baghdasaryan <surenb@google.com>
Fixes: c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS iterations")
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-20 10:54:14 +01:00
Vasily Averin
b58120a61b cgroup: cgroup_procs_next should increase position index
commit 2d4ecb030dcc90fb725ecbfc82ce5d6c37906e0e upstream.

If seq_file .next fuction does not change position index,
read after some lseek can generate unexpected output:

1) dd bs=1 skip output of each 2nd elements
$ dd if=/sys/fs/cgroup/cgroup.procs bs=8 count=1
2
3
4
5
1+0 records in
1+0 records out
8 bytes copied, 0,000267297 s, 29,9 kB/s
[test@localhost ~]$ dd if=/sys/fs/cgroup/cgroup.procs bs=1 count=8
2
4 <<< NB! 3 was skipped
6 <<<    ... and 5 too
8 <<<    ... and 7
8+0 records in
8+0 records out
8 bytes copied, 5,2123e-05 s, 153 kB/s

 This happen because __cgroup_procs_start() makes an extra
 extra cgroup_procs_next() call

2) read after lseek beyond end of file generates whole last line.
3) read after lseek into middle of last line generates
expected rest of last line and unexpected whole line once again.

Additionally patch removes an extra position index changes in
__cgroup_procs_start()

Cc: stable@vger.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-20 10:54:14 +01:00
Shakeel Butt
944f720534 cgroup: memcg: net: do not associate sock with unrelated cgroup
[ Upstream commit e876ecc67db80dfdb8e237f71e5b43bb88ae549c ]

We are testing network memory accounting in our setup and noticed
inconsistent network memory usage and often unrelated cgroups network
usage correlates with testing workload. On further inspection, it
seems like mem_cgroup_sk_alloc() and cgroup_sk_alloc() are broken in
irq context specially for cgroup v1.

mem_cgroup_sk_alloc() and cgroup_sk_alloc() can be called in irq context
and kind of assumes that this can only happen from sk_clone_lock()
and the source sock object has already associated cgroup. However in
cgroup v1, where network memory accounting is opt-in, the source sock
can be unassociated with any cgroup and the new cloned sock can get
associated with unrelated interrupted cgroup.

Cgroup v2 can also suffer if the source sock object was created by
process in the root cgroup or if sk_alloc() is called in irq context.
The fix is to just do nothing in interrupt.

WARNING: Please note that about half of the TCP sockets are allocated
from the IRQ context, so, memory used by such sockets will not be
accouted by the memcg.

The stack trace of mem_cgroup_sk_alloc() from IRQ-context:

CPU: 70 PID: 12720 Comm: ssh Tainted:  5.6.0-smp-DEV #1
Hardware name: ...
Call Trace:
 <IRQ>
 dump_stack+0x57/0x75
 mem_cgroup_sk_alloc+0xe9/0xf0
 sk_clone_lock+0x2a7/0x420
 inet_csk_clone_lock+0x1b/0x110
 tcp_create_openreq_child+0x23/0x3b0
 tcp_v6_syn_recv_sock+0x88/0x730
 tcp_check_req+0x429/0x560
 tcp_v6_rcv+0x72d/0xa40
 ip6_protocol_deliver_rcu+0xc9/0x400
 ip6_input+0x44/0xd0
 ? ip6_protocol_deliver_rcu+0x400/0x400
 ip6_rcv_finish+0x71/0x80
 ipv6_rcv+0x5b/0xe0
 ? ip6_sublist_rcv+0x2e0/0x2e0
 process_backlog+0x108/0x1e0
 net_rx_action+0x26b/0x460
 __do_softirq+0x104/0x2a6
 do_softirq_own_stack+0x2a/0x40
 </IRQ>
 do_softirq.part.19+0x40/0x50
 __local_bh_enable_ip+0x51/0x60
 ip6_finish_output2+0x23d/0x520
 ? ip6table_mangle_hook+0x55/0x160
 __ip6_finish_output+0xa1/0x100
 ip6_finish_output+0x30/0xd0
 ip6_output+0x73/0x120
 ? __ip6_finish_output+0x100/0x100
 ip6_xmit+0x2e3/0x600
 ? ipv6_anycast_cleanup+0x50/0x50
 ? inet6_csk_route_socket+0x136/0x1e0
 ? skb_free_head+0x1e/0x30
 inet6_csk_xmit+0x95/0xf0
 __tcp_transmit_skb+0x5b4/0xb20
 __tcp_send_ack.part.60+0xa3/0x110
 tcp_send_ack+0x1d/0x20
 tcp_rcv_state_process+0xe64/0xe80
 ? tcp_v6_connect+0x5d1/0x5f0
 tcp_v6_do_rcv+0x1b1/0x3f0
 ? tcp_v6_do_rcv+0x1b1/0x3f0
 __release_sock+0x7f/0xd0
 release_sock+0x30/0xa0
 __inet_stream_connect+0x1c3/0x3b0
 ? prepare_to_wait+0xb0/0xb0
 inet_stream_connect+0x3b/0x60
 __sys_connect+0x101/0x120
 ? __sys_getsockopt+0x11b/0x140
 __x64_sys_connect+0x1a/0x20
 do_syscall_64+0x51/0x200
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

The stack trace of mem_cgroup_sk_alloc() from IRQ-context:
Fixes: 2d75807383 ("mm: memcontrol: consolidate cgroup socket tracking")
Fixes: d979a39d72 ("cgroup: duplicate cgroup reference when cloning sockets")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-20 10:54:09 +01:00
Michal Koutný
67ea9300c7 UPSTREAM: cgroup: Iterate tasks that did not finish do_exit()
PF_EXITING is set earlier than actual removal from css_set when a task
is exitting. This can confuse cgroup.procs readers who see no PF_EXITING
tasks, however, rmdir is checking against css_set membership so it can
transitionally fail with EBUSY.

Fix this by listing tasks that weren't unlinked from css_set active
lists.
It may happen that other users of the task iterator (without
CSS_TASK_ITER_PROCS) spot a PF_EXITING task before cgroup_exit(). This
is equal to the state before commit c03cd7738a83 ("cgroup: Include dying
leaders with live threads in PROCS iterations") but it may be reviewed
later.

Reported-by: Suren Baghdasaryan <surenb@google.com>
Fixes: c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS iterations")
Signed-off-by: Michal Koutný <mkoutny@suse.com>
(cherry picked from commit 9c974c77246460fa6a92c18554c3311c8c83c160)
Bug: 141213848
Bug: 146758430
Test: test_cgcore_destroy from linux-kselftest
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Iac57661b931129ed1e44b89675f8115bb89084ff
(cherry picked from commit 21ee296526c70d6dc3c64639406f156f39b80fd0)
2020-03-14 22:40:51 +00:00
Greg Kroah-Hartman
5a81c7e39a Merge 4.14.173 into android-4.14
Changes in 4.14.173
	iwlwifi: pcie: fix rb_allocator workqueue allocation
	netfilter: nf_conntrack: resolve clash for matching conntracks
	ext4: fix potential race between online resizing and write operations
	ext4: fix potential race between s_flex_groups online resizing and access
	ext4: fix potential race between s_group_info online resizing and access
	ipmi:ssif: Handle a possible NULL pointer reference
	drm/msm: Set dma maximum segment size for mdss
	dax: pass NOWAIT flag to iomap_apply
	mac80211: consider more elements in parsing CRC
	cfg80211: check wiphy driver existence for drvinfo report
	qmi_wwan: re-add DW5821e pre-production variant
	qmi_wwan: unconditionally reject 2 ep interfaces
	net: ena: fix potential crash when rxfh key is NULL
	net: ena: fix uses of round_jiffies()
	net: ena: add missing ethtool TX timestamping indication
	net: ena: fix incorrect default RSS key
	net: ena: rss: fix failure to get indirection table
	net: ena: rss: store hash function as values and not bits
	net: ena: fix incorrectly saving queue numbers when setting RSS indirection table
	net: ena: ethtool: use correct value for crc32 hash
	net: ena: ena-com.c: prevent NULL pointer dereference
	cifs: Fix mode output in debugging statements
	cfg80211: add missing policy for NL80211_ATTR_STATUS_CODE
	sysrq: Restore original console_loglevel when sysrq disabled
	sysrq: Remove duplicated sysrq message
	net: fib_rules: Correctly set table field when table number exceeds 8 bits
	net: phy: restore mdio regs in the iproc mdio driver
	nfc: pn544: Fix occasional HW initialization failure
	sctp: move the format error check out of __sctp_sf_do_9_1_abort
	ipv6: Fix nlmsg_flags when splitting a multipath route
	ipv6: Fix route replacement with dev-only route
	qede: Fix race between rdma destroy workqueue and link change event
	net: sched: correct flower port blocking
	ext4: potential crash on allocation error in ext4_alloc_flex_bg_array()
	audit: fix error handling in audit_data_to_entry()
	ACPICA: Introduce ACPI_ACCESS_BYTE_WIDTH() macro
	ACPI: watchdog: Fix gas->access_width usage
	KVM: VMX: check descriptor table exits on instruction emulation
	HID: ite: Only bind to keyboard USB interface on Acer SW5-012 keyboard dock
	HID: core: fix off-by-one memset in hid_report_raw_event()
	HID: core: increase HID report buffer size to 8KiB
	tracing: Disable trace_printk() on post poned tests
	Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs"
	HID: hiddev: Fix race in in hiddev_disconnect()
	MIPS: VPE: Fix a double free and a memory leak in 'release_vpe()'
	i2c: altera: Fix potential integer overflow
	i2c: jz4780: silence log flood on txabrt
	drm/i915/gvt: Separate display reset from ALL_ENGINES reset
	usb: charger: assign specific number for enum value
	ecryptfs: Fix up bad backport of fe2e082f5da5b4a0a92ae32978f81507ef37ec66
	include/linux/bitops.h: introduce BITS_PER_TYPE
	net: netlink: cap max groups which will be considered in netlink_bind()
	net: atlantic: fix potential error handling
	net: ena: make ena rxfh support ETH_RSS_HASH_NO_CHANGE
	namei: only return -ECHILD from follow_dotdot_rcu()
	mwifiex: drop most magic numbers from mwifiex_process_tdls_action_frame()
	KVM: SVM: Override default MMIO mask if memory encryption is enabled
	KVM: Check for a bad hva before dropping into the ghc slow path
	tuntap: correctly set SOCKWQ_ASYNC_NOSPACE
	drivers: net: xgene: Fix the order of the arguments of 'alloc_etherdev_mqs()'
	kprobes: Set unoptimized flag after unoptimizing code
	perf hists browser: Restore ESC as "Zoom out" of DSO/thread/etc
	mm/huge_memory.c: use head to check huge zero page
	mm, thp: fix defrag setting if newline is not used
	Revert "char/random: silence a lockdep splat with printk()"
	audit: always check the netlink payload length in audit_receive_msg()
	vhost: Check docket sk_family instead of call getname
	x86/mce: Handle varying MCA bank counts
	EDAC/amd64: Set grain per DIMM
	net: dsa: bcm_sf2: Forcibly configure IMP port for 1Gb/sec
	RDMA/core: Fix pkey and port assignment in get_new_pps
	RDMA/core: Fix use of logical OR in get_new_pps
	kprobes: Fix optimize_kprobe()/unoptimize_kprobe() cancellation logic
	serial: ar933x_uart: set UART_CS_{RX,TX}_READY_ORIDE
	selftests: fix too long argument
	usb: gadget: composite: Support more than 500mA MaxPower
	usb: gadget: ffs: ffs_aio_cancel(): Save/restore IRQ flags
	usb: gadget: serial: fix Tx stall after buffer overflow
	drm/msm/mdp5: rate limit pp done timeout warnings
	drm: msm: Fix return type of dsi_mgr_connector_mode_valid for kCFI
	drm/msm/dsi: save pll state before dsi host is powered off
	net: ks8851-ml: Remove 8-bit bus accessors
	net: ks8851-ml: Fix 16-bit data access
	net: ks8851-ml: Fix 16-bit IO operation
	watchdog: da9062: do not ping the hw during stop()
	s390/cio: cio_ignore_proc_seq_next should increase position index
	x86/boot/compressed: Don't declare __force_order in kaslr_64.c
	nvme: Fix uninitialized-variable warning
	x86/xen: Distribute switch variables for initialization
	net: thunderx: workaround BGX TX Underflow issue
	cifs: don't leak -EAGAIN for stat() during reconnect
	usb: storage: Add quirk for Samsung Fit flash
	usb: quirks: add NO_LPM quirk for Logitech Screen Share
	usb: core: hub: fix unhandled return by employing a void function
	usb: core: hub: do error out if usb_autopm_get_interface() fails
	usb: core: port: do error out if usb_autopm_get_interface() fails
	vgacon: Fix a UAF in vgacon_invert_region
	mm, numa: fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa
	fat: fix uninit-memory access for partial initialized inode
	arm: dts: dra76x: Fix mmc3 max-frequency
	tty:serial:mvebu-uart:fix a wrong return
	serial: 8250_exar: add support for ACCES cards
	vt: selection, close sel_buffer race
	vt: selection, push console lock down
	vt: selection, push sel_lock up
	x86/pkeys: Manually set X86_FEATURE_OSPKE to preserve existing changes
	dmaengine: tegra-apb: Fix use-after-free
	dmaengine: tegra-apb: Prevent race conditions of tasklet vs free list
	dm cache: fix a crash due to incorrect work item cancelling
	ARM: dts: ls1021a: Restore MDIO compatible to gianfar
	ASoC: topology: Fix memleak in soc_tplg_link_elems_load()
	ASoC: intel: skl: Fix pin debug prints
	ASoC: intel: skl: Fix possible buffer overflow in debug outputs
	ASoC: pcm: Fix possible buffer overflow in dpcm state sysfs output
	ASoC: pcm512x: Fix unbalanced regulator enable call in probe error path
	ASoC: dapm: Correct DAPM handling of active widgets during shutdown
	RDMA/iwcm: Fix iwcm work deallocation
	RMDA/cm: Fix missing ib_cm_destroy_id() in ib_cm_insert_listen()
	IB/hfi1, qib: Ensure RCU is locked when accessing list
	ARM: imx: build v7_cpu_resume() unconditionally
	hwmon: (adt7462) Fix an error return in ADT7462_REG_VOLT()
	dmaengine: coh901318: Fix a double lock bug in dma_tc_handle()
	powerpc: fix hardware PMU exception bug on PowerVM compatibility mode systems
	dm integrity: fix a deadlock due to offloading to an incorrect workqueue
	xhci: handle port status events for removed USB3 hcd
	ASoC: topology: Fix memleak in soc_tplg_manifest_load()
	Linux 4.14.173

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ic06bd3eb90ee58f3fd96bff8969ebf6d9db4cb8d
2020-03-11 19:30:15 +01:00
Masami Hiramatsu
53647b8201 kprobes: Fix optimize_kprobe()/unoptimize_kprobe() cancellation logic
[ Upstream commit e4add247789e4ba5e08ad8256183ce2e211877d4 ]

optimize_kprobe() and unoptimize_kprobe() cancels if a given kprobe
is on the optimizing_list or unoptimizing_list already. However, since
the following commit:

  f66c0447cca1 ("kprobes: Set unoptimized flag after unoptimizing code")

modified the update timing of the KPROBE_FLAG_OPTIMIZED, it doesn't
work as expected anymore.

The optimized_kprobe could be in the following states:

- [optimizing]: Before inserting jump instruction
  op.kp->flags has KPROBE_FLAG_OPTIMIZED and
  op->list is not empty.

- [optimized]: jump inserted
  op.kp->flags has KPROBE_FLAG_OPTIMIZED and
  op->list is empty.

- [unoptimizing]: Before removing jump instruction (including unused
  optprobe)
  op.kp->flags has KPROBE_FLAG_OPTIMIZED and
  op->list is not empty.

- [unoptimized]: jump removed
  op.kp->flags doesn't have KPROBE_FLAG_OPTIMIZED and
  op->list is empty.

Current code mis-expects [unoptimizing] state doesn't have
KPROBE_FLAG_OPTIMIZED, and that can cause incorrect results.

To fix this, introduce optprobe_queued_unopt() to distinguish [optimizing]
and [unoptimizing] states and fixes the logic in optimize_kprobe() and
unoptimize_kprobe().

[ mingo: Cleaned up the changelog and the code a bit. ]

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bristot@redhat.com
Fixes: f66c0447cca1 ("kprobes: Set unoptimized flag after unoptimizing code")
Link: https://lkml.kernel.org/r/157840814418.7181.13478003006386303481.stgit@devnote2
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-03-11 18:02:57 +01:00
Paul Moore
c7cba03b2b audit: always check the netlink payload length in audit_receive_msg()
[ Upstream commit 756125289285f6e55a03861bf4b6257aa3d19a93 ]

This patch ensures that we always check the netlink payload length
in audit_receive_msg() before we take any action on the payload
itself.

Cc: stable@vger.kernel.org
Reported-by: syzbot+399c44bf1f43b8747403@syzkaller.appspotmail.com
Reported-by: syzbot+e4b12d8d202701f08b6d@syzkaller.appspotmail.com
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-03-11 18:02:55 +01:00
Masami Hiramatsu
b996b668da kprobes: Set unoptimized flag after unoptimizing code
commit f66c0447cca1281116224d474cdb37d6a18e4b5b upstream.

Set the unoptimized flag after confirming the code is completely
unoptimized. Without this fix, when a kprobe hits the intermediate
modified instruction (the first byte is replaced by an INT3, but
later bytes can still be a jump address operand) while unoptimizing,
it can return to the middle byte of the modified code, which causes
an invalid instruction exception in the kernel.

Usually, this is a rare case, but if we put a probe on the function
call while text patching, it always causes a kernel panic as below:

 # echo p text_poke+5 > kprobe_events
 # echo 1 > events/kprobes/enable
 # echo 0 > events/kprobes/enable

invalid opcode: 0000 [#1] PREEMPT SMP PTI
 RIP: 0010:text_poke+0x9/0x50
 Call Trace:
  arch_unoptimize_kprobe+0x22/0x28
  arch_unoptimize_kprobes+0x39/0x87
  kprobe_optimizer+0x6e/0x290
  process_one_work+0x2a0/0x610
  worker_thread+0x28/0x3d0
  ? process_one_work+0x610/0x610
  kthread+0x10d/0x130
  ? kthread_park+0x80/0x80
  ret_from_fork+0x3a/0x50

text_poke() is used for patching the code in optprobes.

This can happen even if we blacklist text_poke() and other functions,
because there is a small time window during which we show the intermediate
code to other CPUs.

 [ mingo: Edited the changelog. ]

Tested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bristot@redhat.com
Fixes: 6274de4984 ("kprobes: Support delayed unoptimizing")
Link: https://lkml.kernel.org/r/157483422375.25881.13508326028469515760.stgit@devnote2
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11 18:02:54 +01:00
Steven Rostedt (VMware)
581695e615 tracing: Disable trace_printk() on post poned tests
commit 78041c0c9e935d9ce4086feeff6c569ed88ddfd4 upstream.

The tracing seftests checks various aspects of the tracing infrastructure,
and one is filtering. If trace_printk() is active during a self test, it can
cause the filtering to fail, which will disable that part of the trace.

To keep the selftests from failing because of trace_printk() calls,
trace_printk() checks the variable tracing_selftest_running, and if set, it
does not write to the tracing buffer.

As some tracers were registered earlier in boot, the selftest they triggered
would fail because not all the infrastructure was set up for the full
selftest. Thus, some of the tests were post poned to when their
infrastructure was ready (namely file system code). The postpone code did
not set the tracing_seftest_running variable, and could fail if a
trace_printk() was added and executed during their run.

Cc: stable@vger.kernel.org
Fixes: 9afecfbb95 ("tracing: Postpone tracer start-up tests till the system is more robust")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11 18:02:50 +01:00
Paul Moore
edde9fcd5f audit: fix error handling in audit_data_to_entry()
commit 2ad3e17ebf94b7b7f3f64c050ff168f9915345eb upstream.

Commit 219ca39427 ("audit: use union for audit_field values since
they are mutually exclusive") combined a number of separate fields in
the audit_field struct into a single union.  Generally this worked
just fine because they are generally mutually exclusive.
Unfortunately in audit_data_to_entry() the overlap can be a problem
when a specific error case is triggered that causes the error path
code to attempt to cleanup an audit_field struct and the cleanup
involves attempting to free a stored LSM string (the lsm_str field).
Currently the code always has a non-NULL value in the
audit_field.lsm_str field as the top of the for-loop transfers a
value into audit_field.val (both .lsm_str and .val are part of the
same union); if audit_data_to_entry() fails and the audit_field
struct is specified to contain a LSM string, but the
audit_field.lsm_str has not yet been properly set, the error handling
code will attempt to free the bogus audit_field.lsm_str value that
was set with audit_field.val at the top of the for-loop.

This patch corrects this by ensuring that the audit_field.val is only
set when needed (it is cleared when the audit_field struct is
allocated with kcalloc()).  It also corrects a few other issues to
ensure that in case of error the proper error code is returned.

Cc: stable@vger.kernel.org
Fixes: 219ca39427 ("audit: use union for audit_field values since they are mutually exclusive")
Reported-by: syzbot+1f4d90ead370d72e450b@syzkaller.appspotmail.com
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11 18:02:48 +01:00
Greg Kroah-Hartman
4f546b14ce Merge 4.14.172 into android-4.14
Changes in 4.14.172
	KVM: x86: emulate RDPID
	iommu/qcom: Fix bogus detach logic
	ALSA: hda: Use scnprintf() for printing texts for sysfs/procfs
	ASoC: sun8i-codec: Fix setting DAI data format
	ecryptfs: fix a memory leak bug in parse_tag_1_packet()
	ecryptfs: fix a memory leak bug in ecryptfs_init_messaging()
	Input: synaptics - switch T470s to RMI4 by default
	Input: synaptics - enable SMBus on ThinkPad L470
	Input: synaptics - remove the LEN0049 dmi id from topbuttonpad list
	ALSA: usb-audio: Apply sample rate quirk for Audioengine D1
	arm64: cpufeature: Set the FP/SIMD compat HWCAP bits properly
	arm64: ptrace: nofpsimd: Fail FP/SIMD regset operations
	arm64: nofpsimd: Handle TIF_FOREIGN_FPSTATE flag cleanly
	ARM: 8723/2: always assume the "unified" syntax for assembly code
	ext4: don't assume that mmp_nodename/bdevname have NUL
	ext4: fix support for inode sizes > 1024 bytes
	ext4: fix checksum errors with indexed dirs
	ext4: improve explanation of a mount failure caused by a misconfigured kernel
	Btrfs: fix race between using extent maps and merging them
	btrfs: print message when tree-log replay starts
	btrfs: log message when rw remount is attempted with unclean tree-log
	arm64: ssbs: Fix context-switch when SSBS is present on all CPUs
	KVM: nVMX: Use correct root level for nested EPT shadow page tables
	perf/x86/amd: Add missing L2 misses event spec to AMD Family 17h's event map
	padata: Remove broken queue flushing
	serial: imx: ensure that RX irqs are off if RX is off
	serial: imx: Only handle irqs that are actually enabled
	IB/hfi1: Close window for pq and request coliding
	RDMA/core: Fix protection fault in get_pkey_idx_qp_list
	s390/time: Fix clk type in get_tod_clock
	perf/x86/intel: Fix inaccurate period in context switch for auto-reload
	hwmon: (pmbus/ltc2978) Fix PMBus polling of MFR_COMMON definitions.
	jbd2: move the clearing of b_modified flag to the journal_unmap_buffer()
	jbd2: do not clear the BH_Mapped flag when forgetting a metadata buffer
	scsi: qla2xxx: fix a potential NULL pointer dereference
	Revert "KVM: nVMX: Use correct root level for nested EPT shadow page tables"
	Revert "KVM: VMX: Add non-canonical check on writes to RTIT address MSRs"
	KVM: nVMX: Use correct root level for nested EPT shadow page tables
	drm/gma500: Fixup fbdev stolen size usage evaluation
	cpu/hotplug, stop_machine: Fix stop_machine vs hotplug order
	brcmfmac: Fix use after free in brcmf_sdio_readframes()
	leds: pca963x: Fix open-drain initialization
	ext4: fix ext4_dax_read/write inode locking sequence for IOCB_NOWAIT
	ALSA: ctl: allow TLV read operation for callback type of element in locked case
	gianfar: Fix TX timestamping with a stacked DSA driver
	pinctrl: sh-pfc: sh7264: Fix CAN function GPIOs
	pxa168fb: Fix the function used to release some memory in an error handling path
	media: i2c: mt9v032: fix enum mbus codes and frame sizes
	powerpc/powernv/iov: Ensure the pdn for VFs always contains a valid PE number
	gpio: gpio-grgpio: fix possible sleep-in-atomic-context bugs in grgpio_irq_map/unmap()
	char/random: silence a lockdep splat with printk()
	media: sti: bdisp: fix a possible sleep-in-atomic-context bug in bdisp_device_run()
	pinctrl: baytrail: Do not clear IRQ flags on direct-irq enabled pins
	efi/x86: Map the entire EFI vendor string before copying it
	MIPS: Loongson: Fix potential NULL dereference in loongson3_platform_init()
	sparc: Add .exit.data section.
	uio: fix a sleep-in-atomic-context bug in uio_dmem_genirq_irqcontrol()
	usb: gadget: udc: fix possible sleep-in-atomic-context bugs in gr_probe()
	usb: dwc2: Fix IN FIFO allocation
	clocksource/drivers/bcm2835_timer: Fix memory leak of timer
	kselftest: Minimise dependency of get_size on C library interfaces
	jbd2: clear JBD2_ABORT flag before journal_reset to update log tail info when load journal
	x86/sysfb: Fix check for bad VRAM size
	tracing: Fix tracing_stat return values in error handling paths
	tracing: Fix very unlikely race of registering two stat tracers
	ext4, jbd2: ensure panic when aborting with zero errno
	nbd: add a flush_workqueue in nbd_start_device
	KVM: s390: ENOTSUPP -> EOPNOTSUPP fixups
	kconfig: fix broken dependency in randconfig-generated .config
	clk: qcom: rcg2: Don't crash if our parent can't be found; return an error
	drm/amdgpu: remove 4 set but not used variable in amdgpu_atombios_get_connector_info_from_object_table
	regulator: rk808: Lower log level on optional GPIOs being not available
	net/wan/fsl_ucc_hdlc: reject muram offsets above 64K
	PCI/IOV: Fix memory leak in pci_iov_add_virtfn()
	NFC: port100: Convert cpu_to_le16(le16_to_cpu(E1) + E2) to use le16_add_cpu().
	arm64: dts: qcom: msm8996: Disable USB2 PHY suspend by core
	ARM: dts: imx6: rdu2: Disable WP for USDHC2 and USDHC3
	media: v4l2-device.h: Explicitly compare grp{id,mask} to zero in v4l2_device macros
	reiserfs: Fix spurious unlock in reiserfs_fill_super() error handling
	fore200e: Fix incorrect checks of NULL pointer dereference
	ALSA: usx2y: Adjust indentation in snd_usX2Y_hwdep_dsp_status
	b43legacy: Fix -Wcast-function-type
	ipw2x00: Fix -Wcast-function-type
	iwlegacy: Fix -Wcast-function-type
	rtlwifi: rtl_pci: Fix -Wcast-function-type
	orinoco: avoid assertion in case of NULL pointer
	ACPICA: Disassembler: create buffer fields in ACPI_PARSE_LOAD_PASS1
	scsi: ufs: Complete pending requests in host reset and restore path
	scsi: aic7xxx: Adjust indentation in ahc_find_syncrate
	drm/mediatek: handle events when enabling/disabling crtc
	ARM: dts: r8a7779: Add device node for ARM global timer
	dmaengine: Store module owner in dma_device struct
	x86/vdso: Provide missing include file
	PM / devfreq: rk3399_dmc: Add COMPILE_TEST and HAVE_ARM_SMCCC dependency
	pinctrl: sh-pfc: sh7269: Fix CAN function GPIOs
	RDMA/rxe: Fix error type of mmap_offset
	clk: sunxi-ng: add mux and pll notifiers for A64 CPU clock
	ALSA: sh: Fix unused variable warnings
	ALSA: sh: Fix compile warning wrt const
	tools lib api fs: Fix gcc9 stringop-truncation compilation error
	drm: remove the newline for CRC source name.
	usbip: Fix unsafe unaligned pointer usage
	udf: Fix free space reporting for metadata and virtual partitions
	IB/hfi1: Add software counter for ctxt0 seq drop
	soc/tegra: fuse: Correct straps' address for older Tegra124 device trees
	efi/x86: Don't panic or BUG() on non-critical error conditions
	rcu: Use WRITE_ONCE() for assignments to ->pprev for hlist_nulls
	Input: edt-ft5x06 - work around first register access error
	wan: ixp4xx_hss: fix compile-testing on 64-bit
	ASoC: atmel: fix build error with CONFIG_SND_ATMEL_SOC_DMA=m
	tty: synclinkmp: Adjust indentation in several functions
	tty: synclink_gt: Adjust indentation in several functions
	driver core: platform: Prevent resouce overflow from causing infinite loops
	driver core: Print device when resources present in really_probe()
	vme: bridges: reduce stack usage
	drm/nouveau/secboot/gm20b: initialize pointer in gm20b_secboot_new()
	drm/nouveau/gr/gk20a,gm200-: add terminators to method lists read from fw
	drm/nouveau: Fix copy-paste error in nouveau_fence_wait_uevent_handler
	drm/vmwgfx: prevent memory leak in vmw_cmdbuf_res_add
	usb: musb: omap2430: Get rid of musb .set_vbus for omap2430 glue
	iommu/arm-smmu-v3: Use WRITE_ONCE() when changing validity of an STE
	f2fs: free sysfs kobject
	scsi: iscsi: Don't destroy session if there are outstanding connections
	arm64: fix alternatives with LLVM's integrated assembler
	watchdog/softlockup: Enforce that timestamp is valid on boot
	f2fs: fix memleak of kobject
	x86/mm: Fix NX bit clearing issue in kernel_map_pages_in_pgd
	pwm: omap-dmtimer: Remove PWM chip in .remove before making it unfunctional
	cmd64x: potential buffer overflow in cmd64x_program_timings()
	ide: serverworks: potential overflow in svwks_set_pio_mode()
	pwm: Remove set but not set variable 'pwm'
	btrfs: fix possible NULL-pointer dereference in integrity checks
	btrfs: safely advance counter when looking up bio csums
	btrfs: device stats, log when stats are zeroed
	remoteproc: Initialize rproc_class before use
	irqchip/mbigen: Set driver .suppress_bind_attrs to avoid remove problems
	ALSA: hda/hdmi - add retry logic to parse_intel_hdmi()
	x86/decoder: Add TEST opcode to Group3-2
	s390/ftrace: generate traced function stack frame
	driver core: platform: fix u32 greater or equal to zero comparison
	ALSA: hda - Add docking station support for Lenovo Thinkpad T420s
	powerpc/sriov: Remove VF eeh_dev state when disabling SR-IOV
	jbd2: switch to use jbd2_journal_abort() when failed to submit the commit record
	jbd2: make sure ESHUTDOWN to be recorded in the journal superblock
	ARM: 8951/1: Fix Kexec compilation issue.
	hostap: Adjust indentation in prism2_hostapd_add_sta
	iwlegacy: ensure loop counter addr does not wrap and cause an infinite loop
	cifs: fix NULL dereference in match_prepath
	ceph: check availability of mds cluster on mount after wait timeout
	irqchip/gic-v3: Only provision redistributors that are enabled in ACPI
	drm/nouveau/disp/nv50-: prevent oops when no channel method map provided
	ftrace: fpid_next() should increase position index
	trigger_next should increase position index
	radeon: insert 10ms sleep in dce5_crtc_load_lut
	ocfs2: fix a NULL pointer dereference when call ocfs2_update_inode_fsync_trans()
	lib/scatterlist.c: adjust indentation in __sg_alloc_table
	reiserfs: prevent NULL pointer dereference in reiserfs_insert_item()
	bcache: explicity type cast in bset_bkey_last()
	irqchip/gic-v3-its: Reference to its_invall_cmd descriptor when building INVALL
	iwlwifi: mvm: Fix thermal zone registration
	microblaze: Prevent the overflow of the start
	brd: check and limit max_part par
	help_next should increase position index
	virtio_balloon: prevent pfn array overflow
	mlxsw: spectrum_dpipe: Add missing error path
	selinux: ensure we cleanup the internal AVC counters on error in avc_update()
	enic: prevent waking up stopped tx queues over watchdog reset
	net: dsa: tag_qca: Make sure there is headroom for tag
	net/sched: matchall: add missing validation of TCA_MATCHALL_FLAGS
	net/sched: flower: add missing validation of TCA_FLOWER_FLAGS
	net/smc: fix leak of kernel memory to user space
	thunderbolt: Prevent crash if non-active NVMem file is read
	USB: misc: iowarrior: add support for 2 OEMed devices
	USB: misc: iowarrior: add support for the 28 and 28L devices
	USB: misc: iowarrior: add support for the 100 device
	floppy: check FDC index for errors before assigning it
	vt: selection, handle pending signals in paste_selection
	staging: android: ashmem: Disallow ashmem memory from being remapped
	staging: vt6656: fix sign of rx_dbm to bb_pre_ed_rssi.
	xhci: Force Maximum Packet size for Full-speed bulk devices to valid range.
	xhci: fix runtime pm enabling for quirky Intel hosts
	usb: host: xhci: update event ring dequeue pointer on purpose
	usb: uas: fix a plug & unplug racing
	USB: Fix novation SourceControl XL after suspend
	USB: hub: Don't record a connect-change event during reset-resume
	USB: hub: Fix the broken detection of USB3 device in SMSC hub
	staging: rtl8188eu: Fix potential security hole
	staging: rtl8188eu: Fix potential overuse of kernel memory
	staging: rtl8723bs: Fix potential security hole
	staging: rtl8723bs: Fix potential overuse of kernel memory
	x86/mce/amd: Publish the bank pointer only after setup has succeeded
	x86/mce/amd: Fix kobject lifetime
	tty/serial: atmel: manage shutdown in case of RS485 or ISO7816 mode
	tty: serial: imx: setup the correct sg entry for tx dma
	serdev: ttyport: restore client ops on deregistration
	MAINTAINERS: Update drm/i915 bug filing URL
	Revert "ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()"
	mm/vmscan.c: don't round up scan size for online memory cgroup
	drm/amdgpu/soc15: fix xclk for raven
	KVM: x86: don't notify userspace IOAPIC on edge-triggered interrupt EOI
	xhci: apply XHCI_PME_STUCK_QUIRK to Intel Comet Lake platforms
	VT_RESIZEX: get rid of field-by-field copyin
	vt: vt_ioctl: fix race in VT_RESIZEX
	serial: 8250: Check UPF_IRQ_SHARED in advance
	lib/stackdepot.c: fix global out-of-bounds in stack_slabs
	KVM: nVMX: Don't emulate instructions in guest mode
	ext4: fix a data race in EXT4_I(inode)->i_disksize
	ext4: add cond_resched() to __ext4_find_entry()
	ext4: fix mount failure with quota configured as module
	ext4: rename s_journal_flag_rwsem to s_writepages_rwsem
	ext4: fix race between writepages and enabling EXT4_EXTENTS_FL
	KVM: nVMX: Refactor IO bitmap checks into helper function
	KVM: nVMX: Check IO instruction VM-exit conditions
	KVM: nVMX: handle nested posted interrupts when apicv is disabled for L1
	KVM: apic: avoid calculating pending eoi from an uninitialized val
	btrfs: fix bytes_may_use underflow in prealloc error condtition
	btrfs: do not check delayed items are empty for single transaction cleanup
	Btrfs: fix btrfs_wait_ordered_range() so that it waits for all ordered extents
	scsi: Revert "RDMA/isert: Fix a recently introduced regression related to logout"
	scsi: Revert "target: iscsi: Wait for all commands to finish before freeing a session"
	usb: gadget: composite: Fix bMaxPower for SuperSpeedPlus
	staging: rtl8723bs: fix copy of overlapping memory
	staging: greybus: use after free in gb_audio_manager_remove_all()
	ecryptfs: replace BUG_ON with error handling code
	iommu/vt-d: Fix compile warning from intel-svm.h
	genirq/proc: Reject invalid affinity masks (again)
	ALSA: rawmidi: Avoid bit fields for state flags
	ALSA: seq: Avoid concurrent access to queue flags
	ALSA: seq: Fix concurrent access to queue current tick/time
	netfilter: xt_hashlimit: limit the max size of hashtable
	ata: ahci: Add shutdown to freeze hardware resources of ahci
	xen: Enable interrupts when calling _cond_resched()
	s390/mm: Explicitly compare PAGE_DEFAULT_KEY against zero in storage_key_init_range
	Linux 4.14.172

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ia229dbad24bf3cb8a718d73fc9eb86a053985985
2020-03-03 07:38:15 +01:00
Thomas Gleixner
d3daa3edcf genirq/proc: Reject invalid affinity masks (again)
commit cba6437a1854fde5934098ec3bd0ee83af3129f5 upstream.

Qian Cai reported that the WARN_ON() in the x86/msi affinity setting code,
which catches cases where the affinity setting is not done on the CPU which
is the current target of the interrupt, triggers during CPU hotplug stress
testing.

It turns out that the warning which was added with the commit addressing
the MSI affinity race unearthed yet another long standing bug.

If user space writes a bogus affinity mask, i.e. it contains no online CPUs,
then it calls irq_select_affinity_usr(). This was introduced for ALPHA in

  eee45269b0 ("[PATCH] Alpha: convert to generic irq framework (generic part)")

and subsequently made available for all architectures in

  1840475676 ("genirq: Expose default irq affinity mask (take 3)")

which introduced the circumvention of the affinity setting restrictions for
interrupt which cannot be moved in process context.

The whole exercise is bogus in various aspects:

  1) If the interrupt is already started up then there is absolutely
     no point to honour a bogus interrupt affinity setting from user
     space. The interrupt is already assigned to an online CPU and it
     does not make any sense to reassign it to some other randomly
     chosen online CPU.

  2) If the interupt is not yet started up then there is no point
     either. A subsequent startup of the interrupt will invoke
     irq_setup_affinity() anyway which will chose a valid target CPU.

So the only correct solution is to just return -EINVAL in case user space
wrote an affinity mask which does not contain any online CPUs, except for
ALPHA which has it's own magic sauce for this.

Fixes: 1840475676 ("genirq: Expose default irq affinity mask (take 3)")
Reported-by: Qian Cai <cai@lca.pw>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Qian Cai <cai@lca.pw>
Link: https://lkml.kernel.org/r/878sl8xdbm.fsf@nanos.tec.linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:36:15 +01:00
Vasily Averin
c156943230 trigger_next should increase position index
[ Upstream commit 6722b23e7a2ace078344064a9735fb73e554e9ef ]

if seq_file .next fuction does not change position index,
read after some lseek can generate unexpected output.

Without patch:
 # dd bs=30 skip=1 if=/sys/kernel/tracing/events/sched/sched_switch/trigger
 dd: /sys/kernel/tracing/events/sched/sched_switch/trigger: cannot skip to specified offset
 n traceoff snapshot stacktrace enable_event disable_event enable_hist disable_hist hist
 # Available triggers:
 # traceon traceoff snapshot stacktrace enable_event disable_event enable_hist disable_hist hist
 6+1 records in
 6+1 records out
 206 bytes copied, 0.00027916 s, 738 kB/s

Notice the printing of "# Available triggers:..." after the line.

With the patch:
 # dd bs=30 skip=1 if=/sys/kernel/tracing/events/sched/sched_switch/trigger
 dd: /sys/kernel/tracing/events/sched/sched_switch/trigger: cannot skip to specified offset
 n traceoff snapshot stacktrace enable_event disable_event enable_hist disable_hist hist
 2+1 records in
 2+1 records out
 88 bytes copied, 0.000526867 s, 167 kB/s

It only prints the end of the file, and does not restart.

Link: http://lkml.kernel.org/r/3c35ee24-dd3a-8119-9c19-552ed253388a@virtuozzo.com

https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-28 16:36:07 +01:00
Vasily Averin
9df00bc555 ftrace: fpid_next() should increase position index
[ Upstream commit e4075e8bdffd93a9b6d6e1d52fabedceeca5a91b ]

if seq_file .next fuction does not change position index,
read after some lseek can generate unexpected output.

Without patch:
 # dd bs=4 skip=1 if=/sys/kernel/tracing/set_ftrace_pid
 dd: /sys/kernel/tracing/set_ftrace_pid: cannot skip to specified offset
 id
 no pid
 2+1 records in
 2+1 records out
 10 bytes copied, 0.000213285 s, 46.9 kB/s

Notice the "id" followed by "no pid".

With the patch:
 # dd bs=4 skip=1 if=/sys/kernel/tracing/set_ftrace_pid
 dd: /sys/kernel/tracing/set_ftrace_pid: cannot skip to specified offset
 id
 0+1 records in
 0+1 records out
 3 bytes copied, 0.000202112 s, 14.8 kB/s

Notice that it only prints "id" and not the "no pid" afterward.

Link: http://lkml.kernel.org/r/4f87c6ad-f114-30bb-8506-c32274ce2992@virtuozzo.com

https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-28 16:36:07 +01:00
Thomas Gleixner
434f42546a watchdog/softlockup: Enforce that timestamp is valid on boot
[ Upstream commit 11e31f608b499f044f24b20be73f1dcab3e43f8a ]

Robert reported that during boot the watchdog timestamp is set to 0 for one
second which is the indicator for a watchdog reset.

The reason for this is that the timestamp is in seconds and the time is
taken from sched clock and divided by ~1e9. sched clock starts at 0 which
means that for the first second during boot the watchdog timestamp is 0,
i.e. reset.

Use ULONG_MAX as the reset indicator value so the watchdog works correctly
right from the start. ULONG_MAX would only conflict with a real timestamp
if the system reaches an uptime of 136 years on 32bit and almost eternity
on 64bit.

Reported-by: Robert Richter <rrichter@marvell.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/87o8v3uuzl.fsf@nanos.tec.linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-28 16:36:05 +01:00
Steven Rostedt (VMware)
cf6eb046e5 tracing: Fix very unlikely race of registering two stat tracers
[ Upstream commit dfb6cd1e654315168e36d947471bd2a0ccd834ae ]

Looking through old emails in my INBOX, I came across a patch from Luis
Henriques that attempted to fix a race of two stat tracers registering the
same stat trace (extremely unlikely, as this is done in the kernel, and
probably doesn't even exist). The submitted patch wasn't quite right as it
needed to deal with clean up a bit better (if two stat tracers were the
same, it would have the same files).

But to make the code cleaner, all we needed to do is to keep the
all_stat_sessions_mutex held for most of the registering function.

Link: http://lkml.kernel.org/r/1410299375-20068-1-git-send-email-luis.henriques@canonical.com

Fixes: 002bb86d8d ("tracing/ftrace: separate events tracing and stats tracing engine")
Reported-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-28 16:35:58 +01:00
Luis Henriques
8f16da1dcd tracing: Fix tracing_stat return values in error handling paths
[ Upstream commit afccc00f75bbbee4e4ae833a96c2d29a7259c693 ]

tracing_stat_init() was always returning '0', even on the error paths.  It
now returns -ENODEV if tracing_init_dentry() fails or -ENOMEM if it fails
to created the 'trace_stat' debugfs directory.

Link: http://lkml.kernel.org/r/1410299381-20108-1-git-send-email-luis.henriques@canonical.com

Fixes: ed6f1c996b ("tracing: Check return value of tracing_init_dentry()")
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
[ Pulled from the archeological digging of my INBOX ]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-28 16:35:58 +01:00