555 Commits

Author SHA1 Message Date
Bruno Martins
9162135978 Merge branch 'deprecated/android-4.9-q' of https://android.googlesource.com/kernel/common into HEAD
Conflicts:
	arch/arm/Makefile
	arch/arm/include/asm/unistd.h
	arch/arm/kernel/calls.S
	arch/arm64/include/asm/assembler.h
	arch/arm64/include/asm/cputype.h
	arch/arm64/kernel/bpi.S
	arch/arm64/kernel/cpu_errata.c
	arch/arm64/kernel/setup.c
	arch/arm64/kernel/vdso.c
	arch/arm64/mm/proc.S
	arch/mips/include/uapi/asm/Kbuild
	arch/powerpc/include/uapi/asm/Kbuild
	drivers/char/Kconfig
	drivers/char/random.c
	drivers/clk/qcom/clk-rcg2.c
	drivers/gpu/drm/drm_edid.c
	drivers/irqchip/irq-gic.c
	drivers/md/dm-table.c
	drivers/media/dvb-core/dmxdev.c
	drivers/mmc/core/core.c
	drivers/mmc/core/host.c
	drivers/mmc/core/mmc.c
	drivers/mmc/host/sdhci.c
	drivers/net/usb/lan78xx.c
	drivers/scsi/ufs/ufs_quirks.h
	drivers/scsi/ufs/ufshcd.c
	drivers/staging/android/ion/ion-ioctl.c
	drivers/staging/android/ion/ion.c
	drivers/staging/android/ion/ion_priv.h
	drivers/staging/android/ion/ion_system_heap.c
	drivers/tty/tty_io.c
	drivers/usb/core/hub.c
	drivers/usb/core/usb.h
	drivers/usb/dwc3/core.c
	drivers/usb/dwc3/gadget.c
	drivers/usb/gadget/composite.c
	drivers/usb/gadget/configfs.c
	drivers/usb/gadget/function/f_accessory.c
	drivers/usb/gadget/function/rndis.c
	drivers/usb/gadget/function/rndis.h
	fs/eventpoll.c
	fs/ext4/namei.c
	fs/fat/fatent.c
	fs/gfs2/acl.c
	include/linux/random.h
	include/uapi/drm/Kbuild
	include/uapi/linux/Kbuild
	include/uapi/linux/cifs/Kbuild
	include/uapi/linux/genwqe/Kbuild
	kernel/cpu.c
	kernel/exit.c
	kernel/sched/cpufreq_schedutil.c
	lib/Makefile
	lib/string.c
	mm/memory.c
	mm/page-writeback.c
	mm/page_alloc.c
	net/ipv4/udp.c
	net/ipv6/datagram.c
	net/ipv6/ip6_output.c
	net/netfilter/nf_conntrack_irc.c
	net/netfilter/xt_quota2.c
	net/netlink/genetlink.c
	security/selinux/avc.c
	security/selinux/include/objsec.h
	sound/core/compress_offload.c

Change-Id: I41982a5a8e22a21b72ec5dfa61a3680be66213f4
2023-03-26 12:09:00 +01:00
Greg Kroah-Hartman
5a227d815c Merge 4.9.325 into android-4.9-q
Changes in 4.9.325
	security,selinux,smack: kill security_task_wait hook
	xen/gntdev: Ignore failure to unmap INVALID_GRANT_HANDLE
	misc: rtsx_usb: fix use of dma mapped buffer for usb bulk transfer
	misc: rtsx_usb: use separate command and response buffers
	misc: rtsx_usb: set return value in rsp_buf alloc err path
	xfrm: xfrm_policy: fix a possible double xfrm_pols_put() in xfrm_bundle_lookup()
	power/reset: arm-versatile: Fix refcount leak in versatile_reboot_probe
	perf/core: Fix data race between perf_event_set_output() and perf_mmap_close()
	ip: Fix a data-race around sysctl_fwmark_reflect.
	tcp/dccp: Fix a data-race around sysctl_tcp_fwmark_accept.
	tcp: Fix a data-race around sysctl_tcp_probe_threshold.
	i2c: cadence: Change large transfer count reset logic to be unconditional
	igmp: Fix data-races around sysctl_igmp_llm_reports.
	igmp: Fix a data-race around sysctl_igmp_max_memberships.
	tcp: Fix a data-race around sysctl_tcp_notsent_lowat.
	be2net: Fix buffer overflow in be_get_module_eeprom
	Revert "Revert "char/random: silence a lockdep splat with printk()""
	mm/mempolicy: fix uninit-value in mpol_rebind_policy()
	bpf: Make sure mac_header was set before using it
	ALSA: memalloc: Align buffer allocations in page size
	tty: drivers/tty/, stop using tty_schedule_flip()
	tty: the rest, stop using tty_schedule_flip()
	tty: drop tty_schedule_flip()
	tty: extract tty_flip_buffer_commit() from tty_flip_buffer_push()
	tty: use new tty_insert_flip_string_and_push_buffer() in pty_write()
	net: usb: ax88179_178a needs FLAG_SEND_ZLP
	Linux 4.9.325

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ia0b4189557d82a2402819decbcf4eb3c7a204d86
2022-07-29 18:58:19 +02:00
Stephen Smalley
ab83798bd5 security,selinux,smack: kill security_task_wait hook
commit 3a2f5a59a695a73e0cde9a61e0feae5fa730e936 upstream.

As reported by yangshukui, a permission denial from security_task_wait()
can lead to a soft lockup in zap_pid_ns_processes() since it only expects
sys_wait4() to return 0 or -ECHILD. Further, security_task_wait() can
in general lead to zombies; in the absence of some way to automatically
reparent a child process upon a denial, the hook is not useful.  Remove
the security hook and its implementations in SELinux and Smack.  Smack
already removed its check from its hook.

Reported-by: yangshukui <yangshukui@huawei.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Alexander Grund <theflamefire89@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-29 17:05:44 +02:00
Greg Kroah-Hartman
2a2b02a000 Merge 4.9.255 into android-4.9-q
Changes in 4.9.255
	ACPI: sysfs: Prefer "compatible" modalias
	wext: fix NULL-ptr-dereference with cfg80211's lack of commit()
	net: usb: qmi_wwan: added support for Thales Cinterion PLSx3 modem family
	y2038: futex: Move compat implementation into futex.c
	futex: Move futex exit handling into futex code
	futex: Replace PF_EXITPIDONE with a state
	exit/exec: Seperate mm_release()
	futex: Split futex_mm_release() for exit/exec
	futex: Set task::futex_state to DEAD right after handling futex exit
	futex: Mark the begin of futex exit explicitly
	futex: Sanitize exit state handling
	futex: Provide state handling for exec() as well
	futex: Add mutex around futex exit
	futex: Provide distinct return value when owner is exiting
	futex: Prevent exit livelock
	KVM: x86/pmu: Fix HW_REF_CPU_CYCLES event pseudo-encoding in intel_arch_events[]
	KVM: x86: get smi pending status correctly
	leds: trigger: fix potential deadlock with libata
	mt7601u: fix kernel crash unplugging the device
	mt7601u: fix rx buffer refcounting
	ARM: imx: build suspend-imx6.S with arm instruction set
	netfilter: nft_dynset: add timeout extension to template
	xfrm: Fix oops in xfrm_replay_advance_bmp
	RDMA/cxgb4: Fix the reported max_recv_sge value
	iwlwifi: pcie: use jiffies for memory read spin time limit
	iwlwifi: pcie: reschedule in long-running memory reads
	mac80211: pause TX while changing interface type
	can: dev: prevent potential information leak in can_fill_info()
	iommu/vt-d: Gracefully handle DMAR units with no supported address widths
	iommu/vt-d: Don't dereference iommu_device if IOMMU_API is not built
	NFC: fix resource leak when target index is invalid
	NFC: fix possible resource leak
	Linux 4.9.255

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I1ead684216d7f27b8209f4d680f40b3619d16e3a
2021-02-03 23:44:54 +01:00
Thomas Gleixner
32d782808b futex: Mark the begin of futex exit explicitly
commit 18f694385c4fd77a09851fd301236746ca83f3cb upstream.

Instead of relying on PF_EXITING use an explicit state for the futex exit
and set it in the futex exit function. This moves the smp barrier and the
lock/unlock serialization into the futex code.

As with the DEAD state this is restricted to the exit path as exec
continues to use the same task struct.

This allows to simplify that logic in a next step.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191106224556.539409004@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03 23:19:49 +01:00
Thomas Gleixner
c2fd4e1198 futex: Set task::futex_state to DEAD right after handling futex exit
commit f24f22435dcc11389acc87e5586239c1819d217c upstream.

Setting task::futex_state in do_exit() is rather arbitrarily placed for no
reason. Move it into the futex code.

Note, this is only done for the exit cleanup as the exec cleanup cannot set
the state to FUTEX_STATE_DEAD because the task struct is still in active
use.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191106224556.439511191@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03 23:19:49 +01:00
Thomas Gleixner
394ff1207f exit/exec: Seperate mm_release()
commit 4610ba7ad877fafc0a25a30c6c82015304120426 upstream.

mm_release() contains the futex exit handling. mm_release() is called from
do_exit()->exit_mm() and from exec()->exec_mm().

In the exit_mm() case PF_EXITING and the futex state is updated. In the
exec_mm() case these states are not touched.

As the futex exit code needs further protections against exit races, this
needs to be split into two functions.

Preparatory only, no functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191106224556.240518241@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03 23:19:49 +01:00
Thomas Gleixner
2c11689578 futex: Replace PF_EXITPIDONE with a state
commit 3d4775df0a89240f671861c6ab6e8d59af8e9e41 upstream.

The futex exit handling relies on PF_ flags. That's suboptimal as it
requires a smp_mb() and an ugly lock/unlock of the exiting tasks pi_lock in
the middle of do_exit() to enforce the observability of PF_EXITING in the
futex code.

Add a futex_state member to task_struct and convert the PF_EXITPIDONE logic
over to the new state. The PF_EXITING dependency will be cleaned up in a
later step.

This prepares for handling various futex exit issues later.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191106224556.149449274@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03 23:19:49 +01:00
Greg Kroah-Hartman
a3ba0ea9cb Merge 4.9.245 into android-4.9-q
Changes in 4.9.245
	powerpc/64s: Define MASKABLE_RELON_EXCEPTION_PSERIES_OOL
	powerpc/64s: move some exception handlers out of line
	powerpc/64s: flush L1D on kernel entry
	powerpc: Add a framework for user access tracking
	powerpc: Implement user_access_begin and friends
	powerpc: Fix __clear_user() with KUAP enabled
	powerpc/uaccess: Evaluate macro arguments once, before user access is allowed
	powerpc/64s: flush L1D after user accesses
	i2c: imx: use clk notifier for rate changes
	i2c: imx: Fix external abort on interrupt in exit paths
	i2c: mux: pca954x: Add missing pca9546 definition to chip_desc
	powerpc/8xx: Always fault when _PAGE_ACCESSED is not set
	Input: sunkbd - avoid use-after-free in teardown paths
	mac80211: always wind down STA state
	KVM: x86: clflushopt should be treated as a no-op by emulation
	ACPI: GED: fix -Wformat
	Linux 4.9.245

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I688b066e99eeb16270414e0c4cb4dc3bb244486c
2020-11-22 10:31:19 +01:00
Al Viro
951cb4f231 don't dump the threads that had been already exiting when zapped.
commit 77f6ab8b7768cf5e6bdd0e72499270a0671506ee upstream.

Coredump logics needs to report not only the registers of the dumping
thread, but (since 2.5.43) those of other threads getting killed.

Doing that might require extra state saved on the stack in asm glue at
kernel entry; signal delivery logics does that (we need to be able to
save sigcontext there, at the very least) and so does seccomp.

That covers all callers of do_coredump().  Secondary threads get hit with
SIGKILL and caught as soon as they reach exit_mm(), which normally happens
in signal delivery, so those are also fine most of the time.  Unfortunately,
it is possible to end up with secondary zapped when it has already entered
exit(2) (or, worse yet, is oopsing).  In those cases we reach exit_mm()
when mm->core_state is already set, but the stack contents is not what
we would have in signal delivery.

At least on two architectures (alpha and m68k) it leads to infoleaks - we
end up with a chunk of kernel stack written into coredump, with the contents
consisting of normal C stack frames of the call chain leading to exit_mm()
instead of the expected copy of userland registers.  In case of alpha we
leak 312 bytes of stack.  Other architectures (including the regset-using
ones) might have similar problems - the normal user of regsets is ptrace
and the state of tracee at the time of such calls is special in the same
way signal delivery is.

Note that had the zapper gotten to the exiting thread slightly later,
it wouldn't have been included into coredump anyway - we skip the threads
that have already cleared their ->mm.  So let's pretend that zapper always
loses the race.  IOW, have exit_mm() only insert into the dumper list if
we'd gotten there from handling a fatal signal[*]

As the result, the callers of do_exit() that have *not* gone through get_signal()
are not seen by coredump logics as secondary threads.  Which excludes voluntary
exit()/oopsen/traps/etc.  The dumper thread itself is unaffected by that,
so seccomp is fine.

[*] originally I intended to add a new flag in tsk->flags, but ebiederman pointed
out that PF_SIGNALED is already doing just what we need.

Cc: stable@vger.kernel.org
Fixes: d89f3847def4 ("[PATCH] thread-aware coredumps, 2.5.43-C3")
History-tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-18 18:26:28 +01:00
Greg Kroah-Hartman
0f1687ebb5 Merge 4.9.228 into android-4.9-q
Changes in 4.9.228
	ipv6: fix IPV6_ADDRFORM operation logic
	vxlan: Avoid infinite loop when suppressing NS messages with invalid options
	scsi: return correct blkprep status code in case scsi_init_io() fails.
	crypto: talitos - fix ECB and CBC algs ivsize
	ARM: 8977/1: ptrace: Fix mask for thumb breakpoint hook
	sched/fair: Don't NUMA balance for kthreads
	drivers/net/ibmvnic: Update VNIC protocol version reporting
	ath9k_htc: Silence undersized packet warnings
	x86_64: Fix jiffies ODR violation
	x86/PCI: Mark Intel C620 MROMs as having non-compliant BARs
	x86/speculation: Prevent rogue cross-process SSBD shutdown
	x86/reboot/quirks: Add MacBook6,1 reboot quirk
	efi/efivars: Add missing kobject_put() in sysfs entry creation error path
	ALSA: es1688: Add the missed snd_card_free()
	ALSA: usb-audio: Fix inconsistent card PM state after resume
	ACPI: sysfs: Fix reference count leak in acpi_sysfs_add_hotplug_profile()
	ACPI: CPPC: Fix reference count leak in acpi_cppc_processor_probe()
	ACPI: GED: add support for _Exx / _Lxx handler methods
	ACPI: PM: Avoid using power resources if there are none for D0
	cgroup, blkcg: Prepare some symbols for module and !CONFIG_CGROUP usages
	nilfs2: fix null pointer dereference at nilfs_segctor_do_construct()
	spi: bcm2835aux: Fix controller unregister order
	spi: bcm-qspi: when tx/rx buffer is NULL set to 0
	ALSA: pcm: disallow linking stream to itself
	x86/speculation: Change misspelled STIPB to STIBP
	x86/speculation: Add support for STIBP always-on preferred mode
	x86/speculation: Avoid force-disabling IBPB based on STIBP and enhanced IBRS.
	x86/speculation: PR_SPEC_FORCE_DISABLE enforcement for indirect branches.
	spi: dw: fix possible race condition
	spi: dw: Fix controller unregister order
	spi: No need to assign dummy value in spi_unregister_controller()
	spi: Fix controller unregister order
	spi: pxa2xx: Fix controller unregister order
	spi: bcm2835: Fix controller unregister order
	ovl: initialize error in ovl_copy_xattr
	proc: Use new_inode not new_inode_pseudo
	video: fbdev: w100fb: Fix a potential double free.
	KVM: nSVM: leave ASID aside in copy_vmcb_control_area
	KVM: nVMX: Consult only the "basic" exit reason when routing nested exit
	KVM: MIPS: Define KVM_ENTRYHI_ASID to cpu_asid_mask(&boot_cpu_data)
	KVM: MIPS: Fix VPN2_MASK definition for variable cpu_vmbits
	KVM: arm64: Make vcpu_cp1x() work on Big Endian hosts
	ath9k: Fix use-after-free Read in ath9k_wmi_ctrl_rx
	ath9k: Fix use-after-free Write in ath9k_htc_rx_msg
	ath9x: Fix stack-out-of-bounds Write in ath9k_hif_usb_rx_cb
	ath9k: Fix general protection fault in ath9k_hif_usb_rx_cb
	Smack: slab-out-of-bounds in vsscanf
	mm/slub: fix a memory leak in sysfs_slab_add()
	fat: don't allow to mount if the FAT length == 0
	perf: Add cond_resched() to task_function_call()
	agp/intel: Reinforce the barrier after GTT updates
	can: kvaser_usb: kvaser_usb_leaf: Fix some info-leaks to USB devices
	media: dvb_frontend: ensure that inital front end status initialized
	ACPI: GED: use correct trigger type field in _Exx / _Lxx handling
	media: si2157: Better check for running tuner in init
	objtool: Ignore empty alternatives
	net: ena: fix error returning in ena_com_get_hash_function()
	spi: dw: Zero DMA Tx and Rx configurations on stack
	Bluetooth: Add SCO fallback for invalid LMP parameters error
	kgdb: Prevent infinite recursive entries to the debugger
	spi: dw: Enable interrupts in accordance with DMA xfer mode
	clocksource: dw_apb_timer_of: Fix missing clockevent timers
	btrfs: do not ignore error from btrfs_next_leaf() when inserting checksums
	ARM: 8978/1: mm: make act_mm() respect THREAD_SIZE
	x86/kvm/hyper-v: Explicitly align hcall param for kvm_hyperv_exit
	net: vmxnet3: fix possible buffer overflow caused by bad DMA value in vmxnet3_get_rss()
	staging: android: ion: use vmap instead of vm_map_ram
	e1000: Distribute switch variables for initialization
	dt-bindings: display: mediatek: control dpi pins mode to avoid leakage
	media: dvb: return -EREMOTEIO on i2c transfer failure.
	media: platform: fcp: Set appropriate DMA parameters
	MIPS: Make sparse_init() using top-down allocation
	netfilter: nft_nat: return EOPNOTSUPP if type or flags are not supported
	lib/mpi: Fix 64-bit MIPS build with Clang
	exit: Move preemption fixup up, move blocking operations down
	net: lpc-enet: fix error return code in lpc_mii_init()
	net: allwinner: Fix use correct return type for ndo_start_xmit()
	powerpc/spufs: fix copy_to_user while atomic
	MIPS: Truncate link address into 32bit for 32bit kernel
	mips: cm: Fix an invalid error code of INTVN_*_ERR
	kgdb: Fix spurious true from in_dbg_master()
	md: don't flush workqueue unconditionally in md_open
	rtlwifi: Fix a double free in _rtl_usb_tx_urb_setup()
	mwifiex: Fix memory corruption in dump_station
	x86/boot: Correct relocation destination on old linkers
	mips: Add udelay lpj numbers adjustment
	x86/mm: Stop printing BRK addresses
	m68k: mac: Don't call via_flush_cache() on Mac IIfx
	macvlan: Skip loopback packets in RX handler
	PCI: Don't disable decoding when mmio_always_on is set
	MIPS: Fix IRQ tracing when call handle_fpe() and handle_msa_fpe()
	staging: greybus: sdio: Respect the cmd->busy_timeout from the mmc core
	ixgbe: fix signed-integer-overflow warning
	mmc: sdhci-esdhc-imx: fix the mask for tuning start point
	spi: dw: Return any value retrieved from the dma_transfer callback
	cpuidle: Fix three reference count leaks
	btrfs: send: emit file capabilities after chown
	mm: thp: make the THP mapcount atomic against __split_huge_pmd_locked()
	ima: Fix ima digest hash table key calculation
	ima: Directly assign the ima_default_policy pointer to ima_rules
	evm: Fix possible memory leak in evm_calc_hmac_or_hash()
	ext4: fix EXT_MAX_EXTENT/INDEX to check for zeroed eh_max
	ext4: fix race between ext4_sync_parent() and rename()
	btrfs: fix error handling when submitting direct I/O bio
	blk-mq: move blk_mq_update_nr_hw_queues synchronize_rcu call
	PCI: Program MPS for RCiEP devices
	e1000e: Relax condition to trigger reset for ME workaround
	carl9170: remove P2P_GO support
	media: go7007: fix a miss of snd_card_free
	b43legacy: Fix case where channel status is corrupted
	b43: Fix connection problem with WPA3
	b43_legacy: Fix connection problem with WPA3
	igb: Report speed and duplex as unknown when device is runtime suspended
	power: vexpress: add suppress_bind_attrs to true
	pinctrl: samsung: Save/restore eint_mask over suspend for EINT_TYPE GPIOs
	sparc32: fix register window handling in genregs32_[gs]et()
	sparc64: fix misuses of access_process_vm() in genregs32_[sg]et()
	kernel/cpu_pm: Fix uninitted local in cpu_pm
	ARM: tegra: Correct PL310 Auxiliary Control Register initialization
	drivers/macintosh: Fix memleak in windfarm_pm112 driver
	kbuild: force to build vmlinux if CONFIG_MODVERSION=y
	sunrpc: svcauth_gss_register_pseudoflavor must reject duplicate registrations.
	sunrpc: clean up properly in gss_mech_unregister()
	mtd: rawnand: brcmnand: fix hamming oob layout
	mtd: rawnand: pasemi: Fix the probe error path
	w1: omap-hdq: cleanup to add missing newline for some dev_dbg
	perf probe: Do not show the skipped events
	perf symbols: Fix debuginfo search for Ubuntu
	Linux 4.9.228

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I26fadb036b8aab801c0ba5e7e5ed99170cf0f783
2020-06-20 12:50:27 +02:00
Jann Horn
1e587ce792 exit: Move preemption fixup up, move blocking operations down
[ Upstream commit 586b58cac8b4683eb58a1446fbc399de18974e40 ]

With CONFIG_DEBUG_ATOMIC_SLEEP=y and CONFIG_CGROUPS=y, kernel oopses in
non-preemptible context look untidy; after the main oops, the kernel prints
a "sleeping function called from invalid context" report because
exit_signals() -> cgroup_threadgroup_change_begin() -> percpu_down_read()
can sleep, and that happens before the preempt_count_set(PREEMPT_ENABLED)
fixup.

It looks like the same thing applies to profile_task_exit() and
kcov_task_exit().

Fix it by moving the preemption fixup up and the calls to
profile_task_exit() and kcov_task_exit() down.

Fixes: 1dc0fffc48 ("sched/core: Robustify preemption leak checks")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200305220657.46800-1-jannh@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-20 10:24:16 +02:00
jianzhou
dca3398ea7 Merge android-4.9.190 (476e7ea) into msm-4.9
* refs/heads/tmp-476e7ea:
  Linux 4.9.190
  bonding: Add vlan tx offload to hw_enc_features
  team: Add vlan tx offload to hw_enc_features
  net/mlx5e: Use flow keys dissector to parse packets for ARFS
  net/mlx5e: Only support tx/rx pause setting for port owner
  xen/netback: Reset nr_frags before freeing skb
  sctp: fix the transport error_count check
  net/packet: fix race in tpacket_snd()
  bnx2x: Fix VF's VLAN reconfiguration in reload.
  iommu/amd: Move iommu_init_pci() to .init section
  Input: psmouse - fix build error of multiple definition
  netfilter: conntrack: Use consistent ct id hash calculation
  arm64: compat: Allow single-byte watchpoints on all addresses
  bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K
  asm-generic: fix -Wtype-limits compiler warnings
  USB: serial: option: Add Motorola modem UARTs
  USB: serial: option: add the BroadMobi BM818 card
  USB: serial: option: Add support for ZTE MF871A
  USB: serial: option: add D-Link DWM-222 device ID
  USB: CDC: fix sanity checks in CDC union parser
  usb: cdc-acm: make sure a refcount is taken early enough
  USB: core: Fix races in character device registration and deregistraion
  staging: comedi: dt3000: Fix rounding up of timer divisor
  staging: comedi: dt3000: Fix signed integer overflow 'divider * base'
  ocfs2: remove set but not used variable 'last_hash'
  IB/mad: Fix use-after-free in ib mad completion handling
  IB/core: Add mitigation for Spectre V1
  arm64/mm: fix variable 'pud' set but not used
  arm64/efi: fix variable 'si' set but not used
  kbuild: modpost: handle KBUILD_EXTRA_SYMBOLS only for external modules
  ata: libahci: do not complain in case of deferred probe
  scsi: hpsa: correct scsi command status issue after reset
  libata: zpodd: Fix small read overflow in zpodd_get_mech_type()
  perf header: Fix use of unitialized value warning
  perf header: Fix divide by zero error if f_header.attr_size==0
  irqchip/irq-imx-gpcv2: Forward irq type to parent
  xen/pciback: remove set but not used variable 'old_state'
  net: usb: pegasus: fix improper read if get_registers() fail
  Input: iforce - add sanity checks
  Input: kbtab - sanity check for endpoint type
  HID: hiddev: do cleanup in failure of opening a device
  HID: hiddev: avoid opening a disconnected device
  HID: holtek: test for sanity of intfdata
  ALSA: hda - Let all conexant codec enter D3 when rebooting
  ALSA: hda - Add a generic reboot_notify
  ALSA: hda - Fix a memory leak bug
  xtensa: add missing isync to the cpu_reset TLB code
  netfilter: ctnetlink: don't use conntrack/expect object addresses as id
  inet: switch IP ID generator to siphash
  siphash: implement HalfSipHash1-3 for hash tables
  siphash: add cryptographically secure PRF
  vhost: scsi: add weight support
  vhost_net: fix possible infinite loop
  vhost: introduce vhost_exceeds_weight()
  vhost_net: introduce vhost_exceeds_weight()
  vhost_net: use packet weight for rx handler, too
  vhost-net: set packet weight of tx polling to 2 * vq size
  bpf: add bpf_jit_limit knob to restrict unpriv allocations
  bpf: restrict access to core bpf sysctls
  bpf: get rid of pure_initcall dependency to enable jits
  mm/memcontrol.c: fix use after free in mem_cgroup_iter()
  mm/usercopy: use memory range to be accessed for wraparound check
  sh: kernel: hw_breakpoint: Fix missing break in switch statement
  scsi: mpt3sas: Use 63-bit DMA addressing on SAS35 HBA
  iwlwifi: don't unmap as page memory that was mapped as single
  mwifiex: fix 802.11n/WPA detection
  smb3: send CAP_DFS capability during session setup
  SMB3: Fix deadlock in validate negotiate hits reconnect
  mac80211: don't WARN on short WMM parameters from AP
  ALSA: hda - Don't override global PCM hw info flag
  ALSA: firewire: fix a memory leak bug
  hwmon: (nct7802) Fix wrong detection of in4 presence
  can: peak_usb: pcan_usb_fd: Fix info-leaks to USB devices
  can: peak_usb: pcan_usb_pro: Fix info-leaks to USB devices
  perf/core: Fix creating kernel counters for PMUs that override event->cpu
  tty/ldsem, locking/rwsem: Add missing ACQUIRE to read_failed sleep loop
  scsi: scsi_dh_alua: always use a 2 second delay before retrying RTPG
  scsi: ibmvfc: fix WARN_ON during event pool release
  scsi: megaraid_sas: fix panic on loading firmware crashdump
  ARM: davinci: fix sleep.S build error on ARMv4
  ACPI/IORT: Fix off-by-one check in iort_dev_find_its_id()
  drbd: dynamically allocate shash descriptor
  perf probe: Avoid calling freeing routine multiple times for same pointer
  ALSA: compress: Be more restrictive about when a drain is allowed
  ALSA: compress: Don't allow paritial drain operations on capture streams
  ALSA: compress: Prevent bypasses of set_params
  ALSA: compress: Fix regression on compressed capture streams
  s390/qdio: add sanity checks to the fast-requeue path
  cpufreq/pasemi: fix use-after-free in pas_cpufreq_cpu_init()
  hwmon: (nct6775) Fix register address and added missed tolerance for nct6106
  mac80211: don't warn about CW params when not using them
  iscsi_ibft: make ISCSI_IBFT dependson ACPI instead of ISCSI_IBFT_FIND
  netfilter: nfnetlink: avoid deadlock due to synchronous request_module
  can: peak_usb: fix potential double kfree_skb()
  usb: yurex: Fix use-after-free in yurex_delete
  perf record: Fix module size on s390
  perf db-export: Fix thread__exec_comm()
  perf record: Fix wrong size in perf_record_mmap for last kernel module
  mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
  x86/mm: Sync also unmappings in vmalloc_sync_all()
  x86/mm: Check for pfn instead of page in vmalloc_sync_one()
  sound: fix a memory leak bug
  usb: iowarrior: fix deadlock on disconnect
  usb: usbfs: fix double-free of usb memory upon submiturb error
  BACKPORT: arch: add pidfd and io_uring syscalls everywhere
  ANDROID: arch: add missing pidfd_open definitions for arm32
  ANDROID: fix kernelci build-break in lowmemorykiller
  f2fs: fix build error on android tracepoints
  ANDROID: Avoid taking multiple locks in handle_lmk_event
  UPSTREAM: net/ipv6: allow sysctl to change link-local address generation mode
  ANDROID: fix binder change in merge of 4.9.188
  UPSTREAM: pidfd: fix a poll race when setting exit_state
  BACKPORT: arch: wire-up pidfd_open()
  BACKPORT: pid: add pidfd_open()
  UPSTREAM: pidfd: add polling support
  UPSTREAM: signal: improve comments
  BACKPORT: fork: do not release lock that wasn't taken
  BACKPORT: signal: support CLONE_PIDFD with pidfd_send_signal
  BACKPORT: clone: add CLONE_PIDFD
  UPSTREAM: Make anon_inodes unconditional
  UPSTREAM: signal: use fdget() since we don't allow O_PATH
  UPSTREAM: signal: don't silently convert SI_USER signals to non-current pidfd
  BACKPORT: signal: add pidfd_send_signal() syscall

Conflicts:
	drivers/staging/android/lowmemorykiller.c
	include/linux/ipv6.h
	net/ipv6/addrconf.c
	sound/core/compress_offload.c

Change-Id: I18be309a1a2fd17077b949c7b7113f407a9033a8
Signed-off-by: jianzhou <jianzhou@codeaurora.org>
2019-10-23 10:32:31 +08:00
Suren Baghdasaryan
42cafda296 UPSTREAM: pidfd: fix a poll race when setting exit_state
There is a race between reading task->exit_state in pidfd_poll and
writing it after do_notify_parent calls do_notify_pidfd. Expected
sequence of events is:

CPU 0                            CPU 1
------------------------------------------------
exit_notify
  do_notify_parent
    do_notify_pidfd
  tsk->exit_state = EXIT_DEAD
                                  pidfd_poll
                                     if (tsk->exit_state)

However nothing prevents the following sequence:

CPU 0                            CPU 1
------------------------------------------------
exit_notify
  do_notify_parent
    do_notify_pidfd
                                   pidfd_poll
                                      if (tsk->exit_state)
  tsk->exit_state = EXIT_DEAD

This causes a polling task to wait forever, since poll blocks because
exit_state is 0 and the waiting task is not notified again. A stress
test continuously doing pidfd poll and process exits uncovered this bug.

To fix it, we make sure that the task's exit_state is always set before
calling do_notify_pidfd.

Fixes: b53b0b9d9a6 ("pidfd: add polling support")
Cc: kernel-team@android.com
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Link: https://lore.kernel.org/r/20190717172100.261204-1-joel@joelfernandes.org
[christian@brauner.io: adapt commit message and drop unneeded changes from wait_task_zombie]
Signed-off-by: Christian Brauner <christian@brauner.io>

(cherry picked from commit b191d6491be67cef2b3fa83015561caca1394ab9)

Bug: 135608568
Test: test program using syscall(__NR_sys_pidfd_open,..) and poll()
Change-Id: Ia9419ceac08497523c4d830160df49f582075070
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2019-09-03 13:51:35 -07:00
Suren Baghdasaryan
0fc8665ea6 UPSTREAM: pidfd: fix a poll race when setting exit_state
There is a race between reading task->exit_state in pidfd_poll and
writing it after do_notify_parent calls do_notify_pidfd. Expected
sequence of events is:

CPU 0                            CPU 1
------------------------------------------------
exit_notify
  do_notify_parent
    do_notify_pidfd
  tsk->exit_state = EXIT_DEAD
                                  pidfd_poll
                                     if (tsk->exit_state)

However nothing prevents the following sequence:

CPU 0                            CPU 1
------------------------------------------------
exit_notify
  do_notify_parent
    do_notify_pidfd
                                   pidfd_poll
                                      if (tsk->exit_state)
  tsk->exit_state = EXIT_DEAD

This causes a polling task to wait forever, since poll blocks because
exit_state is 0 and the waiting task is not notified again. A stress
test continuously doing pidfd poll and process exits uncovered this bug.

To fix it, we make sure that the task's exit_state is always set before
calling do_notify_pidfd.

Fixes: b53b0b9d9a6 ("pidfd: add polling support")
Cc: kernel-team@android.com
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Link: https://lore.kernel.org/r/20190717172100.261204-1-joel@joelfernandes.org
[christian@brauner.io: adapt commit message and drop unneeded changes from wait_task_zombie]
Signed-off-by: Christian Brauner <christian@brauner.io>

(cherry picked from commit b191d6491be67cef2b3fa83015561caca1394ab9)

Bug: 135608568
Test: test program using syscall(__NR_sys_pidfd_open,..) and poll()
Change-Id: Ia9419ceac08497523c4d830160df49f582075070
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2019-08-12 13:34:02 -04:00
jianzhou
2f8eb1ca38 Merge android-4.9.155 (32e6695) into msm-4.9
* refs/heads/tmp-32e6695:
  Linux 4.9.155
  fanotify: fix handling of events on child sub-directory
  fs: don't scan the inode cache before SB_BORN is set
  drivers: core: Remove glue dirs from sysfs earlier
  cifs: Always resolve hostname before reconnecting
  mm: migrate: don't rely on __PageMovable() of newpage after unlocking it
  mm: hwpoison: use do_send_sig_info() instead of force_sig()
  mm, oom: fix use-after-free in oom_kill_process
  kernel/exit.c: release ptraced tasks before zap_pid_ns_processes
  mmc: sdhci-iproc: handle mmc_of_parse() errors during probe
  platform/x86: asus-nb-wmi: Drop mapping of 0x33 and 0x34 scan codes
  platform/x86: asus-nb-wmi: Map 0x35 to KEY_SCREENLOCK
  gfs2: Revert "Fix loop in gfs2_rbm_find"
  arm64: hibernate: Clean the __hyp_text to PoC after resume
  arm64: hyp-stub: Forbid kprobing of the hyp-stub
  arm64: kaslr: ensure randomized quantities are clean also when kaslr is off
  ARM: cns3xxx: Fix writing to wrong PCI config registers after alignment
  fs/dcache: Fix incorrect nr_dentry_unused accounting in shrink_dcache_sb()
  CIFS: Do not count -ENODATA as failure for query directory
  ipvlan, l3mdev: fix broken l3s mode wrt local routes
  l2tp: fix reading optional fields of L2TPv3
  l2tp: remove l2specific_len dependency in l2tp_core
  net/mlx5e: Allow MAC invalidation while spoofchk is ON
  ucc_geth: Reset BQL queue when stopping device
  net/rose: fix NULL ax25_cb kernel panic
  netrom: switch to sock timer API
  net/mlx4_core: Add masking for a few queries on HCA caps
  l2tp: copy 4 more bytes to linear part if necessary
  ipv6: Consider sk_bound_dev_if when binding a socket to an address
  fs: add the fsnotify call to vfs_iter_write
  Fix "net: ipv4: do not handle duplicate fragments as overlapping"
  BACKPORT: net: dev_is_mac_header_xmit() true for ARPHRD_RAWIP
  UPSTREAM: binder: filter out nodes when showing binder procs
  UPSTREAM: xfrm: Make set-mark default behavior backward compatible
  ANDROID: cuttlefish_defconfig: Enable CONFIG_RTC_HCTOSYS
  ANDROID: zram: fix incorrect assignment for access time
  UPSTREAM: zram: idle writeback fixes and cleanup
  UPSTREAM: zram: writeback throttle
  UPSTREAM: zram: add bd_stat statistics
  BACKPORT: zram: support idle/huge page writeback
  UPSTREAM: zram: introduce ZRAM_IDLE flag
  BACKPORT: zram: refactor flags and writeback stuff
  UPSTREAM: zram: fix double free backing device
  UPSTREAM: zram: fix lockdep warning of free block handling
  Linux 4.9.154
  btrfs: dev-replace: go back to suspended state if target device is missing
  btrfs: fix error handling in btrfs_dev_replace_start
  f2fs: read page index before freeing
  nvmet-rdma: fix null dereference under heavy load
  nvmet-rdma: Add unlikely for response allocated check
  s390/smp: Fix calling smp_call_ipl_cpu() from ipl CPU
  irqchip/gic-v3-its: Align PCI Multi-MSI allocation on their size
  perf unwind: Take pgoff into account when reporting elf to libdwfl
  perf unwind: Unwind with libdw doesn't take symfs into account
  vt: invoke notifier on screen size change
  can: bcm: check timer values before ktime conversion
  can: dev: __can_get_echo_skb(): fix bogous check for non-existing skb by removing it
  x86/kaslr: Fix incorrect i8254 outb() parameters
  x86/selftests/pkeys: Fork() to check for state being preserved
  KVM: x86: Fix single-step debugging
  dm thin: fix passdown_double_checking_shared_status()
  acpi/nfit: Fix command-supported detection
  acpi/nfit: Block function zero DSMs
  Input: uinput - fix undefined behavior in uinput_validate_absinfo()
  compiler.h: enable builtin overflow checkers and add fallback code
  Input: xpad - add support for SteelSeries Stratus Duo
  CIFS: Fix possible hang during async MTU reads and writes
  tty/n_hdlc: fix __might_sleep warning
  uart: Fix crash in uart_write and uart_put_char
  tty: Handle problem if line discipline does not have receive_buf
  staging: rtl8188eu: Add device code for D-Link DWA-121 rev B1
  char/mwave: fix potential Spectre v1 vulnerability
  s390/smp: fix CPU hotplug deadlock with CPU rescan
  s390/early: improve machine detection
  ARC: perf: map generic branches to correct hardware condition
  ARCv2: lib: memeset: fix doing prefetchw outside of buffer
  ASoC: rt5514-spi: Fix potential NULL pointer dereference
  ASoC: atom: fix a missing check of snd_pcm_lib_malloc_pages
  USB: serial: pl2303: add new PID to support PL2303TB
  USB: serial: simple: add Motorola Tetra TPG2200 device id
  ipfrag: really prevent allocation on netns exit
  net_sched: refetch skb protocol for each filter
  net: ipv4: Fix memory leak in network namespace dismantle
  vhost: log dirty page correctly
  openvswitch: Avoid OOB read when parsing flow nlattrs
  net: Fix usage of pskb_trim_rcsum
  net: bridge: Fix ethernet header pointer before check skb forwardable
  Linux 4.9.153
  locking/qspinlock: Pull in asm/byteorder.h to ensure correct endianness
  ipmi:ssif: Fix handling of multi-part return messages
  mm, proc: be more verbose about unstable VMA flags in /proc/<pid>/smaps
  mm/page-writeback.c: don't break integrity writeback on ->writepage() error
  ocfs2: fix panic due to unrecovered local alloc
  scsi: megaraid: fix out-of-bound array accesses
  scsi: smartpqi: correct lun reset issues
  sysfs: Disable lockdep for driver bind/unbind files
  ALSA: bebob: fix model-id of unit for Apogee Ensemble
  dm snapshot: Fix excessive memory usage and workqueue stalls
  tools lib subcmd: Don't add the kernel sources to the include path
  dm kcopyd: Fix bug causing workqueue stalls
  perf parse-events: Fix unchecked usage of strncpy()
  perf svghelper: Fix unchecked usage of strncpy()
  perf intel-pt: Fix error with config term "pt=0"
  tty/serial: do not free trasnmit buffer page under port lock
  mmc: atmel-mci: do not assume idle after atmci_request_end
  kconfig: fix memory leak when EOF is encountered in quotation
  kconfig: fix file name and line number of warn_ignored_character()
  clk: imx6q: reset exclusive gates on init
  scsi: target: use consistent left-aligned ASCII INQUIRY data
  net: call sk_dst_reset when set SO_DONTROUTE
  media: firewire: Fix app_info parameter type in avc_ca{,_app}_info
  powerpc/pseries/cpuidle: Fix preempt warning
  powerpc/xmon: Fix invocation inside lock region
  pstore/ram: Do not treat empty buffers as valid
  jffs2: Fix use of uninitialized delayed_work, lockdep breakage
  rxe: IB_WR_REG_MR does not capture MR's iova field
  selinux: always allow mounting submounts
  arm64: perf: set suppress_bind_attrs flag to true
  MIPS: SiByte: Enable swiotlb for SWARM, LittleSur and BigSur
  ALSA: oxfw: add support for APOGEE duet FireWire
  serial: set suppress_bind_attrs flag only if builtin
  writeback: don't decrement wb->refcnt if !wb->bdi
  e1000e: allow non-monotonic SYSTIM readings
  platform/x86: asus-wmi: Tell the EC the OS will handle the display off hotkey
  ipv6: Take rcu_read_lock in __inet6_bind for mapped addresses
  ipv6: Consider sk_bound_dev_if when binding a socket to a v4 mapped address
  r8169: Add support for new Realtek Ethernet
  ANDROID: cfi: fix shadow rebasing
  UPSTREAM: dm: do not allow readahead to limit IO size
  UPSTREAM: readahead: stricter check for bdi io_pages
  UPSTREAM: mm: don't cap request size based on read-ahead setting
  Revert "UPSTREAM: dm: do not allow readahead to limit IO size"
  UPSTREAM: dm: do not allow readahead to limit IO size
  UPSTREAM: ppp: Move PFC decompression to PPP generic layer
  UPSTREAM: l2tp: Add protocol field decompression
  BACKPORT: l2tp: remove ->recv_payload_hook

Change-Id: Ied9b99c5d4cec558b44c3cb720257458d7e3f40e
Signed-off-by: jianzhou <jianzhou@codeaurora.org>
2019-02-12 16:32:37 +08:00
Greg Kroah-Hartman
32e6695e35 Merge 4.9.155 into android-4.9
Changes in 4.9.155
	Fix "net: ipv4: do not handle duplicate fragments as overlapping"
	fs: add the fsnotify call to vfs_iter_write
	ipv6: Consider sk_bound_dev_if when binding a socket to an address
	l2tp: copy 4 more bytes to linear part if necessary
	net/mlx4_core: Add masking for a few queries on HCA caps
	netrom: switch to sock timer API
	net/rose: fix NULL ax25_cb kernel panic
	ucc_geth: Reset BQL queue when stopping device
	net/mlx5e: Allow MAC invalidation while spoofchk is ON
	l2tp: remove l2specific_len dependency in l2tp_core
	l2tp: fix reading optional fields of L2TPv3
	ipvlan, l3mdev: fix broken l3s mode wrt local routes
	CIFS: Do not count -ENODATA as failure for query directory
	fs/dcache: Fix incorrect nr_dentry_unused accounting in shrink_dcache_sb()
	ARM: cns3xxx: Fix writing to wrong PCI config registers after alignment
	arm64: kaslr: ensure randomized quantities are clean also when kaslr is off
	arm64: hyp-stub: Forbid kprobing of the hyp-stub
	arm64: hibernate: Clean the __hyp_text to PoC after resume
	gfs2: Revert "Fix loop in gfs2_rbm_find"
	platform/x86: asus-nb-wmi: Map 0x35 to KEY_SCREENLOCK
	platform/x86: asus-nb-wmi: Drop mapping of 0x33 and 0x34 scan codes
	mmc: sdhci-iproc: handle mmc_of_parse() errors during probe
	kernel/exit.c: release ptraced tasks before zap_pid_ns_processes
	mm, oom: fix use-after-free in oom_kill_process
	mm: hwpoison: use do_send_sig_info() instead of force_sig()
	mm: migrate: don't rely on __PageMovable() of newpage after unlocking it
	cifs: Always resolve hostname before reconnecting
	drivers: core: Remove glue dirs from sysfs earlier
	fs: don't scan the inode cache before SB_BORN is set
	fanotify: fix handling of events on child sub-directory
	Linux 4.9.155

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-02-07 09:14:43 +01:00
Andrei Vagin
44ccc0cce1 kernel/exit.c: release ptraced tasks before zap_pid_ns_processes
commit 8fb335e078378c8426fabeed1ebee1fbf915690c upstream.

Currently, exit_ptrace() adds all ptraced tasks in a dead list, then
zap_pid_ns_processes() waits on all tasks in a current pidns, and only
then are tasks from the dead list released.

zap_pid_ns_processes() can get stuck on waiting tasks from the dead
list.  In this case, we will have one unkillable process with one or
more dead children.

Thanks to Oleg for the advice to release tasks in find_child_reaper().

Link: http://lkml.kernel.org/r/20190110175200.12442-1-avagin@gmail.com
Fixes: 7c8bd2322c ("exit: ptrace: shift "reap dead" code from exit_ptrace() to forget_original_parent()")
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-06 17:33:29 +01:00
Blagovest Kolenichev
f56989ba37 Merge android-4.9.114 (dbcb748) into msm-4.9
* refs/heads/tmp-dbcb748:
  ANDROID: sdcardfs: Check stacked filesystem depth
  ANDROID: verity: really fix android-verity Kconfig
  tcp: detect malicious patterns in tcp_collapse_ofo_queue()
  tcp: avoid collapses in tcp_prune_queue() if possible
  tcp: free batches of packets in tcp_prune_ofo_queue()
  x86_64_cuttlefish_defconfig: Enable android-verity
  x86_64_cuttlefish_defconfig: enable verity cert
  ANDROID: android-verity: Fix broken parameter handling.
  ANDROID: android-verity: Make it work with newer kernels
  ANDROID: android-verity: Add API to verify signature with builtin keys.
  ANDROID: verity: fix android-verity Kconfig dependencies
  Linux 4.9.114
  string: drop __must_check from strscpy() and restore strscpy() usages in cgroup
  arm64: KVM: Add ARCH_WORKAROUND_2 discovery through ARCH_FEATURES_FUNC_ID
  arm64: KVM: Handle guest's ARCH_WORKAROUND_2 requests
  arm64: KVM: Add ARCH_WORKAROUND_2 support for guests
  arm64: KVM: Add HYP per-cpu accessors
  arm64: ssbd: Add prctl interface for per-thread mitigation
  arm64: ssbd: Introduce thread flag to control userspace mitigation
  arm64: ssbd: Restore mitigation status on CPU resume
  arm64: ssbd: Skip apply_ssbd if not using dynamic mitigation
  arm64: ssbd: Add global mitigation state accessor
  arm64: Add 'ssbd' command-line option
  arm64: Add ARCH_WORKAROUND_2 probing
  arm64: Add per-cpu infrastructure to call ARCH_WORKAROUND_2
  arm64: Call ARCH_WORKAROUND_2 on transitions between EL0 and EL1
  arm/arm64: smccc: Add SMCCC-specific return codes
  KVM: arm64: Avoid storing the vcpu pointer on the stack
  KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state
  arm64: alternatives: Add dynamic patching feature
  KVM: arm64: Stop save/restoring host tpidr_el1 on VHE
  arm64: alternatives: use tpidr_el2 on VHE hosts
  KVM: arm64: Change hyp_panic()s dependency on tpidr_el2
  KVM: arm/arm64: Convert kvm_host_cpu_state to a static per-cpu allocation
  KVM: arm64: Store vcpu on the stack during __guest_enter()
  arm64: assembler: introduce ldr_this_cpu
  net/nfc: Avoid stalls when nfc_alloc_send_skb() returned NULL.
  rds: avoid unenecessary cong_update in loop transport
  netfilter: ipv6: nf_defrag: drop skb dst before queueing
  KEYS: DNS: fix parsing multiple options
  reiserfs: fix buffer overflow with long warning messages
  netfilter: ebtables: reject non-bridge targets
  net: lan78xx: Fix race in tx pending skb size calculation
  rtlwifi: rtl8821ae: fix firmware is not ready to run
  net: cxgb3_main: fix potential Spectre v1
  net/mlx5: Fix command interface race in polling mode
  net/packet: fix use-after-free
  vhost_net: validate sock before trying to put its fd
  tcp: prevent bogus FRTO undos with non-SACK flows
  tcp: fix Fast Open key endianness
  r8152: napi hangup fix after disconnect
  qmi_wwan: add support for the Dell Wireless 5821e module
  qed: Limit msix vectors in kdump kernel to the minimum required count.
  qed: Fix use of incorrect size in memcpy call.
  net: sungem: fix rx checksum support
  net_sched: blackhole: tell upper qdisc about dropped packets
  net/mlx5: Fix wrong size allocation for QoS ETC TC regitster
  net/mlx5: Fix incorrect raw command length parsing
  net: dccp: switch rx_tstamp_last_feedback to monotonic clock
  net: dccp: avoid crash in ccid3_hc_rx_send_feedback()
  ipvlan: fix IFLA_MTU ignored on NEWLINK
  atm: zatm: Fix potential Spectre v1
  crypto: crypto4xx - fix crypto4xx_build_pdr, crypto4xx_build_sdr leak
  crypto: crypto4xx - remove bad list_del
  bcm63xx_enet: do not write to random DMA channel on BCM6345
  bcm63xx_enet: correct clock usage
  mtd: m25p80: consider max message size in m25p80_read
  ocfs2: ip_alloc_sem should be taken in ocfs2_get_block()
  ocfs2: subsystem.su_mutex is required while accessing the item->ci_parent
  x86/paravirt: Make native_save_fl() extern inline
  x86/asm: Add _ASM_ARG* constants for argument registers to <asm/asm.h>
  compiler-gcc.h: Add __attribute__((gnu_inline)) to all inline declarations
  compiler, clang: always inline when CONFIG_OPTIMIZE_INLINING is disabled
  compiler, clang: properly override 'inline' for clang
  compiler, clang: suppress warning for unused static inline functions
  MIPS: Use async IPIs for arch_trigger_cpumask_backtrace()
  ANDROID: Fix massive cpufreq_times memory leaks
  ANDROID: Reduce use of #ifdef CONFIG_CPU_FREQ_TIMES
  treewide: Use array_size in f2fs_kvzalloc()
  treewide: Use array_size() in f2fs_kzalloc()
  treewide: Use array_size() in f2fs_kmalloc()
  overflow.h: Add allocation size calculation helpers
  f2fs: fix to clear FI_VOLATILE_FILE correctly
  f2fs: let sync node IO interrupt async one
  f2fs: don't change wbc->sync_mode
  f2fs: fix to update mtime correctly
  fs: f2fs: insert space around that ':' and ', '
  fs: f2fs: add missing blank lines after declarations
  fs: f2fs: changed variable type of offset "unsigned" to "loff_t"
  f2fs: clean up symbol namespace
  f2fs: make set_de_type() static
  f2fs: make __f2fs_write_data_pages() static
  f2fs: fix to avoid accessing cross the boundary
  f2fs: fix to let caller retry allocating block address
  disable loading f2fs module on PAGE_SIZE > 4KB
  f2fs: fix error path of move_data_page
  f2fs: don't drop dentry pages after fs shutdown
  f2fs: fix to avoid race during access gc_thread pointer
  f2fs: clean up with clear_radix_tree_dirty_tag
  f2fs: fix to don't trigger writeback during recovery
  f2fs: clear discard_wake earlier
  f2fs: let discard thread wait a little longer if dev is busy
  f2fs: avoid stucking GC due to atomic write
  f2fs: introduce sbi->gc_mode to determine the policy
  f2fs: keep migration IO order in LFS mode
  f2fs: fix to wait page writeback during revoking atomic write
  f2fs: Fix deadlock in shutdown ioctl
  f2fs: detect synchronous writeback more earlier
  mm: remove nr_pages argument from pagevec_lookup_{,range}_tag()
  ceph: use pagevec_lookup_range_nr_tag()
  mm: add variant of pagevec_lookup_range_tag() taking number of pages
  mm: use pagevec_lookup_range_tag() in write_cache_pages()
  mm: use pagevec_lookup_range_tag() in __filemap_fdatawait_range()
  nilfs2: use pagevec_lookup_range_tag()
  gfs2: use pagevec_lookup_range_tag()
  f2fs: use find_get_pages_tag() for looking up single page
  f2fs: simplify page iteration loops
  f2fs: use pagevec_lookup_range_tag()
  ext4: use pagevec_lookup_range_tag()
  ceph: use pagevec_lookup_range_tag()
  btrfs: use pagevec_lookup_range_tag()
  mm: implement find_get_pages_range_tag()
  f2fs: clean up with is_valid_blkaddr()
  f2fs: fix to initialize min_mtime with ULLONG_MAX
  f2fs: fix to let checkpoint guarantee atomic page persistence
  f2fs: fix to initialize i_current_depth according to inode type
  Revert "f2fs: add ovp valid_blocks check for bg gc victim to fg_gc"
  f2fs: don't drop any page on f2fs_cp_error() case
  f2fs: fix spelling mistake: "extenstion" -> "extension"
  f2fs: enhance sanity_check_raw_super() to avoid potential overflows
  f2fs: treat volatile file's data as hot one
  f2fs: introduce release_discard_addr() for cleanup
  f2fs: fix potential overflow
  f2fs: rename dio_rwsem to i_gc_rwsem
  f2fs: move mnt_want_write_file after range check
  f2fs: fix missing clear FI_NO_PREALLOC in some error case
  f2fs: enforce fsync_mode=strict for renamed directory
  f2fs: sanity check for total valid node blocks
  f2fs: sanity check on sit entry
  f2fs: avoid bug_on on corrupted inode
  f2fs: give message and set need_fsck given broken node id
  f2fs: clean up commit_inmem_pages()
  f2fs: do not check F2FS_INLINE_DOTS in recover
  f2fs: remove duplicated dquot_initialize and fix error handling
  f2fs: stop issue discard if something wrong with f2fs
  f2fs: fix return value in f2fs_ioc_commit_atomic_write
  f2fs: allocate hot_data for atomic write more strictly
  f2fs: check if inmem_pages list is empty correctly
  f2fs: fix race in between GC and atomic open
  f2fs: change le32 to le16 of f2fs_inode->i_extra_size
  f2fs: check cur_valid_map_mir & raw_sit block count when flush sit entries
  f2fs: correct return value of f2fs_trim_fs
  f2fs: fix to show missing bits in FS_IOC_GETFLAGS
  f2fs: remove unneeded F2FS_PROJINHERIT_FL
  f2fs: don't use GFP_ZERO for page caches
  f2fs: issue all big range discards in umount process
  f2fs: remove redundant block plug
  f2fs: remove unmatched zero_user_segment when convert inline dentry
  f2fs: introduce private inode status mapping
  fscrypt: log the crypto algorithm implementations
  fscrypt: add Speck128/256 support
  fscrypt: only derive the needed portion of the key
  fscrypt: separate key lookup from key derivation
  fscrypt: use a common logging function
  fscrypt: remove internal key size constants
  fscrypt: remove unnecessary check for non-logon key type
  fscrypt: make fscrypt_operations.max_namelen an integer
  fscrypt: drop empty name check from fname_decrypt()
  fscrypt: drop max_namelen check from fname_decrypt()
  fscrypt: don't special-case EOPNOTSUPP from fscrypt_get_encryption_info()
  fscrypt: don't clear flags on crypto transform
  fscrypt: remove stale comment from fscrypt_d_revalidate()
  fscrypt: remove error messages for skcipher_request_alloc() failure
  fscrypt: remove unnecessary NULL check when allocating skcipher
  fscrypt: clean up after fscrypt_prepare_lookup() conversions
  ext4: switch to fscrypt_prepare_lookup()
  fscrypt: use unbound workqueue for decryption
  f2fs: run fstrim asynchronously if runtime discard is on
  f2fs: turn down IO priority of discard from background
  f2fs: don't split checkpoint in fstrim
  f2fs: issue discard commands proactively in high fs utilization
  f2fs: add fsync_mode=nobarrier for non-atomic files
  f2fs: let fstrim issue discard commands in lower priority
  f2fs: avoid fsync() failure caused by EAGAIN in writepage()
  f2fs: clear PageError on writepage - part 2
  f2fs: check cap_resource only for data blocks
  Revert "f2fs: introduce f2fs_set_page_dirty_nobuffer"
  f2fs: clear PageError on writepage
  f2fs: call unlock_new_inode() before d_instantiate()
  f2fs: refactor read path to allow multiple postprocessing steps
  fscrypt: allow synchronous bio decryption
  f2fs: remain written times to update inode during fsync
  f2fs: make assignment of t->dentry_bitmap more readable
  f2fs: truncate preallocated blocks in error case
  f2fs: fix a wrong condition in f2fs_skip_inode_update
  f2fs: reserve bits for fs-verity
  f2fs: Add a segment type check in inplace write
  f2fs: no need to initialize zero value for GFP_F2FS_ZERO
  f2fs: don't track new nat entry in nat set
  f2fs: clean up with F2FS_BLK_ALIGN
  f2fs: check blkaddr more accuratly before issue a bio
  f2fs: Set GF_NOFS in read_cache_page_gfp while doing f2fs_quota_read
  f2fs: introduce a new mount option test_dummy_encryption
  f2fs: introduce F2FS_FEATURE_LOST_FOUND feature
  f2fs: release locks before return in f2fs_ioc_gc_range()
  f2fs: align memory boundary for bitops
  f2fs: remove unneeded set_cold_node()
  f2fs: add nowait aio support
  f2fs: wrap all options with f2fs_sb_info.mount_opt
  f2fs: Don't overwrite all types of node to keep node chain
  f2fs: introduce mount option for fsync mode
  f2fs: fix to restore old mount option in ->remount_fs
  f2fs: wrap sb_rdonly with f2fs_readonly
  f2fs: avoid selinux denial on CAP_SYS_RESOURCE
  f2fs: support hot file extension
  f2fs: fix to avoid race in between atomic write and background GC
  f2fs: do gc in greedy mode for whole range if gc_urgent mode is set
  f2fs: issue discard aggressively in the gc_urgent mode
  f2fs: set readdir_ra by default
  f2fs: add auto tuning for small devices
  f2fs: add mount option for segment allocation policy
  f2fs: don't stop GC if GC is contended
  f2fs: expose extension_list sysfs entry
  f2fs: fix to set KEEP_SIZE bit in f2fs_zero_range
  f2fs: introduce sb_lock to make encrypt pwsalt update exclusive
  f2fs: remove redundant initialization of pointer 'p'
  f2fs: flush cp pack except cp pack 2 page at first
  f2fs: clean up f2fs_sb_has_xxx functions
  f2fs: remove redundant check of page type when submit bio
  f2fs: fix to handle looped node chain during recovery
  f2fs: handle quota for orphan inodes
  f2fs: support passing down write hints to block layer with F2FS policy
  f2fs: support passing down write hints given by users to block layer
  f2fs: fix to clear CP_TRIMMED_FLAG
  f2fs: support large nat bitmap
  f2fs: fix to check extent cache in f2fs_drop_extent_tree
  f2fs: restrict inline_xattr_size configuration
  f2fs: fix heap mode to reset it back
  f2fs: fix potential corruption in area before F2FS_SUPER_OFFSET
  fscrypt: fix build with pre-4.6 gcc versions
  fscrypt: remove 'ci' parameter from fscrypt_put_encryption_info()
  fscrypt: fix up fscrypt_fname_encrypted_size() for internal use
  fscrypt: define fscrypt_fname_alloc_buffer() to be for presented names
  ext4: switch to fscrypt ->symlink() helper functions
  ext4: switch to fscrypt_get_symlink()
  fscrypt: calculate NUL-padding length in one place only
  fscrypt: move fscrypt_symlink_data to fscrypt_private.h
  fscrypt: remove fscrypt_fname_usr_to_disk()
  f2fs: switch to fscrypt_get_symlink()
  f2fs: switch to fscrypt ->symlink() helper functions
  fscrypt: new helper function - fscrypt_get_symlink()
  fscrypt: new helper functions for ->symlink()
  fscrypt: trim down fscrypt.h includes
  fscrypt: move fscrypt_is_dot_dotdot() to fs/crypto/fname.c
  fscrypt: move fscrypt_valid_enc_modes() to fscrypt_private.h
  fscrypt: move fscrypt_operations declaration to fscrypt_supp.h
  fscrypt: split fscrypt_dummy_context_enabled() into supp/notsupp versions
  fscrypt: move fscrypt_ctx declaration to fscrypt_supp.h
  fscrypt: move fscrypt_info_cachep declaration to fscrypt_private.h
  fscrypt: move fscrypt_control_page() to supp/notsupp headers
  fscrypt: move fscrypt_has_encryption_key() to supp/notsupp headers
  f2fs: don't put dentry page in pagecache into highmem
  f2fs: support inode creation time
  f2fs: rebuild sit page from sit info in mem
  f2fs: stop issuing discard if fs is readonly
  f2fs: clean up duplicated assignment in init_discard_policy
  f2fs: use GFP_F2FS_ZERO for cleanup
  f2fs: allow to recover node blocks given updated checkpoint
  f2fs: recover some i_inline flags
  f2fs: correct removexattr behavior for null valued extended attribute
  f2fs: drop page cache after fs shutdown
  f2fs: stop gc/discard thread after fs shutdown
  f2fs: hanlde error case in f2fs_ioc_shutdown
  f2fs: split need_inplace_update
  f2fs: fix to update last_disk_size correctly
  f2fs: kill F2FS_INLINE_XATTR_ADDRS for cleanup
  f2fs: clean up error path of fill_super
  f2fs: avoid hungtask when GC encrypted block if io_bits is set
  f2fs: allow quota to use reserved blocks
  f2fs: fix to drop all inmem pages correctly
  f2fs: speed up defragment on sparse file
  f2fs: support F2FS_IOC_PRECACHE_EXTENTS
  f2fs: add an ioctl to disable GC for specific file
  f2fs: prevent newly created inode from being dirtied incorrectly
  f2fs: support FIEMAP_FLAG_XATTR
  f2fs: fix to cover f2fs_inline_data_fiemap with inode_lock
  f2fs: check node page again in write end io
  f2fs: fix to caclulate required free section correctly
  f2fs: handle newly created page when revoking inmem pages
  f2fs: add resgid and resuid to reserve root blocks
  f2fs: implement cgroup writeback support
  f2fs: remove unused pend_list_tag
  f2fs: avoid high cpu usage in discard thread
  f2fs: make local functions static
  f2fs: add reserved blocks for root user
  f2fs: check segment type in __f2fs_replace_block
  f2fs: update inode info to inode page for new file
  f2fs: show precise # of blocks that user/root can use
  f2fs: clean up unneeded declaration
  f2fs: continue to do direct IO if we only preallocate partial blocks
  f2fs: enable quota at remount from r to w
  f2fs: skip stop_checkpoint for user data writes
  f2fs: fix missing error number for xattr operation
  f2fs: recover directory operations by fsync
  f2fs: return error during fill_super
  f2fs: fix an error case of missing update inode page
  f2fs: fix potential hangtask in f2fs_trace_pid
  f2fs: no need return value in restore summary process
  f2fs: use unlikely for release case
  f2fs: don't return value in truncate_data_blocks_range
  f2fs: clean up f2fs_map_blocks
  f2fs: clean up hash codes
  f2fs: fix error handling in fill_super
  f2fs: spread f2fs_k{m,z}alloc
  f2fs: inject fault to kvmalloc
  f2fs: inject fault to kzalloc
  f2fs: remove a redundant conditional expression
  f2fs: apply write hints to select the type of segment for direct write
  f2fs: switch to fscrypt_prepare_setattr()
  f2fs: switch to fscrypt_prepare_lookup()
  f2fs: switch to fscrypt_prepare_rename()
  f2fs: switch to fscrypt_prepare_link()
  f2fs: switch to fscrypt_file_open()
  f2fs: remove repeated f2fs_bug_on
  f2fs: remove an excess variable
  f2fs: fix lock dependency in between dio_rwsem & i_mmap_sem
  f2fs: remove unused parameter
  f2fs: still write data if preallocate only partial blocks
  f2fs: introduce sysfs readdir_ra to readahead inode block in readdir
  f2fs: fix concurrent problem for updating free bitmap
  f2fs: remove unneeded memory footprint accounting
  f2fs: no need to read nat block if nat_block_bitmap is set
  f2fs: reserve nid resource for quota sysfile
  fscrypt: move to generic async completion
  crypto: introduce crypto wait for async op
  fscrypt: lock mutex before checking for bounce page pool
  fscrypt: new helper function - fscrypt_prepare_setattr()
  fscrypt: new helper function - fscrypt_prepare_lookup()
  fscrypt: new helper function - fscrypt_prepare_rename()
  fscrypt: new helper function - fscrypt_prepare_link()
  fscrypt: new helper function - fscrypt_file_open()
  fscrypt: new helper function - fscrypt_require_key()
  fscrypt: remove unneeded empty fscrypt_operations structs
  fscrypt: remove ->is_encrypted()
  fscrypt: switch from ->is_encrypted() to IS_ENCRYPTED()
  fs, fscrypt: add an S_ENCRYPTED inode flag
  fscrypt: clean up include file mess
  fscrypt: fix dereference of NULL user_key_payload
  fscrypt: make ->dummy_context() return bool
  f2fs: deny accessing encryption policy if encryption is off
  f2fs: inject fault in inc_valid_node_count
  f2fs: fix to clear FI_NO_PREALLOC
  f2fs: expose quota information in debugfs
  f2fs: separate nat entry mem alloc from nat_tree_lock
  f2fs: validate before set/clear free nat bitmap
  f2fs: avoid opened loop codes in __add_ino_entry
  f2fs: apply write hints to select the type of segments for buffered write
  f2fs: introduce scan_curseg_cache for cleanup
  f2fs: optimize the way of traversing free_nid_bitmap
  f2fs: keep scanning until enough free nids are acquired
  f2fs: trace checkpoint reason in fsync()
  f2fs: keep isize once block is reserved cross EOF
  f2fs: avoid race in between GC and block exchange
  f2fs: save a multiplication for last_nid calculation
  f2fs: fix summary info corruption
  f2fs: remove dead code in update_meta_page
  f2fs: remove unneeded semicolon
  f2fs: don't bother with inode->i_version
  f2fs: check curseg space before foreground GC
  f2fs: use rw_semaphore to protect SIT cache
  f2fs: support quota sys files
  f2fs: add quota_ino feature infra
  f2fs: optimize __update_nat_bits
  f2fs: modify for accurate fggc node io stat
  Revert "f2fs: handle dirty segments inside refresh_sit_entry"
  f2fs: add a function to move nid
  f2fs: export SSR allocation threshold
  f2fs: give correct trimmed blocks in fstrim
  f2fs: support bio allocation error injection
  f2fs: support get_page error injection
  f2fs: add missing sysfs description
  f2fs: support soft block reservation
  f2fs: handle error case when adding xattr entry
  f2fs: support flexible inline xattr size
  f2fs: show current cp state
  f2fs: add missing quota_initialize
  f2fs: show # of dirty segments via sysfs
  f2fs: stop all the operations by cp_error flag
  f2fs: remove several redundant assignments
  f2fs: avoid using timespec
  f2fs: fix to correct no_fggc_candidate
  Revert "f2fs: return wrong error number on f2fs_quota_write"
  f2fs: remove obsolete pointer for truncate_xattr_node
  f2fs: retry ENOMEM for quota_read|write
  f2fs: limit # of inmemory pages
  f2fs: update ctx->pos correctly when hitting hole in directory
  f2fs: relocate readahead codes in readdir()
  f2fs: allow readdir() to be interrupted
  f2fs: trace f2fs_readdir
  f2fs: trace f2fs_lookup
  f2fs: skip searching non-exist range in truncate_hole
  f2fs: expose some sectors to user in inline data or dentry case
  f2fs: avoid stale fi->gdirty_list pointer
  f2fs/crypto: drop crypto key at evict_inode only
  f2fs: fix to avoid race when accessing last_disk_size
  f2fs: Fix bool initialization/comparison
  f2fs: give up CP_TRIMMED_FLAG if it drops discards
  f2fs: trace f2fs_remove_discard
  f2fs: reduce cmd_lock coverage in __issue_discard_cmd
  f2fs: split discard policy
  f2fs: wrap discard policy
  f2fs: support issuing/waiting discard in range
  f2fs: fix to flush multiple device in checkpoint
  f2fs: enhance multiple device flush
  f2fs: fix to show ino management cache size correctly
  f2fs: drop FI_UPDATE_WRITE tag after f2fs_issue_flush
  f2fs: obsolete ALLOC_NID_LIST list
  f2fs: convert inline data for direct I/O & FI_NO_PREALLOC
  f2fs: allow readpages with NULL file pointer
  f2fs: show flush list status in sysfs
  f2fs: introduce read_xattr_block
  f2fs: introduce read_inline_xattr
  Revert "f2fs: reuse nids more aggressively"
  Revert "f2fs: node segment is prior to data segment selected victim"
  f2fs: fix potential panic during fstrim
  f2fs: hurry up to issue discard after io interruption
  f2fs: fix to show correct discard_granularity in sysfs
  f2fs: detect dirty inode in evict_inode
  f2fs: clear radix tree dirty tag of pages whose dirty flag is cleared
  f2fs: speed up gc_urgent mode with SSR
  f2fs: better to wait for fstrim completion
  f2fs: avoid race in between read xattr & write xattr
  f2fs: make get_lock_data_page to handle encrypted inode
  f2fs: use generic terms used for encrypted block management
  f2fs: introduce f2fs_encrypted_file for clean-up
  Revert "f2fs: add a new function get_ssr_cost"
  f2fs: constify super_operations
  f2fs: fix to wake up all sleeping flusher
  f2fs: avoid race in between atomic_read & atomic_inc
  f2fs: remove unneeded parameter of change_curseg
  f2fs: update i_flags correctly
  f2fs: don't check inode's checksum if it was dirtied or writebacked
  f2fs: don't need to update inode checksum for recovery
  f2fs: trigger fdatasync for non-atomic_write file
  f2fs: fix to avoid race in between aio and gc
  f2fs: wake up discard_thread iff there is a candidate
  f2fs: return error when accessing insane flie offset
  f2fs: trigger normal fsync for non-atomic_write file
  f2fs: clear FI_HOT_DATA correctly
  f2fs: fix out-of-order execution in f2fs_issue_flush
  f2fs: issue discard commands if gc_urgent is set
  f2fs: introduce discard_granularity sysfs entry
  f2fs: remove unused function overprovision_sections
  f2fs: check hot_data for roll-forward recovery
  f2fs: add tracepoint for f2fs_gc
  f2fs: retry to revoke atomic commit in -ENOMEM case
  f2fs: let fill_super handle roll-forward errors
  f2fs: merge equivalent flags F2FS_GET_BLOCK_[READ|DIO]
  f2fs: support journalled quota
  f2fs: fix potential overflow when adjusting GC cycle
  f2fs: avoid unneeded sync on quota file
  f2fs: introduce gc_urgent mode for background GC
  f2fs: use IPU for cold files
  f2fs: fix the size value in __check_sit_bitmap
  f2fs: add app/fs io stat
  f2fs: do not change the valid_block value if cur_valid_map was wrongly set or cleared
  f2fs: update cur_valid_map_mir together with cur_valid_map
  f2fs: use printk_ratelimited for f2fs_msg
  f2fs: expose features to sysfs entry
  f2fs: support inode checksum
  f2fs: return wrong error number on f2fs_quota_write
  f2fs: provide f2fs_balance_fs to __write_node_page
  f2fs: introduce f2fs_statfs_project
  f2fs: don't need to wait for node writes for atomic write
  f2fs: avoid naming confusion of sysfs init
  f2fs: support project quota
  f2fs: record quota during dot{,dot} recovery
  f2fs: enhance on-disk inode structure scalability
  f2fs: make max inline size changeable
  f2fs: add ioctl to expose current features
  f2fs: make background threads of f2fs being aware of freezing
  f2fs: don't give partially written atomic data from process crash
  f2fs: give a try to do atomic write in -ENOMEM case
  f2fs: preserve i_mode if __f2fs_set_acl() fails
  f2fs: alloc new nids for xattr block in recovery
  f2fs: spread struct f2fs_dentry_ptr for inline path
  f2fs: remove unused input parameter
  f2fs: avoid cpu lockup
  f2fs: include seq_file.h for sysfs.c
  f2fs: Don't clear SGID when inheriting ACLs
  f2fs: remove extra inode_unlock() in error path
  fscrypt: add support for AES-128-CBC
  fscrypt: inline fscrypt_free_filename()
  f2fs: make more close to v4.13-rc1
  f2fs: support plain user/group quota
  f2fs: avoid deadlock caused by lock order of page and lock_op
  f2fs: use spin_{,un}lock_irq{save,restore}
  f2fs: relax migratepage for atomic written page
  f2fs: don't count inode block in in-memory inode.i_blocks
  Revert "f2fs: fix to clean previous mount option when remount_fs"
  f2fs: do not set LOST_PINO for renamed dir
  f2fs: do not set LOST_PINO for newly created dir
  f2fs: skip ->writepages for {mete,node}_inode during recovery
  f2fs: introduce __check_sit_bitmap
  f2fs: stop gc/discard thread in prior during umount
  f2fs: introduce reserved_blocks in sysfs
  f2fs: avoid redundant f2fs_flush after remount
  f2fs: report # of free inodes more precisely
  f2fs: add ioctl to do gc with target block address
  f2fs: don't need to check encrypted inode for partial truncation
  f2fs: measure inode.i_blocks as generic filesystem
  f2fs: set CP_TRIMMED_FLAG correctly
  f2fs: require key for truncate(2) of encrypted file
  f2fs: move sysfs code from super.c to fs/f2fs/sysfs.c
  f2fs: clean up sysfs codes
  f2fs: fix wrong error number of fill_super
  f2fs: fix to show injection rate in ->show_options
  f2fs: Fix a return value in case of error in 'f2fs_fill_super'
  f2fs: use proper variable name
  f2fs: fix to avoid panic when encountering corrupt node
  f2fs: don't track newly allocated nat entry in list
  f2fs: add f2fs_bug_on in __remove_discard_cmd
  f2fs: introduce __wait_one_discard_bio
  f2fs: dax: fix races between page faults and truncating pages
  f2fs: simplify the way of calulating next nat address
  f2fs: sanity check size of nat and sit cache
  f2fs: fix a panic caused by NULL flush_cmd_control
  f2fs: remove the unnecessary cast for PTR_ERR
  f2fs: remove false-positive bug_on
  f2fs: Do not issue small discards in LFS mode
  f2fs: don't bother checking for encryption key in ->write_iter()
  f2fs: don't bother checking for encryption key in ->mmap()
  f2fs: wait discard IO completion without cmd_lock held
  f2fs: wake up all waiters in f2fs_submit_discard_endio
  f2fs: show more info if fail to issue discard
  f2fs: introduce io_list for serialize data/node IOs
  f2fs: split wio_mutex
  f2fs: combine huge num of discard rb tree consistence checks
  f2fs: fix a bug caused by NULL extent tree
  f2fs: try to freeze in gc and discard threads
  f2fs: add a new function get_ssr_cost
  f2fs: declare load_free_nid_bitmap static
  f2fs: avoid f2fs_lock_op for IPU writes
  f2fs: split bio cache
  f2fs: use fio instead of multiple parameters
  f2fs: remove unnecessary read cases in merged IO flow
  f2fs: use f2fs_submit_page_bio for ra_meta_pages
  f2fs: make sure f2fs_gc returns consistent errno
  f2fs: load inode's flag from disk
  f2fs: sanity check checkpoint segno and blkoff
  f2fs/fscrypt: catch up to v4.12
  KEYS: Differentiate uses of rcu_dereference_key() and user_key_payload()

Conflicts:
	arch/arm64/Kconfig
	arch/arm64/include/asm/cpucaps.h
	arch/arm64/include/asm/thread_info.h
	fs/crypto/keyinfo.c
	fs/f2fs/data.c
	kernel/fork.c

Extra changes were required in below files due to:

  Hard conflicts against HW based FBE

	fs/crypto/fscrypt_ice.c
	fs/crypto/fscrypt_ice.h
	fs/crypto/keyinfo.c
	fs/f2fs/data.c

  Compilation errors

    Move inclusion of psci and arm-smccc headers to beginning of
    source:

	arch/arm64/kernel/cpu_errata.c

    Upstream change [1] fixes compilation errors in below files
    and it was amended into the merge:

	arch/arm64/include/asm/assembler.h
	arch/arm64/kernel/entry.S

[1] 0137ea2 ANDROID: arm64: Fix 4.9.114 merge
    https://android-review.googlesource.com/c/kernel/common/+/724266

Change-Id: Id2345a06d926b99117d785a24cec2ce8624cb106
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2018-09-28 03:19:19 -07:00
Sultan Alsawaf
47bbcd6bf8 ANDROID: Fix massive cpufreq_times memory leaks
Every time _cpu_up() is called for a CPU, idle_thread_get() is called
which then re-initializes a CPU's idle thread that was already
previously created and cached in a global variable in
smpboot.c. idle_thread_get() calls init_idle() which then calls
__sched_fork(). __sched_fork() is where cpufreq_task_times_init() is,
and cpufreq_task_times_init() allocates memory for the task struct's
time_in_state array.

Since idle_thread_get() reuses a task struct instance that was already
previously created, this means that every time it calls init_idle(),
cpufreq_task_times_init() allocates this array again and overwrites
the existing allocation that the idle thread already had.

This causes memory to be leaked every time a CPU is onlined. In order
to fix this, move allocation of time_in_state into _do_fork to avoid
allocating it at all for idle threads. The cpufreq times interface is
intended to be used for tracking userspace tasks, so we can safely
remove it from the kernel's idle threads without killing any
functionality.

But that's not all!

Task structs can be freed outside of release_task(), which creates
another memory leak because a task struct can be freed without having
its cpufreq times allocation freed. To fix this, free the cpufreq
times allocation at the same time that task struct allocations are
freed, in free_task().

Since free_task() can also be called in error paths of copy_process()
after dup_task_struct(), set time_in_state to NULL immediately after
calling dup_task_struct() to avoid possible double free.

Bug description and fix adapted from patch submitted by
Sultan Alsawaf <sultanxda@gmail.com> at
https://android-review.googlesource.com/c/kernel/msm/+/700134

Bug: 110044919
Test: Hikey960 builds, boots & reports /proc/<pid>/time_in_state
correctly
Change-Id: I12fe7611fc88eb7f6c39f8f7629ad27b6ec4722c
Signed-off-by: Connor O'Brien <connoro@google.com>
2018-07-18 13:22:08 +00:00
Connor O'Brien
23a1412b82 ANDROID: Reduce use of #ifdef CONFIG_CPU_FREQ_TIMES
Add empty versions of functions to cpufreq_times.h to cut down on use
of #ifdef in .c files.

Test: kernel builds with and without CONFIG_CPU_FREQ_TIMES=y
Change-Id: I49ac364fac3d42bba0ca1801e23b15081094fb12
Signed-off-by: Connor O'Brien <connoro@google.com>
2018-07-18 13:21:52 +00:00
Tingting Yang
1a538b3997 msm: move printk out of spin lock low_water_lock
cpu3 stuck in printk more time in spin lock low_water_lock cause cpu0
get spin lock fail and system crashed.

CRs-Fixed: 969097
Change-Id: I75356a4b4171ae2888ce6cce792f569b5ca8cdcf
Signed-off-by: Tingting Yang <tingting@codeaurora.org>
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
Signed-off-by: Minming Qi <mqi@codeaurora.org>
2018-06-20 22:38:09 -07:00
Blagovest Kolenichev
ff75e7b48e Merge android-4.9.101 (aef17a58) into msm-4.9
* refs/heads/tmp-aef17a58:
  Linux 4.9.101
  kernel/exit.c: avoid undefined behaviour when calling wait4()
  futex: futex_wake_op, fix sign_extend32 sign bits
  proc: do not access cmdline nor environ from file-backed areas
  nfp: TX time stamp packets before HW doorbell is rung
  l2tp: revert "l2tp: fix missing print session offset info"
  Revert "ARM: dts: imx6qdl-wandboard: Fix audio channel swap"
  lockd: lost rollback of set_grace_period() in lockd_down_net()
  xfrm: fix xfrm_do_migrate() with AEAD e.g(AES-GCM)
  futex: Remove duplicated code and fix undefined behaviour
  serial: sccnxp: Fix error handling in sccnxp_probe()
  sctp: delay the authentication for the duplicated cookie-echo chunk
  sctp: fix the issue that the cookie-ack with auth can't get processed
  tcp: ignore Fast Open on repair mode
  bonding: send learning packets for vlans on slave
  net/mlx5: Avoid cleaning flow steering table twice during error flow
  bonding: do not allow rlb updates to invalid mac
  tg3: Fix vunmap() BUG_ON() triggered from tg3_free_consistent().
  tcp_bbr: fix to zero idle_restart only upon S/ACKed data
  sctp: use the old asoc when making the cookie-ack chunk in dupcook_d
  sctp: remove sctp_chunk_put from fail_mark err path in sctp_ulpevent_make_rcvmsg
  sctp: handle two v4 addrs comparison in sctp_inet6_cmp_addr
  r8169: fix powering up RTL8168h
  qmi_wwan: do not steal interfaces from class drivers
  openvswitch: Don't swap table in nlattr_set() after OVS_ATTR_NESTED is found
  net: support compat 64-bit time in {s,g}etsockopt
  net_sched: fq: take care of throttled flows before reuse
  net/mlx5: E-Switch, Include VF RDMA stats in vport statistics
  net/mlx4_en: Verify coalescing parameters are in range
  net: ethernet: ti: cpsw: fix packet leaking in dual_mac mode
  net: ethernet: sun: niu set correct packet size in skb
  llc: better deal with too small mtu
  ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg
  dccp: fix tasklet usage
  bridge: check iface upper dev when setting master via ioctl
  8139too: Use disable_irq_nosync() in rtl8139_poll_controller()
  ANDROID: sdcardfs: Don't d_drop in d_revalidate
  FROMLIST: brcmfmac: fix initialization of struct cfg80211_inform_bss variable
  FROMLIST: brcmfmac: reports boottime_ns while informing bss

Change-Id: Idfe62af1b38254bed44364aa6ef001c38a5ad285
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2018-05-29 07:45:26 -07:00
Greg Kroah-Hartman
aef17a58e8 Merge 4.9.101 into android-4.9
Changes in 4.9.101
	8139too: Use disable_irq_nosync() in rtl8139_poll_controller()
	bridge: check iface upper dev when setting master via ioctl
	dccp: fix tasklet usage
	ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg
	llc: better deal with too small mtu
	net: ethernet: sun: niu set correct packet size in skb
	net: ethernet: ti: cpsw: fix packet leaking in dual_mac mode
	net/mlx4_en: Verify coalescing parameters are in range
	net/mlx5: E-Switch, Include VF RDMA stats in vport statistics
	net_sched: fq: take care of throttled flows before reuse
	net: support compat 64-bit time in {s,g}etsockopt
	openvswitch: Don't swap table in nlattr_set() after OVS_ATTR_NESTED is found
	qmi_wwan: do not steal interfaces from class drivers
	r8169: fix powering up RTL8168h
	sctp: handle two v4 addrs comparison in sctp_inet6_cmp_addr
	sctp: remove sctp_chunk_put from fail_mark err path in sctp_ulpevent_make_rcvmsg
	sctp: use the old asoc when making the cookie-ack chunk in dupcook_d
	tcp_bbr: fix to zero idle_restart only upon S/ACKed data
	tg3: Fix vunmap() BUG_ON() triggered from tg3_free_consistent().
	bonding: do not allow rlb updates to invalid mac
	net/mlx5: Avoid cleaning flow steering table twice during error flow
	bonding: send learning packets for vlans on slave
	tcp: ignore Fast Open on repair mode
	sctp: fix the issue that the cookie-ack with auth can't get processed
	sctp: delay the authentication for the duplicated cookie-echo chunk
	serial: sccnxp: Fix error handling in sccnxp_probe()
	futex: Remove duplicated code and fix undefined behaviour
	xfrm: fix xfrm_do_migrate() with AEAD e.g(AES-GCM)
	lockd: lost rollback of set_grace_period() in lockd_down_net()
	Revert "ARM: dts: imx6qdl-wandboard: Fix audio channel swap"
	l2tp: revert "l2tp: fix missing print session offset info"
	nfp: TX time stamp packets before HW doorbell is rung
	proc: do not access cmdline nor environ from file-backed areas
	futex: futex_wake_op, fix sign_extend32 sign bits
	kernel/exit.c: avoid undefined behaviour when calling wait4()
	Linux 4.9.101

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-05-19 14:06:17 +02:00
zhongjiang
04103c29b6 kernel/exit.c: avoid undefined behaviour when calling wait4()
commit dd83c161fbcc5d8be637ab159c0de015cbff5ba4 upstream.

wait4(-2147483648, 0x20, 0, 0xdd0000) triggers:
UBSAN: Undefined behaviour in kernel/exit.c:1651:9

The related calltrace is as follows:

  negation of -2147483648 cannot be represented in type 'int':
  CPU: 9 PID: 16482 Comm: zj Tainted: G    B          ---- -------   3.10.0-327.53.58.71.x86_64+ #66
  Hardware name: Huawei Technologies Co., Ltd. Tecal RH2285          /BC11BTSA              , BIOS CTSAV036 04/27/2011
  Call Trace:
    dump_stack+0x19/0x1b
    ubsan_epilogue+0xd/0x50
    __ubsan_handle_negate_overflow+0x109/0x14e
    SyS_wait4+0x1cb/0x1e0
    system_call_fastpath+0x16/0x1b

Exclude the overflow to avoid the UBSAN warning.

Link: http://lkml.kernel.org/r/1497264618-20212-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhongjiang <zhongjiang@huawei.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-19 10:27:01 +02:00
Blagovest Kolenichev
f4b8243182 Merge android-4.9.93 (05baf14) into msm-4.9
* refs/heads/tmp-05baf14:
  Linux 4.9.93
  spi: davinci: fix up dma_mapping_error() incorrect patch
  Revert "ip6_vti: adjust vti mtu according to mtu of lower device"
  Revert "mtip32xx: use runtime tag to initialize command header"
  Revert "spi: bcm-qspi: shut up warning about cfi header inclusion"
  Revert "ARM: dts: omap3-n900: Fix the audio CODEC's reset pin"
  Revert "ARM: dts: am335x-pepper: Fix the audio CODEC's reset pin"
  Fix slab name "biovec-(1<<(21-12))"
  net: hns: Fix ethtool private flags
  md/raid10: reset the 'first' at the end of loop
  ARM: dts: am57xx-idk-common: Add overide powerhold property
  ARM: dts: am57xx-beagle-x15-common: Add overide powerhold property
  ARM: dts: dra7: Add power hold and power controller properties to palmas
  Documentation: pinctrl: palmas: Add ti,palmas-powerhold-override property definition
  vt: change SGR 21 to follow the standards
  Input: i8042 - enable MUX on Sony VAIO VGN-CS series to fix touchpad
  Input: i8042 - add Lenovo ThinkPad L460 to i8042 reset list
  Input: ALPS - fix TrackStick detection on Thinkpad L570 and Latitude 7370
  staging: comedi: ni_mio_common: ack ai fifo error interrupts.
  crypto: x86/cast5-avx - fix ECB encryption when long sg follows short one
  crypto: ahash - Fix early termination in hash walk
  parport_pc: Add support for WCH CH382L PCI-E single parallel port card.
  media: usbtv: prevent double free in error case
  mei: remove dev_err message on an unsupported ioctl
  USB: serial: cp210x: add ELDAT Easywave RX09 id
  USB: serial: ftdi_sio: add support for Harman FirmwareHubEmulator
  USB: serial: ftdi_sio: add RT Systems VX-8 cable
  arm64: idmap: Use "awx" flags for .idmap.text .pushsection directives
  arm64: entry: Reword comment about post_ttbr_update_workaround
  arm64: Force KPTI to be disabled on Cavium ThunderX
  arm64: kpti: Add ->enable callback to remap swapper using nG mappings
  arm64: kpti: Make use of nG dependent on arm64_kernel_unmapped_at_el0()
  arm64: Turn on KPTI only on CPUs that need it
  arm64: cputype: Add MIDR values for Cavium ThunderX2 CPUs
  arm64: capabilities: Handle duplicate entries for a capability
  arm64: Allow checking of a CPU-local erratum
  arm64: Take into account ID_AA64PFR0_EL1.CSV3
  arm64: Kconfig: Reword UNMAP_KERNEL_AT_EL0 kconfig entry
  arm64: Kconfig: Add CONFIG_UNMAP_KERNEL_AT_EL0
  arm64: use RET instruction for exiting the trampoline
  arm64: kaslr: Put kernel vectors address in separate data page
  arm64: entry: Add fake CPU feature for unmapping the kernel at EL0
  arm64: tls: Avoid unconditional zeroing of tpidrro_el0 for native tasks
  arm64: entry: Hook up entry trampoline to exception vectors
  arm64: entry: Explicitly pass exception level to kernel_ventry macro
  arm64: mm: Map entry trampoline into trampoline and kernel page tables
  arm64: entry: Add exception trampoline page for exceptions from EL0
  module: extend 'rodata=off' boot cmdline parameter to module mappings
  arm64: factor out entry stack manipulation
  arm64: mm: Invalidate both kernel and user ASIDs when performing TLBI
  arm64: mm: Add arm64_kernel_unmapped_at_el0 helper
  arm64: mm: Allocate ASIDs in pairs
  arm64: mm: Move ASID from TTBR0 to TTBR1
  arm64: mm: Use non-global mappings for kernel space
  usb: dwc2: Improve gadget state disconnection handling
  scsi: virtio_scsi: always read VPD pages for multiqueue too
  llist: clang: introduce member_address_is_nonnull()
  Bluetooth: Fix missing encryption refresh on Security Request
  netfilter: x_tables: add and use xt_check_proc_name
  netfilter: bridge: ebt_among: add more missing match size checks
  xfrm: Refuse to insert 32 bit userspace socket policies on 64 bit systems
  net: xfrm: use preempt-safe this_cpu_read() in ipcomp_alloc_tfms()
  RDMA/ucma: Introduce safer rdma_addr_size() variants
  RDMA/ucma: Check that device exists prior to accessing it
  RDMA/ucma: Check that device is connected prior to access it
  RDMA/ucma: Ensure that CM_ID exists prior to access it
  RDMA/ucma: Fix use-after-free access in ucma_close
  RDMA/ucma: Check AF family prior resolving address
  xfrm_user: uncoditionally validate esn replay attribute struct
  mm/vmscan.c: fix unsequenced modification and access warning
  selinux: Remove redundant check for unknown labeling behavior
  arm64: avoid overflow in VA_START and PAGE_OFFSET
  btrfs: Remove extra parentheses from condition in copy_items()
  mac80211: ibss: Fix channel type enum in ieee80211_sta_join_ibss()
  mac80211: Fix clang warning about constant operand in logical operation
  netfilter: ctnetlink: Make some parameters integer to avoid enum mismatch
  HID: sony: Use LED_CORE_SUSPENDRESUME
  cfg80211: Fix array-bounds warning in fragment copy
  nl80211: Fix enum type of variable in nl80211_put_sta_rate()
  xgene_enet: remove bogus forward declarations
  usb: gadget: remove redundant self assignment
  frv: declare jiffies to be located in the .data section
  jiffies.h: declare jiffies and jiffies_64 with ____cacheline_aligned_in_smp
  fs: compat: Remove warning from COMPATIBLE_IOCTL
  selinux: Remove unnecessary check of array base in selinux_set_mapping()
  cpumask: Add helper cpumask_available()
  genirq: Use cpumask_available() for check of cpumask variable
  netfilter: nf_nat_h323: fix logical-not-parentheses warning
  Input: mousedev - fix implicit conversion warning
  dm ioctl: remove double parentheses
  PCI: Make PCI_ROM_ADDRESS_MASK a 32-bit constant
  kprobes/x86: Fix to set RWX bits correctly before releasing trampoline
  partitions/msdos: Unable to mount UFS 44bsd partitions
  powerpc/64s: Fix i-side SLB miss bad address handler saving nonvolatile GPRs
  powerpc/64s: Fix lost pending interrupt due to race causing lost update to irq_happened
  ipc/shm.c: add split function to shm_vm_ops
  ceph: only dirty ITER_IOVEC pages for direct read
  perf/hwbp: Simplify the perf-hwbp code, fix documentation
  ALSA: pcm: potential uninitialized return values
  ALSA: pcm: Use dma_bytes as size parameter in dma_mmap_coherent()
  ALSA: usb-audio: Add native DSD support for TEAC UD-301
  mtd: jedec_probe: Fix crash in jedec_read_mfr()
  ARM: 8746/1: vfp: Go back to clearing vfp_current_hw_state[]
  ANDROID: fuse: Add null terminator to path in canonical path to avoid issue
  ANDROID: sdcardfs: Fix sdcardfs to stop creating cases-sensitive duplicate entries.
  ANDROID: cpufreq: times: skip printing invalid frequencies
  ANDROID: cpufreq: Add time_in_state to /proc/uid directories
  ANDROID: proc: Add /proc/uid directory
  ANDROID: cpufreq: times: track per-uid time in state
  ANDROID: cpufreq: track per-task time in state
  arm64: fix show_data fallout from KERN_CONT changes
  arm: fix show_data fallout from KERN_CONT changes

Conflicts:
	arch/arm64/include/asm/assembler.h
	arch/arm64/include/asm/cputype.h
	arch/arm64/include/asm/sysreg.h
	arch/arm64/kernel/cpufeature.c
	kernel/sched/core.c

Change-Id: If39e1c5577a1c9345b1b2739f4a5368422cef135
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2018-05-11 03:06:00 -07:00
Matt Wagantall
72633f8045 exit: Add PANIC_ON_RECURSIVE_FAULT Kconfig option
If a recursive fault is detected during do_exit(), tasks are left
to sit and wait in an un-interruptible sleep until the system
reboots (typically manually). Add Kconfig option to change this
behaviour and force a panic.

This is particularly important if a critical system task encounters
a recursive fault (ex. a kworker). Otherwise, the system may be
unusable, but since the scheduler is still running system watchdogs
may continue to be pet.

Change-Id: Ifc26fc79d6066f05a3b2c4d27f78bf4f8d2bd640
Signed-off-by: Matt Wagantall <mattw@codeaurora.org>
2018-04-16 17:49:42 -07:00
Connor O'Brien
6e7b83d80b ANDROID: cpufreq: track per-task time in state
Add time in state data to task structs, and create
/proc/<pid>/time_in_state files to show how long each individual task
has run at each frequency.
Create a CONFIG_CPU_FREQ_TIMES option to enable/disable this tracking.

Signed-off-by: Connor O'Brien <connoro@google.com>
Bug: 72339335
Bug: 70951257
Test: Read /proc/<pid>/time_in_state
Change-Id: Ia6456754f4cb1e83b2bc35efa8fbe9f8696febc8
2018-04-03 11:15:30 -07:00
Liam Mark
4f1cdd2baf android/lowmemorykiller: Ignore tasks with freed mm
A killed task can stay in the task list long after its
memory has been returned to the system, therefore
ignore any tasks whose mm struct has been freed.

Change-Id: I76394b203b4ab2312437c839976f0ecb7b6dde4e
Signed-off-by: Liam Mark <lmark@codeaurora.org>
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2017-07-20 11:17:28 +05:30
Syed Rameez Mustafa
d5b9f97f3c sched: Call sched_exit() when a task is exiting
The scheduler needs to do some book-keeping when a task exits.
Call the scheduler hook sched_exit() that takes care of all of
that necessary book-keeping.

Change-Id: I551aead70248b06f9d918e6d032075d6ecaa7fed
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2017-03-24 15:57:31 -07:00
Patrick Bellasi
d248900606 ANDROID: FIXUP: sched/tune: fix accounting for runnable tasks
Contains:

sched/tune: fix accounting for runnable tasks (1/5)

The accounting for tasks into boost groups of different CPUs is currently
broken mainly because:
a) we do not properly track the change of boost group of a RUNNABLE task
b) there are race conditions between migration code and accounting code

This patch provides a fixes to ensure enqueue/dequeue
accounting also for throttled tasks.

Without this patch is can happen that a task is enqueued into a throttled
RQ thus not being accounted for the boosting of the corresponding RQ.
We could argue that a throttled task should not boost a CPU, however:
a) properly implementing CPU boosting considering throttled tasks will
   increase a lot the complexity of the solution
b) it's not easy to quantify the benefits introduced by such a more
   complex solution

Since task throttling requires the usage of the CFS bandwidth controller,
which is not widely used on mobile systems (at least not by Android kernels
so far), for the time being we go for the simple solution and boost also
for throttled RQs.

sched/tune: fix accounting for runnable tasks (2/5)

This patch provides the code required to enforce proper locking.
A per boost group spinlock has been added to grant atomic
accounting of tasks as well as to serialise enqueue/dequeue operations,
triggered by tasks migrations, with cgroups's attach/detach operations.

sched/tune: fix accounting for runnable tasks (3/5)

This patch adds cgroups {allow,can,cancel}_attach callbacks.

Since a task can be migrated between boost groups while it's running,
the CGroups's attach callbacks have been added to properly migrate
boost contributions of RUNNABLE tasks.

The RQ's lock is used to serialise enqueue/dequeue operations, triggered
by tasks migrations, with cgroups's attach/detach operations. While the
SchedTune's CPU lock is used to grant atrocity of the accounting within
the CPU.

NOTE: the current implementation does not allows a concurrent CPU migration
      and CGroups change.

sched/tune: fix accounting for runnable tasks (4/5)

This fixes accounting for exiting tasks by adding a dedicated call early
in the do_exit() syscall, which disables SchedTune accounting as soon as a
task is flagged PF_EXITING.

This flag is set before the multiple dequeue/enqueue dance triggered
by cgroup_exit() which is useful only to inject useless tasks movements
thus increasing possibilities for race conditions with the migration code.
The schedtune_exit_task() call does the last dequeue of a task from its
current boost group. This is a solution more aligned with what happens in
mainline kernels (>v4.4) where the exit_cgroup does not move anymore a dying
task to the root control group.

sched/tune: fix accounting for runnable tasks (5/5)

To avoid accounting issues at startup, this patch disable the SchedTune
accounting until the required data structures have been properly
initialized.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
[jstultz: fwdported to 4.4]
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:56 -08:00
Oleg Nesterov
8e5bfa8c1f sched/autogroup: Do not use autogroup->tg in zombie threads
Exactly because for_each_thread() in autogroup_move_group() can't see it
and update its ->sched_task_group before _put() and possibly free().

So the exiting task needs another sched_move_task() before exit_notify()
and we need to re-introduce the PF_EXITING (or similar) check removed by
the previous change for another reason.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: hartsjc@redhat.com
Cc: vbendel@redhat.com
Cc: vlovejoy@redhat.com
Link: http://lkml.kernel.org/r/20161114184612.GA15968@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-22 12:33:43 +01:00
Tetsuo Handa
38531201c1 mm, oom: enforce exit_oom_victim on current task
There are no users of exit_oom_victim on !current task anymore so enforce
the API to always work on the current.

Link: http://lkml.kernel.org/r/1472119394-11342-8-git-send-email-mhocko@kernel.org
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:28 -07:00
Peter Zijlstra
9af6528ee9 sched/core: Optimize __schedule()
Oleg noted that by making do_exit() use __schedule() for the TASK_DEAD
context switch, we can avoid the TASK_DEAD special case currently in
__schedule() because that avoids the extra preempt_disable() from
schedule().

In order to facilitate this, create a do_task_dead() helper which we
place in the scheduler code, such that it can access __schedule().

Also add some __noreturn annotations to the functions, there's no
coming back from do_exit().

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Cheng Chao <cs.os.kernel@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akpm@linux-foundation.org
Cc: chris@chris-wilson.co.uk
Cc: tj@kernel.org
Link: http://lkml.kernel.org/r/20160913163729.GB5012@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-22 14:53:45 +02:00
David Rientjes
c11600e4fe mm, mempolicy: task->mempolicy must be NULL before dropping final reference
KASAN allocates memory from the page allocator as part of
kmem_cache_free(), and that can reference current->mempolicy through any
number of allocation functions.  It needs to be NULL'd out before the
final reference is dropped to prevent a use-after-free bug:

	BUG: KASAN: use-after-free in alloc_pages_current+0x363/0x370 at addr ffff88010b48102c
	CPU: 0 PID: 15425 Comm: trinity-c2 Not tainted 4.8.0-rc2+ #140
	...
	Call Trace:
		dump_stack
		kasan_object_err
		kasan_report_error
		__asan_report_load2_noabort
		alloc_pages_current	<-- use after free
		depot_save_stack
		save_stack
		kasan_slab_free
		kmem_cache_free
		__mpol_put		<-- free
		do_exit

This patch sets current->mempolicy to NULL before dropping the final
reference.

Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1608301442180.63329@chino.kir.corp.google.com
Fixes: cd11016e5f ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB")
Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>	[4.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-01 17:52:01 -07:00
Anton Blanchard
627393d448 kernel/exit.c: quieten greatest stack depth printk
Many targets enable CONFIG_DEBUG_STACK_USAGE, and while the information
is useful, it isn't worthy of pr_warn().  Reduce it to pr_info().

Link: http://lkml.kernel.org/r/1466982072-29836-1-git-send-email-anton@ozlabs.org
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:23 -04:00
Linus Torvalds
cca08cd66c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:

 - introduce and use task_rcu_dereference()/try_get_task_struct() to fix
   and generalize task_struct handling (Oleg Nesterov)

 - do various per entity load tracking (PELT) fixes and optimizations
   (Peter Zijlstra)

 - cputime virt-steal time accounting enhancements/fixes (Wanpeng Li)

 - introduce consolidated cputime output file cpuacct.usage_all and
   related refactorings (Zhao Lei)

 - ... plus misc fixes and enhancements

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/core: Panic on scheduling while atomic bugs if kernel.panic_on_warn is set
  sched/cpuacct: Introduce cpuacct.usage_all to show all CPU stats together
  sched/cpuacct: Use loop to consolidate code in cpuacct_stats_show()
  sched/cpuacct: Merge cpuacct_usage_index and cpuacct_stat_index enums
  sched/fair: Rework throttle_count sync
  sched/core: Fix sched_getaffinity() return value kerneldoc comment
  sched/fair: Reorder cgroup creation code
  sched/fair: Apply more PELT fixes
  sched/fair: Fix PELT integrity for new tasks
  sched/cgroup: Fix cpu_cgroup_fork() handling
  sched/fair: Fix PELT integrity for new groups
  sched/fair: Fix and optimize the fork() path
  sched/cputime: Add steal time support to full dynticks CPU time accounting
  sched/cputime: Fix prev steal time accouting during CPU hotplug
  KVM: Fix steal clock warp during guest CPU hotplug
  sched/debug: Always show 'nr_migrations'
  sched/fair: Use task_rcu_dereference()
  sched/api: Introduce task_rcu_dereference() and try_get_task_struct()
  sched/idle: Optimize the generic idle loop
  sched/fair: Fix the wrong throttled clock time for cfs_rq_clock_task()
2016-07-25 13:59:34 -07:00
Peter Zijlstra
be3e784498 locking/spinlock: Update spin_unlock_wait() users
With the modified semantics of spin_unlock_wait() a number of
explicit barriers can be removed. Also update the comment for the
do_exit() usecase, as that was somewhat stale/obscure.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-14 11:55:15 +02:00
Oleg Nesterov
150593bf86 sched/api: Introduce task_rcu_dereference() and try_get_task_struct()
Generally task_struct is only protected by RCU if it was found on a
RCU protected list (say, for_each_process() or find_task_by_vpid()).

As Kirill pointed out rq->curr isn't protected by RCU, the scheduler
drops the (potentially) last reference without RCU gp, this means
that we need to fix the code which uses foreign_rq->curr under
rcu_read_lock().

Add a new helper which can be used to dereference rq->curr or any
other pointer to task_struct assuming that it should be cleared or
updated before the final put_task_struct(). It returns non-NULL
only if this task can't go away before rcu_read_unlock().

( Also add try_get_task_struct() to make it easier to use this API
  correctly. )

Suggested-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
[ Updated comments; added try_get_task_struct()]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Kirill Tkhai <tkhai@yandex.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Link: http://lkml.kernel.org/r/20160518170218.GY3192@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03 09:18:57 +02:00
Oleg Nesterov
91c4e8ea8f wait: allow sys_waitid() to accept __WNOTHREAD/__WCLONE/__WALL
I see no reason why waitid() can't support other linux-specific flags
allowed in sys_wait4().

In particular this change can help if we reconsider the previous change
("wait/ptrace: assume __WALL if the child is traced") which adds the
"automagical" __WALL for debugger.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Roland McGrath <roland@hack.frob.com>
Cc: <syzkaller@googlegroups.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-23 17:04:14 -07:00
Oleg Nesterov
bf959931dd wait/ptrace: assume __WALL if the child is traced
The following program (simplified version of generated by syzkaller)

	#include <pthread.h>
	#include <unistd.h>
	#include <sys/ptrace.h>
	#include <stdio.h>
	#include <signal.h>

	void *thread_func(void *arg)
	{
		ptrace(PTRACE_TRACEME, 0,0,0);
		return 0;
	}

	int main(void)
	{
		pthread_t thread;

		if (fork())
			return 0;

		while (getppid() != 1)
			;

		pthread_create(&thread, NULL, thread_func, NULL);
		pthread_join(thread, NULL);
		return 0;
	}

creates an unreapable zombie if /sbin/init doesn't use __WALL.

This is not a kernel bug, at least in a sense that everything works as
expected: debugger should reap a traced sub-thread before it can reap the
leader, but without __WALL/__WCLONE do_wait() ignores sub-threads.

Unfortunately, it seems that /sbin/init in most (all?) distributions
doesn't use it and we have to change the kernel to avoid the problem.
Note also that most init's use sys_waitid() which doesn't allow __WALL, so
the necessary user-space fix is not that trivial.

This patch just adds the "ptrace" check into eligible_child().  To some
degree this matches the "tsk->ptrace" in exit_notify(), ->exit_signal is
mostly ignored when the tracee reports to debugger.  Or WSTOPPED, the
tracer doesn't need to set this flag to wait for the stopped tracee.

This obviously means the user-visible change: __WCLONE and __WALL no
longer have any meaning for debugger.  And I can only hope that this won't
break something, but at least strace/gdb won't suffer.

We could make a more conservative change.  Say, we can take __WCLONE into
account, or !thread_group_leader().  But it would be nice to not
complicate these historical/confusing checks.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Roland McGrath <roland@hack.frob.com>
Cc: <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-23 17:04:14 -07:00
Jiri Slaby
e64646946e exit_thread: accept a task parameter to be exited
We need to call exit_thread from copy_process in a fail path.  So make it
accept task_struct as a parameter.

[v2]
* s390: exit_thread_runtime_instr doesn't make sense to be called for
  non-current tasks.
* arm: fix the comment in vfp_thread_copy
* change 'me' to 'tsk' for task_struct
* now we can change only archs that actually have exit_thread

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Howells <dhowells@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Michal Hocko
36324a990c oom: clear TIF_MEMDIE after oom_reaper managed to unmap the address space
When oom_reaper manages to unmap all the eligible vmas there shouldn't
be much of the freable memory held by the oom victim left anymore so it
makes sense to clear the TIF_MEMDIE flag for the victim and allow the
OOM killer to select another task.

The lack of TIF_MEMDIE also means that the victim cannot access memory
reserves anymore but that shouldn't be a problem because it would get
the access again if it needs to allocate and hits the OOM killer again
due to the fatal_signal_pending resp.  PF_EXITING check.  We can safely
hide the task from the OOM killer because it is clearly not a good
candidate anymore as everyhing reclaimable has been torn down already.

This patch will allow to cap the time an OOM victim can keep TIF_MEMDIE
and thus hold off further global OOM killer actions granted the oom
reaper is able to take mmap_sem for the associated mm struct.  This is
not guaranteed now but further steps should make sure that mmap_sem for
write should be blocked killable which will help to reduce such a lock
contention.  This is not done by this patch.

Note that exit_oom_victim might be called on a remote task from
__oom_reap_task now so we have to check and clear the flag atomically
otherwise we might race and underflow oom_victims or wake up waiters too
early.

Signed-off-by: Michal Hocko <mhocko@suse.com>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Andrea Argangeli <andrea@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-25 16:37:42 -07:00
Dmitry Vyukov
5c9a8750a6 kernel: add kcov code coverage
kcov provides code coverage collection for coverage-guided fuzzing
(randomized testing).  Coverage-guided fuzzing is a testing technique
that uses coverage feedback to determine new interesting inputs to a
system.  A notable user-space example is AFL
(http://lcamtuf.coredump.cx/afl/).  However, this technique is not
widely used for kernel testing due to missing compiler and kernel
support.

kcov does not aim to collect as much coverage as possible.  It aims to
collect more or less stable coverage that is function of syscall inputs.
To achieve this goal it does not collect coverage in soft/hard
interrupts and instrumentation of some inherently non-deterministic or
non-interesting parts of kernel is disbled (e.g.  scheduler, locking).

Currently there is a single coverage collection mode (tracing), but the
API anticipates additional collection modes.  Initially I also
implemented a second mode which exposes coverage in a fixed-size hash
table of counters (what Quentin used in his original patch).  I've
dropped the second mode for simplicity.

This patch adds the necessary support on kernel side.  The complimentary
compiler support was added in gcc revision 231296.

We've used this support to build syzkaller system call fuzzer, which has
found 90 kernel bugs in just 2 months:

  https://github.com/google/syzkaller/wiki/Found-Bugs

We've also found 30+ bugs in our internal systems with syzkaller.
Another (yet unexplored) direction where kcov coverage would greatly
help is more traditional "blob mutation".  For example, mounting a
random blob as a filesystem, or receiving a random blob over wire.

Why not gcov.  Typical fuzzing loop looks as follows: (1) reset
coverage, (2) execute a bit of code, (3) collect coverage, repeat.  A
typical coverage can be just a dozen of basic blocks (e.g.  an invalid
input).  In such context gcov becomes prohibitively expensive as
reset/collect coverage steps depend on total number of basic
blocks/edges in program (in case of kernel it is about 2M).  Cost of
kcov depends only on number of executed basic blocks/edges.  On top of
that, kernel requires per-thread coverage because there are always
background threads and unrelated processes that also produce coverage.
With inlined gcov instrumentation per-thread coverage is not possible.

kcov exposes kernel PCs and control flow to user-space which is
insecure.  But debugfs should not be mapped as user accessible.

Based on a patch by Quentin Casasnovas.

[akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode']
[akpm@linux-foundation.org: unbreak allmodconfig]
[akpm@linux-foundation.org: follow x86 Makefile layout standards]
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Tavis Ormandy <taviso@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Kees Cook <keescook@google.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: David Drysdale <drysdale@google.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Dmitry Safonov
c428fbdbf3 exit: remove unneeded declaration of exit_mm()
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Oleg Nesterov
570ac9337b ptrace: task_stopped_code(ptrace => true) can't see TASK_STOPPED task
task_stopped_code()->task_is_stopped_or_traced() doesn't look right, the
traced task must never be TASK_STOPPED.

We can not add WARN_ON(task_is_stopped(p)), but this is only because
do_wait() can race with PTRACE_ATTACH from another thread.

[akpm@linux-foundation.org: teeny cleanup]
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Roland McGrath <roland@hack.frob.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Pedro Alves <palves@redhat.com>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Linus Torvalds
53528695ff Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler changes from Ingo Molnar:
 "The main changes in this cycle were:

   - sched/fair load tracking fixes and cleanups (Byungchul Park)

   - Make load tracking frequency scale invariant (Dietmar Eggemann)

   - sched/deadline updates (Juri Lelli)

   - stop machine fixes, cleanups and enhancements for bugs triggered by
     CPU hotplug stress testing (Oleg Nesterov)

   - scheduler preemption code rework: remove PREEMPT_ACTIVE and related
     cleanups (Peter Zijlstra)

   - Rework the sched_info::run_delay code to fix races (Peter Zijlstra)

   - Optimize per entity utilization tracking (Peter Zijlstra)

   - ... misc other fixes, cleanups and smaller updates"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits)
  sched: Don't scan all-offline ->cpus_allowed twice if !CONFIG_CPUSETS
  sched: Move cpu_active() tests from stop_two_cpus() into migrate_swap_stop()
  sched: Start stopper early
  stop_machine: Kill cpu_stop_threads->setup() and cpu_stop_unpark()
  stop_machine: Kill smp_hotplug_thread->pre_unpark, introduce stop_machine_unpark()
  stop_machine: Change cpu_stop_queue_two_works() to rely on stopper->enabled
  stop_machine: Introduce __cpu_stop_queue_work() and cpu_stop_queue_two_works()
  stop_machine: Ensure that a queued callback will be called before cpu_stop_park()
  sched/x86: Fix typo in __switch_to() comments
  sched/core: Remove a parameter in the migrate_task_rq() function
  sched/core: Drop unlikely behind BUG_ON()
  sched/core: Fix task and run queue sched_info::run_delay inconsistencies
  sched/numa: Fix task_tick_fair() from disabling numa_balancing
  sched/core: Add preempt_count invariant check
  sched/core: More notrace annotations
  sched/core: Kill PREEMPT_ACTIVE
  sched/core, sched/x86: Kill thread_info::saved_preempt_count
  sched/core: Simplify preempt_count tests
  sched/core: Robustify preemption leak checks
  sched/core: Stop setting PREEMPT_ACTIVE
  ...
2015-11-03 18:03:50 -08:00
Paul E. McKenney
49f5903b47 rcu: Move preemption disabling out of __srcu_read_lock()
Currently, __srcu_read_lock() cannot be invoked from restricted
environments because it contains calls to preempt_disable() and
preempt_enable(), both of which can invoke lockdep, which is a bad
idea in some restricted execution modes.  This commit therefore moves
the preempt_disable() and preempt_enable() from __srcu_read_lock()
to srcu_read_lock().  It also inserts the preempt_disable() and
preempt_enable() around the call to __srcu_read_lock() in do_exit().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06 11:15:43 -07:00
Peter Zijlstra
1dc0fffc48 sched/core: Robustify preemption leak checks
When we warn about a preempt_count leak; reset the preempt_count to
the known good value such that the problem does not ripple forward.

This is most important on x86 which has a per cpu preempt_count that is
not saved/restored (after this series). So if you schedule with an
invalid (!2*PREEMPT_DISABLE_OFFSET) preempt_count the next task is
messed up too.

Enforcing this invariant limits the borkage to just the one task.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-06 17:08:17 +02:00
Frans Klaver
3da56d1663 kernel: exit: fix typo in comment
s,critiera,criteria,

While at it, add a comma, because it makes sense grammatically.

Signed-off-by: Frans Klaver <fransklaver@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
2015-08-07 13:59:49 +02:00