0e020c19bbbc7bf0520dde9a4802ebb03f8a03ef
53427 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
0e020c19bb |
BACKPORT: clone: add CLONE_PIDFD
This patchset makes it possible to retrieve pid file descriptors at process creation time by introducing the new flag CLONE_PIDFD to the clone() system call. Linus originally suggested to implement this as a new flag to clone() instead of making it a separate system call. As spotted by Linus, there is exactly one bit for clone() left. CLONE_PIDFD creates file descriptors based on the anonymous inode implementation in the kernel that will also be used to implement the new mount api. They serve as a simple opaque handle on pids. Logically, this makes it possible to interpret a pidfd differently, narrowing or widening the scope of various operations (e.g. signal sending). Thus, a pidfd cannot just refer to a tgid, but also a tid, or in theory - given appropriate flag arguments in relevant syscalls - a process group or session. A pidfd does not represent a privilege. This does not imply it cannot ever be that way but for now this is not the case. A pidfd comes with additional information in fdinfo if the kernel supports procfs. The fdinfo file contains the pid of the process in the callers pid namespace in the same format as the procfs status file, i.e. "Pid:\t%d". As suggested by Oleg, with CLONE_PIDFD the pidfd is returned in the parent_tidptr argument of clone. This has the advantage that we can give back the associated pid and the pidfd at the same time. To remove worries about missing metadata access this patchset comes with a sample program that illustrates how a combination of CLONE_PIDFD, and pidfd_send_signal() can be used to gain race-free access to process metadata through /proc/<pid>. The sample program can easily be translated into a helper that would be suitable for inclusion in libc so that users don't have to worry about writing it themselves. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christian Brauner <christian@brauner.io> Co-developed-by: Jann Horn <jannh@google.com> Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Howells <dhowells@redhat.com> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> Cc: Andy Lutomirsky <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> (cherry picked from commit b3e5838252665ee4cfa76b82bdf1198dca81e5be) Conflicts: kernel/fork.c (1. Replaced proc_pid_ns() with its direct implementation.) Bug: 135608568 Test: test program using syscall(__NR_sys_pidfd_open,..) and poll() Change-Id: I3c804a92faea686e5bf7f99df893fe3a5d87ddf7 Signed-off-by: Suren Baghdasaryan <surenb@google.com> |
||
|
|
cf9f829523 |
BACKPORT: signal: add pidfd_send_signal() syscall
The kill() syscall operates on process identifiers (pid). After a process
has exited its pid can be reused by another process. If a caller sends a
signal to a reused pid it will end up signaling the wrong process. This
issue has often surfaced and there has been a push to address this problem [1].
This patch uses file descriptors (fd) from proc/<pid> as stable handles on
struct pid. Even if a pid is recycled the handle will not change. The fd
can be used to send signals to the process it refers to.
Thus, the new syscall pidfd_send_signal() is introduced to solve this
problem. Instead of pids it operates on process fds (pidfd).
/* prototype and argument /*
long pidfd_send_signal(int pidfd, int sig, siginfo_t *info, unsigned int flags);
/* syscall number 424 */
The syscall number was chosen to be 424 to align with Arnd's rework in his
y2038 to minimize merge conflicts (cf. [25]).
In addition to the pidfd and signal argument it takes an additional
siginfo_t and flags argument. If the siginfo_t argument is NULL then
pidfd_send_signal() is equivalent to kill(<positive-pid>, <signal>). If it
is not NULL pidfd_send_signal() is equivalent to rt_sigqueueinfo().
The flags argument is added to allow for future extensions of this syscall.
It currently needs to be passed as 0. Failing to do so will cause EINVAL.
/* pidfd_send_signal() replaces multiple pid-based syscalls */
The pidfd_send_signal() syscall currently takes on the job of
rt_sigqueueinfo(2) and parts of the functionality of kill(2), Namely, when a
positive pid is passed to kill(2). It will however be possible to also
replace tgkill(2) and rt_tgsigqueueinfo(2) if this syscall is extended.
/* sending signals to threads (tid) and process groups (pgid) */
Specifically, the pidfd_send_signal() syscall does currently not operate on
process groups or threads. This is left for future extensions.
In order to extend the syscall to allow sending signal to threads and
process groups appropriately named flags (e.g. PIDFD_TYPE_PGID, and
PIDFD_TYPE_TID) should be added. This implies that the flags argument will
determine what is signaled and not the file descriptor itself. Put in other
words, grouping in this api is a property of the flags argument not a
property of the file descriptor (cf. [13]). Clarification for this has been
requested by Eric (cf. [19]).
When appropriate extensions through the flags argument are added then
pidfd_send_signal() can additionally replace the part of kill(2) which
operates on process groups as well as the tgkill(2) and
rt_tgsigqueueinfo(2) syscalls.
How such an extension could be implemented has been very roughly sketched
in [14], [15], and [16]. However, this should not be taken as a commitment
to a particular implementation. There might be better ways to do it.
Right now this is intentionally left out to keep this patchset as simple as
possible (cf. [4]).
/* naming */
The syscall had various names throughout iterations of this patchset:
- procfd_signal()
- procfd_send_signal()
- taskfd_send_signal()
In the last round of reviews it was pointed out that given that if the
flags argument decides the scope of the signal instead of different types
of fds it might make sense to either settle for "procfd_" or "pidfd_" as
prefix. The community was willing to accept either (cf. [17] and [18]).
Given that one developer expressed strong preference for the "pidfd_"
prefix (cf. [13]) and with other developers less opinionated about the name
we should settle for "pidfd_" to avoid further bikeshedding.
The "_send_signal" suffix was chosen to reflect the fact that the syscall
takes on the job of multiple syscalls. It is therefore intentional that the
name is not reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the
fomer because it might imply that pidfd_send_signal() is a replacement for
kill(2), and not the latter because it is a hassle to remember the correct
spelling - especially for non-native speakers - and because it is not
descriptive enough of what the syscall actually does. The name
"pidfd_send_signal" makes it very clear that its job is to send signals.
/* zombies */
Zombies can be signaled just as any other process. No special error will be
reported since a zombie state is an unreliable state (cf. [3]). However,
this can be added as an extension through the @flags argument if the need
ever arises.
/* cross-namespace signals */
The patch currently enforces that the signaler and signalee either are in
the same pid namespace or that the signaler's pid namespace is an ancestor
of the signalee's pid namespace. This is done for the sake of simplicity
and because it is unclear to what values certain members of struct
siginfo_t would need to be set to (cf. [5], [6]).
/* compat syscalls */
It became clear that we would like to avoid adding compat syscalls
(cf. [7]). The compat syscall handling is now done in kernel/signal.c
itself by adding __copy_siginfo_from_user_generic() which lets us avoid
compat syscalls (cf. [8]). It should be noted that the addition of
__copy_siginfo_from_user_any() is caused by a bug in the original
implementation of rt_sigqueueinfo(2) (cf. 12).
With upcoming rework for syscall handling things might improve
significantly (cf. [11]) and __copy_siginfo_from_user_any() will not gain
any additional callers.
/* testing */
This patch was tested on x64 and x86.
/* userspace usage */
An asciinema recording for the basic functionality can be found under [9].
With this patch a process can be killed via:
#define _GNU_SOURCE
#include <errno.h>
#include <fcntl.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <unistd.h>
static inline int do_pidfd_send_signal(int pidfd, int sig, siginfo_t *info,
unsigned int flags)
{
#ifdef __NR_pidfd_send_signal
return syscall(__NR_pidfd_send_signal, pidfd, sig, info, flags);
#else
return -ENOSYS;
#endif
}
int main(int argc, char *argv[])
{
int fd, ret, saved_errno, sig;
if (argc < 3)
exit(EXIT_FAILURE);
fd = open(argv[1], O_DIRECTORY | O_CLOEXEC);
if (fd < 0) {
printf("%s - Failed to open \"%s\"\n", strerror(errno), argv[1]);
exit(EXIT_FAILURE);
}
sig = atoi(argv[2]);
printf("Sending signal %d to process %s\n", sig, argv[1]);
ret = do_pidfd_send_signal(fd, sig, NULL, 0);
saved_errno = errno;
close(fd);
errno = saved_errno;
if (ret < 0) {
printf("%s - Failed to send signal %d to process %s\n",
strerror(errno), sig, argv[1]);
exit(EXIT_FAILURE);
}
exit(EXIT_SUCCESS);
}
/* Q&A
* Given that it seems the same questions get asked again by people who are
* late to the party it makes sense to add a Q&A section to the commit
* message so it's hopefully easier to avoid duplicate threads.
*
* For the sake of progress please consider these arguments settled unless
* there is a new point that desperately needs to be addressed. Please make
* sure to check the links to the threads in this commit message whether
* this has not already been covered.
*/
Q-01: (Florian Weimer [20], Andrew Morton [21])
What happens when the target process has exited?
A-01: Sending the signal will fail with ESRCH (cf. [22]).
Q-02: (Andrew Morton [21])
Is the task_struct pinned by the fd?
A-02: No. A reference to struct pid is kept. struct pid - as far as I
understand - was created exactly for the reason to not require to
pin struct task_struct (cf. [22]).
Q-03: (Andrew Morton [21])
Does the entire procfs directory remain visible? Just one entry
within it?
A-03: The same thing that happens right now when you hold a file descriptor
to /proc/<pid> open (cf. [22]).
Q-04: (Andrew Morton [21])
Does the pid remain reserved?
A-04: No. This patchset guarantees a stable handle not that pids are not
recycled (cf. [22]).
Q-05: (Andrew Morton [21])
Do attempts to signal that fd return errors?
A-05: See {Q,A}-01.
Q-06: (Andrew Morton [22])
Is there a cleaner way of obtaining the fd? Another syscall perhaps.
A-06: Userspace can already trivially retrieve file descriptors from procfs
so this is something that we will need to support anyway. Hence,
there's no immediate need to add another syscalls just to make
pidfd_send_signal() not dependent on the presence of procfs. However,
adding a syscalls to get such file descriptors is planned for a
future patchset (cf. [22]).
Q-07: (Andrew Morton [21] and others)
This fd-for-a-process sounds like a handy thing and people may well
think up other uses for it in the future, probably unrelated to
signals. Are the code and the interface designed to permit such
future applications?
A-07: Yes (cf. [22]).
Q-08: (Andrew Morton [21] and others)
Now I think about it, why a new syscall? This thing is looking
rather like an ioctl?
A-08: This has been extensively discussed. It was agreed that a syscall is
preferred for a variety or reasons. Here are just a few taken from
prior threads. Syscalls are safer than ioctl()s especially when
signaling to fds. Processes are a core kernel concept so a syscall
seems more appropriate. The layout of the syscall with its four
arguments would require the addition of a custom struct for the
ioctl() thereby causing at least the same amount or even more
complexity for userspace than a simple syscall. The new syscall will
replace multiple other pid-based syscalls (see description above).
The file-descriptors-for-processes concept introduced with this
syscall will be extended with other syscalls in the future. See also
[22], [23] and various other threads already linked in here.
Q-09: (Florian Weimer [24])
What happens if you use the new interface with an O_PATH descriptor?
A-09:
pidfds opened as O_PATH fds cannot be used to send signals to a
process (cf. [2]). Signaling processes through pidfds is the
equivalent of writing to a file. Thus, this is not an operation that
operates "purely at the file descriptor level" as required by the
open(2) manpage. See also [4].
/* References */
[1]: https://lore.kernel.org/lkml/20181029221037.87724-1-dancol@google.com/
[2]: https://lore.kernel.org/lkml/874lbtjvtd.fsf@oldenburg2.str.redhat.com/
[3]: https://lore.kernel.org/lkml/20181204132604.aspfupwjgjx6fhva@brauner.io/
[4]: https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru2ku@brauner.io/
[5]: https://lore.kernel.org/lkml/20181121213946.GA10795@mail.hallyn.com/
[6]: https://lore.kernel.org/lkml/20181120103111.etlqp7zop34v6nv4@brauner.io/
[7]: https://lore.kernel.org/lkml/36323361-90BD-41AF-AB5B-EE0D7BA02C21@amacapital.net/
[8]: https://lore.kernel.org/lkml/87tvjxp8pc.fsf@xmission.com/
[9]: https://asciinema.org/a/IQjuCHew6bnq1cr78yuMv16cy
[11]: https://lore.kernel.org/lkml/F53D6D38-3521-4C20-9034-5AF447DF62FF@amacapital.net/
[12]: https://lore.kernel.org/lkml/87zhtjn8ck.fsf@xmission.com/
[13]: https://lore.kernel.org/lkml/871s6u9z6u.fsf@xmission.com/
[14]: https://lore.kernel.org/lkml/20181206231742.xxi4ghn24z4h2qki@brauner.io/
[15]: https://lore.kernel.org/lkml/20181207003124.GA11160@mail.hallyn.com/
[16]: https://lore.kernel.org/lkml/20181207015423.4miorx43l3qhppfz@brauner.io/
[17]: https://lore.kernel.org/lkml/CAGXu5jL8PciZAXvOvCeCU3wKUEB_dU-O3q0tDw4uB_ojMvDEew@mail.gmail.com/
[18]: https://lore.kernel.org/lkml/20181206222746.GB9224@mail.hallyn.com/
[19]: https://lore.kernel.org/lkml/20181208054059.19813-1-christian@brauner.io/
[20]: https://lore.kernel.org/lkml/8736rebl9s.fsf@oldenburg.str.redhat.com/
[21]: https://lore.kernel.org/lkml/20181228152012.dbf0508c2508138efc5f2bbe@linux-foundation.org/
[22]: https://lore.kernel.org/lkml/20181228233725.722tdfgijxcssg76@brauner.io/
[23]: https://lwn.net/Articles/773459/
[24]: https://lore.kernel.org/lkml/8736rebl9s.fsf@oldenburg.str.redhat.com/
[25]: https://lore.kernel.org/lkml/CAK8P3a0ej9NcJM8wXNPbcGUyOUZYX+VLoDFdbenW3s3114oQZw@mail.gmail.com/
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Jann Horn <jannh@google.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Tycho Andersen <tycho@tycho.ws>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Serge Hallyn <serge@hallyn.com>
Acked-by: Aleksa Sarai <cyphar@cyphar.com>
(cherry picked from commit 3eb39f47934f9d5a3027fe00d906a45fe3a15fad)
Conflicts:
arch/x86/entry/syscalls/syscall_32.tbl - trivial manual merge
arch/x86/entry/syscalls/syscall_64.tbl - trivial manual merge
include/linux/proc_fs.h - trivial manual merge
include/linux/syscalls.h - trivial manual merge
include/uapi/asm-generic/unistd.h - trivial manual merge
kernel/signal.c - struct kernel_siginfo does not exist in 4.9
kernel/sys_ni.c - cond_syscall is used instead of COND_SYSCALL
arch/x86/entry/syscalls/syscall_32.tbl
arch/x86/entry/syscalls/syscall_64.tbl
(1. manual merges because of 4.9 differences
2. change prepare_kill_siginfo() to use struct siginfo instead of
kernel_siginfo
3. exclude kill() changes to avoid struct kernel_siginfo usage
4. exclude copy_siginfo_from_user_any() to avoid struct kernel_siginfo usage
5. use copy_from_user() instead of copy_siginfo_from_user() in copy_siginfo_from_user_any()
6. replaced COND_SYSCALL with cond_syscall
7. Removed __ia32_sys_pidfd_send_signal in arch/x86/entry/syscalls/syscall_32.tbl.
8. Replaced __x64_sys_pidfd_send_signal with sys_pidfd_send_signal in arch/x86/entry/syscalls/syscall_64.tbl.)
Bug: 135608568
Test: test program using syscall(__NR_pidfd_send_signal,..) to send SIGKILL
Change-Id: I00f1c618b2e9dbafae4d4113ad4d8a1a44b6957c
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
|
||
|
|
9595aa8719 |
Merge 4.9.190 into android-4.9-q
Changes in 4.9.190 usb: usbfs: fix double-free of usb memory upon submiturb error usb: iowarrior: fix deadlock on disconnect sound: fix a memory leak bug x86/mm: Check for pfn instead of page in vmalloc_sync_one() x86/mm: Sync also unmappings in vmalloc_sync_all() mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy() perf record: Fix wrong size in perf_record_mmap for last kernel module perf db-export: Fix thread__exec_comm() perf record: Fix module size on s390 usb: yurex: Fix use-after-free in yurex_delete can: peak_usb: fix potential double kfree_skb() netfilter: nfnetlink: avoid deadlock due to synchronous request_module iscsi_ibft: make ISCSI_IBFT dependson ACPI instead of ISCSI_IBFT_FIND mac80211: don't warn about CW params when not using them hwmon: (nct6775) Fix register address and added missed tolerance for nct6106 cpufreq/pasemi: fix use-after-free in pas_cpufreq_cpu_init() s390/qdio: add sanity checks to the fast-requeue path ALSA: compress: Fix regression on compressed capture streams ALSA: compress: Prevent bypasses of set_params ALSA: compress: Don't allow paritial drain operations on capture streams ALSA: compress: Be more restrictive about when a drain is allowed perf probe: Avoid calling freeing routine multiple times for same pointer drbd: dynamically allocate shash descriptor ACPI/IORT: Fix off-by-one check in iort_dev_find_its_id() ARM: davinci: fix sleep.S build error on ARMv4 scsi: megaraid_sas: fix panic on loading firmware crashdump scsi: ibmvfc: fix WARN_ON during event pool release scsi: scsi_dh_alua: always use a 2 second delay before retrying RTPG tty/ldsem, locking/rwsem: Add missing ACQUIRE to read_failed sleep loop perf/core: Fix creating kernel counters for PMUs that override event->cpu can: peak_usb: pcan_usb_pro: Fix info-leaks to USB devices can: peak_usb: pcan_usb_fd: Fix info-leaks to USB devices hwmon: (nct7802) Fix wrong detection of in4 presence ALSA: firewire: fix a memory leak bug ALSA: hda - Don't override global PCM hw info flag mac80211: don't WARN on short WMM parameters from AP SMB3: Fix deadlock in validate negotiate hits reconnect smb3: send CAP_DFS capability during session setup mwifiex: fix 802.11n/WPA detection iwlwifi: don't unmap as page memory that was mapped as single scsi: mpt3sas: Use 63-bit DMA addressing on SAS35 HBA sh: kernel: hw_breakpoint: Fix missing break in switch statement mm/usercopy: use memory range to be accessed for wraparound check mm/memcontrol.c: fix use after free in mem_cgroup_iter() bpf: get rid of pure_initcall dependency to enable jits bpf: restrict access to core bpf sysctls bpf: add bpf_jit_limit knob to restrict unpriv allocations vhost-net: set packet weight of tx polling to 2 * vq size vhost_net: use packet weight for rx handler, too vhost_net: introduce vhost_exceeds_weight() vhost: introduce vhost_exceeds_weight() vhost_net: fix possible infinite loop vhost: scsi: add weight support siphash: add cryptographically secure PRF siphash: implement HalfSipHash1-3 for hash tables inet: switch IP ID generator to siphash netfilter: ctnetlink: don't use conntrack/expect object addresses as id xtensa: add missing isync to the cpu_reset TLB code ALSA: hda - Fix a memory leak bug ALSA: hda - Add a generic reboot_notify ALSA: hda - Let all conexant codec enter D3 when rebooting HID: holtek: test for sanity of intfdata HID: hiddev: avoid opening a disconnected device HID: hiddev: do cleanup in failure of opening a device Input: kbtab - sanity check for endpoint type Input: iforce - add sanity checks net: usb: pegasus: fix improper read if get_registers() fail xen/pciback: remove set but not used variable 'old_state' irqchip/irq-imx-gpcv2: Forward irq type to parent perf header: Fix divide by zero error if f_header.attr_size==0 perf header: Fix use of unitialized value warning libata: zpodd: Fix small read overflow in zpodd_get_mech_type() scsi: hpsa: correct scsi command status issue after reset ata: libahci: do not complain in case of deferred probe kbuild: modpost: handle KBUILD_EXTRA_SYMBOLS only for external modules arm64/efi: fix variable 'si' set but not used arm64/mm: fix variable 'pud' set but not used IB/core: Add mitigation for Spectre V1 IB/mad: Fix use-after-free in ib mad completion handling ocfs2: remove set but not used variable 'last_hash' staging: comedi: dt3000: Fix signed integer overflow 'divider * base' staging: comedi: dt3000: Fix rounding up of timer divisor USB: core: Fix races in character device registration and deregistraion usb: cdc-acm: make sure a refcount is taken early enough USB: CDC: fix sanity checks in CDC union parser USB: serial: option: add D-Link DWM-222 device ID USB: serial: option: Add support for ZTE MF871A USB: serial: option: add the BroadMobi BM818 card USB: serial: option: Add Motorola modem UARTs asm-generic: fix -Wtype-limits compiler warnings bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K arm64: compat: Allow single-byte watchpoints on all addresses netfilter: conntrack: Use consistent ct id hash calculation Input: psmouse - fix build error of multiple definition iommu/amd: Move iommu_init_pci() to .init section bnx2x: Fix VF's VLAN reconfiguration in reload. net/packet: fix race in tpacket_snd() sctp: fix the transport error_count check xen/netback: Reset nr_frags before freeing skb net/mlx5e: Only support tx/rx pause setting for port owner net/mlx5e: Use flow keys dissector to parse packets for ARFS team: Add vlan tx offload to hw_enc_features bonding: Add vlan tx offload to hw_enc_features Linux 4.9.190 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
|
6c1dc8f96b |
bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K
[ Upstream commit fdadd04931c2d7cd294dc5b2b342863f94be53a3 ]
Michael and Sandipan report:
Commit ede95a63b5 introduced a bpf_jit_limit tuneable to limit BPF
JIT allocations. At compile time it defaults to PAGE_SIZE * 40000,
and is adjusted again at init time if MODULES_VADDR is defined.
For ppc64 kernels, MODULES_VADDR isn't defined, so we're stuck with
the compile-time default at boot-time, which is 0x9c400000 when
using 64K page size. This overflows the signed 32-bit bpf_jit_limit
value:
root@ubuntu:/tmp# cat /proc/sys/net/core/bpf_jit_limit
-1673527296
and can cause various unexpected failures throughout the network
stack. In one case `strace dhclient eth0` reported:
setsockopt(5, SOL_SOCKET, SO_ATTACH_FILTER, {len=11, filter=0x105dd27f8},
16) = -1 ENOTSUPP (Unknown error 524)
and similar failures can be seen with tools like tcpdump. This doesn't
always reproduce however, and I'm not sure why. The more consistent
failure I've seen is an Ubuntu 18.04 KVM guest booted on a POWER9
host would time out on systemd/netplan configuring a virtio-net NIC
with no noticeable errors in the logs.
Given this and also given that in near future some architectures like
arm64 will have a custom area for BPF JIT image allocations we should
get rid of the BPF_JIT_LIMIT_DEFAULT fallback / default entirely. For
4.21, we have an overridable bpf_jit_alloc_exec(), bpf_jit_free_exec()
so therefore add another overridable bpf_jit_alloc_exec_limit() helper
function which returns the possible size of the memory area for deriving
the default heuristic in bpf_jit_charge_init().
Like bpf_jit_alloc_exec() and bpf_jit_free_exec(), the new
bpf_jit_alloc_exec_limit() assumes that module_alloc() is the default
JIT memory provider, and therefore in case archs implement their custom
module_alloc() we use MODULES_{END,_VADDR} for limits and otherwise for
vmalloc_exec() cases like on ppc64 we use VMALLOC_{END,_START}.
Additionally, for archs supporting large page sizes, we should change
the sysctl to be handled as long to not run into sysctl restrictions
in future.
Fixes: ede95a63b5e8 ("bpf: add bpf_jit_limit knob to restrict unpriv allocations")
Reported-by: Sandipan Das <sandipan@linux.ibm.com>
Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
||
|
|
b97a2f3d58 |
inet: switch IP ID generator to siphash
commit df453700e8d81b1bdafdf684365ee2b9431fb702 upstream. According to Amit Klein and Benny Pinkas, IP ID generation is too weak and might be used by attackers. Even with recent net_hash_mix() fix (netns: provide pure entropy for net_hash_mix()) having 64bit key and Jenkins hash is risky. It is time to switch to siphash and its 128bit keys. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Amit Klein <aksecurity@gmail.com> Reported-by: Benny Pinkas <benny@pinkas.net> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 4.9: adjust context] Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
175a407ce4 |
siphash: implement HalfSipHash1-3 for hash tables
commit 1ae2324f732c9c4e2fa4ebd885fa1001b70d52e1 upstream. HalfSipHash, or hsiphash, is a shortened version of SipHash, which generates 32-bit outputs using a weaker 64-bit key. It has *much* lower security margins, and shouldn't be used for anything too sensitive, but it could be used as a hashtable key function replacement, if the output is never exposed, and if the security requirement is not too high. The goal is to make this something that performance-critical jhash users would be willing to use. On 64-bit machines, HalfSipHash1-3 is slower than SipHash1-3, so we alias SipHash1-3 to HalfSipHash1-3 on those systems. 64-bit x86_64: [ 0.509409] test_siphash: SipHash2-4 cycles: 4049181 [ 0.510650] test_siphash: SipHash1-3 cycles: 2512884 [ 0.512205] test_siphash: HalfSipHash1-3 cycles: 3429920 [ 0.512904] test_siphash: JenkinsHash cycles: 978267 So, we map hsiphash() -> SipHash1-3 32-bit x86: [ 0.509868] test_siphash: SipHash2-4 cycles: 14812892 [ 0.513601] test_siphash: SipHash1-3 cycles: 9510710 [ 0.515263] test_siphash: HalfSipHash1-3 cycles: 3856157 [ 0.515952] test_siphash: JenkinsHash cycles: 1148567 So, we map hsiphash() -> HalfSipHash1-3 hsiphash() is roughly 3 times slower than jhash(), but comes with a considerable security improvement. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 4.9 to avoid regression for WireGuard with only half the siphash API present] Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
53e054b3cd |
siphash: add cryptographically secure PRF
commit 2c956a60778cbb6a27e0c7a8a52a91378c90e1d1 upstream. SipHash is a 64-bit keyed hash function that is actually a cryptographically secure PRF, like HMAC. Except SipHash is super fast, and is meant to be used as a hashtable keyed lookup function, or as a general PRF for short input use cases, such as sequence numbers or RNG chaining. For the first usage: There are a variety of attacks known as "hashtable poisoning" in which an attacker forms some data such that the hash of that data will be the same, and then preceeds to fill up all entries of a hashbucket. This is a realistic and well-known denial-of-service vector. Currently hashtables use jhash, which is fast but not secure, and some kind of rotating key scheme (or none at all, which isn't good). SipHash is meant as a replacement for jhash in these cases. There are a modicum of places in the kernel that are vulnerable to hashtable poisoning attacks, either via userspace vectors or network vectors, and there's not a reliable mechanism inside the kernel at the moment to fix it. The first step toward fixing these issues is actually getting a secure primitive into the kernel for developers to use. Then we can, bit by bit, port things over to it as deemed appropriate. While SipHash is extremely fast for a cryptographically secure function, it is likely a bit slower than the insecure jhash, and so replacements will be evaluated on a case-by-case basis based on whether or not the difference in speed is negligible and whether or not the current jhash usage poses a real security risk. For the second usage: A few places in the kernel are using MD5 or SHA1 for creating secure sequence numbers, syn cookies, port numbers, or fast random numbers. SipHash is a faster and more fitting, and more secure replacement for MD5 in those situations. Replacing MD5 and SHA1 with SipHash for these uses is obvious and straight-forward, and so is submitted along with this patch series. There shouldn't be much of a debate over its efficacy. Dozens of languages are already using this internally for their hash tables and PRFs. Some of the BSDs already use this in their kernels. SipHash is a widely known high-speed solution to a widely known set of problems, and it's time we catch-up. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 4.9 as dependency of commits df453700e8d8 "inet: switch IP ID generator to siphash" and 3c79107631db "netfilter: ctnetlink: don't use conntrack/expect object addresses as id"] Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
c98446e1ba |
bpf: add bpf_jit_limit knob to restrict unpriv allocations
commit ede95a63b5e84ddeea6b0c473b36ab8bfd8c6ce3 upstream.
Rick reported that the BPF JIT could potentially fill the entire module
space with BPF programs from unprivileged users which would prevent later
attempts to load normal kernel modules or privileged BPF programs, for
example. If JIT was enabled but unsuccessful to generate the image, then
before commit 290af86629b2 ("bpf: introduce BPF_JIT_ALWAYS_ON config")
we would always fall back to the BPF interpreter. Nowadays in the case
where the CONFIG_BPF_JIT_ALWAYS_ON could be set, then the load will abort
with a failure since the BPF interpreter was compiled out.
Add a global limit and enforce it for unprivileged users such that in case
of BPF interpreter compiled out we fail once the limit has been reached
or we fall back to BPF interpreter earlier w/o using module mem if latter
was compiled in. In a next step, fair share among unprivileged users can
be resolved in particular for the case where we would fail hard once limit
is reached.
Fixes: 290af86629b2 ("bpf: introduce BPF_JIT_ALWAYS_ON config")
Fixes:
|
||
|
|
1524ee00a2 |
UPSTREAM: net/ipv6: allow sysctl to change link-local address generation mode
The address generation mode for IPv6 link-local can only be configured by netlink messages. This patch adds the ability to change the address generation mode via sysctl. v1 -> v2 Removed the rtnl lock and switch to use RCU lock to iterate through the netdev list. v2 -> v3 Removed the addrgenmode variable from the idev structure and use the systcl storage for the flag. Simplifed the logic for sysctl handling by removing the supported for all operation. Added support for more types of tunnel interfaces for link-local address generation. Based the patches from net-next. v3 -> v4 Removed unnecessary whitespace changes. Signed-off-by: Felix Jia <felix.jia@alliedtelesis.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit d35a00b8e33dab7385f724e713ae71c8be0a49f4) Bug: 138428295 Signed-off-by: Maciej Żenczykowski <maze@google.com> Change-Id: Ia526e6c1de55e51e35b4b83d95749121fee5d0d1 |
||
|
|
94c1fcc8e4 |
Merge 4.9.189 into android-4.9-q
Changes in 4.9.189
scsi: fcoe: Embed fc_rport_priv in fcoe_rport structure
ARM: dts: Add pinmuxing for i2c2 and i2c3 for LogicPD SOM-LV
ARM: dts: Add pinmuxing for i2c2 and i2c3 for LogicPD torpedo
ARM: dts: logicpd-som-lv: Fix Audio Mute
arm64: cpufeature: Fix CTR_EL0 field definitions
arm64: cpufeature: Fix feature comparison for CTR_EL0.{CWG,ERG}
tcp: be more careful in tcp_fragment()
HID: wacom: fix bit shift for Cintiq Companion 2
HID: Add quirk for HP X1200 PIXART OEM mouse
RDMA: Directly cast the sockaddr union to sockaddr
IB: directly cast the sockaddr union to aockaddr
objtool: Add machine_real_restart() to the noreturn list
objtool: Add rewind_stack_do_exit() to the noreturn list
libceph: use kbasename() and kill ceph_file_part()
atm: iphase: Fix Spectre v1 vulnerability
net: bridge: delete local fdb on device init failure
net: bridge: mcast: don't delete permanent entries when fast leave is enabled
net: fix ifindex collision during namespace removal
net/mlx5: Use reversed order when unregister devices
net: sched: Fix a possible null-pointer dereference in dequeue_func()
tipc: compat: allow tipc commands without arguments
compat_ioctl: pppoe: fix PPPOEIOCSFWD handling
ip6_tunnel: fix possible use-after-free on xmit
ife: error out when nla attributes are empty
bnx2x: Disable multi-cos feature.
block: blk_init_allocated_queue() set q->fq as NULL in the fail case
spi: bcm2835: Fix 3-wire mode if DMA is enabled
x86: cpufeatures: Sort feature word 7
x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations
x86/speculation: Enable Spectre v1 swapgs mitigations
x86/entry/64: Use JMP instead of JMPQ
x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS
Linux 4.9.189
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
|
||
|
|
00a8794f63 |
compat_ioctl: pppoe: fix PPPOEIOCSFWD handling
[ Upstream commit 055d88242a6046a1ceac3167290f054c72571cd9 ]
Support for handling the PPPOEIOCSFWD ioctl in compat mode was added in
linux-2.5.69 along with hundreds of other commands, but was always broken
sincen only the structure is compatible, but the command number is not,
due to the size being sizeof(size_t), or at first sizeof(sizeof((struct
sockaddr_pppox)), which is different on 64-bit architectures.
Guillaume Nault adds:
And the implementation was broken until 2016 (see
|
||
|
|
22395a3e46 |
libceph: use kbasename() and kill ceph_file_part()
commit 6f4dbd149d2a151b89d1a5bbf7530ee5546c7908 upstream. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
4ebd29edaf |
Merge 4.9.188 into android-4.9-q
Changes in 4.9.188 ARM: riscpc: fix DMA ARM: dts: rockchip: Make rk3288-veyron-minnie run at hs200 ARM: dts: rockchip: Make rk3288-veyron-mickey's emmc work again ARM: dts: rockchip: Mark that the rk3288 timer might stop in suspend ftrace: Enable trampoline when rec count returns back to one kernel/module.c: Only return -EEXIST for modules that have finished loading MIPS: lantiq: Fix bitfield masking dmaengine: rcar-dmac: Reject zero-length slave DMA requests fs/adfs: super: fix use-after-free bug btrfs: fix minimum number of chunk errors for DUP ceph: fix improper use of smp_mb__before_atomic() ceph: return -ERANGE if virtual xattr value didn't fit in buffer scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized ACPI: fix false-positive -Wuninitialized warning be2net: Signal that the device cannot transmit during reconfiguration x86/apic: Silence -Wtype-limits compiler warnings x86: math-emu: Hide clang warnings for 16-bit overflow mm/cma.c: fail if fixed declaration can't be honored coda: add error handling for fget coda: fix build using bare-metal toolchain uapi linux/coda_psdev.h: move upc_req definition from uapi to kernel side headers drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings ipc/mqueue.c: only perform resource calculation if user valid x86/kvm: Don't call kvm_spurious_fault() from .fixup x86, boot: Remove multiple copy of static function sanitize_boot_params() kbuild: initialize CLANG_FLAGS correctly in the top Makefile Btrfs: fix incremental send failure after deduplication mmc: dw_mmc: Fix occasional hang after tuning on eMMC gpiolib: fix incorrect IRQ requesting of an active-low lineevent selinux: fix memory leak in policydb_init() s390/dasd: fix endless loop after read unit address configuration drivers/perf: arm_pmu: Fix failure path in PM notifier xen/swiotlb: fix condition for calling xen_destroy_contiguous_region() IB/mlx5: Fix RSS Toeplitz setup to be aligned with the HW specification coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping infiniband: fix race condition between infiniband mlx4, mlx5 driver and core dumping coredump: fix race condition between collapse_huge_page() and core dumping eeprom: at24: make spd world-readable again Backport minimal compiler_attributes.h to support GCC 9 include/linux/module.h: copy __init/__exit attrs to init/cleanup_module objtool: Support GCC 9 cold subfunction naming scheme x86, mm, gup: prevent get_page() race with munmap in paravirt guest Linux 4.9.188 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
|
2c34c215c1 |
include/linux/module.h: copy __init/__exit attrs to init/cleanup_module
commit a6e60d84989fa0e91db7f236eda40453b0e44afa upstream. The upcoming GCC 9 release extends the -Wmissing-attributes warnings (enabled by -Wall) to C and aliases: it warns when particular function attributes are missing in the aliases but not in their target. In particular, it triggers for all the init/cleanup_module aliases in the kernel (defined by the module_init/exit macros), ending up being very noisy. These aliases point to the __init/__exit functions of a module, which are defined as __cold (among other attributes). However, the aliases themselves do not have the __cold attribute. Since the compiler behaves differently when compiling a __cold function as well as when compiling paths leading to calls to __cold functions, the warning is trying to point out the possibly-forgotten attribute in the alias. In order to keep the warning enabled, we decided to silence this case. Ideally, we would mark the aliases directly as __init/__exit. However, there are currently around 132 modules in the kernel which are missing __init/__exit in their init/cleanup functions (either because they are missing, or for other reasons, e.g. the functions being called from somewhere else); and a section mismatch is a hard error. A conservative alternative was to mark the aliases as __cold only. However, since we would like to eventually enforce __init/__exit to be always marked, we chose to use the new __copy function attribute (introduced by GCC 9 as well to deal with this). With it, we copy the attributes used by the target functions into the aliases. This way, functions that were not marked as __init/__exit won't have their aliases marked either, and therefore there won't be a section mismatch. Note that the warning would go away marking either the extern declaration, the definition, or both. However, we only mark the definition of the alias, since we do not want callers (which only see the declaration) to be compiled as if the function was __cold (and therefore the paths leading to those calls would be assumed to be unlikely). Link: https://lore.kernel.org/lkml/259986242.BvXPX32bHu@devpool35/ Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/lkml/20190123173707.GA16603@gmail.com/ Link: https://lore.kernel.org/lkml/20190206175627.GA20399@gmail.com/ Suggested-by: Martin Sebor <msebor@gcc.gnu.org> Acked-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
fe5844365e |
Backport minimal compiler_attributes.h to support GCC 9
This adds support for __copy to v4.9.y so that we can use it in init/exit_module to avoid -Werror=missing-attributes errors on GCC 9. Link: https://lore.kernel.org/lkml/259986242.BvXPX32bHu@devpool35/ Cc: <stable@vger.kernel.org> Suggested-by: Rolf Eike Beer <eb@emlix.com> Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
d4fc64c927 |
coredump: fix race condition between collapse_huge_page() and core dumping
commit 59ea6d06cfa9247b586a695c21f94afa7183af74 upstream.
When fixing the race conditions between the coredump and the mmap_sem
holders outside the context of the process, we focused on
mmget_not_zero()/get_task_mm() callers in 04f5866e41fb70 ("coredump: fix
race condition between mmget_not_zero()/get_task_mm() and core
dumping"), but those aren't the only cases where the mmap_sem can be
taken outside of the context of the process as Michal Hocko noticed
while backporting that commit to older -stable kernels.
If mmgrab() is called in the context of the process, but then the
mm_count reference is transferred outside the context of the process,
that can also be a problem if the mmap_sem has to be taken for writing
through that mm_count reference.
khugepaged registration calls mmgrab() in the context of the process,
but the mmap_sem for writing is taken later in the context of the
khugepaged kernel thread.
collapse_huge_page() after taking the mmap_sem for writing doesn't
modify any vma, so it's not obvious that it could cause a problem to the
coredump, but it happens to modify the pmd in a way that breaks an
invariant that pmd_trans_huge_lock() relies upon. collapse_huge_page()
needs the mmap_sem for writing just to block concurrent page faults that
call pmd_trans_huge_lock().
Specifically the invariant that "!pmd_trans_huge()" cannot become a
"pmd_trans_huge()" doesn't hold while collapse_huge_page() runs.
The coredump will call __get_user_pages() without mmap_sem for reading,
which eventually can invoke a lockless page fault which will need a
functional pmd_trans_huge_lock().
So collapse_huge_page() needs to use mmget_still_valid() to check it's
not running concurrently with the coredump... as long as the coredump
can invoke page faults without holding the mmap_sem for reading.
This has "Fixes: khugepaged" to facilitate backporting, but in my view
it's more a bug in the coredump code that will eventually have to be
rewritten to stop invoking page faults without the mmap_sem for reading.
So the long term plan is still to drop all mmget_still_valid().
Link: http://lkml.kernel.org/r/20190607161558.32104-1-aarcange@redhat.com
Fixes:
|
||
|
|
16903f1a5b |
coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping
commit 04f5866e41fb70690e28397487d8bd8eea7d712a upstream.
The core dumping code has always run without holding the mmap_sem for
writing, despite that is the only way to ensure that the entire vma
layout will not change from under it. Only using some signal
serialization on the processes belonging to the mm is not nearly enough.
This was pointed out earlier. For example in Hugh's post from Jul 2017:
https://lkml.kernel.org/r/alpine.LSU.2.11.1707191716030.2055@eggly.anvils
"Not strictly relevant here, but a related note: I was very surprised
to discover, only quite recently, how handle_mm_fault() may be called
without down_read(mmap_sem) - when core dumping. That seems a
misguided optimization to me, which would also be nice to correct"
In particular because the growsdown and growsup can move the
vm_start/vm_end the various loops the core dump does around the vma will
not be consistent if page faults can happen concurrently.
Pretty much all users calling mmget_not_zero()/get_task_mm() and then
taking the mmap_sem had the potential to introduce unexpected side
effects in the core dumping code.
Adding mmap_sem for writing around the ->core_dump invocation is a
viable long term fix, but it requires removing all copy user and page
faults and to replace them with get_dump_page() for all binary formats
which is not suitable as a short term fix.
For the time being this solution manually covers the places that can
confuse the core dump either by altering the vma layout or the vma flags
while it runs. Once ->core_dump runs under mmap_sem for writing the
function mmget_still_valid() can be dropped.
Allowing mmap_sem protected sections to run in parallel with the
coredump provides some minor parallelism advantage to the swapoff code
(which seems to be safe enough by never mangling any vma field and can
keep doing swapins in parallel to the core dumping) and to some other
corner case.
In order to facilitate the backporting I added "Fixes: 86039bd3b4e6"
however the side effect of this same race condition in /proc/pid/mem
should be reproducible since before 2.6.12-rc2 so I couldn't add any
other "Fixes:" because there's no hash beyond the git genesis commit.
Because find_extend_vma() is the only location outside of the process
context that could modify the "mm" structures under mmap_sem for
reading, by adding the mmget_still_valid() check to it, all other cases
that take the mmap_sem for reading don't need the new check after
mmget_not_zero()/get_task_mm(). The expand_stack() in page fault
context also doesn't need the new check, because all tasks under core
dumping are frozen.
Link: http://lkml.kernel.org/r/20190325224949.11068-1-aarcange@redhat.com
Fixes:
|
||
|
|
317fc4dd52 |
uapi linux/coda_psdev.h: move upc_req definition from uapi to kernel side headers
[ Upstream commit f90fb3c7e2c13ae829db2274b88b845a75038b8a ] Only users of upc_req in kernel side fs/coda/psdev.c and fs/coda/upcall.c already include linux/coda_psdev.h. Suggested by Jan Harkes <jaharkes@cs.cmu.edu> in https://lore.kernel.org/lkml/20150531111913.GA23377@cs.cmu.edu/ Fixes these include/uapi/linux/coda_psdev.h compilation errors in userspace: linux/coda_psdev.h:12:19: error: field `uc_chain' has incomplete type struct list_head uc_chain; ^ linux/coda_psdev.h:13:2: error: unknown type name `caddr_t' caddr_t uc_data; ^ linux/coda_psdev.h:14:2: error: unknown type name `u_short' u_short uc_flags; ^ linux/coda_psdev.h:15:2: error: unknown type name `u_short' u_short uc_inSize; /* Size is at most 5000 bytes */ ^ linux/coda_psdev.h:16:2: error: unknown type name `u_short' u_short uc_outSize; ^ linux/coda_psdev.h:17:2: error: unknown type name `u_short' u_short uc_opcode; /* copied from data to save lookup */ ^ linux/coda_psdev.h:19:2: error: unknown type name `wait_queue_head_t' wait_queue_head_t uc_sleep; /* process' wait queue */ ^ Link: http://lkml.kernel.org/r/9f99f5ce6a0563d5266e6cf7aa9585aac2cae971.1558117389.git.jaharkes@cs.cmu.edu Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi> Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Colin Ian King <colin.king@canonical.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: David Howells <dhowells@redhat.com> Cc: Fabian Frederick <fabf@skynet.be> Cc: Sam Protsenko <semen.protsenko@linaro.org> Cc: Yann Droneaud <ydroneaud@opteya.com> Cc: Zhouyang Jia <jiazhouyang09@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
90320a506c |
coda: fix build using bare-metal toolchain
[ Upstream commit b2a57e334086602be56b74958d9f29b955cd157f ] The kernel is self-contained project and can be built with bare-metal toolchain. But bare-metal toolchain doesn't define __linux__. Because of this u_quad_t type is not defined when using bare-metal toolchain and codafs build fails. This patch fixes it by defining u_quad_t type unconditionally. Link: http://lkml.kernel.org/r/3cbb40b0a57b6f9923a9d67b53473c0b691a3eaa.1558117389.git.jaharkes@cs.cmu.edu Signed-off-by: Sam Protsenko <semen.protsenko@linaro.org> Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Colin Ian King <colin.king@canonical.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: David Howells <dhowells@redhat.com> Cc: Fabian Frederick <fabf@skynet.be> Cc: Mikko Rapeli <mikko.rapeli@iki.fi> Cc: Yann Droneaud <ydroneaud@opteya.com> Cc: Zhouyang Jia <jiazhouyang09@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
0153dbcbd4 |
ACPI: fix false-positive -Wuninitialized warning
[ Upstream commit dfd6f9ad36368b8dbd5f5a2b2f0a4705ae69a323 ]
clang gets confused by an uninitialized variable in what looks
to it like a never executed code path:
arch/x86/kernel/acpi/boot.c:618:13: error: variable 'polarity' is uninitialized when used here [-Werror,-Wuninitialized]
polarity = polarity ? ACPI_ACTIVE_LOW : ACPI_ACTIVE_HIGH;
^~~~~~~~
arch/x86/kernel/acpi/boot.c:606:32: note: initialize the variable 'polarity' to silence this warning
int rc, irq, trigger, polarity;
^
= 0
arch/x86/kernel/acpi/boot.c:617:12: error: variable 'trigger' is uninitialized when used here [-Werror,-Wuninitialized]
trigger = trigger ? ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE;
^~~~~~~
arch/x86/kernel/acpi/boot.c:606:22: note: initialize the variable 'trigger' to silence this warning
int rc, irq, trigger, polarity;
^
= 0
This is unfortunately a design decision in clang and won't be fixed.
Changing the acpi_get_override_irq() macro to an inline function
reliably avoids the issue.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
||
|
|
0eb90dd8f7 |
Merge 4.9.187 into android-4.9-q
Changes in 4.9.187
MIPS: ath79: fix ar933x uart parity mode
MIPS: fix build on non-linux hosts
arm64/efi: Mark __efistub_stext_offset as an absolute symbol explicitly
dmaengine: imx-sdma: fix use-after-free on probe error path
ath10k: Do not send probe response template for mesh
ath9k: Check for errors when reading SREV register
ath6kl: add some bounds checking
ath: DFS JP domain W56 fixed pulse type 3 RADAR detection
batman-adv: fix for leaked TVLV handler.
media: dvb: usb: fix use after free in dvb_usb_device_exit
crypto: talitos - fix skcipher failure due to wrong output IV
media: marvell-ccic: fix DMA s/g desc number calculation
media: vpss: fix a potential NULL pointer dereference
media: media_device_enum_links32: clean a reserved field
net: stmmac: dwmac1000: Clear unused address entries
net: stmmac: dwmac4/5: Clear unused address entries
signal/pid_namespace: Fix reboot_pid_ns to use send_sig not force_sig
af_key: fix leaks in key_pol_get_resp and dump_sp.
xfrm: Fix xfrm sel prefix length validation
media: mc-device.c: don't memset __user pointer contents
media: staging: media: davinci_vpfe: - Fix for memory leak if decoder initialization fails.
net: phy: Check against net_device being NULL
crypto: talitos - properly handle split ICV.
crypto: talitos - Align SEC1 accesses to 32 bits boundaries.
tua6100: Avoid build warnings.
locking/lockdep: Fix merging of hlocks with non-zero references
media: wl128x: Fix some error handling in fm_v4l2_init_video_device()
cpupower : frequency-set -r option misses the last cpu in related cpu list
net: fec: Do not use netdev messages too early
net: axienet: Fix race condition causing TX hang
s390/qdio: handle PENDING state for QEBSM devices
perf cs-etm: Properly set the value of 'old' and 'head' in snapshot mode
perf test 6: Fix missing kvm module load for s390
gpio: omap: fix lack of irqstatus_raw0 for OMAP4
gpio: omap: ensure irq is enabled before wakeup
regmap: fix bulk writes on paged registers
bpf: silence warning messages in core
rcu: Force inlining of rcu_read_lock()
blkcg, writeback: dead memcgs shouldn't contribute to writeback ownership arbitration
xfrm: fix sa selector validation
perf evsel: Make perf_evsel__name() accept a NULL argument
vhost_net: disable zerocopy by default
ipoib: correcly show a VF hardware address
EDAC/sysfs: Fix memory leak when creating a csrow object
ipsec: select crypto ciphers for xfrm_algo
media: i2c: fix warning same module names
ntp: Limit TAI-UTC offset
timer_list: Guard procfs specific code
acpi/arm64: ignore 5.1 FADTs that are reported as 5.0
media: coda: fix mpeg2 sequence number handling
media: coda: increment sequence offset for the last returned frame
mt7601u: do not schedule rx_tasklet when the device has been disconnected
x86/build: Add 'set -e' to mkcapflags.sh to delete broken capflags.c
mt7601u: fix possible memory leak when the device is disconnected
ath10k: fix PCIE device wake up failed
perf tools: Increase MAX_NR_CPUS and MAX_CACHES
libata: don't request sense data on !ZAC ATA devices
clocksource/drivers/exynos_mct: Increase priority over ARM arch timer
rslib: Fix decoding of shortened codes
rslib: Fix handling of of caller provided syndrome
ixgbe: Check DDM existence in transceiver before access
crypto: asymmetric_keys - select CRYPTO_HASH where needed
EDAC: Fix global-out-of-bounds write when setting edac_mc_poll_msec
bcache: check c->gc_thread by IS_ERR_OR_NULL in cache_set_flush()
iwlwifi: mvm: Drop large non sta frames
net: usb: asix: init MAC address buffers
gpiolib: Fix references to gpiod_[gs]et_*value_cansleep() variants
Bluetooth: hci_bcsp: Fix memory leak in rx_skb
Bluetooth: 6lowpan: search for destination address in all peers
Bluetooth: Check state in l2cap_disconnect_rsp
Bluetooth: validate BLE connection interval updates
gtp: fix Illegal context switch in RCU read-side critical section.
gtp: fix use-after-free in gtp_newlink()
xen: let alloc_xenballooned_pages() fail if not enough memory free
scsi: NCR5380: Reduce goto statements in NCR5380_select()
scsi: NCR5380: Always re-enable reselection interrupt
scsi: mac_scsi: Increase PIO/PDMA transfer length threshold
crypto: ghash - fix unaligned memory access in ghash_setkey()
crypto: arm64/sha1-ce - correct digest for empty data in finup
crypto: arm64/sha2-ce - correct digest for empty data in finup
crypto: chacha20poly1305 - fix atomic sleep when using async algorithm
crypto: crypto4xx - fix a potential double free in ppc4xx_trng_probe
Input: gtco - bounds check collection indent level
regulator: s2mps11: Fix buck7 and buck8 wrong voltages
arm64: tegra: Update Jetson TX1 GPU regulator timings
iwlwifi: pcie: don't service an interrupt that was masked
tracing/snapshot: Resize spare buffer if size changed
NFSv4: Handle the special Linux file open access mode
lib/scatterlist: Fix mapping iterator when sg->offset is greater than PAGE_SIZE
ALSA: seq: Break too long mutex context in the write loop
ALSA: hda/realtek: apply ALC891 headset fixup to one Dell machine
media: v4l2: Test type instead of cfg->type in v4l2_ctrl_new_custom()
media: coda: Remove unbalanced and unneeded mutex unlock
KVM: x86/vPMU: refine kvm_pmu err msg when event creation failed
arm64: tegra: Fix AGIC register range
fs/proc/proc_sysctl.c: fix the default values of i_uid/i_gid on /proc/sys inodes.
drm/nouveau/i2c: Enable i2c pads & busses during preinit
padata: use smp_mb in padata_reorder to avoid orphaned padata jobs
9p/virtio: Add cleanup path in p9_virtio_init
PCI: Do not poll for PME if the device is in D3cold
Btrfs: add missing inode version, ctime and mtime updates when punching hole
libnvdimm/pfn: fix fsdax-mode namespace info-block zero-fields
take floppy compat ioctls to sodding floppy.c
floppy: fix div-by-zero in setup_format_params
floppy: fix out-of-bounds read in next_valid_format
floppy: fix invalid pointer dereference in drive_name
floppy: fix out-of-bounds read in copy_buffer
coda: pass the host file in vma->vm_file on mmap
gpu: ipu-v3: ipu-ic: Fix saturation bit offset in TPMEM
crypto: ccp - Validate the the error value used to index error messages
PCI: hv: Delete the device earlier from hbus->children for hot-remove
PCI: hv: Fix a use-after-free bug in hv_eject_device_work()
crypto: caam - limit output IV to CBC to work around CTR mode DMA issue
um: Allow building and running on older hosts
um: Fix FP register size for XSTATE/XSAVE
parisc: Ensure userspace privilege for ptraced processes in regset functions
parisc: Fix kernel panic due invalid values in IAOQ0 or IAOQ1
powerpc/32s: fix suspend/resume when IBATs 4-7 are used
powerpc/watchpoint: Restore NV GPRs while returning from exception
eCryptfs: fix a couple type promotion bugs
intel_th: msu: Fix single mode with disabled IOMMU
Bluetooth: Add SMP workaround Microsoft Surface Precision Mouse bug
usb: Handle USB3 remote wakeup for LPM enabled devices correctly
dm bufio: fix deadlock with loop device
compiler.h, kasan: Avoid duplicating __read_once_size_nocheck()
compiler.h: Add read_word_at_a_time() function.
lib/strscpy: Shut up KASAN false-positives in strscpy()
ext4: allow directory holes
bnx2x: Prevent load reordering in tx completion processing
bnx2x: Prevent ptp_task to be rescheduled indefinitely
caif-hsi: fix possible deadlock in cfhsi_exit_module()
igmp: fix memory leak in igmpv3_del_delrec()
ipv4: don't set IPv6 only flags to IPv4 addresses
net: bcmgenet: use promisc for unsupported filters
net: dsa: mv88e6xxx: wait after reset deactivation
net: neigh: fix multiple neigh timer scheduling
net: openvswitch: fix csum updates for MPLS actions
nfc: fix potential illegal memory access
rxrpc: Fix send on a connected, but unbound socket
sky2: Disable MSI on ASUS P6T
vrf: make sure skb->data contains ip header to make routing
macsec: fix use-after-free of skb during RX
macsec: fix checksumming after decryption
netrom: fix a memory leak in nr_rx_frame()
netrom: hold sock when setting skb->destructor
bonding: validate ip header before check IPPROTO_IGMP
tcp: Reset bytes_acked and bytes_received when disconnecting
net: bridge: mcast: fix stale nsrcs pointer in igmp3/mld2 report handling
net: bridge: mcast: fix stale ipv6 hdr pointer when handling v6 query
net: bridge: stp: don't cache eth dest pointer before skb pull
perf/x86/amd/uncore: Rename 'L2' to 'LLC'
perf/x86/amd/uncore: Get correct number of cores sharing last level cache
perf/events/amd/uncore: Fix amd_uncore_llc ID to use pre-defined cpu_llc_id
NFSv4: Fix open create exclusive when the server reboots
nfsd: increase DRC cache limit
nfsd: give out fewer session slots as limit approaches
nfsd: fix performance-limiting session calculation
nfsd: Fix overflow causing non-working mounts on 1 TB machines
drm/panel: simple: Fix panel_simple_dsi_probe
usb: core: hub: Disable hub-initiated U1/U2
tty: max310x: Fix invalid baudrate divisors calculator
pinctrl: rockchip: fix leaked of_node references
tty: serial: cpm_uart - fix init when SMC is relocated
drm/bridge: tc358767: read display_props in get_modes()
drm/bridge: sii902x: pixel clock unit is 10kHz instead of 1kHz
memstick: Fix error cleanup path of memstick_init
tty/serial: digicolor: Fix digicolor-usart already registered warning
tty: serial: msm_serial: avoid system lockup condition
serial: 8250: Fix TX interrupt handling condition
drm/virtio: Add memory barriers for capset cache.
phy: renesas: rcar-gen2: Fix memory leak at error paths
drm/rockchip: Properly adjust to a true clock in adjusted_mode
tty: serial_core: Set port active bit in uart_port_activate
usb: gadget: Zero ffs_io_data
powerpc/pci/of: Fix OF flags parsing for 64bit BARs
PCI: sysfs: Ignore lockdep for remove attribute
kbuild: Add -Werror=unknown-warning-option to CLANG_FLAGS
PCI: xilinx-nwl: Fix Multi MSI data programming
iio: iio-utils: Fix possible incorrect mask calculation
recordmcount: Fix spurious mcount entries on powerpc
mfd: core: Set fwnode for created devices
mfd: arizona: Fix undefined behavior
mfd: hi655x-pmic: Fix missing return value check for devm_regmap_init_mmio_clk
um: Silence lockdep complaint about mmap_sem
powerpc/4xx/uic: clear pending interrupt after irq type/pol change
RDMA/i40iw: Set queue pair state when being queried
serial: sh-sci: Terminate TX DMA during buffer flushing
serial: sh-sci: Fix TX DMA buffer flushing and workqueue races
kallsyms: exclude kasan local symbols on s390
perf test mmap-thread-lookup: Initialize variable to suppress memory sanitizer warning
RDMA/rxe: Fill in wc byte_len with IB_WC_RECV_RDMA_WITH_IMM
powerpc/boot: add {get, put}_unaligned_be32 to xz_config.h
f2fs: avoid out-of-range memory access
mailbox: handle failed named mailbox channel request
powerpc/eeh: Handle hugepages in ioremap space
sh: prevent warnings when using iounmap
mm/kmemleak.c: fix check for softirq context
9p: pass the correct prototype to read_cache_page
mm/mmu_notifier: use hlist_add_head_rcu()
locking/lockdep: Fix lock used or unused stats error
locking/lockdep: Hide unused 'class' variable
usb: wusbcore: fix unbalanced get/put cluster_id
usb: pci-quirks: Correct AMD PLL quirk detection
x86/sysfb_efi: Add quirks for some devices with swapped width and height
x86/speculation/mds: Apply more accurate check on hypervisor platform
hpet: Fix division by zero in hpet_time_div()
ALSA: line6: Fix wrong altsetting for LINE6_PODHD500_1
ALSA: hda - Add a conexant codec entry to let mute led work
powerpc/tm: Fix oops on sigreturn on systems without TM
access: avoid the RCU grace period for the temporary subjective credentials
ipv6: check sk sk_type and protocol early in ip_mroute_set/getsockopt
tcp: reset sk_send_head in tcp_write_queue_purge
arm64: dts: marvell: Fix A37xx UART0 register size
i2c: qup: fixed releasing dma without flush operation completion
arm64: compat: Provide definition for COMPAT_SIGMINSTKSZ
ISDN: hfcsusb: checking idx of ep configuration
media: au0828: fix null dereference in error path
media: cpia2_usb: first wake up, then free in disconnect
media: radio-raremono: change devm_k*alloc to k*alloc
Bluetooth: hci_uart: check for missing tty operations
sched/fair: Don't free p->numa_faults with concurrent readers
drivers/pps/pps.c: clear offset flags in PPS_SETPARAMS ioctl
ceph: hold i_ceph_lock when removing caps for freeing inode
Linux 4.9.187
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
|
||
|
|
837ffc9723 |
sched/fair: Don't free p->numa_faults with concurrent readers
commit 16d51a590a8ce3befb1308e0e7ab77f3b661af33 upstream.
When going through execve(), zero out the NUMA fault statistics instead of
freeing them.
During execve, the task is reachable through procfs and the scheduler. A
concurrent /proc/*/sched reader can read data from a freed ->numa_faults
allocation (confirmed by KASAN) and write it back to userspace.
I believe that it would also be possible for a use-after-free read to occur
through a race between a NUMA fault and execve(): task_numa_fault() can
lead to task_numa_compare(), which invokes task_weight() on the currently
running task of a different CPU.
Another way to fix this would be to make ->numa_faults RCU-managed or add
extra locking, but it seems easier to wipe the NUMA fault statistics on
execve.
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Fixes:
|
||
|
|
50810015e0 |
access: avoid the RCU grace period for the temporary subjective credentials
commit d7852fbd0f0423937fa287a598bfde188bb68c22 upstream. It turns out that 'access()' (and 'faccessat()') can cause a lot of RCU work because it installs a temporary credential that gets allocated and freed for each system call. The allocation and freeing overhead is mostly benign, but because credentials can be accessed under the RCU read lock, the freeing involves a RCU grace period. Which is not a huge deal normally, but if you have a lot of access() calls, this causes a fair amount of seconday damage: instead of having a nice alloc/free patterns that hits in hot per-CPU slab caches, you have all those delayed free's, and on big machines with hundreds of cores, the RCU overhead can end up being enormous. But it turns out that all of this is entirely unnecessary. Exactly because access() only installs the credential as the thread-local subjective credential, the temporary cred pointer doesn't actually need to be RCU free'd at all. Once we're done using it, we can just free it synchronously and avoid all the RCU overhead. So add a 'non_rcu' flag to 'struct cred', which can be set by users that know they only use it in non-RCU context (there are other potential users for this). We can make it a union with the rcu freeing list head that we need for the RCU case, so this doesn't need any extra storage. Note that this also makes 'get_current_cred()' clear the new non_rcu flag, in case we have filesystems that take a long-term reference to the cred and then expect the RCU delayed freeing afterwards. It's not entirely clear that this is required, but it makes for clear semantics: the subjective cred remains non-RCU as long as you only access it synchronously using the thread-local accessors, but you _can_ use it as a generic cred if you want to. It is possible that we should just remove the whole RCU markings for ->cred entirely. Only ->real_cred is really supposed to be accessed through RCU, and the long-term cred copies that nfs uses might want to explicitly re-enable RCU freeing if required, rather than have get_current_cred() do it implicitly. But this is a "minimal semantic changes" change for the immediate problem. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Paul E. McKenney <paulmck@linux.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Jan Glauber <jglauber@marvell.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Jayachandran Chandrasekharan Nair <jnair@marvell.com> Cc: Greg KH <greg@kroah.com> Cc: Kees Cook <keescook@chromium.org> Cc: David Howells <dhowells@redhat.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
4b5d4bdfd1 |
compiler.h: Add read_word_at_a_time() function.
[ Upstream commit 7f1e541fc8d57a143dd5df1d0a1276046e08c083 ] Sometimes we know that it's safe to do potentially out-of-bounds access because we know it won't cross a page boundary. Still, KASAN will report this as a bug. Add read_word_at_a_time() function which is supposed to be used in such cases. In read_word_at_a_time() KASAN performs relaxed check - only the first byte of access is validated. Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
229b670e66 |
compiler.h, kasan: Avoid duplicating __read_once_size_nocheck()
[ Upstream commit bdb5ac801af3d81d36732c2f640d6a1d3df83826 ] Instead of having two identical __read_once_size_nocheck() functions with different attributes, consolidate all the difference in new macro __no_kasan_or_inline and use it. No functional changes. Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
df5b05868d |
clocksource/drivers/exynos_mct: Increase priority over ARM arch timer
[ Upstream commit 6282edb72bed5324352522d732080d4c1b9dfed6 ] Exynos SoCs based on CA7/CA15 have 2 timer interfaces: custom Exynos MCT (Multi Core Timer) and standard ARM Architected Timers. There are use cases, where both timer interfaces are used simultanously. One of such examples is using Exynos MCT for the main system timer and ARM Architected Timers for the KVM and virtualized guests (KVM requires arch timers). Exynos Multi-Core Timer driver (exynos_mct) must be however started before ARM Architected Timers (arch_timer), because they both share some common hardware blocks (global system counter) and turning on MCT is needed to get ARM Architected Timer working properly. To ensure selecting Exynos MCT as the main system timer, increase MCT timer rating. To ensure proper starting order of both timers during suspend/resume cycle, increase MCT hotplug priority over ARM Archictected Timers. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
8151383a17 |
rcu: Force inlining of rcu_read_lock()
[ Upstream commit 6da9f775175e516fc7229ceaa9b54f8f56aa7924 ] When debugging options are turned on, the rcu_read_lock() function might not be inlined. This results in lockdep's print_lock() function printing "rcu_read_lock+0x0/0x70" instead of rcu_read_lock()'s caller. For example: [ 10.579995] ============================= [ 10.584033] WARNING: suspicious RCU usage [ 10.588074] 4.18.0.memcg_v2+ #1 Not tainted [ 10.593162] ----------------------------- [ 10.597203] include/linux/rcupdate.h:281 Illegal context switch in RCU read-side critical section! [ 10.606220] [ 10.606220] other info that might help us debug this: [ 10.606220] [ 10.614280] [ 10.614280] rcu_scheduler_active = 2, debug_locks = 1 [ 10.620853] 3 locks held by systemd/1: [ 10.624632] #0: (____ptrval____) (&type->i_mutex_dir_key#5){.+.+}, at: lookup_slow+0x42/0x70 [ 10.633232] #1: (____ptrval____) (rcu_read_lock){....}, at: rcu_read_lock+0x0/0x70 [ 10.640954] #2: (____ptrval____) (rcu_read_lock){....}, at: rcu_read_lock+0x0/0x70 These "rcu_read_lock+0x0/0x70" strings are not providing any useful information. This commit therefore forces inlining of the rcu_read_lock() function so that rcu_read_lock()'s caller is instead shown. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
83cf3cfdb2 |
Merge 4.9.186 into android-4.9-q
Changes in 4.9.186 crypto: talitos - rename alternative AEAD algos. Input: elantech - enable middle button support on 2 ThinkPads samples, bpf: fix to change the buffer size for read() staging:iio:ad7150: fix threshold mode config bit mac80211: mesh: fix RCU warning mac80211: free peer keys before vif down in mesh mwifiex: Fix possible buffer overflows at parsing bss descriptor netfilter: ipv6: nf_defrag: fix leakage of unqueued fragments netfilter: ipv6: nf_defrag: accept duplicate fragments again dt-bindings: can: mcp251x: add mcp25625 support can: mcp251x: add support for mcp25625 Input: imx_keypad - make sure keyboard can always wake up system KVM: arm/arm64: vgic: Fix kvm_device leak in vgic_its_destroy mlxsw: spectrum: Disallow prio-tagged packets when PVID is removed ARM: davinci: da850-evm: call regulator_has_full_constraints() ARM: davinci: da8xx: specify dma_coherent_mask for lcdc mac80211: only warn once on chanctx_conf being NULL md: fix for divide error in status_resync bnx2x: Check if transceiver implements DDM before access ip6_tunnel: allow not to count pkts on tstats by passing dev as NULL net :sunrpc :clnt :Fix xps refcount imbalance on the error path udf: Fix incorrect final NOT_ALLOCATED (hole) extent length x86/ptrace: Fix possible spectre-v1 in ptrace_get_debugreg() x86/tls: Fix possible spectre-v1 in do_get_thread_area() mwifiex: Abort at too short BSS descriptor element mwifiex: Fix heap overflow in mwifiex_uap_parse_tail_ies() fscrypt: don't set policy for a dead directory mwifiex: Don't abort on small, spec-compliant vendor IEs USB: serial: ftdi_sio: add ID for isodebug v1 USB: serial: option: add support for GosunCn ME3630 RNDIS mode Revert "serial: 8250: Don't service RX FIFO if interrupts are disabled" p54usb: Fix race between disconnect and firmware loading usb: gadget: ether: Fix race between gether_disconnect and rx_submit usb: renesas_usbhs: add a workaround for a race condition of workqueue staging: comedi: dt282x: fix a null pointer deref on interrupt staging: comedi: amplc_pci230: fix null pointer deref on interrupt carl9170: fix misuse of device driver API VMCI: Fix integer overflow in VMCI handle arrays MIPS: Remove superfluous check for __linux__ Revert "e1000e: fix cyclic resets at link up with active tx" e1000e: start network tx queue only when link is up nilfs2: do not use unexported cpu_to_le32()/le32_to_cpu() in uapi header arm64: crypto: remove accidentally backported files perf/core: Fix perf_sample_regs_user() mm check ARM: omap2: remove incorrect __init annotation be2net: fix link failure after ethtool offline test ppp: mppe: Add softdep to arc4 sis900: fix TX completion ARM: dts: imx6ul: fix PWM[1-4] interrupts dm verity: use message limit for data block corruption message ARC: hide unused function unw_hdr_alloc s390: fix stfle zero padding s390/qdio: (re-)initialize tiqdio list entries s390/qdio: don't touch the dsci in tiqdio_add_input_queues() Linux 4.9.186 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
|
6f56992a4e |
VMCI: Fix integer overflow in VMCI handle arrays
commit 1c2eb5b2853c9f513690ba6b71072d8eb65da16a upstream. The VMCI handle array has an integer overflow in vmci_handle_arr_append_entry when it tries to expand the array. This can be triggered from a guest, since the doorbell link hypercall doesn't impose a limit on the number of doorbell handles that a VM can create in the hypervisor, and these handles are stored in a handle array. In this change, we introduce a mandatory max capacity for handle arrays/lists to avoid excessive memory usage. Signed-off-by: Vishnu Dasa <vdasa@vmware.com> Reviewed-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
39eed54804 |
Merge 4.9.185 into android-4.9-q
Changes in 4.9.185
tracing: Silence GCC 9 array bounds warning
gcc-9: silence 'address-of-packed-member' warning
scsi: ufs: Avoid runtime suspend possibly being blocked forever
usb: chipidea: udc: workaround for endpoint conflict issue
IB/hfi1: Silence txreq allocation warnings
Input: uinput - add compat ioctl number translation for UI_*_FF_UPLOAD
apparmor: enforce nullbyte at end of tag string
ARC: fix build warnings with !CONFIG_KPROBES
parport: Fix mem leak in parport_register_dev_model
parisc: Fix compiler warnings in float emulation code
IB/rdmavt: Fix alloc_qpn() WARN_ON()
IB/hfi1: Insure freeze_work work_struct is canceled on shutdown
IB/{qib, hfi1, rdmavt}: Correct ibv_devinfo max_mr value
MIPS: uprobes: remove set but not used variable 'epc'
net: dsa: mv88e6xxx: avoid error message on remove from VLAN 0
net: hns: Fix loopback test failed at copper ports
sparc: perf: fix updated event period in response to PERF_EVENT_IOC_PERIOD
net: ethernet: mediatek: Use hw_feature to judge if HWLRO is supported
net: ethernet: mediatek: Use NET_IP_ALIGN to judge if HW RX_2BYTE_OFFSET is enabled
drm/arm/hdlcd: Allow a bit of clock tolerance
scripts/checkstack.pl: Fix arm64 wrong or unknown architecture
scsi: ufs: Check that space was properly alloced in copy_query_response
s390/qeth: fix VLAN attribute in bridge_hostnotify udev event
hwmon: (pmbus/core) Treat parameters as paged if on multiple pages
nvme: Fix u32 overflow in the number of namespace list calculation
btrfs: start readahead also in seed devices
can: flexcan: fix timeout when set small bitrate
can: purge socket error queue on sock destruct
powerpc/bpf: use unsigned division instruction for 64-bit operations
ARM: imx: cpuidle-imx6sx: Restrict the SW2ISO increase to i.MX6SX
Bluetooth: Align minimum encryption key size for LE and BR/EDR connections
Bluetooth: Fix regression with minimum encryption key size alignment
cfg80211: fix memory leak of wiphy device name
mac80211: drop robust management frames from unknown TA
mac80211: Do not use stack memory with scatterlist for GMAC
IB/hfi1: Avoid hardlockup with flushlist_lock
perf ui helpline: Use strlcpy() as a shorter form of strncpy() + explicit set nul
perf help: Remove needless use of strncpy()
perf header: Fix unchecked usage of strncpy()
9p/rdma: do not disconnect on down_interruptible EAGAIN
9p: acl: fix uninitialized iattr access
9p/rdma: remove useless check in cm_event_handler
9p: p9dirent_read: check network-provided name length
net/9p: include trans_common.h to fix missing prototype warning.
fs/proc/array.c: allow reporting eip/esp for all coredumping threads
fs/binfmt_flat.c: make load_flat_shared_library() work
mm/page_idle.c: fix oops because end_pfn is larger than max_pfn
scsi: vmw_pscsi: Fix use-after-free in pvscsi_queue_lck()
x86/speculation: Allow guests to use SSBD even if host does not
NFS/flexfiles: Use the correct TCP timeout for flexfiles I/O
cpu/speculation: Warn on unsupported mitigations= parameter
af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET
net: stmmac: fixed new system time seconds value calculation
sctp: change to hold sk after auth shkey is created successfully
tipc: change to use register_pernet_device
tipc: check msg->req data len in tipc_nl_compat_bearer_disable
tun: wake up waitqueues after IFF_UP is set
team: Always enable vlan tx offload
bonding: Always enable vlan tx offload
ipv4: Use return value of inet_iif() for __raw_v4_lookup in the while loop
net: check before dereferencing netdev_ops during busy poll
bpf: udp: Avoid calling reuseport's bpf_prog from udp_gro
bpf: udp: ipv6: Avoid running reuseport's bpf_prog from __udp6_lib_err
tipc: pass tunnel dev as NULL to udp_tunnel(6)_xmit_skb
Bluetooth: Fix faulty expression for minimum encryption key size check
ASoC : cs4265 : readable register too low
ASoC: soc-pcm: BE dai needs prepare when pause release after resume
spi: bitbang: Fix NULL pointer dereference in spi_unregister_master
drm/mediatek: fix unbind functions
ASoC: max98090: remove 24-bit format support if RJ is 0
usb: gadget: fusb300_udc: Fix memory leak of fusb300->ep[i]
usb: gadget: udc: lpc32xx: allocate descriptor with GFP_ATOMIC
scsi: hpsa: correct ioaccel2 chaining
scripts/decode_stacktrace.sh: prefix addr2line with $CROSS_COMPILE
mm/mlock.c: change count_mm_mlocked_page_nr return type
MIPS: math-emu: do not use bools for arithmetic
MIPS: netlogic: xlr: Remove erroneous check in nlm_fmn_send()
mfd: omap-usb-tll: Fix register offsets
ARC: fix allnoconfig build warning
bug.h: work around GCC PR82365 in BUG()
ARC: handle gcc generated __builtin_trap for older compiler
clk: sunxi: fix uninitialized access
KVM: x86: degrade WARN to pr_warn_ratelimited
drm/i915/dmc: protect against reading random memory
MIPS: Workaround GCC __builtin_unreachable reordering bug
ptrace: Fix ->ptracer_cred handling for PTRACE_TRACEME
crypto: user - prevent operating on larval algorithms
ALSA: seq: fix incorrect order of dest_client/dest_ports arguments
ALSA: firewire-lib/fireworks: fix miss detection of received MIDI messages
ALSA: line6: Fix write on zero-sized buffer
ALSA: usb-audio: fix sign unintended sign extension on left shifts
lib/mpi: Fix karactx leak in mpi_powm
drm/imx: notify drm core before sending event during crtc disable
drm/imx: only send event on crtc disable if kept disabled
btrfs: Ensure replaced device doesn't have pending chunk allocation
tty: rocket: fix incorrect forward declaration of 'rp_init()'
arm64, vdso: Define vdso_{start,end} as array
KVM: LAPIC: Fix pending interrupt in IRR blocked by software disable LAPIC
IB/hfi1: Close PSM sdma_progress sleep window
MIPS: Add missing EHB in mtc0 -> mfc0 sequence.
dmaengine: imx-sdma: remove BD_INTR for channel0
arm64: kaslr: keep modules inside module region when KASAN is enabled
Linux 4.9.185
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
|
||
|
|
074d0aaec0 |
bug.h: work around GCC PR82365 in BUG()
[ Upstream commit 173a3efd3edb2ef6ef07471397c5f542a360e9c1 ] Looking at functions with large stack frames across all architectures led me discovering that BUG() suffers from the same problem as fortify_panic(), which I've added a workaround for already. In short, variables that go out of scope by calling a noreturn function or __builtin_unreachable() keep using stack space in functions afterwards. A workaround that was identified is to insert an empty assembler statement just before calling the function that doesn't return. I'm adding a macro "barrier_before_unreachable()" to document this, and insert calls to that in all instances of BUG() that currently suffer from this problem. The files that saw the largest change from this had these frame sizes before, and much less with my patch: fs/ext4/inode.c:82:1: warning: the frame size of 1672 bytes is larger than 800 bytes [-Wframe-larger-than=] fs/ext4/namei.c:434:1: warning: the frame size of 904 bytes is larger than 800 bytes [-Wframe-larger-than=] fs/ext4/super.c:2279:1: warning: the frame size of 1160 bytes is larger than 800 bytes [-Wframe-larger-than=] fs/ext4/xattr.c:146:1: warning: the frame size of 1168 bytes is larger than 800 bytes [-Wframe-larger-than=] fs/f2fs/inode.c:152:1: warning: the frame size of 1424 bytes is larger than 800 bytes [-Wframe-larger-than=] net/netfilter/ipvs/ip_vs_core.c:1195:1: warning: the frame size of 1068 bytes is larger than 800 bytes [-Wframe-larger-than=] net/netfilter/ipvs/ip_vs_core.c:395:1: warning: the frame size of 1084 bytes is larger than 800 bytes [-Wframe-larger-than=] net/netfilter/ipvs/ip_vs_ftp.c:298:1: warning: the frame size of 928 bytes is larger than 800 bytes [-Wframe-larger-than=] net/netfilter/ipvs/ip_vs_ftp.c:418:1: warning: the frame size of 908 bytes is larger than 800 bytes [-Wframe-larger-than=] net/netfilter/ipvs/ip_vs_lblcr.c:718:1: warning: the frame size of 960 bytes is larger than 800 bytes [-Wframe-larger-than=] drivers/net/xen-netback/netback.c:1500:1: warning: the frame size of 1088 bytes is larger than 800 bytes [-Wframe-larger-than=] In case of ARC and CRIS, it turns out that the BUG() implementation actually does return (or at least the compiler thinks it does), resulting in lots of warnings about uninitialized variable use and leaving noreturn functions, such as: block/cfq-iosched.c: In function 'cfq_async_queue_prio': block/cfq-iosched.c:3804:1: error: control reaches end of non-void function [-Werror=return-type] include/linux/dmaengine.h: In function 'dma_maxpq': include/linux/dmaengine.h:1123:1: error: control reaches end of non-void function [-Werror=return-type] This makes them call __builtin_trap() instead, which should normally dump the stack and kill the current process, like some of the other architectures already do. I tried adding barrier_before_unreachable() to panic() and fortify_panic() as well, but that had very little effect, so I'm not submitting that patch. Vineet said: : For ARC, it is double win. : : 1. Fixes 3 -Wreturn-type warnings : : | ../net/core/ethtool.c:311:1: warning: control reaches end of non-void function : [-Wreturn-type] : | ../kernel/sched/core.c:3246:1: warning: control reaches end of non-void function : [-Wreturn-type] : | ../include/linux/sunrpc/svc_xprt.h:180:1: warning: control reaches end of : non-void function [-Wreturn-type] : : 2. bloat-o-meter reports code size improvements as gcc elides the : generated code for stack return. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365 Link: http://lkml.kernel.org/r/20171219114112.939391-1-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Vineet Gupta <vgupta@synopsys.com> [arch/arc] Tested-by: Vineet Gupta <vgupta@synopsys.com> [arch/arc] Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Christopher Li <sparse@chrisli.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> [removed cris chunks - gregkh] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
a0b21f86b2 |
Merge 4.9.183 into android-4.9-q
Changes in 4.9.183 rapidio: fix a NULL pointer dereference when create_workqueue() fails fs/fat/file.c: issue flush after the writeback of FAT sysctl: return -EINVAL if val violates minmax ipc: prevent lockup on alloc_msg and free_msg ARM: prevent tracing IPI_CPU_BACKTRACE hugetlbfs: on restore reserve error path retain subpool reservation mem-hotplug: fix node spanned pages when we have a node with only ZONE_MOVABLE mm/cma.c: fix crash on CMA allocation if bitmap allocation fails mm/cma_debug.c: fix the break condition in cma_maxchunk_get() mm/slab.c: fix an infinite loop in leaks_show() kernel/sys.c: prctl: fix false positive in validate_prctl_map() drivers: thermal: tsens: Don't print error message on -EPROBE_DEFER mfd: tps65912-spi: Add missing of table registration mfd: intel-lpss: Set the device in reset state when init mfd: twl6040: Fix device init errors for ACCCTL register perf/x86/intel: Allow PEBS multi-entry in watermark mode drm/bridge: adv7511: Fix low refresh rate selection objtool: Don't use ignore flag for fake jumps pwm: meson: Use the spin-lock only to protect register modifications ntp: Allow TAI-UTC offset to be set to zero f2fs: fix to avoid panic in do_recover_data() f2fs: fix to clear dirty inode in error path of f2fs_iget() f2fs: fix to do sanity check on valid block count of segment configfs: fix possible use-after-free in configfs_register_group uml: fix a boot splat wrt use of cpu_all_mask watchdog: imx2_wdt: Fix set_timeout for big timeout values watchdog: fix compile time error of pretimeout governors iommu/vt-d: Set intel_iommu_gfx_mapped correctly ALSA: hda - Register irq handler after the chip initialization nvmem: core: fix read buffer in place fuse: retrieve: cap requested size to negotiated max_write nfsd: allow fh_want_write to be called twice x86/PCI: Fix PCI IRQ routing table memory leak platform/chrome: cros_ec_proto: check for NULL transfer function soc: mediatek: pwrap: Zero initialize rdata in pwrap_init_cipher clk: rockchip: Turn on "aclk_dmac1" for suspend on rk3288 ARM: dts: imx6sx: Specify IMX6SX_CLK_IPG as "ahb" clock to SDMA ARM: dts: imx7d: Specify IMX7D_CLK_IPG as "ipg" clock to SDMA ARM: dts: imx6ul: Specify IMX6UL_CLK_IPG as "ipg" clock to SDMA ARM: dts: imx6sx: Specify IMX6SX_CLK_IPG as "ipg" clock to SDMA ARM: dts: imx6qdl: Specify IMX6QDL_CLK_IPG as "ipg" clock to SDMA PCI: rpadlpar: Fix leaked device_node references in add/remove paths platform/x86: intel_pmc_ipc: adding error handling PCI: rcar: Fix a potential NULL pointer dereference PCI: rcar: Fix 64bit MSI message address handling video: hgafb: fix potential NULL pointer dereference video: imsttfb: fix potential NULL pointer dereferences PCI: xilinx: Check for __get_free_pages() failure gpio: gpio-omap: add check for off wake capable gpios dmaengine: idma64: Use actual device for DMA transfers pwm: tiehrpwm: Update shadow register for disabling PWMs ARM: dts: exynos: Always enable necessary APIO_1V8 and ABB_1V8 regulators on Arndale Octa pwm: Fix deadlock warning when removing PWM device ARM: exynos: Fix undefined instruction during Exynos5422 resume Revert "Bluetooth: Align minimum encryption key size for LE and BR/EDR connections" ALSA: seq: Cover unsubscribe_port() in list_mutex ALSA: oxfw: allow PCM capture for Stanton SCS.1m libata: Extend quirks for the ST1000LM024 drives with NOLPM quirk mm/list_lru.c: fix memory leak in __memcg_init_list_lru_node fs/ocfs2: fix race in ocfs2_dentry_attach_lock() signal/ptrace: Don't leak unitialized kernel memory with PTRACE_PEEK_SIGINFO ptrace: restore smp_rmb() in __ptrace_may_access() media: v4l2-ioctl: clear fields in s_parm i2c: acorn: fix i2c warning bcache: fix stack corruption by PRECEDING_KEY() cgroup: Use css_tryget() instead of css_tryget_online() in task_get_css() ASoC: cs42xx8: Add regcache mask dirty ASoC: fsl_asrc: Fix the issue about unsupported rate x86/uaccess, kcov: Disable stack protector ALSA: seq: Protect in-kernel ioctl calls with mutex ALSA: seq: Fix race of get-subscription call vs port-delete ioctls Revert "ALSA: seq: Protect in-kernel ioctl calls with mutex" Drivers: misc: fix out-of-bounds access in function param_set_kgdbts_var scsi: lpfc: add check for loss of ndlp when sending RRQ arm64/mm: Inhibit huge-vmap with ptdump scsi: bnx2fc: fix incorrect cast to u64 on shift operation selftests/timers: Add missing fflush(stdout) calls usbnet: ipheth: fix racing condition KVM: x86/pmu: do not mask the value that is written to fixed PMUs KVM: s390: fix memory slot handling for KVM_SET_USER_MEMORY_REGION drm/vmwgfx: integer underflow in vmw_cmd_dx_set_shader() leading to an invalid read drm/vmwgfx: NULL pointer dereference from vmw_cmd_dx_view_define() usb: dwc2: Fix DMA cache alignment issues USB: Fix chipmunk-like voice when using Logitech C270 for recording audio. USB: usb-storage: Add new ID to ums-realtek USB: serial: pl2303: add Allied Telesis VT-Kit3 USB: serial: option: add support for Simcom SIM7500/SIM7600 RNDIS mode USB: serial: option: add Telit 0x1260 and 0x1261 compositions rtc: pcf8523: don't return invalid date when battery is low ax25: fix inconsistent lock state in ax25_destroy_timer be2net: Fix number of Rx queues used for flow hashing ipv6: flowlabel: fl6_sock_lookup() must use atomic_inc_not_zero lapb: fixed leak of control-blocks. neigh: fix use-after-free read in pneigh_get_next sunhv: Fix device naming inconsistency between sunhv_console and sunhv_reg Revert "staging: vc04_services: prevent integer overflow in create_pagelist()" perf/x86/intel/ds: Fix EVENT vs. UEVENT PEBS constraints selftests: netfilter: missing error check when setting up veth interface mISDN: make sure device name is NUL terminated x86/CPU/AMD: Don't force the CPB cap when running under a hypervisor perf/ring_buffer: Fix exposing a temporarily decreased data_head perf/ring_buffer: Add ordering to rb->nest increment gpio: fix gpio-adp5588 build errors net: tulip: de4x5: Drop redundant MODULE_DEVICE_TABLE() i2c: dev: fix potential memory leak in i2cdev_ioctl_rdwr configfs: Fix use-after-free when accessing sd->s_dentry perf data: Fix 'strncat may truncate' build failure with recent gcc perf record: Fix s390 missing module symbol and warning for non-root users ia64: fix build errors by exporting paddr_to_nid() KVM: PPC: Book3S: Use new mutex to synchronize access to rtas token list KVM: PPC: Book3S HV: Don't take kvm->lock around kvm_for_each_vcpu net: sh_eth: fix mdio access in sh_eth_close() for R-Car Gen2 and RZ/A1 SoCs scsi: libcxgbi: add a check for NULL pointer in cxgbi_check_route() scsi: smartpqi: properly set both the DMA mask and the coherent DMA mask scsi: libsas: delete sas port if expander discover failed mlxsw: spectrum: Prevent force of 56G Abort file_remove_privs() for non-reg. files Linux 4.9.183 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
|
df260f7a03 |
cgroup: Use css_tryget() instead of css_tryget_online() in task_get_css()
commit 18fa84a2db0e15b02baa5d94bdb5bd509175d2f6 upstream.
A PF_EXITING task can stay associated with an offline css. If such
task calls task_get_css(), it can get stuck indefinitely. This can be
triggered by BSD process accounting which writes to a file with
PF_EXITING set when racing against memcg disable as in the backtrace
at the end.
After this change, task_get_css() may return a css which was already
offline when the function was called. None of the existing users are
affected by this change.
INFO: rcu_sched self-detected stall on CPU
INFO: rcu_sched detected stalls on CPUs/tasks:
...
NMI backtrace for cpu 0
...
Call Trace:
<IRQ>
dump_stack+0x46/0x68
nmi_cpu_backtrace.cold.2+0x13/0x57
nmi_trigger_cpumask_backtrace+0xba/0xca
rcu_dump_cpu_stacks+0x9e/0xce
rcu_check_callbacks.cold.74+0x2af/0x433
update_process_times+0x28/0x60
tick_sched_timer+0x34/0x70
__hrtimer_run_queues+0xee/0x250
hrtimer_interrupt+0xf4/0x210
smp_apic_timer_interrupt+0x56/0x110
apic_timer_interrupt+0xf/0x20
</IRQ>
RIP: 0010:balance_dirty_pages_ratelimited+0x28f/0x3d0
...
btrfs_file_write_iter+0x31b/0x563
__vfs_write+0xfa/0x140
__kernel_write+0x4f/0x100
do_acct_process+0x495/0x580
acct_process+0xb9/0xdb
do_exit+0x748/0xa00
do_group_exit+0x3a/0xa0
get_signal+0x254/0x560
do_signal+0x23/0x5c0
exit_to_usermode_loop+0x5d/0xa0
prepare_exit_to_usermode+0x53/0x80
retint_user+0x8/0x8
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org # v4.2+
Fixes:
|
||
|
|
d7650c74b6 |
pwm: Fix deadlock warning when removing PWM device
[ Upstream commit 347ab9480313737c0f1aaa08e8f2e1a791235535 ] This patch fixes deadlock warning if removing PWM device when CONFIG_PROVE_LOCKING is enabled. This issue can be reproceduced by the following steps on the R-Car H3 Salvator-X board if the backlight is disabled: # cd /sys/class/pwm/pwmchip0 # echo 0 > export # ls device export npwm power pwm0 subsystem uevent unexport # cd device/driver # ls bind e6e31000.pwm uevent unbind # echo e6e31000.pwm > unbind [ 87.659974] ====================================================== [ 87.666149] WARNING: possible circular locking dependency detected [ 87.672327] 5.0.0 #7 Not tainted [ 87.675549] ------------------------------------------------------ [ 87.681723] bash/2986 is trying to acquire lock: [ 87.686337] 000000005ea0e178 (kn->count#58){++++}, at: kernfs_remove_by_name_ns+0x50/0xa0 [ 87.694528] [ 87.694528] but task is already holding lock: [ 87.700353] 000000006313b17c (pwm_lock){+.+.}, at: pwmchip_remove+0x28/0x13c [ 87.707405] [ 87.707405] which lock already depends on the new lock. [ 87.707405] [ 87.715574] [ 87.715574] the existing dependency chain (in reverse order) is: [ 87.723048] [ 87.723048] -> #1 (pwm_lock){+.+.}: [ 87.728017] __mutex_lock+0x70/0x7e4 [ 87.732108] mutex_lock_nested+0x1c/0x24 [ 87.736547] pwm_request_from_chip.part.6+0x34/0x74 [ 87.741940] pwm_request_from_chip+0x20/0x40 [ 87.746725] export_store+0x6c/0x1f4 [ 87.750820] dev_attr_store+0x18/0x28 [ 87.754998] sysfs_kf_write+0x54/0x64 [ 87.759175] kernfs_fop_write+0xe4/0x1e8 [ 87.763615] __vfs_write+0x40/0x184 [ 87.767619] vfs_write+0xa8/0x19c [ 87.771448] ksys_write+0x58/0xbc [ 87.775278] __arm64_sys_write+0x18/0x20 [ 87.779721] el0_svc_common+0xd0/0x124 [ 87.783986] el0_svc_compat_handler+0x1c/0x24 [ 87.788858] el0_svc_compat+0x8/0x18 [ 87.792947] [ 87.792947] -> #0 (kn->count#58){++++}: [ 87.798260] lock_acquire+0xc4/0x22c [ 87.802353] __kernfs_remove+0x258/0x2c4 [ 87.806790] kernfs_remove_by_name_ns+0x50/0xa0 [ 87.811836] remove_files.isra.1+0x38/0x78 [ 87.816447] sysfs_remove_group+0x48/0x98 [ 87.820971] sysfs_remove_groups+0x34/0x4c [ 87.825583] device_remove_attrs+0x6c/0x7c [ 87.830197] device_del+0x11c/0x33c [ 87.834201] device_unregister+0x14/0x2c [ 87.838638] pwmchip_sysfs_unexport+0x40/0x4c [ 87.843509] pwmchip_remove+0xf4/0x13c [ 87.847773] rcar_pwm_remove+0x28/0x34 [ 87.852039] platform_drv_remove+0x24/0x64 [ 87.856651] device_release_driver_internal+0x18c/0x21c [ 87.862391] device_release_driver+0x14/0x1c [ 87.867175] unbind_store+0xe0/0x124 [ 87.871265] drv_attr_store+0x20/0x30 [ 87.875442] sysfs_kf_write+0x54/0x64 [ 87.879618] kernfs_fop_write+0xe4/0x1e8 [ 87.884055] __vfs_write+0x40/0x184 [ 87.888057] vfs_write+0xa8/0x19c [ 87.891887] ksys_write+0x58/0xbc [ 87.895716] __arm64_sys_write+0x18/0x20 [ 87.900154] el0_svc_common+0xd0/0x124 [ 87.904417] el0_svc_compat_handler+0x1c/0x24 [ 87.909289] el0_svc_compat+0x8/0x18 [ 87.913378] [ 87.913378] other info that might help us debug this: [ 87.913378] [ 87.921374] Possible unsafe locking scenario: [ 87.921374] [ 87.927286] CPU0 CPU1 [ 87.931808] ---- ---- [ 87.936331] lock(pwm_lock); [ 87.939293] lock(kn->count#58); [ 87.945120] lock(pwm_lock); [ 87.950599] lock(kn->count#58); [ 87.953908] [ 87.953908] *** DEADLOCK *** [ 87.953908] [ 87.959821] 4 locks held by bash/2986: [ 87.963563] #0: 00000000ace7bc30 (sb_writers#6){.+.+}, at: vfs_write+0x188/0x19c [ 87.971044] #1: 00000000287991b2 (&of->mutex){+.+.}, at: kernfs_fop_write+0xb4/0x1e8 [ 87.978872] #2: 00000000f739d016 (&dev->mutex){....}, at: device_release_driver_internal+0x40/0x21c [ 87.988001] #3: 000000006313b17c (pwm_lock){+.+.}, at: pwmchip_remove+0x28/0x13c [ 87.995481] [ 87.995481] stack backtrace: [ 87.999836] CPU: 0 PID: 2986 Comm: bash Not tainted 5.0.0 #7 [ 88.005489] Hardware name: Renesas Salvator-X board based on r8a7795 ES1.x (DT) [ 88.012791] Call trace: [ 88.015235] dump_backtrace+0x0/0x190 [ 88.018891] show_stack+0x14/0x1c [ 88.022204] dump_stack+0xb0/0xec [ 88.025514] print_circular_bug.isra.32+0x1d0/0x2e0 [ 88.030385] __lock_acquire+0x1318/0x1864 [ 88.034388] lock_acquire+0xc4/0x22c [ 88.037958] __kernfs_remove+0x258/0x2c4 [ 88.041874] kernfs_remove_by_name_ns+0x50/0xa0 [ 88.046398] remove_files.isra.1+0x38/0x78 [ 88.050487] sysfs_remove_group+0x48/0x98 [ 88.054490] sysfs_remove_groups+0x34/0x4c [ 88.058580] device_remove_attrs+0x6c/0x7c [ 88.062671] device_del+0x11c/0x33c [ 88.066154] device_unregister+0x14/0x2c [ 88.070070] pwmchip_sysfs_unexport+0x40/0x4c [ 88.074421] pwmchip_remove+0xf4/0x13c [ 88.078163] rcar_pwm_remove+0x28/0x34 [ 88.081906] platform_drv_remove+0x24/0x64 [ 88.085996] device_release_driver_internal+0x18c/0x21c [ 88.091215] device_release_driver+0x14/0x1c [ 88.095478] unbind_store+0xe0/0x124 [ 88.099048] drv_attr_store+0x20/0x30 [ 88.102704] sysfs_kf_write+0x54/0x64 [ 88.106359] kernfs_fop_write+0xe4/0x1e8 [ 88.110275] __vfs_write+0x40/0x184 [ 88.113757] vfs_write+0xa8/0x19c [ 88.117065] ksys_write+0x58/0xbc [ 88.120374] __arm64_sys_write+0x18/0x20 [ 88.124291] el0_svc_common+0xd0/0x124 [ 88.128034] el0_svc_compat_handler+0x1c/0x24 [ 88.132384] el0_svc_compat+0x8/0x18 The sysfs unexport in pwmchip_remove() is completely asymmetric to what we do in pwmchip_add_with_polarity() and commit |
||
|
|
8b83e43efb |
Merge 4.9.182 into android-4.9-q
Changes in 4.9.182 tcp: reduce tcp_fastretrans_alert() verbosity tcp: limit payload size of sacked skbs tcp: tcp_fragment() should apply sane memory limits tcp: add tcp_min_snd_mss sysctl tcp: enforce tcp_min_snd_mss in tcp_mtu_probing() Linux 4.9.182 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
|
cc1b58ccb7 |
tcp: limit payload size of sacked skbs
commit 3b4929f65b0d8249f19a50245cd88ed1a2f78cff upstream.
Jonathan Looney reported that TCP can trigger the following crash
in tcp_shifted_skb() :
BUG_ON(tcp_skb_pcount(skb) < pcount);
This can happen if the remote peer has advertized the smallest
MSS that linux TCP accepts : 48
An skb can hold 17 fragments, and each fragment can hold 32KB
on x86, or 64KB on PowerPC.
This means that the 16bit witdh of TCP_SKB_CB(skb)->tcp_gso_segs
can overflow.
Note that tcp_sendmsg() builds skbs with less than 64KB
of payload, so this problem needs SACK to be enabled.
SACK blocks allow TCP to coalesce multiple skbs in the retransmit
queue, thus filling the 17 fragments to maximal capacity.
CVE-2019-11477 -- u16 overflow of TCP_SKB_CB(skb)->tcp_gso_segs
Backport notes, provided by Joao Martins <joao.m.martins@oracle.com>
v4.15 or since commit 737ff314563 ("tcp: use sequence distance to
detect reordering") had switched from the packet-based FACK tracking and
switched to sequence-based.
v4.14 and older still have the old logic and hence on
tcp_skb_shift_data() needs to retain its original logic and have
@fack_count in sync. In other words, we keep the increment of pcount with
tcp_skb_pcount(skb) to later used that to update fack_count. To make it
more explicit we track the new skb that gets incremented to pcount in
@next_pcount, and we get to avoid the constant invocation of
tcp_skb_pcount(skb) all together.
Fixes:
|
||
|
|
77161ce013 |
Merge 4.9.181 into android-4.9-q
Changes in 4.9.181
ipv6: Consider sk_bound_dev_if when binding a raw socket to an address
llc: fix skb leak in llc_build_and_send_ui_pkt()
net: fec: fix the clk mismatch in failed_reset path
net-gro: fix use-after-free read in napi_gro_frags()
net: stmmac: fix reset gpio free missing
usbnet: fix kernel crash after disconnect
tipc: Avoid copying bytes beyond the supplied data
bnxt_en: Fix aggregation buffer leak under OOM condition.
ipv4/igmp: fix another memory leak in igmpv3_del_delrec()
ipv4/igmp: fix build error if !CONFIG_IP_MULTICAST
net: dsa: mv88e6xxx: fix handling of upper half of STATS_TYPE_PORT
net: mvneta: Fix err code path of probe
net: mvpp2: fix bad MVPP2_TXQ_SCHED_TOKEN_CNTR_REG queue value
crypto: vmx - ghash: do nosimd fallback manually
xen/pciback: Don't disable PCI_COMMAND on PCI device reset.
Revert "tipc: fix modprobe tipc failed after switch order of device registration"
tipc: fix modprobe tipc failed after switch order of device registration
sparc64: Fix regression in non-hypervisor TLB flush xcall
include/linux/bitops.h: sanitize rotate primitives
xhci: update bounce buffer with correct sg num
xhci: Use %zu for printing size_t type
xhci: Convert xhci_handshake() to use readl_poll_timeout_atomic()
usb: xhci: avoid null pointer deref when bos field is NULL
usbip: usbip_host: fix BUG: sleeping function called from invalid context
usbip: usbip_host: fix stub_dev lock context imbalance regression
USB: Fix slab-out-of-bounds write in usb_get_bos_descriptor
USB: sisusbvga: fix oops in error path of sisusb_probe
USB: Add LPM quirk for Surface Dock GigE adapter
USB: rio500: refuse more than one device at a time
USB: rio500: fix memory leak in close after disconnect
media: usb: siano: Fix general protection fault in smsusb
media: usb: siano: Fix false-positive "uninitialized variable" warning
media: smsusb: better handle optional alignment
scsi: zfcp: fix missing zfcp_port reference put on -EBUSY from port_remove
scsi: zfcp: fix to prevent port_remove with pure auto scan LUNs (only sdevs)
Btrfs: fix race updating log root item during fsync
powerpc/perf: Fix MMCRA corruption by bhrb_filter
ALSA: hda/realtek - Set default power save node to 0
drm/nouveau/i2c: Disable i2c bus access after ->fini()
tty: serial: msm_serial: Fix XON/XOFF
tty: max310x: Fix external crystal register setup
memcg: make it work on sparse non-0-node systems
kernel/signal.c: trace_signal_deliver when signal_group_exit
docs: Fix conf.py for Sphinx 2.0
staging: vc04_services: prevent integer overflow in create_pagelist()
CIFS: cifs_read_allocate_pages: don't iterate through whole page array on ENOMEM
gcc-plugins: Fix build failures under Darwin host
drm/vmwgfx: Don't send drm sysfs hotplug events on initial master set
brcmfmac: add length checks in scheduled scan result handler
brcmfmac: assure SSID length from firmware is limited
brcmfmac: add subtype check for event handling in data path
binder: Replace "%p" with "%pK" for stable
binder: replace "%p" with "%pK"
fs: prevent page refcount overflow in pipe_buf_get
mm, gup: remove broken VM_BUG_ON_PAGE compound check for hugepages
mm, gup: ensure real head page is ref-counted when using hugepages
mm: prevent get_user_pages() from overflowing page refcount
mm: make page ref count overflow check tighter and more explicit
Revert "x86/build: Move _etext to actual end of .text"
efi/libstub: Unify command line param parsing
media: uvcvideo: Fix uvc_alloc_entity() allocation alignment
ethtool: fix potential userspace buffer overflow
neighbor: Call __ipv4_neigh_lookup_noref in neigh_xmit
net/mlx4_en: ethtool, Remove unsupported SFP EEPROM high pages query
net: rds: fix memory leak in rds_ib_flush_mr_pool
pktgen: do not sleep with the thread lock held.
ipv6: fix EFAULT on sendto with icmpv6 and hdrincl
ipv6: use READ_ONCE() for inet->hdrincl as in ipv4
Revert "fib_rules: fix error in backport of e9919a24d302 ("fib_rules: return 0...")"
Revert "fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied"
rcu: locking and unlocking need to always be at least barriers
parisc: Use implicit space register selection for loading the coherence index of I/O pdirs
fuse: fallocate: fix return with locked inode
x86/power: Fix 'nosmt' vs hibernation triple fault during resume
MIPS: pistachio: Build uImage.gz by default
Revert "MIPS: perf: ath79: Fix perfcount IRQ assignment"
genwqe: Prevent an integer overflow in the ioctl
drm/gma500/cdv: Check vbt config bits when detecting lvds panels
drm/radeon: prefer lower reference dividers
drm/i915: Fix I915_EXEC_RING_MASK
TTY: serial_core, add ->install
fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock
fuse: Add FOPEN_STREAM to use stream_open()
ipv4: Define __ipv4_neigh_lookup_noref when CONFIG_INET is disabled
ethtool: check the return value of get_regs_len
Linux 4.9.181
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
|
||
|
|
9c829b6e3f |
fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock
commit 10dce8af34226d90fa56746a934f8da5dcdba3df upstream. Commit |
||
|
|
5bdc536ce6 |
x86/power: Fix 'nosmt' vs hibernation triple fault during resume
commit ec527c318036a65a083ef68d8ba95789d2212246 upstream.
As explained in
0cc3cd21657b ("cpu/hotplug: Boot HT siblings at least once")
we always, no matter what, have to bring up x86 HT siblings during boot at
least once in order to avoid first MCE bringing the system to its knees.
That means that whenever 'nosmt' is supplied on the kernel command-line,
all the HT siblings are as a result sitting in mwait or cpudile after
going through the online-offline cycle at least once.
This causes a serious issue though when a kernel, which saw 'nosmt' on its
commandline, is going to perform resume from hibernation: if the resume
from the hibernated image is successful, cr3 is flipped in order to point
to the address space of the kernel that is being resumed, which in turn
means that all the HT siblings are all of a sudden mwaiting on address
which is no longer valid.
That results in triple fault shortly after cr3 is switched, and machine
reboots.
Fix this by always waking up all the SMT siblings before initiating the
'restore from hibernation' process; this guarantees that all the HT
siblings will be properly carried over to the resumed kernel waiting in
resume_play_dead(), and acted upon accordingly afterwards, based on the
target kernel configuration.
Symmetricaly, the resumed kernel has to push the SMT siblings to mwait
again in case it has SMT disabled; this means it has to online all
the siblings when resuming (so that they come out of hlt) and offline
them again to let them reach mwait.
Cc: 4.19+ <stable@vger.kernel.org> # v4.19+
Debugged-by: Thomas Gleixner <tglx@linutronix.de>
Fixes: 0cc3cd21657b ("cpu/hotplug: Boot HT siblings at least once")
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
||
|
|
0b1a236081 |
rcu: locking and unlocking need to always be at least barriers
commit 66be4e66a7f422128748e3c3ef6ee72b20a6197b upstream. Herbert Xu pointed out that commit |
||
|
|
8fca3c3646 |
efi/libstub: Unify command line param parsing
commit 60f38de7a8d4e816100ceafd1b382df52527bd50 upstream. Merge the parsing of the command line carried out in arm-stub.c with the handling in efi_parse_options(). Note that this also fixes the missing handling of CONFIG_CMDLINE_FORCE=y, in which case the builtin command line should supersede the one passed by the firmware. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bhe@redhat.com Cc: bhsharma@redhat.com Cc: bp@alien8.de Cc: eugene@hp.com Cc: evgeny.kalugin@intel.com Cc: jhugo@codeaurora.org Cc: leif.lindholm@linaro.org Cc: linux-efi@vger.kernel.org Cc: mark.rutland@arm.com Cc: roy.franz@cavium.com Cc: rruigrok@codeaurora.org Link: http://lkml.kernel.org/r/20170404160910.28115-1-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> [ardb: fix up merge conflicts with 4.9.180] Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
96019c6914 |
mm: make page ref count overflow check tighter and more explicit
commit f958d7b528b1b40c44cfda5eabe2d82760d868c3 upstream. We have a VM_BUG_ON() to check that the page reference count doesn't underflow (or get close to overflow) by checking the sign of the count. That's all fine, but we actually want to allow people to use a "get page ref unless it's already very high" helper function, and we want that one to use the sign of the page ref (without triggering this VM_BUG_ON). Change the VM_BUG_ON to only check for small underflows (or _very_ close to overflowing), and ignore overflows which have strayed into negative territory. Acked-by: Matthew Wilcox <willy@infradead.org> Cc: Jann Horn <jannh@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
9557090582 |
fs: prevent page refcount overflow in pipe_buf_get
commit 15fab63e1e57be9fdb5eec1bbc5916e9825e9acb upstream. Change pipe_buf_get() to return a bool indicating whether it succeeded in raising the refcount of the page (if the thing in the pipe is a page). This removes another mechanism for overflowing the page refcount. All callers converted to handle a failure. Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [bwh: Backported to 4.9: adjust context] Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
0b45276928 |
memcg: make it work on sparse non-0-node systems
commit 3e8589963773a5c23e2f1fe4bcad0e9a90b7f471 upstream.
We have a single node system with node 0 disabled:
Scanning NUMA topology in Northbridge 24
Number of physical nodes 2
Skipping disabled node 0
Node 1 MemBase 0000000000000000 Limit 00000000fbff0000
NODE_DATA(1) allocated [mem 0xfbfda000-0xfbfeffff]
This causes crashes in memcg when system boots:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
#PF error: [normal kernel read fault]
...
RIP: 0010:list_lru_add+0x94/0x170
...
Call Trace:
d_lru_add+0x44/0x50
dput.part.34+0xfc/0x110
__fput+0x108/0x230
task_work_run+0x9f/0xc0
exit_to_usermode_loop+0xf5/0x100
It is reproducible as far as 4.12. I did not try older kernels. You have
to have a new enough systemd, e.g. 241 (the reason is unknown -- was not
investigated). Cannot be reproduced with systemd 234.
The system crashes because the size of lru array is never updated in
memcg_update_all_list_lrus and the reads are past the zero-sized array,
causing dereferences of random memory.
The root cause are list_lru_memcg_aware checks in the list_lru code. The
test in list_lru_memcg_aware is broken: it assumes node 0 is always
present, but it is not true on some systems as can be seen above.
So fix this by avoiding checks on node 0. Remember the memcg-awareness by
a bool flag in struct list_lru.
Link: http://lkml.kernel.org/r/20190522091940.3615-1-jslaby@suse.cz
Fixes:
|
||
|
|
e186b19bc3 |
include/linux/bitops.h: sanitize rotate primitives
commit ef4d6f6b275c498f8e5626c99dbeefdc5027f843 upstream.
The ror32 implementation (word >> shift) | (word << (32 - shift) has
undefined behaviour if shift is outside the [1, 31] range. Similarly
for the 64 bit variants. Most callers pass a compile-time constant
(naturally in that range), but there's an UBSAN report that these may
actually be called with a shift count of 0.
Instead of special-casing that, we can make them DTRT for all values of
shift while also avoiding UB. For some reason, this was already partly
done for rol32 (which was well-defined for [0, 31]). gcc 8 recognizes
these patterns as rotates, so for example
__u32 rol32(__u32 word, unsigned int shift)
{
return (word << (shift & 31)) | (word >> ((-shift) & 31));
}
compiles to
0000000000000020 <rol32>:
20: 89 f8 mov %edi,%eax
22: 89 f1 mov %esi,%ecx
24: d3 c0 rol %cl,%eax
26: c3 retq
Older compilers unfortunately do not do as well, but this only affects
the small minority of users that don't pass constants.
Due to integer promotions, ro[lr]8 were already well-defined for shifts
in [0, 8], and ro[lr]16 were mostly well-defined for shifts in [0, 16]
(only mostly - u16 gets promoted to _signed_ int, so if bit 15 is set,
word << 16 is undefined). For consistency, update those as well.
Link: http://lkml.kernel.org/r/20190410211906.2190-1-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reported-by: Ido Schimmel <idosch@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Cc: Vadim Pasternak <vadimp@mellanox.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
||
|
|
3f006d43bd |
Merge 4.9.180 into android-4.9-q
Changes in 4.9.180 ext4: do not delete unlinked inode from orphan list on failed truncate KVM: x86: fix return value for reserved EFER bio: fix improper use of smp_mb__before_atomic() Revert "scsi: sd: Keep disk read-only when re-reading partition" crypto: vmx - CTR: always increment IV as quadword kvm: svm/avic: fix off-by-one in checking host APIC ID libnvdimm/namespace: Fix label tracking error arm64: Save and restore OSDLR_EL1 across suspend/resume gfs2: Fix sign extension bug in gfs2_update_stats Btrfs: do not abort transaction at btrfs_update_root() after failure to COW path Btrfs: fix race between ranged fsync and writeback of adjacent ranges btrfs: sysfs: don't leak memory when failing add fsid fbdev: fix divide error in fb_var_to_videomode hugetlb: use same fault hash key for shared and private mappings fbdev: fix WARNING in __alloc_pages_nodemask bug media: cpia2: Fix use-after-free in cpia2_exit media: vivid: use vfree() instead of kfree() for dev->bitmap_cap ssb: Fix possible NULL pointer dereference in ssb_host_pcmcia_exit at76c50x-usb: Don't register led_trigger if usb_register_driver failed perf tools: No need to include bitops.h in util.h tools include: Adopt linux/bits.h Revert "btrfs: Honour FITRIM range constraints during free space trim" gfs2: Fix lru_count going negative cxgb4: Fix error path in cxgb4_init_module mmc: core: Verify SD bus width dmaengine: tegra210-dma: free dma controller in remove() net: ena: gcc 8: fix compilation warning ASoC: hdmi-codec: unlock the device on startup errors powerpc/boot: Fix missing check of lseek() return value ASoC: imx: fix fiq dependencies spi: pxa2xx: fix SCR (divisor) calculation brcm80211: potential NULL dereference in brcmf_cfg80211_vndr_cmds_dcmd_handler() ARM: vdso: Remove dependency with the arch_timer driver internals arm64: Fix compiler warning from pte_unmap() with -Wunused-but-set-variable sched/cpufreq: Fix kobject memleak scsi: qla2xxx: Fix a qla24xx_enable_msix() error path iwlwifi: pcie: don't crash on invalid RX interrupt rtc: 88pm860x: prevent use-after-free on device remove w1: fix the resume command API dmaengine: pl330: _stop: clear interrupt status mac80211/cfg80211: update bss channel on channel switch ASoC: fsl_sai: Update is_slave_mode with correct value mwifiex: prevent an array overflow net: cw1200: fix a NULL pointer dereference crypto: sun4i-ss - Fix invalid calculation of hash end bcache: return error immediately in bch_journal_replay() bcache: fix failure in journal relplay bcache: add failure check to run_cache_set() for journal replay bcache: avoid clang -Wunintialized warning x86/build: Move _etext to actual end of .text smpboot: Place the __percpu annotation correctly x86/mm: Remove in_nmi() warning from 64-bit implementation of vmalloc_fault() mm/uaccess: Use 'unsigned long' to placate UBSAN warnings on older GCC versions HID: logitech-hidpp: use RAP instead of FAP to get the protocol version pinctrl: pistachio: fix leaked of_node references dmaengine: at_xdmac: remove BUG_ON macro in tasklet media: coda: clear error return value before picture run media: ov6650: Move v4l2_clk_get() to ov6650_video_probe() helper media: au0828: stop video streaming only when last user stops media: ov2659: make S_FMT succeed even if requested format doesn't match audit: fix a memory leak bug media: au0828: Fix NULL pointer dereference in au0828_analog_stream_enable() media: pvrusb2: Prevent a buffer overflow powerpc/numa: improve control of topology updates sched/core: Check quota and period overflow at usec to nsec conversion sched/core: Handle overflow in cpu_shares_write_u64 USB: core: Don't unbind interfaces following device reset failure x86/irq/64: Limit IST stack overflow check to #DB stack i40e: don't allow changes to HW VLAN stripping on active port VLANs arm64: vdso: Fix clock_getres() for CLOCK_REALTIME RDMA/cxgb4: Fix null pointer dereference on alloc_skb failure hwmon: (vt1211) Use request_muxed_region for Super-IO accesses hwmon: (smsc47m1) Use request_muxed_region for Super-IO accesses hwmon: (smsc47b397) Use request_muxed_region for Super-IO accesses hwmon: (pc87427) Use request_muxed_region for Super-IO accesses hwmon: (f71805f) Use request_muxed_region for Super-IO accesses scsi: libsas: Do discovery on empty PHY to update PHY info mmc: core: make pwrseq_emmc (partially) support sleepy GPIO controllers mmc_spi: add a status check for spi_sync_locked mmc: sdhci-of-esdhc: add erratum eSDHC5 support mmc: sdhci-of-esdhc: add erratum eSDHC-A001 and A-008358 support PM / core: Propagate dev->power.wakeup_path when no callbacks extcon: arizona: Disable mic detect if running when driver is removed s390: cio: fix cio_irb declaration cpufreq: ppc_cbe: fix possible object reference leak cpufreq/pasemi: fix possible object reference leak cpufreq: pmac32: fix possible object reference leak x86/build: Keep local relocations with ld.lld iio: ad_sigma_delta: Properly handle SPI bus locking vs CS assertion iio: hmc5843: fix potential NULL pointer dereferences iio: common: ssp_sensors: Initialize calculated_time in ssp_common_process_data rtlwifi: fix a potential NULL pointer dereference mwifiex: Fix mem leak in mwifiex_tm_cmd brcmfmac: fix missing checks for kmemdup b43: shut up clang -Wuninitialized variable warning brcmfmac: convert dev_init_lock mutex to completion brcmfmac: fix race during disconnect when USB completion is in progress brcmfmac: fix Oops when bringing up interface during USB disconnect scsi: ufs: Fix regulator load and icc-level configuration scsi: ufs: Avoid configuring regulator with undefined voltage range arm64: cpu_ops: fix a leaked reference by adding missing of_node_put x86/uaccess, signal: Fix AC=1 bloat x86/ia32: Fix ia32_restore_sigcontext() AC leak chardev: add additional check for minor range overlap HID: core: move Usage Page concatenation to Main item ASoC: eukrea-tlv320: fix a leaked reference by adding missing of_node_put ASoC: fsl_utils: fix a leaked reference by adding missing of_node_put cxgb3/l2t: Fix undefined behaviour spi: tegra114: reset controller on probe media: wl128x: prevent two potential buffer overflows virtio_console: initialize vtermno value for ports tty: ipwireless: fix missing checks for ioremap x86/mce: Fix machine_check_poll() tests for error types rcutorture: Fix cleanup path for invalid torture_type strings rcuperf: Fix cleanup path for invalid perf_type strings usb: core: Add PM runtime calls to usb_hcd_platform_shutdown scsi: qla4xxx: avoid freeing unallocated dma memory dmaengine: tegra210-adma: use devm_clk_*() helpers media: m88ds3103: serialize reset messages in m88ds3103_set_frontend media: go7007: avoid clang frame overflow warning with KASAN scsi: lpfc: Fix FDMI manufacturer attribute value media: saa7146: avoid high stack usage with clang scsi: lpfc: Fix SLI3 commands being issued on SLI4 devices spi : spi-topcliff-pch: Fix to handle empty DMA buffers spi: rspi: Fix sequencer reset during initialization spi: Fix zero length xfer bug ASoC: davinci-mcasp: Fix clang warning without CONFIG_PM drm: Wake up next in drm_read() chain if we are forced to putback the event Linux 4.9.180 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
|
1410277e19 |
HID: core: move Usage Page concatenation to Main item
[ Upstream commit 58e75155009cc800005629955d3482f36a1e0eec ] As seen on some USB wireless keyboards manufactured by Primax, the HID parser was using some assumptions that are not always true. In this case it's s the fact that, inside the scope of a main item, an Usage Page will always precede an Usage. The spec is not pretty clear as 6.2.2.7 states "Any usage that follows is interpreted as a Usage ID and concatenated with the Usage Page". While 6.2.2.8 states "When the parser encounters a main item it concatenates the last declared Usage Page with a Usage to form a complete usage value." Being somewhat contradictory it was decided to match Window's implementation, which follows 6.2.2.8. In summary, the patch moves the Usage Page concatenation from the local item parsing function to the main item parsing function. Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Reviewed-by: Terry Junge <terry.junge@poly.com> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
0ce6473c39 |
iio: ad_sigma_delta: Properly handle SPI bus locking vs CS assertion
[ Upstream commit df1d80aee963480c5c2938c64ec0ac3e4a0df2e0 ] For devices from the SigmaDelta family we need to keep CS low when doing a conversion, since the device will use the MISO line as a interrupt to indicate that the conversion is complete. This is why the driver locks the SPI bus and when the SPI bus is locked keeps as long as a conversion is going on. The current implementation gets one small detail wrong though. CS is only de-asserted after the SPI bus is unlocked. This means it is possible for a different SPI device on the same bus to send a message which would be wrongfully be addressed to the SigmaDelta device as well. Make sure that the last SPI transfer that is done while holding the SPI bus lock de-asserts the CS signal. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Alexandru Ardelean <Alexandru.Ardelean@analog.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
a5f3559f30 |
smpboot: Place the __percpu annotation correctly
[ Upstream commit d4645d30b50d1691c26ff0f8fa4e718b08f8d3bb ]
The test robot reported a wrong assignment of a per-CPU variable which
it detected by using sparse and sent a report. The assignment itself is
correct. The annotation for sparse was wrong and hence the report.
The first pointer is a "normal" pointer and points to the per-CPU memory
area. That means that the __percpu annotation has to be moved.
Move the __percpu annotation to pointer which points to the per-CPU
area. This change affects only the sparse tool (and is ignored by the
compiler).
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes:
|
||
|
|
f0539c7018 |
hugetlb: use same fault hash key for shared and private mappings
commit 1b426bac66e6cc83c9f2d92b96e4e72acf43419a upstream. hugetlb uses a fault mutex hash table to prevent page faults of the same pages concurrently. The key for shared and private mappings is different. Shared keys off address_space and file index. Private keys off mm and virtual address. Consider a private mappings of a populated hugetlbfs file. A fault will map the page from the file and if needed do a COW to map a writable page. Hugetlbfs hole punch uses the fault mutex to prevent mappings of file pages. It uses the address_space file index key. However, private mappings will use a different key and could race with this code to map the file page. This causes problems (BUG) for the page cache remove code as it expects the page to be unmapped. A sample stack is: page dumped because: VM_BUG_ON_PAGE(page_mapped(page)) kernel BUG at mm/filemap.c:169! ... RIP: 0010:unaccount_page_cache_page+0x1b8/0x200 ... Call Trace: __delete_from_page_cache+0x39/0x220 delete_from_page_cache+0x45/0x70 remove_inode_hugepages+0x13c/0x380 ? __add_to_page_cache_locked+0x162/0x380 hugetlbfs_fallocate+0x403/0x540 ? _cond_resched+0x15/0x30 ? __inode_security_revalidate+0x5d/0x70 ? selinux_file_permission+0x100/0x130 vfs_fallocate+0x13f/0x270 ksys_fallocate+0x3c/0x80 __x64_sys_fallocate+0x1a/0x20 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 There seems to be another potential COW issue/race with this approach of different private and shared keys as noted in commit |