kernel_realme_sm7125

Evolution-X-Devices/kernel_realme_sm7125

Author	SHA1	Message	Date
Paul E. McKenney	6c1da3d301	BACKPORT: rcu: Defer reporting RCU-preempt quiescent states when disabled This commit defers reporting of RCU-preempt quiescent states at rcu_read_unlock_special() time when any of interrupts, softirq, or preemption are disabled. These deferred quiescent states are reported at a later RCU_SOFTIRQ, context switch, idle entry, or CPU-hotplug offline operation. Of course, if another RCU read-side critical section has started in the meantime, the reporting of the quiescent state will be further deferred. This also means that disabling preemption, interrupts, and/or softirqs will act as an RCU-preempt read-side critical section. This is enforced by checking preempt_count() as needed. Some special cases must be handled on an ad-hoc basis, for example, context switch is a quiescent state even though both the scheduler and do_exit() disable preemption. In these cases, additional calls to rcu_preempt_deferred_qs() override the preemption disabling. Similar logic overrides disabled interrupts in rcu_preempt_check_callbacks() because in this case the quiescent state happened just before the corresponding scheduling-clock interrupt. In theory, this change lifts a long-standing restriction that required that if interrupts were disabled across a call to rcu_read_unlock() that the matching rcu_read_lock() also be contained within that interrupts-disabled region of code. Because the reporting of the corresponding RCU-preempt quiescent state is now deferred until after interrupts have been enabled, it is no longer possible for this situation to result in deadlocks involving the scheduler's runqueue and priority-inheritance locks. This may allow some code simplification that might reduce interrupt latency a bit. Unfortunately, in practice this would also defer deboosting a low-priority task that had been subjected to RCU priority boosting, so real-time-response considerations might well force this restriction to remain in place. Because RCU-preempt grace periods are now blocked not only by RCU read-side critical sections, but also by disabling of interrupts, preemption, and softirqs, it will be possible to eliminate RCU-bh and RCU-sched in favor of RCU-preempt in CONFIG_PREEMPT=y kernels. This may require some additional plumbing to provide the network denial-of-service guarantees that have been traditionally provided by RCU-bh. Once these are in place, CONFIG_PREEMPT=n kernels will be able to fold RCU-bh into RCU-sched. This would mean that all kernels would have but one flavor of RCU, which would open the door to significant code cleanup. Moving to a single flavor of RCU would also have the beneficial effect of reducing the NOCB kthreads by at least a factor of two. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Apply rcu_read_unlock_special() preempt_count() feedback from Joel Fernandes. ] [ paulmck: Adjust rcu_eqs_enter() call to rcu_preempt_deferred_qs() in response to bug reports from kbuild test robot. ] [ paulmck: Fix bug located by kbuild test robot involving recursion via rcu_preempt_deferred_qs(). ] Change-Id: If76ec913be9db64e7d1e70f408407229f5291af2	2025-12-28 18:39:28 +05:30
Andrii Nakryiko	b5563a7445	UPSTREAM: btf: rename /sys/kernel/btf/kernel into /sys/kernel/btf/vmlinux Expose kernel's BTF under the name vmlinux to be more uniform with using kernel module names as file names in the future. Fixes: 341dfcf8d78e ("btf: expose BTF info through sysfs") Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Change-Id: I0c478e34bd67044a37b166b60c1de07025c3ec90 Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2025-12-28 18:38:12 +05:30
Andrii Nakryiko	7b56e8ffcc	BACKPORT: btf: expose BTF info through sysfs Make .BTF section allocated and expose its contents through sysfs. /sys/kernel/btf directory is created to contain all the BTFs present inside kernel. Currently there is only kernel's main BTF, represented as /sys/kernel/btf/kernel file. Once kernel modules' BTFs are supported, each module will expose its BTF as /sys/kernel/btf/<module-name> file. Current approach relies on a few pieces coming together: 1. pahole is used to take almost final vmlinux image (modulo .BTF and kallsyms) and generate .BTF section by converting DWARF info into BTF. This section is not allocated and not mapped to any segment, though, so is not yet accessible from inside kernel at runtime. 2. objcopy dumps .BTF contents into binary file and subsequently convert binary file into linkable object file with automatically generated symbols _binary__btf_kernel_bin_start and _binary__btf_kernel_bin_end, pointing to start and end, respectively, of BTF raw data. 3. final vmlinux image is generated by linking this object file (and kallsyms, if necessary). sysfs_btf.c then creates /sys/kernel/btf/kernel file and exposes embedded BTF contents through it. This allows, e.g., libbpf and bpftool access BTF info at well-known location, without resorting to searching for vmlinux image on disk (location of which is not standardized and vmlinux image might not be even available in some scenarios, e.g., inside qemu during testing). Alternative approach using .incbin assembler directive to embed BTF contents directly was attempted but didn't work, because sysfs_proc.o is not re-compiled during link-vmlinux.sh stage. This is required, though, to update embedded BTF data (initially empty data is embedded, then pahole generates BTF info and we need to regenerate sysfs_btf.o with updated contents, but it's too late at that point). If BTF couldn't be generated due to missing or too old pahole, sysfs_btf.c handles that gracefully by detecting that _binary__btf_kernel_bin_start (weak symbol) is 0 and not creating /sys/kernel/btf at all. v2->v3: - added Documentation/ABI/testing/sysfs-kernel-btf (Greg K-H); - created proper kobject (btf_kobj) for btf directory (Greg K-H); - undo v2 change of reusing vmlinux, as it causes extra kallsyms pass due to initially missing __binary__btf_kernel_bin_{start/end} symbols; v1->v2: - allow kallsyms stage to re-use vmlinux generated by gen_btf(); Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Change-Id: Ife57356dd14b9ae0d2e3801eea7c602c6245952f Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2025-12-28 18:38:12 +05:30
Björn Töpel	8f0a19547d	UPSTREAM: xsk: new descriptor addressing scheme Currently, AF_XDP only supports a fixed frame-size memory scheme where each frame is referenced via an index (idx). A user passes the frame index to the kernel, and the kernel acts upon the data. Some NICs, however, do not have a fixed frame-size model, instead they have a model where a memory window is passed to the hardware and multiple frames are filled into that window (referred to as the "type-writer" model). By changing the descriptor format from the current frame index addressing scheme, AF_XDP can in the future be extended to support these kinds of NICs. In the index-based model, an idx refers to a frame of size frame_size. Addressing a frame in the UMEM is done by offseting the UMEM starting address by a global offset, idx * frame_size + offset. Communicating via the fill- and completion-rings are done by means of idx. In this commit, the idx is removed in favor of an address (addr), which is a relative address ranging over the UMEM. To convert an idx-based address to the new addr is simply: addr = idx * frame_size + offset. We also stop referring to the UMEM "frame" as a frame. Instead it is simply called a chunk. To transfer ownership of a chunk to the kernel, the addr of the chunk is passed in the fill-ring. Note, that the kernel will mask addr to make it chunk aligned, so there is no need for userspace to do that. E.g., for a chunk size of 2k, passing an addr of 2048, 2050 or 3000 to the fill-ring will refer to the same chunk. On the completion-ring, the addr will match that of the Tx descriptor, passed to the kernel. Changing the descriptor format to use chunks/addr will allow for future changes to move to a type-writer based model, where multiple frames can reside in one chunk. In this model passing one single chunk into the fill-ring, would potentially result in multiple Rx descriptors. This commit changes the uapi of AF_XDP sockets, and updates the documentation. Change-Id: I375aff25e625bddb164b6047cd60c4deb35ba557 Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2025-12-28 16:56:37 +05:30
Magnus Karlsson	c786206f01	BACKPORT: samples/bpf: sample application and documentation for AF_XDP sockets This is a sample application for AF_XDP sockets. The application supports three different modes of operation: rxdrop, txonly and l2fwd. To show-case a simple round-robin load-balancing between a set of sockets in an xskmap, set the RR_LB compile time define option to 1 in "xdpsock.h". v2: The entries variable was calculated twice in {umem,xq}_nb_avail. Co-authored-by: Björn Töpel <bjorn.topel@intel.com> Change-Id: I39c4d5fb7c5d4448cc44206520c3d8aa3c13da4e Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-28 16:56:36 +05:30
Al Viro	953ada34db	BACKPORT: new inode method: ->free_inode() A lot of ->destroy_inode() instances end with call_rcu() of a callback that does RCU-delayed part of freeing. Introduce a new method for doing just that, with saner signature. Rules: ->destroy_inode ->free_inode f g immediate call of f(), RCU-delayed call of g() f NULL immediate call of f(), no RCU-delayed calls NULL g RCU-delayed call of g() NULL NULL RCU-delayed default freeing IOW, NULL ->free_inode gives the same behaviour as now. Note that NULL, NULL is equivalent to NULL, free_inode_nonrcu; we could mandate the latter form, but that would have very little benefit beyond making rules a bit more symmetric. It would break backwards compatibility, require extra boilerplate and expected semantics for (NULL, NULL) pair would have no use whatsoever... Change-Id: I828dd519b381a1f25013ec3d1eae0b6835eb9c26 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2025-12-28 16:54:57 +05:30
Jiong Wang	b9da358a11	UPSTREAM: bpf: allocate 0x06 to new eBPF instruction class JMP32 The new eBPF instruction class JMP32 uses the reserved class number 0x6. Kernel BPF ISA documentation updated accordingly. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Change-Id: Idfa9c7ef537b67b06ffea33db60785f754f9bfea Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-28 16:52:28 +05:30
Sergey Senozhatsky	896dba6e26	BACKPORT: symbol lookup: introduce dereference_symbol_descriptor() dereference_symbol_descriptor() invokes appropriate ARCH specific function descriptor dereference callbacks: - dereference_kernel_function_descriptor() if the pointer is a kernel symbol; - dereference_module_function_descriptor() if the pointer is a module symbol. This is the last step needed to make '%pS/%ps' smart enough to handle function descriptor dereference on affected ARCHs and to retire '%pF/%pf'. To refresh it: Some architectures (ia64, ppc64, parisc64) use an indirect pointer for C function pointers - the function pointer points to a function descriptor and we need to dereference it to get the actual function pointer. Function descriptors live in .opd elf section and all affected ARCHs (ia64, ppc64, parisc64) handle it properly for kernel and modules. So we, technically, can decide if the dereference is needed by simply looking at the pointer: if it belongs to .opd section then we need to dereference it. The kernel and modules have their own .opd sections, obviously, that's why we need to split dereference_function_descriptor() and use separate kernel and module dereference arch callbacks. Link: http://lkml.kernel.org/r/20171206043649.GB15885@jagdpanzerIV Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: James Bottomley <jejb@parisc-linux.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jessica Yu <jeyu@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linux-ia64@vger.kernel.org Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-kernel@vger.kernel.org Change-Id: I102222c3b164340bd683e29fe61cc963eb5106eb Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Tested-by: Tony Luck <tony.luck@intel.com> #ia64 Tested-by: Santosh Sivaraj <santosh@fossix.org> #powerpc Tested-by: Helge Deller <deller@gmx.de> #parisc64 Signed-off-by: Petr Mladek <pmladek@suse.com>	2025-12-28 16:46:57 +05:30
Mauro Carvalho Chehab	af04ce618c	BACKPORT: MAINTAINERS & files: Canonize the e-mails I use at files From now on, I'll start using my @kernel.org as my development e-mail. As such, let's remove the entries that point to the old mchehab@s-opensource.com at MAINTAINERS file. For the files written with a copyright with mchehab@s-opensource, let's keep Samsung on their names, using mchehab+samsung@kernel.org, in order to keep pointing to my employer, with sponsors the work. For the files written before I join Samsung (on July, 4 2013), let's just use mchehab@kernel.org. For bug reports, we can simply point to just kernel.org, as this will reach my mchehab+samsung inbox anyway. [Linux4: Only keep RC related bits] Change-Id: I336d95e919b1566af32b7dac66e2da301503d0ea Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Brian Warner <brian.warner@samsung.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>	2025-12-28 16:39:05 +05:30
Sean Young	f9c49a0e59	UPSTREAM: media: rc: add ioctl to get the current timeout Since the kernel now modifies the timeout, make it possible to retrieve the current value. Change-Id: If925b7d4bc77cce2b1d8200251aac1db124b4d9c Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2025-12-28 16:39:00 +05:30
Sean Young	ab907c3992	UPSTREAM: media: rc: report receiver and transmitter type on device register On the raspberry pi, we might have two lirc devices; one for sending and one for receiving. This change makes it much more apparent which one is which. Change-Id: I8c8a566886f184575b47cec5ec4e2d91dc3c7dff Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2025-12-28 16:38:59 +05:30
Sean Young	3805f131cf	UPSTREAM: media: rc: no need to announce major number Since commit a60d64b15c20 ("media: lirc: lirc interface should not be a raw decoder"), the message in the documentation is incorrect as the module name is rc_core, not lirc_dev. Since the message is not useful, just make the message debug and remove it from the documentation. Change-Id: I945a6a2f99e27e8a62e62a3e8f5547b73de93c0a Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2025-12-28 16:38:50 +05:30
Sean Young	d69c495223	UPSTREAM: media: lirc: lirc daemon fails to detect raw IR device Since commit 9b6192589be7 ("media: lirc: implement scancode sending"), and commit de142c324106 ("media: lirc: implement reading scancode") the lirc features ioctl for raw IR devices advertises two modes for sending and receiving. The lirc daemon now fails to detect a raw IR device, both for transmit and receive. To fix this, do not advertise the scancode mode in the lirc features for raw IR devices (however do keep it for scancode devices). The mode can still be used via the LIRC_SET_{REC,SEND}_MODE ioctl. Change-Id: Ibdfe2d4ef9f561380a0e27e705df8cd79faa81bc Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2025-12-28 16:38:00 +05:30
Sean Young	5d5217e88e	UPSTREAM: media: lirc: when transmitting scancodes, block until transmit is done The semantics for lirc IR transmit with raw IR is that the write call should block until the IR is transmitted. Some drivers have no idea when this actually is (e.g. mceusb), so there is a wait. This is useful for userspace, as it might want to send a IR button press, a gap of a predefined number of milliseconds, and then send a repeat message. It turns out that for transmitting scancodes this feature is even more useful, as user space has no idea how long the IR is. So, maintain the existing semantics for IR scancode transmit. Change-Id: Ib8070358522cacc6afcd1516757599f2550a1bbb Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2025-12-28 16:37:55 +05:30
Mauro Carvalho Chehab	895d48ec23	UPSTREAM: media: RC docs: add enum rc_proto description at the docs This is part of the uAPI. Add it to the documentation again, and fix cross-references. Change-Id: I47aee8a3c0d080db6245b03c12299f0e191123ac Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Acked-by: Sean Young <sean@mess.org>	2025-12-28 16:37:52 +05:30
Sean Young	0a8b983fe3	UPSTREAM: media: lirc: document LIRC_MODE_SCANCODE Lirc supports a new mode which requires documentation. Change-Id: Icc626fba59fdfc2adb7d544aa77ad91d7314d2e6 Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2025-12-28 16:37:17 +05:30
Sean Young	ca725fe430	UPSTREAM: media: lirc: remove last remnants of lirc kapi rc-core has replaced the lirc kapi many years ago, and now with the last driver ported to rc-core, we can finally remove it. Note this has no effect on userspace. All future IR drivers should use the rc-core api. Change-Id: Ie00c3ecbc293b3420ba8112768ef5a706b840af3 Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2025-12-28 16:36:35 +05:30
Sean Young	705627b044	UPSTREAM: media: lirc: remove name from lirc_dev This is a duplicate of rcdev->driver_name. Change-Id: I185f4dc3df62e03e2a759b34b29d5e840a33be68 Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2025-12-28 16:36:34 +05:30
Sean Young	468895f12e	UPSTREAM: media: lirc: remove LIRCCODE and LIRC_GET_LENGTH LIRCCODE is a lirc mode where a driver produces driver-dependent codes for receive and transmit. No driver uses this any more. The LIRC_GET_LENGTH ioctl was used for this mode only. Change-Id: Ib3e2deb561c9e3fbd499e14a82beb0799da457a4 Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2025-12-28 16:36:29 +05:30
David Ahern	2451b0c17b	UPSTREAM: net/ipv6: Add support for path selection using hash of 5-tuple Some operators prefer IPv6 path selection to use a standard 5-tuple hash rather than just an L3 hash with the flow the label. To that end add support to IPv6 for multipath hash policy similar to `bf4e0a3db9` ("net: ipv4: add support for ECMP hash policy choice"). The default is still L3 which covers source and destination addresses along with flow label and IPv6 protocol. Change-Id: I7a664a2fbec460203df6aa467f2c478763235146 Signed-off-by: David Ahern <dsahern@gmail.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2025-12-28 16:28:15 +05:30
Tom Herbert	b47cfaee91	UPSTREAM: ipv6: Implement limits on Hop-by-Hop and Destination options RFC 8200 (IPv6) defines Hop-by-Hop options and Destination options extension headers. Both of these carry a list of TLVs which is only limited by the maximum length of the extension header (2048 bytes). By the spec a host must process all the TLVs in these options, however these could be used as a fairly obvious denial of service attack. I think this could in fact be a significant DOS vector on the Internet, one mitigating factor might be that many FWs drop all packets with EH (and obviously this is only IPv6) so an Internet wide attack might not be so effective (yet!). By my calculation, the worse case packet with TLVs in a standard 1500 byte MTU packet that would be processed by the stack contains 1282 invidual TLVs (including pad TLVS) or 724 two byte TLVs. I wrote a quick test program that floods a whole bunch of these packets to a host and sure enough there is substantial time spent in ip6_parse_tlv. These packets contain nothing but unknown TLVS (that are ignored), TLV padding, and bogus UDP header with zero payload length. 25.38% [kernel] [k] __fib6_clean_all 21.63% [kernel] [k] ip6_parse_tlv 4.21% [kernel] [k] __local_bh_enable_ip 2.18% [kernel] [k] ip6_pol_route.isra.39 1.98% [kernel] [k] fib6_walk_continue 1.88% [kernel] [k] _raw_write_lock_bh 1.65% [kernel] [k] dst_release This patch adds configurable limits to Destination and Hop-by-Hop options. There are three limits that may be set: - Limit the number of options in a Hop-by-Hop or Destination options extension header. - Limit the byte length of a Hop-by-Hop or Destination options extension header. - Disallow unrecognized options in a Hop-by-Hop or Destination options extension header. The limits are set in corresponding sysctls: ipv6.sysctl.max_dst_opts_cnt ipv6.sysctl.max_hbh_opts_cnt ipv6.sysctl.max_dst_opts_len ipv6.sysctl.max_hbh_opts_len If a max_*_opts_cnt is less than zero then unknown TLVs are disallowed. The number of known TLVs that are allowed is the absolute value of this number. If a limit is exceeded when processing an extension header the packet is dropped. Default values are set to 8 for options counts, and set to INT_MAX for maximum length. Note the choice to limit options to 8 is an arbitrary guess (roughly based on the fact that the stack supports three HBH options and just one destination option). These limits have being proposed in draft-ietf-6man-rfc6434-bis. Tested (by Martin Lau) I tested out 1 thread (i.e. one raw_udp process). I changed the net.ipv6.max_dst_(opts\|hbh)_number between 8 to 2048. With sysctls setting to 2048, the softirq% is packed to 100%. With 8, the softirq% is almost unnoticable from mpstat. v2; - Code and documention cleanup. - Change references of RFC2460 to be RFC8200. - Add reference to RFC6434-bis where the limits will be in standard. Change-Id: I0642f24d68966d73b9247973faf3473f4944b3bd Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2025-12-28 16:28:14 +05:30
Yuchung Cheng	58fd8d288b	BACKPORT: tcp: pause Fast Open globally after third consecutive timeout Prior to this patch, active Fast Open is paused on a specific destination IP address if the previous connections to the IP address have experienced recurring timeouts . But recent experiments by Microsoft (https://goo.gl/cykmn7) and Mozilla browsers indicate the isssue is often caused by broken middle-boxes sitting close to the client. Therefore it is much better user experience if Fast Open is disabled out-right globally to avoid experiencing further timeouts on connections toward other destinations. This patch changes the destination-IP disablement to global disablement if a connection experiencing recurring timeouts or aborts due to timeout. Repeated incidents would still exponentially increase the pause time, starting from an hour. This is extremely conservative but an unfortunate compromise to minimize bad experience due to broken middle-boxes. Reported-by: Dragana Damjanovic <ddamjanovic@mozilla.com> Reported-by: Patrick McManus <mcmanus@ducksong.com> Change-Id: Ia2698f86b82005d73a0744356d9787395ffb8376 Signed-off-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Wei Wang <weiwan@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2025-12-28 16:12:13 +05:30
Shannon Nelson	dd68b265e3	xfrm: add documentation for xfrm device offload api Add a writeup on how to use the XFRM device offload API, and mention this new file in the index. Change-Id: I2b148393010d6487c45c9d1ff555a89f22a8acca Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2025-12-28 15:58:15 +05:30
Axel Rasmussen	8786bbf619	BACKPORT: FROMGIT: userfaultfd/shmem: advertise shmem minor fault support Now that the feature is fully implemented (the faulting path hooks exist so userspace is notified, and the ioctl to resolve such faults is available), advertise this as a supported feature. Link: https://lkml.kernel.org/r/20210503180737.2487560-6-axelrasmussen@google.com Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: Peter Xu <peterx@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Brian Geffon <bgeffon@google.com> Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Oliver Upton <oupton@google.com> Cc: Shaohua Li <shli@fb.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Wang Qing <wangqing@vivo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> (cherry picked from commit 37aa962fe33a562fd0ac21b68938023e12041fc3 https: //git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git akpm) Link: https://lore.kernel.org/patchwork/patch/1420971/ Conflicts: Documentation/admin-guide/mm/userfaultfd.rst (Manual rebase) Signed-off-by: Lokesh Gidra <lokeshgidra@google.com> Bug: 187930641 Change-Id: I5fbab3783ff8671c0a5aa4826aead2d63f5cbbf3	2025-12-17 21:30:48 +05:30
Peter Collingbourne	0be0d0b493	userfaultfd: do not untag user pointers commit e71e2ace5721a8b921dca18b045069e7bb411277 upstream. Patch series "userfaultfd: do not untag user pointers", v5. If a user program uses userfaultfd on ranges of heap memory, it may end up passing a tagged pointer to the kernel in the range.start field of the UFFDIO_REGISTER ioctl. This can happen when using an MTE-capable allocator, or on Android if using the Tagged Pointers feature for MTE readiness [1]. When a fault subsequently occurs, the tag is stripped from the fault address returned to the application in the fault.address field of struct uffd_msg. However, from the application's perspective, the tagged address is the memory address, so if the application is unaware of memory tags, it may get confused by receiving an address that is, from its point of view, outside of the bounds of the allocation. We observed this behavior in the kselftest for userfaultfd [2] but other applications could have the same problem. Address this by not untagging pointers passed to the userfaultfd ioctls. Instead, let the system call fail. Also change the kselftest to use mmap so that it doesn't encounter this problem. [1] https://source.android.com/devices/tech/debug/tagged-pointers [2] tools/testing/selftests/vm/userfaultfd.c This patch (of 2): Do not untag pointers passed to the userfaultfd ioctls. Instead, let the system call fail. This will provide an early indication of problems with tag-unaware userspace code instead of letting the code get confused later, and is consistent with how we decided to handle brk/mmap/mremap in commit dcde237319e6 ("mm: Avoid creating virtual address aliases in brk()/mmap()/mremap()"), as well as being consistent with the existing tagged address ABI documentation relating to how ioctl arguments are handled. The code change is a revert of commit 7d0325749a6c ("userfaultfd: untag user pointers") plus some fixups to some additional calls to validate_range that have appeared since then. [1] https://source.android.com/devices/tech/debug/tagged-pointers [2] tools/testing/selftests/vm/userfaultfd.c Link: https://lkml.kernel.org/r/20210714195437.118982-1-pcc@google.com Link: https://lkml.kernel.org/r/20210714195437.118982-2-pcc@google.com Link: https://linux-review.googlesource.com/id/I761aa9f0344454c482b83fcfcce547db0a25501b Fixes: 63f0c6037965 ("arm64: Introduce prctl() options to control the tagged user addresses ABI") Change-Id: I2e764cbb1867cd8018d177d449a05cf4d0d51576 Signed-off-by: Peter Collingbourne <pcc@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Alistair Delva <adelva@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Mitch Phillips <mitchp@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will@kernel.org> Cc: William McVicker <willmcvicker@google.com> Cc: <stable@vger.kernel.org> [5.4] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-12-17 21:30:38 +05:30
Axel Rasmussen	fd78eec505	BACKPORT: FROMGIT: userfaultfd: update documentation to describe minor fault handling Reword / reorganize things a little bit into "lists", so new features / modes / ioctls can sort of just be appended. Describe how UFFDIO_REGISTER_MODE_MINOR and UFFDIO_CONTINUE can be used to intercept and resolve minor faults. Make it clear that COPY and ZEROPAGE are used for MISSING faults, whereas CONTINUE is used for MINOR faults. Link: https://lkml.kernel.org/r/20210301222728.176417-6-axelrasmussen@google.com Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Adam Ruprecht <ruprecht@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Cannon Matthews <cannonmatthews@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chinwen Chang <chinwen.chang@mediatek.com> Cc: David Rientjes <rientjes@google.com> Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Michal Koutn" <mkoutny@suse.com> Cc: Michel Lespinasse <walken@google.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oliver Upton <oupton@google.com> Cc: Shaohua Li <shli@fb.com> Cc: Shawn Anastasio <shawn@anastas.io> Cc: Steven Price <steven.price@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> (cherry picked from commit d08ba026886f0161e2bdd3dbd75c4da0fc62a284 https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git akpm) Link: https://lore.kernel.org/patchwork/patch/1388137/ Conflicts: Documentation/admin-guide/mm/userfaultfd.rst (Manual rebase by removing text related to write-protect feature) Signed-off-by: Lokesh Gidra <lokeshgidra@google.com> Bug: 160737021 Bug: 169683130 Change-Id: Ib59504247d38034e4c86f692dd63f2b3706fe554	2025-12-17 21:30:26 +05:30
Lokesh Gidra	52c69bd089	UPSTREAM: userfaultfd: add user-mode only option to unprivileged_userfaultfd sysctl knob With this change, when the knob is set to 0, it allows unprivileged users to call userfaultfd, like when it is set to 1, but with the restriction that page faults from only user-mode can be handled. In this mode, an unprivileged user (without SYS_CAP_PTRACE capability) must pass UFFD_USER_MODE_ONLY to userfaultd or the API will fail with EPERM. This enables administrators to reduce the likelihood that an attacker with access to userfaultfd can delay faulting kernel code to widen timing windows for other exploits. The default value of this knob is changed to 0. This is required for correct functioning of pipe mutex. However, this will fail postcopy live migration, which will be unnoticeable to the VM guests. To avoid this, set 'vm.userfault = 1' in /sys/sysctl.conf. The main reason this change is desirable as in the short term is that the Android userland will behave as with the sysctl set to zero. So without this commit, any Linux binary using userfaultfd to manage its memory would behave differently if run within the Android userland. For more details, refer to Andrea's reply [1]. [1] https://lore.kernel.org/lkml/20200904033438.GI9411@redhat.com/ Link: https://lkml.kernel.org/r/20201120030411.2690816-3-lokeshgidra@google.com Signed-off-by: Lokesh Gidra <lokeshgidra@google.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Peter Xu <peterx@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Daniel Colascione <dancol@dancol.org> Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: <calin@google.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Shaohua Li <shli@fb.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Nitin Gupta <nigupta@nvidia.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Daniel Colascione <dancol@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit d0d4730ac2e404a5b0da9a87ef38c73e51cb1664) Signed-off-by: Lokesh Gidra <lokeshgidra@google.com> Bug: 160737021 Bug: 169683130 Change-Id: Ic46c0be47d6394d25bd3443ff524936fa568ab85	2025-12-17 21:25:46 +05:30
Peter Xu	e734c76578	BACKPORT: userfaultfd/sysctl: add vm.unprivileged_userfaultfd Userfaultfd can be misued to make it easier to exploit existing use-after-free (and similar) bugs that might otherwise only make a short window or race condition available. By using userfaultfd to stall a kernel thread, a malicious program can keep some state that it wrote, stable for an extended period, which it can then access using an existing exploit. While it doesn't cause the exploit itself, and while it's not the only thing that can stall a kernel thread when accessing a memory location, it's one of the few that never needs privilege. We can add a flag, allowing userfaultfd to be restricted, so that in general it won't be useable by arbitrary user programs, but in environments that require userfaultfd it can be turned back on. Add a global sysctl knob "vm.unprivileged_userfaultfd" to control whether userfaultfd is allowed by unprivileged users. When this is set to zero, only privileged users (root user, or users with the CAP_SYS_PTRACE capability) will be able to use the userfaultfd syscalls. Andrea said: : The only difference between the bpf sysctl and the userfaultfd sysctl : this way is that the bpf sysctl adds the CAP_SYS_ADMIN capability : requirement, while userfaultfd adds the CAP_SYS_PTRACE requirement, : because the userfaultfd monitor is more likely to need CAP_SYS_PTRACE : already if it's doing other kind of tracking on processes runtime, in : addition of userfaultfd. In other words both syscalls works only for : root, when the two sysctl are opt-in set to 1. [dgilbert@redhat.com: changelog additions] [akpm@linux-foundation.org: documentation tweak, per Mike] Link: http://lkml.kernel.org/r/20190319030722.12441-2-peterx@redhat.com Change-Id: Ied2500a773b06ac1fdc378e61fd5403a270114a6 Signed-off-by: Peter Xu <peterx@redhat.com> Suggested-by: Andrea Arcangeli <aarcange@redhat.com> Suggested-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Maxime Coquelin <maxime.coquelin@redhat.com> Cc: Maya Gokhale <gokhale2@llnl.gov> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Pavel Emelyanov <xemul@virtuozzo.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Martin Cracauer <cracauer@cons.org> Cc: Denis Plotnikov <dplotnikov@virtuozzo.com> Cc: Marty McFadden <mcfadden8@llnl.gov> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2025-12-17 21:25:45 +05:30
Wolfram Sang	e1b1f86d15	ipmi: docs: don't advertise deprecated sysfs entries [ Upstream commit 64dce81f8c373c681e62d5ffe0397c45a35d48a2 ] "i2c-adapter" class entries are deprecated since 2009. Switch to the proper location. Reported-by: Heiner Kallweit <hkallweit1@gmail.com> Closes: https://lore.kernel.org/r/80c4a898-5867-4162-ac85-bdf7c7c68746@gmail.com Fixes: `259307074b` ("ipmi: Add SMBus interface driver (SSIF)") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Message-Id: <20240901090211.3797-2-wsa+renesas@sang-engineering.com> Signed-off-by: Corey Minyard <corey@minyard.net> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit e4e81788a8b83f267d25b9f3b68cb4837b71bdd9) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>	2025-12-17 19:43:43 +05:30
Guenter Roeck	f2949c0095	hwmon: Introduce SENSOR_DEVICE_ATTR_{RO, RW, WO} and variants [ Upstream commit a5c47c0d388b939dd578fd466aa804b7f2445390 ] Introduce SENSOR_DEVICE_ATTR_{RO,RW,WO} and SENSOR_DEVICE_ATTR_2_{RO,RW,WO} as simplified variants of SENSOR_DEVICE_ATTR and SENSOR_DEVICE_ATTR_2 to simplify the source code, improve readbility, and reduce the chance of inconsistencies. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Stable-dep-of: 1ea3fd1eb986 ("hwmon: (max6697) Fix swapped temp{1,8} critical alarms") Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit eb04482acd9870b84970fe1549203fedc1bbcc79) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2025-12-17 19:36:23 +05:30
Hans de Goede	d65e4b99c7	Input: silead - add support for capactive home button found on some x86 tablets On some x86 tablets with a silead touchscreen the windows logo on the front is a capacitive home button. Touching this button results in a touch with bits 12-15 of the Y coordinates set, while normally only the lower 12 are used. Detect this and report a KEY_LEFTMETA press when this happens. Note for now we only respond to the Y coordinate bits 12-15 containing 0x01, on some tablets without a capacative button I've noticed these bits containing 0x04 when crossing the edges of the screen. Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> (cherry picked from commit eca3be9b95ac7cf9442654a54962859d74f8e38a) Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>	2025-12-17 19:35:46 +05:30
Derek Fang	011e214bdd	ASoC: dt-bindings: rt5645: add cbj sleeve gpio property [ Upstream commit 306b38e3fa727d22454a148a364123709e356600 ] Add an optional gpio property to control external CBJ circuits to avoid some electric noise caused by sleeve/ring2 contacts floating. Signed-off-by: Derek Fang <derek.fang@realtek.com> Link: https://msgid.link/r/20240408091057.14165-2-derek.fang@realtek.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 5af06b6c57a9bbfa9bd5421e28bcd5c571c5821e) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2025-12-17 19:29:27 +05:30
Akira Yokosawa	8f789bce03	docs: kernel_include.py: Cope with docutils 0.21 commit d43ddd5c91802a46354fa4c4381416ef760676e2 upstream. Running "make htmldocs" on a newly installed Sphinx 7.3.7 ends up in a build error: Sphinx parallel build error: AttributeError: module 'docutils.nodes' has no attribute 'reprunicode' docutils 0.21 has removed nodes.reprunicode, quote from release note [1]: * Removed objects: docutils.nodes.reprunicode, docutils.nodes.ensure_str() Python 2 compatibility hacks Sphinx 7.3.0 supports docutils 0.21 [2]: kernel_include.py, whose origin is misc.py of docutils, uses reprunicode. Upstream docutils removed the offending line from the corresponding file (docutils/docutils/parsers/rst/directives/misc.py) in January 2022. Quoting the changelog [3]: Deprecate `nodes.reprunicode` and `nodes.ensure_str()`. Drop uses of the deprecated constructs (not required with Python 3). Do the same for kernel_include.py. Tested against: - Sphinx 2.4.5 (docutils 0.17.1) - Sphinx 3.4.3 (docutils 0.17.1) - Sphinx 5.3.0 (docutils 0.18.1) - Sphinx 6.2.1 (docutils 0.19) - Sphinx 7.2.6 (docutils 0.20.1) - Sphinx 7.3.7 (docutils 0.21.2) Link: http://www.docutils.org/RELEASE-NOTES.html#release-0-21-2024-04-09 [1] Link: https://www.sphinx-doc.org/en/master/changes.html#release-7-3-0-released-apr-16-2024 [2] Link: https://github.com/docutils/docutils/commit/c8471ce47a24 [3] Signed-off-by: Akira Yokosawa <akiyks@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/faf5fa45-2a9d-4573-9d2e-3930bdc1ed65@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 4b431a786f0ca86614b2d00e17b313956d7ef035) Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>	2025-12-17 19:29:22 +05:30
Daniel Axtens	1e2f88ad9b	bpf: fix bpf_skb_adjust_net/bpf_skb_proto_xlat to deal with gso sctp skbs SCTP GSO skbs have a gso_size of GSO_BY_FRAGS, so any sort of unconditionally mangling of that will result in nonsense value and would corrupt the skb later on. Therefore, i) add two helpers skb_increase_gso_size() and skb_decrease_gso_size() that would throw a one time warning and bail out for such skbs and ii) refuse and return early with an error in those BPF helpers that are affected. We do need to bail out as early as possible from there before any changes on the skb have been performed. Fixes: `6578171a7f` ("bpf: add bpf_skb_change_proto helper") Co-authored-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Axtens <dja@axtens.net> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> (cherry picked from commit d02f51cbcf12b09ab945873e35046045875eed9a) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2025-12-17 19:22:22 +05:30
Daniel Axtens	f7e42e5863	docs: segmentation-offloads.txt: add SCTP info Most of this is extracted from `90017accff` ("sctp: Add GSO support"), with some extra text about GSO_BY_FRAGS and the need to check for it. Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Daniel Axtens <dja@axtens.net> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit a677088922831d94d292ca3891b148a8ba0b5fa1) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2025-12-17 19:21:59 +05:30
Kim Phillips	abe4f44724	x86/cpu: Enable STIBP on AMD if Automatic IBRS is enabled commit fd470a8beed88440b160d690344fbae05a0b9b1b upstream. Unlike Intel's Enhanced IBRS feature, AMD's Automatic IBRS does not provide protection to processes running at CPL3/user mode, see section "Extended Feature Enable Register (EFER)" in the APM v2 at https://bugzilla.kernel.org/attachment.cgi?id=304652 Explicitly enable STIBP to protect against cross-thread CPL3 branch target injections on systems with Automatic IBRS enabled. Also update the relevant documentation. Fixes: e7862eda309e ("x86/cpu: Support AMD Automatic IBRS") Reported-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230720194727.67022-1-kim.phillips@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit bb8cc9c34361714dd232700b3d5f1373055de610) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2025-12-17 19:19:36 +05:30
Kim Phillips	d472b51db8	x86/cpu: Support AMD Automatic IBRS commit e7862eda309ecfccc36bb5558d937ed3ace07f3f upstream. The AMD Zen4 core supports a new feature called Automatic IBRS. It is a "set-and-forget" feature that means that, like Intel's Enhanced IBRS, h/w manages its IBRS mitigation resources automatically across CPL transitions. The feature is advertised by CPUID_Fn80000021_EAX bit 8 and is enabled by setting MSR C000_0080 (EFER) bit 21. Enable Automatic IBRS by default if the CPU feature is present. It typically provides greater performance over the incumbent generic retpolines mitigation. Reuse the SPECTRE_V2_EIBRS spectre_v2_mitigation enum. AMD Automatic IBRS and Intel Enhanced IBRS have similar enablement. Add NO_EIBRS_PBRSB to cpu_vuln_whitelist, since AMD Automatic IBRS isn't affected by PBRSB-eIBRS. The kernel command line option spectre_v2=eibrs is used to select AMD Automatic IBRS, if available. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Sean Christopherson <seanjc@google.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20230124163319.2277355-8-kim.phillips@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit a7268b3424863578814d99a1dd603f6bb5b9f977) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2025-12-17 19:18:21 +05:30
Lin Yujun	b790342b50	Documentation/hw-vuln: Update spectre doc commit 06cb31cc761823ef444ba4e1df11347342a6e745 upstream. commit 7c693f54c873691 ("x86/speculation: Add spectre_v2=ibrs option to support Kernel IBRS") adds the "ibrs " option in Documentation/admin-guide/kernel-parameters.txt but omits it to Documentation/admin-guide/hw-vuln/spectre.rst, add it. Signed-off-by: Lin Yujun <linyujun809@huawei.com> Link: https://lore.kernel.org/r/20220830123614.23007-1-linyujun809@huawei.com Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 3e4f86cfda46ef6320f385b80496d3f65d5ed63d) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2025-12-17 19:18:20 +05:30
Pavel Machek	0928d780ff	Documentation: fix little inconsistencies [ Upstream commit 98bfa34462fb6af70b845d28561072f80bacdb9b ] Fix little inconsistencies in Documentation: make case and spacing match surrounding text. Signed-off-by: Pavel Machek <pavel@ucw.cz> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Stable-dep-of: 4848229494a3 ("irqchip/jcore-aic: Fix missing allocation of IRQ descriptors") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>	2025-12-17 19:04:52 +05:30
Christoph Hellwig	ec8b974678	md: switch to ->check_events for media change notifications [ Upstream commit a564e23f0f99759f453dbefcb9160dec6d99df96 ] md is the last driver using the legacy media_changed method. Switch it over to (not so) new ->clear_events approach, which also removes the need for the ->revalidate_disk method. Signed-off-by: Christoph Hellwig <hch@lst.de> [axboe: remove unused 'bdops' variable in disk_clear_events()] Signed-off-by: Jens Axboe <axboe@kernel.dk> Stable-dep-of: 9674f54e41ff ("md: Don't clear MD_CLOSING when the raid is about to stop") Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit dc51c01a3d5a796e18520a186f56e13f8e70749f) Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>	2025-12-17 19:02:04 +05:30
Breno Leitao	c7ef1e441b	net: sysfs: Fix /sys/class/net/<iface> path for statistics [ Upstream commit 5b3fbd61b9d1f4ed2db95aaf03f9adae0373784d ] The Documentation/ABI/testing/sysfs-class-net-statistics documentation is pointing to the wrong path for the interface. Documentation is pointing to /sys/class/<iface>, instead of /sys/class/net/<iface>. Fix it by adding the `net/` directory before the interface. Fixes: `6044f97006` ("net: sysfs: document /sys/class/net/statistics/*") Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit e7928873d9ac5a6194f0ffc56549d4262af7e568) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2025-12-17 18:09:42 +05:30
Julian Wiedmann	4f2bca6148	Documentation: net-sysfs: describe missing statistics [ Upstream commit e528afb72a481977456bb18345d4e7f6b85fa7b1 ] Sync the ABI description with the interface statistics that are currently available through sysfs. CC: Jarod Wilson <jarod@redhat.com> CC: Jonathan Corbet <corbet@lwn.net> CC: linux-doc@vger.kernel.org Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: 5b3fbd61b9d1 ("net: sysfs: Fix /sys/class/net/<iface> path for statistics") Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit b908fdcb6bbc07a3314afb386415b616fa01732f) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2025-12-17 18:09:42 +05:30
Breno Leitao	c8373fb640	net: sysfs: Fix /sys/class/net/<iface> path [ Upstream commit ae3f4b44641dfff969604735a0dcbf931f383285 ] The documentation is pointing to the wrong path for the interface. Documentation is pointing to /sys/class/<iface>, instead of /sys/class/net/<iface>. Fix it by adding the `net/` directory before the interface. Fixes: `1a02ef76ac` ("net: sysfs: add documentation entries for /sys/class/<iface>/queues") Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20240131102150.728960-2-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 3dc7b3ffd5c539124ee8fc42a32a91b5df13717d) [vegard: fix trivial conflict] Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2025-12-17 18:09:30 +05:30
Cristian Ciocaltea	6df5db4521	ASoC: doc: Fix undefined SND_SOC_DAPM_NOPM argument [ Upstream commit 67c7666fe808c3a7af3cc6f9d0a3dd3acfd26115 ] The virtual widget example makes use of an undefined SND_SOC_DAPM_NOPM argument passed to SND_SOC_DAPM_MIXER(). Replace with the correct SND_SOC_NOPM definition. Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com> Link: https://lore.kernel.org/r/20231121120751.77355-1-cristian.ciocaltea@collabora.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit bbb3342c6343688fb673d7c6b51cbf8d184565d2) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2025-12-17 18:08:44 +05:30
theshaenix	37611d1485	Merge tag 'v4.14.328' of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux into mcqueen This is the 4.14.328 stable release	2025-12-17 15:46:31 +05:30
Greg Kroah-Hartman	03f9e655b4	Documentation: security-bugs.rst: update preferences when dealing with the linux-distros group commit 4fee0915e649bd0cea56dece6d96f8f4643df33c upstream. Because the linux-distros group forces reporters to release information about reported bugs, and they impose arbitrary deadlines in having those bugs fixed despite not actually being kernel developers, the kernel security team recommends not interacting with them at all as this just causes confusion and the early-release of reported security problems. Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/2023063020-throat-pantyhose-f110@gregkh Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-12-17 14:54:34 +05:30
Dave Hansen	5846ee8871	Documentation/x86: Fix backwards on/off logic about YMM support commit 1b0fc0345f2852ffe54fb9ae0e12e2ee69ad6a20 upstream These options clearly turn off XSAVE YMM support. Correct the typo. Reported-by: Ben Hutchings <ben@decadent.org.uk> Fixes: 553a5c03e90a ("x86/speculation: Add force option to GDS mitigation") Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-12-17 14:47:59 +05:30
Daniel Sneddon	9651a9dca8	x86/speculation: Add force option to GDS mitigation commit 553a5c03e90a6087e88f8ff878335ef0621536fb upstream The Gather Data Sampling (GDS) vulnerability allows malicious software to infer stale data previously stored in vector registers. This may include sensitive data such as cryptographic keys. GDS is mitigated in microcode, and systems with up-to-date microcode are protected by default. However, any affected system that is running with older microcode will still be vulnerable to GDS attacks. Since the gather instructions used by the attacker are part of the AVX2 and AVX512 extensions, disabling these extensions prevents gather instructions from being executed, thereby mitigating the system from GDS. Disabling AVX2 is sufficient, but we don't have the granularity to do this. The XCR0[2] disables AVX, with no option to just disable AVX2. Add a kernel parameter gather_data_sampling=force that will enable the microcode mitigation if available, otherwise it will disable AVX on affected systems. This option will be ignored if cmdline mitigations=off. This is a big hammer. It is known to break buggy userspace that uses incomplete, buggy AVX enumeration. Unfortunately, such userspace does exist in the wild: https://www.mail-archive.com/bug-coreutils@gnu.org/msg33046.html [ dhansen: add some more ominous warnings about disabling AVX ] Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-12-17 14:47:57 +05:30
Daniel Sneddon	3b7d8d7d60	x86/speculation: Add Gather Data Sampling mitigation commit 8974eb588283b7d44a7c91fa09fcbaf380339f3a upstream Gather Data Sampling (GDS) is a hardware vulnerability which allows unprivileged speculative access to data which was previously stored in vector registers. Intel processors that support AVX2 and AVX512 have gather instructions that fetch non-contiguous data elements from memory. On vulnerable hardware, when a gather instruction is transiently executed and encounters a fault, stale data from architectural or internal vector registers may get transiently stored to the destination vector register allowing an attacker to infer the stale data using typical side channel techniques like cache timing attacks. This mitigation is different from many earlier ones for two reasons. First, it is enabled by default and a bit must be set to DISABLE it. This is the opposite of normal mitigation polarity. This means GDS can be mitigated simply by updating microcode and leaving the new control bit alone. Second, GDS has a "lock" bit. This lock bit is there because the mitigation affects the hardware security features KeyLocker and SGX. It needs to be enabled and STAY enabled for these features to be mitigated against GDS. The mitigation is enabled in the microcode by default. Disable it by setting gather_data_sampling=off or by disabling all mitigations with mitigations=off. The mitigation status can be checked by reading: /sys/devices/system/cpu/vulnerabilities/gather_data_sampling Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-12-17 14:47:56 +05:30
Stephen Hemminger	900ad1d432	Remove DECnet support from kernel commit 1202cdd665315c525b5237e96e0bedc76d7e754f upstream. DECnet is an obsolete network protocol that receives more attention from kernel janitors than users. It belongs in computer protocol history museum not in Linux kernel. It has been "Orphaned" in kernel since 2010. The iproute2 support for DECnet was dropped in 5.0 release. The documentation link on Sourceforge says it is abandoned there as well. Leave the UAPI alone to keep userspace programs compiling. This means that there is still an empty neighbour table for AF_DECNET. The table of /proc/sys/net entries was updated to match current directories and reformatted to be alphabetical. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Ahern <dsahern@kernel.org> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-12-17 14:40:54 +05:30

1 2 3 4 5 ...

34873 Commits