Changes in 5.15.82
arm64: mte: Avoid setting PG_mte_tagged if no tags cleared or restored
drm/i915: Create a dummy object for gen6 ppgtt
drm/i915/gt: Use i915_vm_put on ppgtt_create error paths
erofs: fix order >= MAX_ORDER warning due to crafted negative i_size
btrfs: sink iterator parameter to btrfs_ioctl_logical_to_ino
btrfs: free btrfs_path before copying inodes to userspace
spi: spi-imx: Fix spi_bus_clk if requested clock is higher than input clock
btrfs: move QUOTA_ENABLED check to rescan_should_stop from btrfs_qgroup_rescan_worker
btrfs: qgroup: fix sleep from invalid context bug in btrfs_qgroup_inherit()
drm/display/dp_mst: Fix drm_dp_mst_add_affected_dsc_crtcs() return code
drm/amdgpu: update drm_display_info correctly when the edid is read
drm/amdgpu: Partially revert "drm/amdgpu: update drm_display_info correctly when the edid is read"
iio: health: afe4403: Fix oob read in afe4403_read_raw
iio: health: afe4404: Fix oob read in afe4404_[read|write]_raw
iio: light: rpr0521: add missing Kconfig dependencies
bpf, perf: Use subprog name when reporting subprog ksymbol
scripts/faddr2line: Fix regression in name resolution on ppc64le
ARM: at91: rm9200: fix usb device clock id
libbpf: Handle size overflow for ringbuf mmap
hwmon: (ltc2947) fix temperature scaling
hwmon: (ina3221) Fix shunt sum critical calculation
hwmon: (i5500_temp) fix missing pci_disable_device()
hwmon: (ibmpex) Fix possible UAF when ibmpex_register_bmc() fails
bpf: Do not copy spin lock field from user in bpf_selem_alloc
nvmem: rmem: Fix return value check in rmem_read()
of: property: decrement node refcount in of_fwnode_get_reference_args()
ixgbevf: Fix resource leak in ixgbevf_init_module()
i40e: Fix error handling in i40e_init_module()
fm10k: Fix error handling in fm10k_init_module()
iavf: remove redundant ret variable
iavf: Fix error handling in iavf_init_module()
e100: Fix possible use after free in e100_xmit_prepare
net/mlx5: DR, Rename list field in matcher struct to list_node
net/mlx5: DR, Fix uninitialized var warning
net/mlx5: Fix uninitialized variable bug in outlen_write()
net/mlx5e: Fix use-after-free when reverting termination table
can: sja1000_isa: sja1000_isa_probe(): add missing free_sja1000dev()
can: cc770: cc770_isa_probe(): add missing free_cc770dev()
can: etas_es58x: es58x_init_netdev(): free netdev when register_candev()
can: m_can: pci: add missing m_can_class_free_dev() in probe/remove methods
can: m_can: Add check for devm_clk_get
qlcnic: fix sleep-in-atomic-context bugs caused by msleep
aquantia: Do not purge addresses when setting the number of rings
wifi: cfg80211: fix buffer overflow in elem comparison
wifi: cfg80211: don't allow multi-BSSID in S1G
wifi: mac8021: fix possible oob access in ieee80211_get_rate_duration
net: phy: fix null-ptr-deref while probe() failed
net: ethernet: ti: am65-cpsw: fix error handling in am65_cpsw_nuss_probe()
net: net_netdev: Fix error handling in ntb_netdev_init_module()
net/9p: Fix a potential socket leak in p9_socket_open
net: ethernet: nixge: fix NULL dereference
net: wwan: iosm: fix kernel test robot reported error
net: wwan: iosm: fix dma_alloc_coherent incompatible pointer type
dsa: lan9303: Correct stat name
tipc: re-fetch skb cb after tipc_msg_validate
net: hsr: Fix potential use-after-free
net: mdiobus: fix unbalanced node reference count
afs: Fix fileserver probe RTT handling
net: tun: Fix use-after-free in tun_detach()
packet: do not set TP_STATUS_CSUM_VALID on CHECKSUM_COMPLETE
sctp: fix memory leak in sctp_stream_outq_migrate()
net: ethernet: renesas: ravb: Fix promiscuous mode after system resumed
hwmon: (coretemp) Check for null before removing sysfs attrs
hwmon: (coretemp) fix pci device refcount leak in nv1a_ram_new()
riscv: vdso: fix section overlapping under some conditions
riscv: mm: Proper page permissions after initmem free
ALSA: dice: fix regression for Lexicon I-ONIX FW810S
error-injection: Add prompt for function error injection
tools/vm/slabinfo-gnuplot: use "grep -E" instead of "egrep"
nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry()
x86/bugs: Make sure MSR_SPEC_CTRL is updated properly upon resume from S3
pinctrl: intel: Save and restore pins in "direct IRQ" mode
v4l2: don't fall back to follow_pfn() if pin_user_pages_fast() fails
net: stmmac: Set MAC's flow control register to reflect current settings
mmc: mmc_test: Fix removal of debugfs file
mmc: core: Fix ambiguous TRIM and DISCARD arg
mmc: sdhci-esdhc-imx: correct CQHCI exit halt state check
mmc: sdhci-sprd: Fix no reset data and command after voltage switch
mmc: sdhci: Fix voltage switch delay
drm/amdgpu: temporarily disable broken Clang builds due to blown stack-frame
drm/amdgpu: enable Vangogh VCN indirect sram mode
drm/i915: Fix negative value passed as remaining time
drm/i915: Never return 0 if not all requests retired
tracing/osnoise: Fix duration type
tracing: Fix race where histograms can be called before the event
tracing: Free buffers when a used dynamic event is removed
io_uring: update res mask in io_poll_check_events
io_uring: fix tw losing poll events
io_uring: cmpxchg for poll arm refs release
io_uring: make poll refs more robust
io_uring/poll: fix poll_refs race with cancelation
KVM: x86/mmu: Fix race condition in direct_page_fault
ASoC: ops: Fix bounds check for _sx controls
pinctrl: single: Fix potential division by zero
riscv: Sync efi page table's kernel mappings before switching
riscv: fix race when vmap stack overflow
riscv: kexec: Fixup irq controller broken in kexec crash path
nvme: fix SRCU protection of nvme_ns_head list
iommu/vt-d: Fix PCI device refcount leak in has_external_pci()
iommu/vt-d: Fix PCI device refcount leak in dmar_dev_scope_init()
mm: __isolate_lru_page_prepare() in isolate_migratepages_block()
mm: migrate: fix THP's mapcount on isolation
parisc: Increase FRAME_WARN to 2048 bytes on parisc
Kconfig.debug: provide a little extra FRAME_WARN leeway when KASAN is enabled
selftests: net: add delete nexthop route warning test
selftests: net: fix nexthop warning cleanup double ip typo
ipv4: Handle attempt to delete multipath route when fib_info contains an nh reference
ipv4: Fix route deletion when nexthop info is not specified
serial: stm32: Factor out GPIO RTS toggling into separate function
serial: stm32: Use TC interrupt to deassert GPIO RTS in RS485 mode
serial: stm32: Deassert Transmit Enable on ->rs485_config()
i2c: npcm7xx: Fix error handling in npcm_i2c_init()
i2c: imx: Only DMA messages with I2C_M_DMA_SAFE flag set
ACPI: HMAT: remove unnecessary variable initialization
ACPI: HMAT: Fix initiator registration for single-initiator systems
Revert "clocksource/drivers/riscv: Events are stopped during CPU suspend"
char: tpm: Protect tpm_pm_suspend with locks
Input: raydium_ts_i2c - fix memory leak in raydium_i2c_send()
ipc/sem: Fix dangling sem_array access in semtimedop race
proc: avoid integer type confusion in get_proc_long
proc: proc_skip_spaces() shouldn't think it is working on C strings
Linux 5.15.82
Change-Id: I4ce52cb5917c9036339810c816ab005a4e9489fb
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 4313e5a613049dfc1819a6dfb5f94cf2caff9452 upstream.
After 65536 dynamic events have been added and removed, the "type" field
of the event then uses the first type number that is available (not
currently used by other events). A type number is the identifier of the
binary blobs in the tracing ring buffer (known as events) to map them to
logic that can parse the binary blob.
The issue is that if a dynamic event (like a kprobe event) is traced and
is in the ring buffer, and then that event is removed (because it is
dynamic, which means it can be created and destroyed), if another dynamic
event is created that has the same number that new event's logic on
parsing the binary blob will be used.
To show how this can be an issue, the following can crash the kernel:
# cd /sys/kernel/tracing
# for i in `seq 65536`; do
echo 'p:kprobes/foo do_sys_openat2 $arg1:u32' > kprobe_events
# done
For every iteration of the above, the writing to the kprobe_events will
remove the old event and create a new one (with the same format) and
increase the type number to the next available on until the type number
reaches over 65535 which is the max number for the 16 bit type. After it
reaches that number, the logic to allocate a new number simply looks for
the next available number. When an dynamic event is removed, that number
is then available to be reused by the next dynamic event created. That is,
once the above reaches the max number, the number assigned to the event in
that loop will remain the same.
Now that means deleting one dynamic event and created another will reuse
the previous events type number. This is where bad things can happen.
After the above loop finishes, the kprobes/foo event which reads the
do_sys_openat2 function call's first parameter as an integer.
# echo 1 > kprobes/foo/enable
# cat /etc/passwd > /dev/null
# cat trace
cat-2211 [005] .... 2007.849603: foo: (do_sys_openat2+0x0/0x130) arg1=4294967196
cat-2211 [005] .... 2007.849620: foo: (do_sys_openat2+0x0/0x130) arg1=4294967196
cat-2211 [005] .... 2007.849838: foo: (do_sys_openat2+0x0/0x130) arg1=4294967196
cat-2211 [005] .... 2007.849880: foo: (do_sys_openat2+0x0/0x130) arg1=4294967196
# echo 0 > kprobes/foo/enable
Now if we delete the kprobe and create a new one that reads a string:
# echo 'p:kprobes/foo do_sys_openat2 +0($arg2):string' > kprobe_events
And now we can the trace:
# cat trace
sendmail-1942 [002] ..... 530.136320: foo: (do_sys_openat2+0x0/0x240) arg1= cat-2046 [004] ..... 530.930817: foo: (do_sys_openat2+0x0/0x240) arg1="������������������������������������������������������������������������������������������������"
cat-2046 [004] ..... 530.930961: foo: (do_sys_openat2+0x0/0x240) arg1="������������������������������������������������������������������������������������������������"
cat-2046 [004] ..... 530.934278: foo: (do_sys_openat2+0x0/0x240) arg1="������������������������������������������������������������������������������������������������"
cat-2046 [004] ..... 530.934563: foo: (do_sys_openat2+0x0/0x240) arg1="������������������������������������������������������������������������������������������������"
bash-1515 [007] ..... 534.299093: foo: (do_sys_openat2+0x0/0x240) arg1="kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk���������@��4Z����;Y�����U
And dmesg has:
==================================================================
BUG: KASAN: use-after-free in string+0xd4/0x1c0
Read of size 1 at addr ffff88805fdbbfa0 by task cat/2049
CPU: 0 PID: 2049 Comm: cat Not tainted 6.1.0-rc6-test+ #641
Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
Call Trace:
<TASK>
dump_stack_lvl+0x5b/0x77
print_report+0x17f/0x47b
kasan_report+0xad/0x130
string+0xd4/0x1c0
vsnprintf+0x500/0x840
seq_buf_vprintf+0x62/0xc0
trace_seq_printf+0x10e/0x1e0
print_type_string+0x90/0xa0
print_kprobe_event+0x16b/0x290
print_trace_line+0x451/0x8e0
s_show+0x72/0x1f0
seq_read_iter+0x58e/0x750
seq_read+0x115/0x160
vfs_read+0x11d/0x460
ksys_read+0xa9/0x130
do_syscall_64+0x3a/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fc2e972ade2
Code: c0 e9 b2 fe ff ff 50 48 8d 3d b2 3f 0a 00 e8 05 f0 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
RSP: 002b:00007ffc64e687c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007fc2e972ade2
RDX: 0000000000020000 RSI: 00007fc2e980d000 RDI: 0000000000000003
RBP: 00007fc2e980d000 R08: 00007fc2e980c010 R09: 0000000000000000
R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000020f00
R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
</TASK>
The buggy address belongs to the physical page:
page:ffffea00017f6ec0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x5fdbb
flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
raw: 000fffffc0000000 0000000000000000 ffffea00017f6ec8 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88805fdbbe80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff88805fdbbf00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>ffff88805fdbbf80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
^
ffff88805fdbc000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff88805fdbc080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
==================================================================
This was found when Zheng Yejian sent a patch to convert the event type
number assignment to use IDA, which gives the next available number, and
this bug showed up in the fuzz testing by Yujie Liu and the kernel test
robot. But after further analysis, I found that this behavior is the same
as when the event type numbers go past the 16bit max (and the above shows
that).
As modules have a similar issue, but is dealt with by setting a
"WAS_ENABLED" flag when a module event is enabled, and when the module is
freed, if any of its events were enabled, the ring buffer that holds that
event is also cleared, to prevent reading stale events. The same can be
done for dynamic events.
If any dynamic event that is being removed was enabled, then make sure the
buffers they were enabled in are now cleared.
Link: https://lkml.kernel.org/r/20221123171434.545706e3@gandalf.local.home
Link: https://lore.kernel.org/all/20221110020319.1259291-1-zhengyejian1@huawei.com/
Cc: stable@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Depends-on: e18eb8783ec49 ("tracing: Add tracing_reset_all_online_cpus_unlocked() function")
Depends-on: 5448d44c38 ("tracing: Add unified dynamic event framework")
Depends-on: 6212dd2968 ("tracing/kprobes: Use dyn_event framework for kprobe events")
Depends-on: 065e63f951 ("tracing: Only have rmmod clear buffers that its events were active in")
Depends-on: 575380da8b ("tracing: Only clear trace buffer on module unload if event was traced")
Fixes: 77b44d1b7c ("tracing/kprobes: Rename Kprobe-tracer to kprobe-event")
Reported-by: Zheng Yejian <zhengyejian1@huawei.com>
Reported-by: Yujie Liu <yujie.liu@intel.com>
Reported-by: kernel test robot <yujie.liu@intel.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 4e4f6e33d6.
The file permission changes break android userspace tools that depend on
tracefs on device with older platform release (Android 12/12L).
The removal of other bits from tracefs directory permissions are reverted in android13 kernels, so also revert the change for file permission bits. android14 kernels will align with upstream tracefs.
Bug: 236018289
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Change-Id: If07ae097c2a45fa54163681df7790a6e2537ba31
[ Upstream commit 21ccc9cd72116289469e5519b6159c675a2fa58f ]
When building the files in the tracefs file system, do not by default set
any permissions for OTH (other). This will make it easier for admins who
want to define a group for accessing tracefs and not having to first
disable all the permission bits for "other" in the file system.
As tracing can leak sensitive information, it should never by default
allowing all users access. An admin can still set the permission bits for
others to have access, which may be useful for creating a honeypot and
seeing who takes advantage of it and roots the machine.
Link: https://lkml.kernel.org/r/20210818153038.864149276@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Delegate command parsing to each create function so that the
command syntax can be customized.
This requires changes to the kprobe/uprobe/synthetic event handling,
which are also included here.
Link: https://lkml.kernel.org/r/e488726f49cbdbc01568618f8680584306c4c79f.1612208610.git.zanussi@kernel.org
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
[ zanussi@kernel.org: added synthetic event modifications ]
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
It's kind of strange to have check_arg() callbacks as part of the arg
objects themselves; it makes more sense to just pass these in when the
args are added instead.
Remove the check_arg() callbacks from those objects which also means
removing the check_arg() args from the init functions, adding them to
the add functions and fixing up existing callers.
Link: http://lkml.kernel.org/r/c7708d6f177fcbe1a36b6e4e8e150907df0fa5d2.1580506712.git.zanussi@kernel.org
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Add an interface used to build up dynamic event creation commands,
such as synthetic and kprobe events. Interfaces specific to those
particular types of events and others can be built on top of this
interface.
Command creation is started by first using the dynevent_cmd_init()
function to initialize the dynevent_cmd object. Following that, args
are appended and optionally checked by the dynevent_arg_add() and
dynevent_arg_pair_add() functions, which use objects representing
arguments and pairs of arguments, initialized respectively by
dynevent_arg_init() and dynevent_arg_pair_init(). Finally, once all
args have been successfully added, the command is finalized and
actually created using dynevent_create().
The code here for actually printing into the dyn_event->cmd buffer
using snprintf() etc was adapted from v4 of Masami's 'tracing/boot:
Add synthetic event support' patch.
Link: http://lkml.kernel.org/r/1f65fa44390b6f238f6036777c3784ced1dcc6a0.1580323897.git.zanussi@kernel.org
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Currently, most files in the tracefs directory test if tracing_disabled is
set. If so, it should return -ENODEV. The tracing_disabled is called when
tracing is found to be broken. Originally it was done in case the ring
buffer was found to be corrupted, and we wanted to prevent reading it from
crashing the kernel. But it's also called if a tracing selftest fails on
boot. It's a one way switch. That is, once it is triggered, tracing is
disabled until reboot.
As most tracefs files can also be used by instances in the tracefs
directory, they need to be carefully done. Each instance has a trace_array
associated to it, and when the instance is removed, the trace_array is
freed. But if an instance is opened with a reference to the trace_array,
then it requires looking up the trace_array to get its ref counter (as there
could be a race with it being deleted and the open itself). Once it is
found, a reference is added to prevent the instance from being removed (and
the trace_array associated with it freed).
Combine the two checks (tracing_disabled and trace_array_get()) into a
single helper function. This will also make it easier to add lockdown to
tracefs later.
Link: http://lkml.kernel.org/r/20191011135458.7399da44@gandalf.local.home
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
When user gives an event name to delete, delete all
matched events instead of the first one.
This means if there are several events which have same
name but different group (subsystem) name, those are
removed if user passed only the event name, e.g.
# cat kprobe_events
p:group1/testevent _do_fork
p:group2/testevent fork_idle
# echo -:testevent >> kprobe_events
# cat kprobe_events
#
Link: http://lkml.kernel.org/r/156095684958.28024.16597826267117453638.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Add a generic method to remove event from dynamic event
list. This is same as other system under ftrace. You
just need to pass the event name with '!', e.g.
# echo p:new_grp/new_event _do_fork > dynamic_events
This creates an event, and
# echo '!p:new_grp/new_event _do_fork' > dynamic_events
Or,
# echo '!p:new_grp/new_event' > dynamic_events
will remove new_grp/new_event event.
Note that this doesn't check the event prefix (e.g. "p:")
strictly, because the "group/event" name must be unique.
Link: http://lkml.kernel.org/r/154140869774.17322.8887303560398645347.stgit@devbox
Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>