94de3b405c8dee0ffc8de5c06b32fbf00fc4e8f9
38873 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
78acc7dbd8 |
blktrace: fix use after free for struct blk_trace
commit 30939293262eb433c960c4532a0d59c4073b2b84 upstream.
When tracing the whole disk, 'dropped' and 'msg' will be created
under 'q->debugfs_dir' and 'bt->dir' is NULL, thus blk_trace_free()
won't remove those files. What's worse, the following UAF can be
triggered because of accessing stale 'dropped' and 'msg':
==================================================================
BUG: KASAN: use-after-free in blk_dropped_read+0x89/0x100
Read of size 4 at addr ffff88816912f3d8 by task blktrace/1188
CPU: 27 PID: 1188 Comm: blktrace Not tainted 5.17.0-rc4-next-20220217+ #469
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-4
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x44
print_address_description.constprop.0.cold+0xab/0x381
? blk_dropped_read+0x89/0x100
? blk_dropped_read+0x89/0x100
kasan_report.cold+0x83/0xdf
? blk_dropped_read+0x89/0x100
kasan_check_range+0x140/0x1b0
blk_dropped_read+0x89/0x100
? blk_create_buf_file_callback+0x20/0x20
? kmem_cache_free+0xa1/0x500
? do_sys_openat2+0x258/0x460
full_proxy_read+0x8f/0xc0
vfs_read+0xc6/0x260
ksys_read+0xb9/0x150
? vfs_write+0x3d0/0x3d0
? fpregs_assert_state_consistent+0x55/0x60
? exit_to_user_mode_prepare+0x39/0x1e0
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fbc080d92fd
Code: ce 20 00 00 75 10 b8 00 00 00 00 0f 05 48 3d 01 f0 ff ff 73 31 c3 48 83 1
RSP: 002b:00007fbb95ff9cb0 EFLAGS: 00000293 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 00007fbb95ff9dc0 RCX: 00007fbc080d92fd
RDX: 0000000000000100 RSI: 00007fbb95ff9cc0 RDI: 0000000000000045
RBP: 0000000000000045 R08: 0000000000406299 R09: 00000000fffffffd
R10: 000000000153afa0 R11: 0000000000000293 R12: 00007fbb780008c0
R13: 00007fbb78000938 R14: 0000000000608b30 R15: 00007fbb780029c8
</TASK>
Allocated by task 1050:
kasan_save_stack+0x1e/0x40
__kasan_kmalloc+0x81/0xa0
do_blk_trace_setup+0xcb/0x410
__blk_trace_setup+0xac/0x130
blk_trace_ioctl+0xe9/0x1c0
blkdev_ioctl+0xf1/0x390
__x64_sys_ioctl+0xa5/0xe0
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
Freed by task 1050:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
kasan_set_free_info+0x20/0x30
__kasan_slab_free+0x103/0x180
kfree+0x9a/0x4c0
__blk_trace_remove+0x53/0x70
blk_trace_ioctl+0x199/0x1c0
blkdev_common_ioctl+0x5e9/0xb30
blkdev_ioctl+0x1a5/0x390
__x64_sys_ioctl+0xa5/0xe0
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
The buggy address belongs to the object at ffff88816912f380
which belongs to the cache kmalloc-96 of size 96
The buggy address is located 88 bytes inside of
96-byte region [ffff88816912f380, ffff88816912f3e0)
The buggy address belongs to the page:
page:000000009a1b4e7c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0f
flags: 0x17ffffc0000200(slab|node=0|zone=2|lastcpupid=0x1fffff)
raw: 0017ffffc0000200 ffffea00044f1100 dead000000000002 ffff88810004c780
raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88816912f280: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
ffff88816912f300: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
>ffff88816912f380: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
^
ffff88816912f400: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
ffff88816912f480: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
==================================================================
Fixes:
|
||
|
|
6c3d4da8e7 |
ucounts: Fix systemd LimitNPROC with private users regression
commit 0ac983f512033cb7b5e210c9589768ad25b1e36b upstream.
Long story short recursively enforcing RLIMIT_NPROC when it is not
enforced on the process that creates a new user namespace, causes
currently working code to fail. There is no reason to enforce
RLIMIT_NPROC recursively when we don't enforce it normally so update
the code to detect this case.
I would like to simply use capable(CAP_SYS_RESOURCE) to detect when
RLIMIT_NPROC is not enforced upon the caller. Unfortunately because
RLIMIT_NPROC is charged and checked for enforcement based upon the
real uid, using capable() which is euid based is inconsistent with reality.
Come as close as possible to testing for capable(CAP_SYS_RESOURCE) by
testing for when the real uid would match the conditions when
CAP_SYS_RESOURCE would be present if the real uid was the effective
uid.
Reported-by: Etienne Dechamps <etienne@edechamps.fr>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215596
Link: https://lkml.kernel.org/r/e9589141-cfeb-90cd-2d0e-83a62787239a@edechamps.fr
Link: https://lkml.kernel.org/r/87sfs8jmpz.fsf_-_@email.froward.int.ebiederm.org
Cc: stable@vger.kernel.org
Fixes:
|
||
|
|
07058fb18d |
bpf: Fix possible race in inc_misses_counter
[ Upstream commit 0e3135d3bfa5dfb658145238d2bc723a8e30c3a3 ]
It seems inc_misses_counter() suffers from same issue fixed in
the commit d979617aa84d ("bpf: Fixes possible race in update_prog_stats()
for 32bit arches"):
As it can run while interrupts are enabled, it could
be re-entered and the u64_stats syncp could be mangled.
Fixes:
|
||
|
|
aa5040691c |
bpf: Use u64_stats_t in struct bpf_prog_stats
[ Upstream commit 61a0abaee2092eee69e44fe60336aa2f5b578938 ]
Commit
|
||
|
|
013c2af6c1 |
tracing/probes: check the return value of kstrndup() for pbuf
[ Upstream commit 1c1857d400355e96f0fe8b32adc6fa7594d03b52 ]
kstrndup() is a memory allocation-related function, it returns NULL when
some internal memory errors happen. It is better to check the return
value of it so to catch the memory error in time.
Link: https://lkml.kernel.org/r/tencent_4D6E270731456EB88712ED7F13883C334906@qq.com
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Fixes:
|
||
|
|
8a20fed48e |
tracing/uprobes: Check the return value of kstrdup() for tu->filename
[ Upstream commit 8c7224245557707c613f130431cafbaaa4889615 ]
kstrdup() returns NULL when some internal memory errors happen, it is
better to check the return value of it so to catch the memory error in
time.
Link: https://lkml.kernel.org/r/tencent_3C2E330722056D7891D2C83F29C802734B06@qq.com
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Fixes:
|
||
|
|
628761fe05 |
tracing: Do not let synth_events block other dyn_event systems during create
[ Upstream commit 4f67cca70c0f615e9cfe6ac42244f3416ec60877 ]
synth_events is returning -EINVAL if the dyn_event create command does
not contain ' \t'. This prevents other systems from getting called back.
synth_events needs to return -ECANCELED in these cases when the command
is not targeting the synth_event system.
Link: https://lore.kernel.org/linux-trace-devel/20210930223821.11025-1-beaub@linux.microsoft.com
Fixes:
|
||
|
|
7f361266e9 |
signal: In get_signal test for signal_group_exit every time through the loop
[ Upstream commit e7f7c99ba911f56bc338845c1cd72954ba591707 ]
Recently while investigating a problem with rr and signals I noticed
that siglock is dropped in ptrace_signal and get_signal does not jump
to relock.
Looking farther to see if the problem is anywhere else I see that
do_signal_stop also returns if signal_group_exit is true. I believe
that test can now never be true, but it is a bit hard to trace
through and be certain.
Testing signal_group_exit is not expensive, so move the test for
signal_group_exit into the for loop inside of get_signal to ensure
the test is never skipped improperly.
This has been a potential problem since I added the test for
signal_group_exit was added.
Fixes:
|
||
|
|
33e22b6c53 |
tracing: Add ustring operation to filtering string pointers
[ Upstream commit f37c3bbc635994eda203a6da4ba0f9d05165a8d6 ] Since referencing user space pointers is special, if the user wants to filter on a field that is a pointer to user space, then they need to specify it. Add a ".ustring" attribute to the field name for filters to state that the field is pointing to user space such that the kernel can take the appropriate action to read that pointer. Link: https://lore.kernel.org/all/yt9d8rvmt2jq.fsf@linux.ibm.com/ Fixes: 77360f9bbc7e ("tracing: Add test for user space strings when filtering on string pointers") Tested-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
e0bcd6b577 |
sched/fair: Fix fault in reweight_entity
[ Upstream commit 13765de8148f71fa795e0a6607de37c49ea5915a ]
Syzbot found a GPF in reweight_entity. This has been bisected to
commit 4ef0c5c6b5ba ("kernel/sched: Fix sched_fork() access an invalid
sched_task_group")
There is a race between sched_post_fork() and setpriority(PRIO_PGRP)
within a thread group that causes a null-ptr-deref in
reweight_entity() in CFS. The scenario is that the main process spawns
number of new threads, which then call setpriority(PRIO_PGRP, 0, -20),
wait, and exit. For each of the new threads the copy_process() gets
invoked, which adds the new task_struct and calls sched_post_fork()
for it.
In the above scenario there is a possibility that
setpriority(PRIO_PGRP) and set_one_prio() will be called for a thread
in the group that is just being created by copy_process(), and for
which the sched_post_fork() has not been executed yet. This will
trigger a null pointer dereference in reweight_entity(), as it will
try to access the run queue pointer, which hasn't been set.
Before the mentioned change the cfs_rq pointer for the task has been
set in sched_fork(), which is called much earlier in copy_process(),
before the new task is added to the thread_group. Now it is done in
the sched_post_fork(), which is called after that. To fix the issue
the remove the update_load param from the update_load param() function
and call reweight_task() only if the task flag doesn't have the
TASK_NEW flag set.
Fixes: 4ef0c5c6b5ba ("kernel/sched: Fix sched_fork() access an invalid sched_task_group")
Reported-by: syzbot+af7a719bc92395ee41b3@syzkaller.appspotmail.com
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20220203161846.1160750-1-tadeusz.struk@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
||
|
|
60e6d58ef9 |
tracing: Add test for user space strings when filtering on string pointers
[ Upstream commit 77360f9bbc7e5e2ab7a2c8b4c0244fbbfcfc6f62 ]
Pingfan reported that the following causes a fault:
echo "filename ~ \"cpu\"" > events/syscalls/sys_enter_openat/filter
echo 1 > events/syscalls/sys_enter_at/enable
The reason is that trace event filter treats the user space pointer
defined by "filename" as a normal pointer to compare against the "cpu"
string. The following bug happened:
kvm-03-guest16 login: [72198.026181] BUG: unable to handle page fault for address: 00007fffaae8ef60
#PF: supervisor read access in kernel mode
#PF: error_code(0x0001) - permissions violation
PGD 80000001008b7067 P4D 80000001008b7067 PUD 2393f1067 PMD 2393ec067 PTE 8000000108f47867
Oops: 0001 [#1] PREEMPT SMP PTI
CPU: 1 PID: 1 Comm: systemd Kdump: loaded Not tainted 5.14.0-32.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:strlen+0x0/0x20
Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11
48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8
48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31
RSP: 0018:ffffb5b900013e48 EFLAGS: 00010246
RAX: 0000000000000018 RBX: ffff8fc1c49ede00 RCX: 0000000000000000
RDX: 0000000000000020 RSI: ffff8fc1c02d601c RDI: 00007fffaae8ef60
RBP: 00007fffaae8ef60 R08: 0005034f4ddb8ea4 R09: 0000000000000000
R10: ffff8fc1c02d601c R11: 0000000000000000 R12: ffff8fc1c8a6e380
R13: 0000000000000000 R14: ffff8fc1c02d6010 R15: ffff8fc1c00453c0
FS: 00007fa86123db40(0000) GS:ffff8fc2ffd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffaae8ef60 CR3: 0000000102880001 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
filter_pred_pchar+0x18/0x40
filter_match_preds+0x31/0x70
ftrace_syscall_enter+0x27a/0x2c0
syscall_trace_enter.constprop.0+0x1aa/0x1d0
do_syscall_64+0x16/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fa861d88664
The above happened because the kernel tried to access user space directly
and triggered a "supervisor read access in kernel mode" fault. Worse yet,
the memory could not even be loaded yet, and a SEGFAULT could happen as
well. This could be true for kernel space accessing as well.
To be even more robust, test both kernel and user space strings. If the
string fails to read, then simply have the filter fail.
Note, TASK_SIZE is used to determine if the pointer is user or kernel space
and the appropriate strncpy_from_kernel/user_nofault() function is used to
copy the memory. For some architectures, the compare to TASK_SIZE may always
pick user space or kernel space. If it gets it wrong, the only thing is that
the filter will fail to match. In the future, this needs to be fixed to have
the event denote which should be used. But failing a filter is much better
than panicing the machine, and that can be solved later.
Link: https://lore.kernel.org/all/20220107044951.22080-1-kernelfans@gmail.com/
Link: https://lkml.kernel.org/r/20220110115532.536088fd@gandalf.local.home
Cc: stable@vger.kernel.org
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Reported-by: Pingfan Liu <kernelfans@gmail.com>
Tested-by: Pingfan Liu <kernelfans@gmail.com>
Fixes:
|
||
|
|
9200ee8c6e |
Merge remote-tracking branch into HEAD
* keystone/mirror-android13-5.15: ANDROID: sched: update is_cpu_allowed tracehook ANDROID: tracing: fix register tracing spam on memcpy ANDROID: dma-direct: Document disable_dma32 ANDROID: dma-direct: Make DMA32 disablement work for CONFIG_NUMA Signed-off-by: keystone-kernel-automerger <keystone-kernel-automerger@google.com> Change-Id: I06cc57a24cb8a5d29f29757cb522717ef2460380 |
||
|
|
4345c3db84 |
ANDROID: sched: update is_cpu_allowed tracehook
Currently, the trace hook for is_cpu_allowed only executes if the cpu is not a kthread. Modules need to be able to reject cpus regardless of whether the task is a kthread or not. Modules also need to have the flexibility to execute, or not, the remainder of is_cpu_allowed. Move the tracepoint for is_cpu_allowed so that it is invoked regardless of task's kthread status, but do not interfere with per-cpu-kthread cpu assignment. Bug: 222550772 Change-Id: Ide48a82a33129448bb22be28814267b0b76535a2 Signed-off-by: Stephen Dickey <quic_dickey@quicinc.com> |
||
|
|
24479cdbb7 |
Merge remote-tracking branch into HEAD
* keystone/mirror-android13-5.15: UPSTREAM: dma-buf: system_heap: Avoid warning on mid-order allocations FROMGIT: bpf: Add config to allow loading modules with BTF mismatches UPSTREAM: sched: Fix yet more sched_fork() races UPSTREAM: sched/fair: Fix fault in reweight_entity ANDROID: Update QCOM symbol list ANDROID: gki_defconfig: Enable powercap framework ANDROID: KVM: arm64: Ignore length of 0 in kvm_flush_dcache_to_poc() Signed-off-by: keystone-kernel-automerger <keystone-kernel-automerger@google.com> Change-Id: Id0e75935478ad96af456fa9897083a7d8222194c |
||
|
|
4632fda82b |
ANDROID: dma-direct: Make DMA32 disablement work for CONFIG_NUMA
zone_dma32_is_empty() currently lacks the proper validation to ensure
that the NUMA node ID it receives as an argument is valid. This has no
effect on kernels with CONFIG_NUMA=n as NODE_DATA() will return the
same pglist_data on these devices, but on kernels with CONFIG_NUMA=y,
this is not the case, and the node passed to NODE_DATA must be
validated.
Rather than trying to find the node containing ZONE_DMA32, replace
calls of zone_dma32_is_empty() with zone_dma32_are_empty() (which
iterates over all nodes and returns false if one of the nodes holds
DMA32 and it is non-empty).
Bug: 199917449
Fixes: c3c2bb34ac8f ("ANDROID: arm64/mm: Add command line option to make ZONE_DMA32 empty")
Signed-off-by: Chris Goldsworthy <quic_cgoldswo@quicinc.com>
Change-Id: I850fb9213b71a1ef29106728bfda0cc6de46fdbb
|
||
|
|
9000406481 |
tracing: Have traceon and traceoff trigger honor the instance
commit 302e9edd54985f584cfc180098f3554774126969 upstream.
If a trigger is set on an event to disable or enable tracing within an
instance, then tracing should be disabled or enabled in the instance and
not at the top level, which is confusing to users.
Link: https://lkml.kernel.org/r/20220223223837.14f94ec3@rorschach.local.home
Cc: stable@vger.kernel.org
Fixes:
|
||
|
|
7e35b31e2c |
tracing: Dump stacktrace trigger to the corresponding instance
commit ce33c845b030c9cf768370c951bc699470b09fa7 upstream.
The stacktrace event trigger is not dumping the stacktrace to the instance
where it was enabled, but to the global "instance."
Use the private_data, pointing to the trigger file, to figure out the
corresponding trace instance, and use it in the trigger action, like
snapshot_trigger does.
Link: https://lkml.kernel.org/r/afbb0b4f18ba92c276865bc97204d438473f4ebc.1645396236.git.bristot@kernel.org
Cc: stable@vger.kernel.org
Fixes:
|
||
|
|
8628f489b7 |
bpf: Add schedule points in batch ops
commit 75134f16e7dd0007aa474b281935c5f42e79f2c8 upstream. syzbot reported various soft lockups caused by bpf batch operations. INFO: task kworker/1:1:27 blocked for more than 140 seconds. INFO: task hung in rcu_barrier Nothing prevents batch ops to process huge amount of data, we need to add schedule points in them. Note that maybe_wait_bpf_programs(map) calls from generic_map_delete_batch() can be factorized by moving the call after the loop. This will be done later in -next tree once we get this fix merged, unless there is strong opinion doing this optimization sooner. Fixes: |
||
|
|
ebeb7b7357 |
cgroup-v1: Correct privileges check in release_agent writes
commit 467a726b754f474936980da793b4ff2ec3e382a7 upstream.
The idea is to check: a) the owning user_ns of cgroup_ns, b)
capabilities in init_user_ns.
The commit 24f600856418 ("cgroup-v1: Require capabilities to set
release_agent") got this wrong in the write handler of release_agent
since it checked user_ns of the opener (may be different from the owning
user_ns of cgroup_ns).
Secondly, to avoid possibly confused deputy, the capability of the
opener must be checked.
Fixes: 24f600856418 ("cgroup-v1: Require capabilities to set release_agent")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/stable/20220216121142.GB30035@blackbody.suse.cz/
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Reviewed-by: Masami Ichikawa(CIP) <masami.ichikawa@cybertrust.co.jp>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
||
|
|
ffed0bf6a6 |
cgroup/cpuset: Fix a race between cpuset_attach() and cpu hotplug
commit 05c7b7a92cc87ff8d7fde189d0fade250697573c upstream.
As previously discussed(https://lkml.org/lkml/2022/1/20/51),
cpuset_attach() is affected with similar cpu hotplug race,
as follow scenario:
cpuset_attach() cpu hotplug
--------------------------- ----------------------
down_write(cpuset_rwsem)
guarantee_online_cpus() // (load cpus_attach)
sched_cpu_deactivate
set_cpu_active()
// will change cpu_active_mask
set_cpus_allowed_ptr(cpus_attach)
__set_cpus_allowed_ptr_locked()
// (if the intersection of cpus_attach and
cpu_active_mask is empty, will return -EINVAL)
up_write(cpuset_rwsem)
To avoid races such as described above, protect cpuset_attach() call
with cpu_hotplug_lock.
Fixes:
|
||
|
|
b25a6a78d4 |
FROMGIT: bpf: Add config to allow loading modules with BTF mismatches
BTF mismatch can occur for a separately-built module even when the ABI is otherwise compatible and nothing else would prevent successfully loading. Add a new Kconfig to control how mismatches are handled. By default, preserve the current behavior of refusing to load the module. If MODULE_ALLOW_BTF_MISMATCH is enabled, load the module but ignore its BTF information. Suggested-by: Yonghong Song <yhs@fb.com> Suggested-by: Michal Suchánek <msuchanek@suse.de> Signed-off-by: Connor O'Brien <connoro@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/CAADnVQJ+OVPnBz8z3vNu8gKXX42jCUqfuvhWAyCQDu8N_yqqwQ@mail.gmail.com Link: https: //lore.kernel.org/bpf/20220223012814.1898677-1-connoro@google.com (cherry picked from commit 5e214f2e43e453d862ebbbd2a4f7ee3fe650f209 git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master) Bug: 218515241 Signed-off-by: Connor O'Brien <connoro@google.com> Change-Id: Idabf7f5e38cb58da55faeaafae56dee7262a6886 |
||
|
|
42da9cb956 |
UPSTREAM: sched: Fix yet more sched_fork() races
Where commit 4ef0c5c6b5ba ("kernel/sched: Fix sched_fork() access an
invalid sched_task_group") fixed a fork race vs cgroup, it opened up a
race vs syscalls by not placing the task on the runqueue before it
gets exposed through the pidhash.
Commit 13765de8148f ("sched/fair: Fix fault in reweight_entity") is
trying to fix a single instance of this, instead fix the whole class
of issues, effectively reverting this commit.
Change-Id: I4d34311eac28b23ee32e9308a21c66afe8fa8a3b
Fixes: 4ef0c5c6b5ba ("kernel/sched: Fix sched_fork() access an invalid sched_task_group")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Tested-by: Zhang Qiao <zhangqiao22@huawei.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lkml.kernel.org/r/YgoeCbwj5mbCR0qA@hirez.programming.kicks-ass.net
BUG: 221850698
(cherry picked from commit b1e8206582f9d680cff7d04828708c8b6ab32957)
Signed-off-by: Ashay Jaiswal <quic_ashayj@quicinc.com>
|
||
|
|
8ab19855fc |
UPSTREAM: sched/fair: Fix fault in reweight_entity
Syzbot found a GPF in reweight_entity. This has been bisected to
commit 4ef0c5c6b5ba ("kernel/sched: Fix sched_fork() access an invalid
sched_task_group")
There is a race between sched_post_fork() and setpriority(PRIO_PGRP)
within a thread group that causes a null-ptr-deref in
reweight_entity() in CFS. The scenario is that the main process spawns
number of new threads, which then call setpriority(PRIO_PGRP, 0, -20),
wait, and exit. For each of the new threads the copy_process() gets
invoked, which adds the new task_struct and calls sched_post_fork()
for it.
In the above scenario there is a possibility that
setpriority(PRIO_PGRP) and set_one_prio() will be called for a thread
in the group that is just being created by copy_process(), and for
which the sched_post_fork() has not been executed yet. This will
trigger a null pointer dereference in reweight_entity(), as it will
try to access the run queue pointer, which hasn't been set.
Before the mentioned change the cfs_rq pointer for the task has been
set in sched_fork(), which is called much earlier in copy_process(),
before the new task is added to the thread_group. Now it is done in
the sched_post_fork(), which is called after that. To fix the issue
the remove the update_load param from the update_load param() function
and call reweight_task() only if the task flag doesn't have the
TASK_NEW flag set.
Change-Id: I22d5b9d0b06cd85f0f02446b1e8a2389935cffa8
Fixes: 4ef0c5c6b5ba ("kernel/sched: Fix sched_fork() access an invalid sched_task_group")
Reported-by: syzbot+af7a719bc92395ee41b3@syzkaller.appspotmail.com
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20220203161846.1160750-1-tadeusz.struk@linaro.org
BUG: 221850698
(cherry picked from commit 13765de8148f71fa795e0a6607de37c49ea5915a)
Signed-off-by: Ashay Jaiswal <quic_ashayj@quicinc.com>
|
||
|
|
022f403327 |
Merge remote-tracking branch into HEAD
* keystone/mirror-android13-5.15: (197 commits) ANDROID: dm-bow: remove dm-bow ANDROID: align constness of extcon_get_state parameter Linux 5.15.25 lockdep: Correct lock_classes index mapping i2c: brcmstb: fix support for DSL and CM variants ice: enable parsing IPSEC SPI headers for RSS scsi: qedi: Fix ABBA deadlock in qedi_process_tmf_resp() and qedi_process_cmd_cleanup_resp() copy_process(): Move fd_install() out of sighand->siglock critical section dmaengine: ptdma: Fix the error handling path in pt_core_init() i2c: qcom-cci: don't put a device tree node before i2c_add_adapter() i2c: qcom-cci: don't delete an unregistered adapter tests: fix idmapped mount_setattr test dmaengine: sh: rcar-dmac: Check for error num after dma_set_max_seg_size dmaengine: stm32-dmamux: Fix PM disable depth imbalance in stm32_dmamux_probe dmaengine: sh: rcar-dmac: Check for error num after setting mask net: sched: limit TC_ACT_REPEAT loops ucounts: Move RLIMIT_NPROC handling after set_user rlimit: Fix RLIMIT_NPROC enforcement failure caused by capability calls in set_user lib/iov_iter: initialize "flags" in new pipe_buffer ucounts: Enforce RLIMIT_NPROC not RLIMIT_NPROC+1 ... Signed-off-by: keystone-kernel-automerger <keystone-kernel-automerger@google.com> Change-Id: Ie9769054e64a7f9797180107b4f0da3f21df5fc0 |
||
|
|
2ded03fd7c |
Merge 5.15.25 into android13-5.15
Changes in 5.15.25 drm/nouveau/pmu/gm200-: use alternate falcon reset sequence fs/proc: task_mmu.c: don't read mapcount for migration entry btrfs: zoned: cache reported zone during mount scsi: lpfc: Fix mailbox command failure during driver initialization HID:Add support for UGTABLET WP5540 Revert "svm: Add warning message for AVIC IPI invalid target" parisc: Show error if wrong 32/64-bit compiler is being used serial: parisc: GSC: fix build when IOSAPIC is not set parisc: Drop __init from map_pages declaration parisc: Fix data TLB miss in sba_unmap_sg parisc: Fix sglist access in ccio-dma.c mmc: block: fix read single on recovery logic mm: don't try to NUMA-migrate COW pages that have other uses HID: amd_sfh: Add illuminance mask to limit ALS max value HID: i2c-hid: goodix: Fix a lockdep splat HID: amd_sfh: Increase sensor command timeout HID: amd_sfh: Correct the structure field name PCI: hv: Fix NUMA node assignment when kernel boots with custom NUMA topology parisc: Add ioread64_lo_hi() and iowrite64_lo_hi() btrfs: send: in case of IO error log it platform/x86: touchscreen_dmi: Add info for the RWC NANOTE P8 AY07J 2-in-1 platform/x86: ISST: Fix possible circular locking dependency detected kunit: tool: Import missing importlib.abc selftests: rtc: Increase test timeout so that all tests run kselftest: signal all child processes net: ieee802154: at86rf230: Stop leaking skb's selftests/zram: Skip max_comp_streams interface on newer kernel selftests/zram01.sh: Fix compression ratio calculation selftests/zram: Adapt the situation that /dev/zram0 is being used selftests: openat2: Print also errno in failure messages selftests: openat2: Add missing dependency in Makefile selftests: openat2: Skip testcases that fail with EOPNOTSUPP selftests: skip mincore.check_file_mmap when fs lacks needed support ax25: improve the incomplete fix to avoid UAF and NPD bugs pinctrl: bcm63xx: fix unmet dependency on REGMAP for GPIO_REGMAP vfs: make freeze_super abort when sync_filesystem returns error quota: make dquot_quota_sync return errors from ->sync_fs scsi: pm80xx: Fix double completion for SATA devices kselftest: Fix vdso_test_abi return status scsi: core: Reallocate device's budget map on queue depth change scsi: pm8001: Fix use-after-free for aborted TMF sas_task scsi: pm8001: Fix use-after-free for aborted SSP/STP sas_task drm/amd: Warn users about potential s0ix problems nvme: fix a possible use-after-free in controller reset during load nvme-tcp: fix possible use-after-free in transport error_recovery work nvme-rdma: fix possible use-after-free in transport error_recovery work net: sparx5: do not refer to skb after passing it on drm/amd: add support to check whether the system is set to s3 drm/amd: Only run s3 or s0ix if system is configured properly drm/amdgpu: fix logic inversion in check x86/Xen: streamline (and fix) PV CPU enumeration Revert "module, async: async_synchronize_full() on module init iff async is used" gcc-plugins/stackleak: Use noinstr in favor of notrace random: wake up /dev/random writers after zap KVM: x86/xen: Fix runstate updates to be atomic when preempting vCPU KVM: x86: nSVM/nVMX: set nested_run_pending on VM entry which is a result of RSM KVM: x86: SVM: don't passthrough SMAP/SMEP/PKE bits in !NPT && !gCR0.PG case KVM: x86: nSVM: fix potential NULL derefernce on nested migration KVM: x86: nSVM: mark vmcb01 as dirty when restoring SMM saved state iwlwifi: fix use-after-free drm/radeon: Fix backlight control on iMac 12,1 drm/atomic: Don't pollute crtc_state->mode_blob with error pointers drm/amd/pm: correct the sequence of sending gpu reset msg drm/amdgpu: skipping SDMA hw_init and hw_fini for S0ix. drm/i915/opregion: check port number bounds for SWSCI display power state drm/i915: Fix dbuf slice config lookup drm/i915: Fix mbus join config lookup vsock: remove vsock from connected table when connect is interrupted by a signal drm/cma-helper: Set VM_DONTEXPAND for mmap drm/i915/gvt: Make DRM_I915_GVT depend on X86 drm/i915/ttm: tweak priority hint selection iwlwifi: pcie: fix locking when "HW not ready" iwlwifi: pcie: gen2: fix locking when "HW not ready" iwlwifi: mvm: don't send SAR GEO command for 3160 devices selftests: netfilter: fix exit value for nft_concat_range netfilter: nft_synproxy: unregister hooks on init error path selftests: netfilter: disable rp_filter on router ipv4: fix data races in fib_alias_hw_flags_set ipv6: fix data-race in fib6_info_hw_flags_set / fib6_purge_rt ipv6: mcast: use rcu-safe version of ipv6_get_lladdr() ipv6: per-netns exclusive flowlabel checks Revert "net: ethernet: bgmac: Use devm_platform_ioremap_resource_byname" mac80211: mlme: check for null after calling kmemdup brcmfmac: firmware: Fix crash in brcm_alt_fw_path cfg80211: fix race in netlink owner interface destruction net: dsa: lan9303: fix reset on probe net: dsa: mv88e6xxx: flush switchdev FDB workqueue before removing VLAN net: dsa: lantiq_gswip: fix use after free in gswip_remove() net: dsa: lan9303: handle hwaccel VLAN tags net: dsa: lan9303: add VLAN IDs to master device net: ieee802154: ca8210: Fix lifs/sifs periods ping: fix the dif and sdif check in ping_lookup bonding: force carrier update when releasing slave drop_monitor: fix data-race in dropmon_net_event / trace_napi_poll_hit net_sched: add __rcu annotation to netdev->qdisc bonding: fix data-races around agg_select_timer libsubcmd: Fix use-after-free for realloc(..., 0) net/smc: Avoid overwriting the copies of clcsock callback functions net: phy: mediatek: remove PHY mode check on MT7531 atl1c: fix tx timeout after link flap on Mikrotik 10/25G NIC tipc: fix wrong publisher node address in link publications dpaa2-switch: fix default return of dpaa2_switch_flower_parse_mirror_key dpaa2-eth: Initialize mutex used in one step timestamping path net: bridge: multicast: notify switchdev driver whenever MC processing gets disabled perf bpf: Defer freeing string after possible strlen() on it selftests/exec: Add non-regular to TEST_GEN_PROGS arm64: Correct wrong label in macro __init_el2_gicv3 ALSA: usb-audio: revert to IMPLICIT_FB_FIXED_DEV for M-Audio FastTrack Ultra ALSA: hda/realtek: Add quirk for Legion Y9000X 2019 ALSA: hda/realtek: Fix deadlock by COEF mutex ALSA: hda: Fix regression on forced probe mask option ALSA: hda: Fix missing codec probe on Shenker Dock 15 ASoC: ops: Fix stereo change notifications in snd_soc_put_volsw() ASoC: ops: Fix stereo change notifications in snd_soc_put_volsw_range() ASoC: ops: Fix stereo change notifications in snd_soc_put_volsw_sx() ASoC: ops: Fix stereo change notifications in snd_soc_put_xr_sx() cifs: fix set of group SID via NTSD xattrs powerpc/603: Fix boot failure with DEBUG_PAGEALLOC and KFENCE powerpc/lib/sstep: fix 'ptesync' build error mtd: rawnand: gpmi: don't leak PM reference in error path smb3: fix snapshot mount option tipc: fix wrong notification node addresses scsi: ufs: Remove dead code scsi: ufs: Fix a deadlock in the error handler ASoC: tas2770: Insert post reset delay ASoC: qcom: Actually clear DMA interrupt register for HDMI block/wbt: fix negative inflight counter when remove scsi device NFS: Remove an incorrect revalidation in nfs4_update_changeattr_locked() NFS: LOOKUP_DIRECTORY is also ok with symlinks NFS: Do not report writeback errors in nfs_getattr() tty: n_tty: do not look ahead for EOL character past the end of the buffer block: fix surprise removal for drivers calling blk_set_queue_dying mtd: rawnand: qcom: Fix clock sequencing in qcom_nandc_probe() mtd: parsers: qcom: Fix kernel panic on skipped partition mtd: parsers: qcom: Fix missing free for pparts in cleanup mtd: phram: Prevent divide by zero bug in phram_setup() mtd: rawnand: brcmnand: Fixed incorrect sub-page ECC status HID: elo: fix memory leak in elo_probe mtd: rawnand: ingenic: Fix missing put_device in ingenic_ecc_get Drivers: hv: vmbus: Fix memory leak in vmbus_add_channel_kobj KVM: x86/pmu: Refactoring find_arch_event() to pmc_perf_hw_id() KVM: x86/pmu: Don't truncate the PerfEvtSeln MSR when creating a perf event KVM: x86/pmu: Use AMD64_RAW_EVENT_MASK for PERF_TYPE_RAW ARM: OMAP2+: hwmod: Add of_node_put() before break ARM: OMAP2+: adjust the location of put_device() call in omapdss_init_of phy: usb: Leave some clocks running during suspend staging: vc04_services: Fix RCU dereference check phy: phy-mtk-tphy: Fix duplicated argument in phy-mtk-tphy irqchip/sifive-plic: Add missing thead,c900-plic match string x86/bug: Merge annotate_reachable() into _BUG_FLAGS() asm netfilter: conntrack: don't refresh sctp entries in closed state ksmbd: fix same UniqueId for dot and dotdot entries ksmbd: don't align last entry offset in smb2 query directory arm64: dts: meson-gx: add ATF BL32 reserved-memory region arm64: dts: meson-g12: add ATF BL32 reserved-memory region arm64: dts: meson-g12: drop BL32 region from SEI510/SEI610 pidfd: fix test failure due to stack overflow on some arches selftests: fixup build warnings in pidfd / clone3 tests mm: io_uring: allow oom-killer from io_uring_setup ACPI: PM: Revert "Only mark EC GPE for wakeup on Intel systems" kconfig: let 'shell' return enough output for deep path names ata: libata-core: Disable TRIM on M88V29 soc: aspeed: lpc-ctrl: Block error printing on probe defer cases xprtrdma: fix pointer derefs in error cases of rpcrdma_ep_create drm/rockchip: dw_hdmi: Do not leave clock enabled in error case tracing: Fix tp_printk option related with tp_printk_stop_on_boot display/amd: decrease message verbosity about watermarks table failure drm/amd/display: Cap pflip irqs per max otg number drm/amd/display: fix yellow carp wm clamping net: usb: qmi_wwan: Add support for Dell DW5829e net: macb: Align the dma and coherent dma masks kconfig: fix failing to generate auto.conf scsi: lpfc: Fix pt2pt NVMe PRLI reject LOGO loop EDAC: Fix calculation of returned address and next offset in edac_align_ptr() ucounts: Handle wrapping in is_ucounts_overlimit ucounts: In set_cred_ucounts assume new->ucounts is non-NULL ucounts: Base set_cred_ucounts changes on the real user ucounts: Enforce RLIMIT_NPROC not RLIMIT_NPROC+1 lib/iov_iter: initialize "flags" in new pipe_buffer rlimit: Fix RLIMIT_NPROC enforcement failure caused by capability calls in set_user ucounts: Move RLIMIT_NPROC handling after set_user net: sched: limit TC_ACT_REPEAT loops dmaengine: sh: rcar-dmac: Check for error num after setting mask dmaengine: stm32-dmamux: Fix PM disable depth imbalance in stm32_dmamux_probe dmaengine: sh: rcar-dmac: Check for error num after dma_set_max_seg_size tests: fix idmapped mount_setattr test i2c: qcom-cci: don't delete an unregistered adapter i2c: qcom-cci: don't put a device tree node before i2c_add_adapter() dmaengine: ptdma: Fix the error handling path in pt_core_init() copy_process(): Move fd_install() out of sighand->siglock critical section scsi: qedi: Fix ABBA deadlock in qedi_process_tmf_resp() and qedi_process_cmd_cleanup_resp() ice: enable parsing IPSEC SPI headers for RSS i2c: brcmstb: fix support for DSL and CM variants lockdep: Correct lock_classes index mapping Linux 5.15.25 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ib129a0e11f5e82d67563329a5de1b0aef1d87928 |
||
|
|
5dcc365697 |
lockdep: Correct lock_classes index mapping
commit 28df029d53a2fd80c1b8674d47895648ad26dcfb upstream. A kernel exception was hit when trying to dump /proc/lockdep_chains after lockdep report "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!": Unable to handle kernel paging request at virtual address 00054005450e05c3 ... 00054005450e05c3] address between user and kernel address ranges ... pc : [0xffffffece769b3a8] string+0x50/0x10c lr : [0xffffffece769ac88] vsnprintf+0x468/0x69c ... Call trace: string+0x50/0x10c vsnprintf+0x468/0x69c seq_printf+0x8c/0xd8 print_name+0x64/0xf4 lc_show+0xb8/0x128 seq_read_iter+0x3cc/0x5fc proc_reg_read_iter+0xdc/0x1d4 The cause of the problem is the function lock_chain_get_class() will shift lock_classes index by 1, but the index don't need to be shifted anymore since commit |
||
|
|
795feafb72 |
copy_process(): Move fd_install() out of sighand->siglock critical section
commit ddc204b517e60ae64db34f9832dc41dafa77c751 upstream. I was made aware of the following lockdep splat: [ 2516.308763] ===================================================== [ 2516.309085] WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected [ 2516.309433] 5.14.0-51.el9.aarch64+debug #1 Not tainted [ 2516.309703] ----------------------------------------------------- [ 2516.310149] stress-ng/153663 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: [ 2516.310512] ffff0000e422b198 (&newf->file_lock){+.+.}-{2:2}, at: fd_install+0x368/0x4f0 [ 2516.310944] and this task is already holding: [ 2516.311248] ffff0000c08140d8 (&sighand->siglock){-.-.}-{2:2}, at: copy_process+0x1e2c/0x3e80 [ 2516.311804] which would create a new lock dependency: [ 2516.312066] (&sighand->siglock){-.-.}-{2:2} -> (&newf->file_lock){+.+.}-{2:2} [ 2516.312446] but this new dependency connects a HARDIRQ-irq-safe lock: [ 2516.312983] (&sighand->siglock){-.-.}-{2:2} : [ 2516.330700] Possible interrupt unsafe locking scenario: [ 2516.331075] CPU0 CPU1 [ 2516.331328] ---- ---- [ 2516.331580] lock(&newf->file_lock); [ 2516.331790] local_irq_disable(); [ 2516.332231] lock(&sighand->siglock); [ 2516.332579] lock(&newf->file_lock); [ 2516.332922] <Interrupt> [ 2516.333069] lock(&sighand->siglock); [ 2516.333291] *** DEADLOCK *** [ 2516.389845] stack backtrace: [ 2516.390101] CPU: 3 PID: 153663 Comm: stress-ng Kdump: loaded Not tainted 5.14.0-51.el9.aarch64+debug #1 [ 2516.390756] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 2516.391155] Call trace: [ 2516.391302] dump_backtrace+0x0/0x3e0 [ 2516.391518] show_stack+0x24/0x30 [ 2516.391717] dump_stack_lvl+0x9c/0xd8 [ 2516.391938] dump_stack+0x1c/0x38 [ 2516.392247] print_bad_irq_dependency+0x620/0x710 [ 2516.392525] check_irq_usage+0x4fc/0x86c [ 2516.392756] check_prev_add+0x180/0x1d90 [ 2516.392988] validate_chain+0x8e0/0xee0 [ 2516.393215] __lock_acquire+0x97c/0x1e40 [ 2516.393449] lock_acquire.part.0+0x240/0x570 [ 2516.393814] lock_acquire+0x90/0xb4 [ 2516.394021] _raw_spin_lock+0xe8/0x154 [ 2516.394244] fd_install+0x368/0x4f0 [ 2516.394451] copy_process+0x1f5c/0x3e80 [ 2516.394678] kernel_clone+0x134/0x660 [ 2516.394895] __do_sys_clone3+0x130/0x1f4 [ 2516.395128] __arm64_sys_clone3+0x5c/0x7c [ 2516.395478] invoke_syscall.constprop.0+0x78/0x1f0 [ 2516.395762] el0_svc_common.constprop.0+0x22c/0x2c4 [ 2516.396050] do_el0_svc+0xb0/0x10c [ 2516.396252] el0_svc+0x24/0x34 [ 2516.396436] el0t_64_sync_handler+0xa4/0x12c [ 2516.396688] el0t_64_sync+0x198/0x19c [ 2517.491197] NET: Registered PF_ATMPVC protocol family [ 2517.491524] NET: Registered PF_ATMSVC protocol family [ 2591.991877] sched: RT throttling activated One way to solve this problem is to move the fd_install() call out of the sighand->siglock critical section. Before commit |
||
|
|
2b2be95b60 |
ucounts: Move RLIMIT_NPROC handling after set_user
commit c923a8e7edb010da67424077cbf1a6f1396ebd2e upstream.
During set*id() which cred->ucounts to charge the the current process
to is not known until after set_cred_ucounts. So move the
RLIMIT_NPROC checking into a new helper flag_nproc_exceeded and call
flag_nproc_exceeded after set_cred_ucounts.
This is very much an arbitrary subset of the places where we currently
change the RLIMIT_NPROC accounting, designed to preserve the existing
logic.
Fixing the existing logic will be the subject of another series of
changes.
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20220216155832.680775-4-ebiederm@xmission.com
Fixes:
|
||
|
|
b5f949d9a9 |
rlimit: Fix RLIMIT_NPROC enforcement failure caused by capability calls in set_user
commit c16bdeb5a39ffa3f32b32f812831a2092d2a3061 upstream. Solar Designer <solar@openwall.com> wrote: > I'm not aware of anyone actually running into this issue and reporting > it. The systems that I personally know use suexec along with rlimits > still run older/distro kernels, so would not yet be affected. > > So my mention was based on my understanding of how suexec works, and > code review. Specifically, Apache httpd has the setting RLimitNPROC, > which makes it set RLIMIT_NPROC: > > https://httpd.apache.org/docs/2.4/mod/core.html#rlimitnproc > > The above documentation for it includes: > > "This applies to processes forked from Apache httpd children servicing > requests, not the Apache httpd children themselves. This includes CGI > scripts and SSI exec commands, but not any processes forked from the > Apache httpd parent, such as piped logs." > > In code, there are: > > ./modules/generators/mod_cgid.c: ( (cgid_req.limits.limit_nproc_set) && ((rc = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC, > ./modules/generators/mod_cgi.c: ((rc = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC, > ./modules/filters/mod_ext_filter.c: rv = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC, conf->limit_nproc); > > For example, in mod_cgi.c this is in run_cgi_child(). > > I think this means an httpd child sets RLIMIT_NPROC shortly before it > execs suexec, which is a SUID root program. suexec then switches to the > target user and execs the CGI script. > > Before |
||
|
|
2d2d92cfcd |
ucounts: Enforce RLIMIT_NPROC not RLIMIT_NPROC+1
commit 8f2f9c4d82f24f172ae439e5035fc1e0e4c229dd upstream. Michal Koutný <mkoutny@suse.com> wrote: > It was reported that v5.14 behaves differently when enforcing > RLIMIT_NPROC limit, namely, it allows one more task than previously. > This is consequence of the commit |
||
|
|
efc853d8ff |
ucounts: Base set_cred_ucounts changes on the real user
commit a55d07294f1e9b576093bdfa95422f8119941e83 upstream. Michal Koutný <mkoutny@suse.com> wrote: > Tasks are associated to multiple users at once. Historically and as per > setrlimit(2) RLIMIT_NPROC is enforce based on real user ID. > > The commit |
||
|
|
f418bfabea |
ucounts: In set_cred_ucounts assume new->ucounts is non-NULL
commit 99c31f9feda41d0f10d030dc04ba106c93295aa2 upstream. Any cred that is destined for use by commit_creds must have a non-NULL cred->ucounts field. Only curing credential construction is a NULL cred->ucounts valid. Only abort_creds, put_cred, and put_cred_rcu needs to deal with a cred with a NULL ucount. As set_cred_ucounts is non of those case don't confuse people by handling something that can not happen. Link: https://lkml.kernel.org/r/871r4irzds.fsf_-_@disp2133 Tested-by: Yu Zhao <yuzhao@google.com> Reviewed-by: Alexey Gladkov <legion@kernel.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
d464492eb3 |
ucounts: Handle wrapping in is_ucounts_overlimit
commit 0cbae9e24fa7d6c6e9f828562f084da82217a0c5 upstream.
While examining is_ucounts_overlimit and reading the various messages
I realized that is_ucounts_overlimit fails to deal with counts that
may have wrapped.
Being wrapped should be a transitory state for counts and they should
never be wrapped for long, but it can happen so handle it.
Cc: stable@vger.kernel.org
Fixes:
|
||
|
|
eb61dbb192 |
tracing: Fix tp_printk option related with tp_printk_stop_on_boot
[ Upstream commit 3203ce39ac0b2a57a84382ec184c7d4a0bede175 ]
The kernel parameter "tp_printk_stop_on_boot" starts with "tp_printk" which is
the same as another kernel parameter "tp_printk". If "tp_printk" setup is
called before the "tp_printk_stop_on_boot", it will override the latter
and keep it from being set.
This is similar to other kernel parameter issues, such as:
Commit
|
||
|
|
effdcc2505 |
gcc-plugins/stackleak: Use noinstr in favor of notrace
[ Upstream commit dcb85f85fa6f142aae1fe86f399d4503d49f2b60 ] While the stackleak plugin was already using notrace, objtool is now a bit more picky. Update the notrace uses to noinstr. Silences the following objtool warnings when building with: CONFIG_DEBUG_ENTRY=y CONFIG_STACK_VALIDATION=y CONFIG_VMLINUX_VALIDATION=y CONFIG_GCC_PLUGIN_STACKLEAK=y vmlinux.o: warning: objtool: do_syscall_64()+0x9: call to stackleak_track_stack() leaves .noinstr.text section vmlinux.o: warning: objtool: do_int80_syscall_32()+0x9: call to stackleak_track_stack() leaves .noinstr.text section vmlinux.o: warning: objtool: exc_general_protection()+0x22: call to stackleak_track_stack() leaves .noinstr.text section vmlinux.o: warning: objtool: fixup_bad_iret()+0x20: call to stackleak_track_stack() leaves .noinstr.text section vmlinux.o: warning: objtool: do_machine_check()+0x27: call to stackleak_track_stack() leaves .noinstr.text section vmlinux.o: warning: objtool: .text+0x5346e: call to stackleak_erase() leaves .noinstr.text section vmlinux.o: warning: objtool: .entry.text+0x143: call to stackleak_erase() leaves .noinstr.text section vmlinux.o: warning: objtool: .entry.text+0x10eb: call to stackleak_erase() leaves .noinstr.text section vmlinux.o: warning: objtool: .entry.text+0x17f9: call to stackleak_erase() leaves .noinstr.text section Note that the plugin's addition of calls to stackleak_track_stack() from noinstr functions is expected to be safe, as it isn't runtime instrumentation and is self-contained. Cc: Alexander Popov <alex.popov@linux.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
0a01326fdd |
Revert "module, async: async_synchronize_full() on module init iff async is used"
[ Upstream commit 67d6212afda218d564890d1674bab28e8612170f ] This reverts commit |
||
|
|
59c50c39bb |
Merge remote-tracking branch into HEAD
* keystone/mirror-android13-5.15: UPSTREAM: KVM: arm64: vgic: Read HW interrupt pending state from the HW ANDROID: KVM: arm64: Forward PSCI SYSTEM_RESET2 calls back to the host FROMLIST: BACKPORT: KVM: arm64: Indicate SYSTEM_RESET2 in kvm_run::system_event flags field FROMLIST: KVM: arm64: Expose PSCI SYSTEM_RESET2 call to the guest FROMLIST: KVM: arm64: Bump guest PSCI version to 1.1 UPSTREAM: nl80211: don't kfree() ERR_PTR() value UPSTREAM: dma-buf: cma_heap: Fix mutex locking section ANDROID: Add a vendor hook that allow a module to modify the wake flag ANDROID: gki_defconfig: Enable CONFIG_RANDOM_TRUST_CPU=y ANDROID: KVM: arm64: Don't repaint PSCI SYSTEM_RESET to SYSTEM_OFF Signed-off-by: keystone-kernel-automerger <keystone-kernel-automerger@google.com> Change-Id: I9ec7fbd73f68f22d7187f0e3482749c3297f3820 |
||
|
|
f79e49085d |
ANDROID: Add a vendor hook that allow a module to modify the wake flag
android_vh_do_wake_up_sync: To modify the mode value of __wake_up_sync_key android_vh_set_wake_flags: To modify the wake flag from a module Bug: 181743516 Signed-off-by: Namkyu Kim <namkyu78.kim@samsung.com> Change-Id: I972e2469c3f139373d21f1e8c85974763388a693 (cherry picked from commit 97368fc2dcc29777e8d3d637d0afdef90e611763) (cherry picked from commit 0d0f0c5020bc425c9a51c8d17b16ca831c2598fb) |
||
|
|
9b9237d43d |
Merge remote-tracking branch into HEAD
* keystone/mirror-android13-5.15: (175 commits) ANDROID: Update comment in build.config.gki.aarch64. ANDROID: Revert "tracefs: Have tracefs directories not set OTH permission bits by default" FROMGIT: mm: fix use-after-free when anon vma name is used after vma is freed Linux 5.15.24 iommu: Fix potential use-after-free during probe perf: Fix list corruption in perf_cgroup_switch() arm64: dts: imx8mq: fix lcdif port node MIPS: octeon: Fix missed PTR->PTR_WD conversion scsi: lpfc: Reduce log messages seen after firmware download scsi: lpfc: Remove NVMe support if kernel has NVME_FC disabled Makefile.extrawarn: Move -Wunaligned-access to W=1 x86/sgx: Silence softlockup detection when releasing large enclaves hwmon: (dell-smm) Speed up setting of fan speed bus: mhi: pci_generic: Add mru_default for Cinterion MV31-W bus: mhi: pci_generic: Add mru_default for Foxconn SDX55 s390/cio: verify the driver availability for path_event call signal: HANDLER_EXIT should clear SIGNAL_UNKILLABLE seccomp: Invalidate seccomp mode to catch death failures mm: memcg: synchronize objcg lists with a dedicated spinlock iio: buffer: Fix file related error handling in IIO_BUFFER_GET_FD_IOCTL ... Signed-off-by: keystone-kernel-automerger <keystone-kernel-automerger@google.com> Change-Id: I8be2137ade838a7ed412f2e152bbe47816b08afe |
||
|
|
287cd0232c |
Merge 5.15.24 into android13-5.15
Changes in 5.15.24 integrity: check the return value of audit_log_start() ima: fix reference leak in asymmetric_verify() ima: Remove ima_policy file before directory ima: Allow template selection with ima_template[_fmt]= after ima_hash= ima: Do not print policy rule with inactive LSM labels mmc: sdhci-of-esdhc: Check for error num after setting mask mmc: core: Wait for command setting 'Power Off Notification' bit to complete can: isotp: fix potential CAN frame reception race in isotp_rcv() can: isotp: fix error path in isotp_sendmsg() to unlock wait queue net: phy: marvell: Fix RGMII Tx/Rx delays setting in 88e1121-compatible PHYs net: phy: marvell: Fix MDI-x polarity setting in 88e1118-compatible PHYs NFS: Fix initialisation of nfs_client cl_flags field NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes NFSD: Fix ia_size underflow NFSD: Clamp WRITE offsets NFSD: Fix offset type in I/O trace points NFSD: Fix the behavior of READ near OFFSET_MAX thermal/drivers/int340x: Improve the tcc offset saving for suspend/resume thermal/drivers/int340x: processor_thermal: Suppot 64 bit RFIM responses thermal: int340x: Limit Kconfig to 64-bit thermal/drivers/int340x: Fix RFIM mailbox write commands tracing: Propagate is_signed to expression NFS: change nfs_access_get_cached to only report the mask NFSv4 only print the label when its queried nfs: nfs4clinet: check the return value of kstrdup() NFSv4.1: Fix uninitialised variable in devicenotify NFSv4 remove zero number of fs_locations entries error check NFSv4 store server support for fs_location attribute NFSv4.1 query for fs_location attr on a new file system NFSv4 expose nfs_parse_server_name function NFSv4 handle port presence in fs_location server string SUNRPC allow for unspecified transport time in rpc_clnt_add_xprt net/sunrpc: fix reference count leaks in rpc_sysfs_xprt_state_change sunrpc: Fix potential race conditions in rpc_sysfs_xprt_state_change() irqchip/realtek-rtl: Service all pending interrupts perf/x86/rapl: fix AMD event handling x86/perf: Avoid warning for Arch LBR without XSAVE sched: Avoid double preemption in __cond_resched_*lock*() drm/vc4: Fix deadlock on DSI device attach error drm: panel-orientation-quirks: Add quirk for the 1Netbook OneXPlayer net: sched: Clarify error message when qdisc kind is unknown powerpc/fixmap: Fix VM debug warning on unmap scsi: target: iscsi: Make sure the np under each tpg is unique scsi: ufs: ufshcd-pltfrm: Check the return value of devm_kstrdup() scsi: qedf: Add stag_work to all the vports scsi: qedf: Fix refcount issue when LOGO is received during TMF scsi: qedf: Change context reset messages to ratelimited scsi: pm8001: Fix bogus FW crash for maxcpus=1 scsi: ufs: Use generic error code in ufshcd_set_dev_pwr_mode() scsi: ufs: Treat link loss as fatal error scsi: myrs: Fix crash in error case net: stmmac: reduce unnecessary wakeups from eee sw timer PM: hibernate: Remove register_nosave_region_late() drm/amd/display: Correct MPC split policy for DCN301 usb: dwc2: gadget: don't try to disable ep0 in dwc2_hsotg_suspend perf: Always wake the parent event nvme-pci: add the IGNORE_DEV_SUBNQN quirk for Intel P4500/P4600 SSDs MIPS: Fix build error due to PTR used in more places net: stmmac: dwmac-sun8i: use return val of readl_poll_timeout() KVM: eventfd: Fix false positive RCU usage warning KVM: nVMX: eVMCS: Filter out VM_EXIT_SAVE_VMX_PREEMPTION_TIMER KVM: nVMX: Also filter MSR_IA32_VMX_TRUE_PINBASED_CTLS when eVMCS KVM: SVM: Don't kill SEV guest if SMAP erratum triggers in usermode KVM: VMX: Set vmcs.PENDING_DBG.BS on #DB in STI/MOVSS blocking shadow KVM: x86: Report deprecated x87 features in supported CPUID riscv: fix build with binutils 2.38 riscv: cpu-hotplug: clear cpu from numa map when teardown riscv: eliminate unreliable __builtin_frame_address(1) gfs2: Fix gfs2_release for non-writers regression ARM: dts: imx23-evk: Remove MX23_PAD_SSP1_DETECT from hog group ARM: dts: Fix boot regression on Skomer ARM: socfpga: fix missing RESET_CONTROLLER nvme-tcp: fix bogus request completion when failing to send AER ACPI/IORT: Check node revision for PMCG resources PM: s2idle: ACPI: Fix wakeup interrupts handling drm/amdgpu/display: change pipe policy for DCN 2.0 drm/rockchip: vop: Correct RK3399 VOP register fields drm/i915: Allow !join_mbus cases for adlp+ dbuf configuration drm/i915: Populate pipe dbuf slices more accurately during readout ARM: dts: Fix timer regression for beagleboard revision c ARM: dts: meson: Fix the UART compatible strings ARM: dts: meson8: Fix the UART device-tree schema validation ARM: dts: meson8b: Fix the UART device-tree schema validation phy: broadcom: Kconfig: Fix PHY_BRCM_USB config option staging: fbtft: Fix error path in fbtft_driver_module_init() ARM: dts: imx6qdl-udoo: Properly describe the SD card detect phy: xilinx: zynqmp: Fix bus width setting for SGMII phy: stm32: fix a refcount leak in stm32_usbphyc_pll_enable() ARM: dts: imx7ulp: Fix 'assigned-clocks-parents' typo arm64: dts: imx8mq: fix mipi_csi bidirectional port numbers usb: f_fs: Fix use-after-free for epfile phy: dphy: Correct clk_pre parameter gpio: aggregator: Fix calling into sleeping GPIO controllers NFS: Don't overfill uncached readdir pages NFS: Don't skip directory entries when doing uncached readdir drm/vc4: hdmi: Allow DBLCLK modes even if horz timing is odd. misc: fastrpc: avoid double fput() on failed usercopy net: sparx5: Fix get_stat64 crash in tcpdump netfilter: ctnetlink: disable helper autoassign arm64: dts: meson-g12b-odroid-n2: fix typo 'dio2133' arm64: dts: meson-sm1-odroid: use correct enable-gpio pin for tf-io regulator arm64: dts: meson-sm1-bananapi-m5: fix wrong GPIO domain for GPIOE_2 arm64: dts: meson-sm1-odroid: fix boot loop after reboot ixgbevf: Require large buffers for build_skb on 82599VF drm/panel: simple: Assign data from panel_dpi_probe() correctly ACPI: PM: s2idle: Cancel wakeup before dispatching EC GPE gpiolib: Never return internal error codes to user space gpio: sifive: use the correct register to read output values fbcon: Avoid 'cap' set but not used warning bonding: pair enable_port with slave_arr_updates net: dsa: mv88e6xxx: don't use devres for mdiobus net: dsa: ar9331: register the mdiobus under devres net: dsa: bcm_sf2: don't use devres for mdiobus net: dsa: felix: don't use devres for mdiobus net: dsa: mt7530: fix kernel bug in mdiobus_free() when unbinding net: dsa: lantiq_gswip: don't use devres for mdiobus ipmr,ip6mr: acquire RTNL before calling ip[6]mr_free_table() on failure path nfp: flower: fix ida_idx not being released net: do not keep the dst cache when uncloning an skb dst and its metadata net: fix a memleak when uncloning an skb dst and its metadata veth: fix races around rq->rx_notify_masked net: mdio: aspeed: Add missing MODULE_DEVICE_TABLE tipc: rate limit warning for received illegal binding update net: amd-xgbe: disable interrupts during pci removal drm/amd/pm: fix hwmon node of power1_label create issue mptcp: netlink: process IPv6 addrs in creating listening sockets dpaa2-eth: unregister the netdev before disconnecting from the PHY ice: fix an error code in ice_cfg_phy_fec() ice: fix IPIP and SIT TSO offload ice: Fix KASAN error in LAG NETDEV_UNREGISTER handler ice: Avoid RTNL lock when re-creating auxiliary device net: mscc: ocelot: fix mutex lock error during ethtool stats read net: dsa: mv88e6xxx: fix use-after-free in mv88e6xxx_mdios_unregister vt_ioctl: fix array_index_nospec in vt_setactivate vt_ioctl: add array_index_nospec to VT_ACTIVATE n_tty: wake up poll(POLLRDNORM) on receiving data eeprom: ee1004: limit i2c reads to I2C_SMBUS_BLOCK_MAX usb: dwc2: drd: fix soft connect when gadget is unconfigured Revert "usb: dwc2: drd: fix soft connect when gadget is unconfigured" net: usb: ax88179_178a: Fix out-of-bounds accesses in RX fixup usb: ulpi: Move of_node_put to ulpi_dev_release usb: ulpi: Call of_node_put correctly usb: dwc3: gadget: Prevent core from processing stale TRBs usb: gadget: udc: renesas_usb3: Fix host to USB_ROLE_NONE transition USB: gadget: validate interface OS descriptor requests usb: gadget: rndis: check size of RNDIS_MSG_SET command usb: gadget: f_uac2: Define specific wTerminalType usb: raw-gadget: fix handling of dual-direction-capable endpoints USB: serial: ftdi_sio: add support for Brainboxes US-159/235/320 USB: serial: option: add ZTE MF286D modem USB: serial: ch341: add support for GW Instek USB2.0-Serial devices USB: serial: cp210x: add NCR Retail IO box id USB: serial: cp210x: add CPI Bulk Coin Recycler id speakup-dectlk: Restore pitch setting phy: ti: Fix missing sentinel for clk_div_table iio: buffer: Fix file related error handling in IIO_BUFFER_GET_FD_IOCTL mm: memcg: synchronize objcg lists with a dedicated spinlock seccomp: Invalidate seccomp mode to catch death failures signal: HANDLER_EXIT should clear SIGNAL_UNKILLABLE s390/cio: verify the driver availability for path_event call bus: mhi: pci_generic: Add mru_default for Foxconn SDX55 bus: mhi: pci_generic: Add mru_default for Cinterion MV31-W hwmon: (dell-smm) Speed up setting of fan speed x86/sgx: Silence softlockup detection when releasing large enclaves Makefile.extrawarn: Move -Wunaligned-access to W=1 scsi: lpfc: Remove NVMe support if kernel has NVME_FC disabled scsi: lpfc: Reduce log messages seen after firmware download MIPS: octeon: Fix missed PTR->PTR_WD conversion arm64: dts: imx8mq: fix lcdif port node perf: Fix list corruption in perf_cgroup_switch() iommu: Fix potential use-after-free during probe Linux 5.15.24 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ibe10e24eeda28e78c35f7656bc49cf11f58d858c |
||
|
|
7969fe91c9 |
perf: Fix list corruption in perf_cgroup_switch()
commit 5f4e5ce638e6a490b976ade4a40017b40abb2da0 upstream.
There's list corruption on cgrp_cpuctx_list. This happens on the
following path:
perf_cgroup_switch: list_for_each_entry(cgrp_cpuctx_list)
cpu_ctx_sched_in
ctx_sched_in
ctx_pinned_sched_in
merge_sched_in
perf_cgroup_event_disable: remove the event from the list
Use list_for_each_entry_safe() to allow removing an entry during
iteration.
Fixes:
|
||
|
|
56ca18dd54 |
signal: HANDLER_EXIT should clear SIGNAL_UNKILLABLE
commit 5c72263ef2fbe99596848f03758ae2dc593adf2c upstream. Fatal SIGSYS signals (i.e. seccomp RET_KILL_* syscall filter actions) were not being delivered to ptraced pid namespace init processes. Make sure the SIGNAL_UNKILLABLE doesn't get set for these cases. Reported-by: Robert Święcki <robert@swiecki.net> Suggested-by: "Eric W. Biederman" <ebiederm@xmission.com> Fixes: 00b06da29cf9 ("signal: Add SA_IMMUTABLE to ensure forced siganls do not get changed") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Link: https://lore.kernel.org/lkml/878rui8u4a.fsf@email.froward.int.ebiederm.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
f7a56fcca2 |
seccomp: Invalidate seccomp mode to catch death failures
commit 495ac3069a6235bfdf516812a2a9b256671bbdf9 upstream.
If seccomp tries to kill a process, it should never see that process
again. To enforce this proactively, switch the mode to something
impossible. If encountered: WARN, reject all syscalls, and attempt to
kill the process again even harder.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Fixes:
|
||
|
|
0e546bb132 |
PM: s2idle: ACPI: Fix wakeup interrupts handling
commit cb1f65c1e1424a4b5e4a86da8aa3b8fd8459c8ec upstream. After commit |
||
|
|
a3486ef99a |
perf: Always wake the parent event
[ Upstream commit 961c39121759ad09a89598ec4ccdd34ae0468a19 ] When using per-process mode and event inheritance is set to true, forked processes will create a new perf events via inherit_event() -> perf_event_alloc(). But these events will not have ring buffers assigned to them. Any call to wakeup will be dropped if it's called on an event with no ring buffer assigned because that's the object that holds the wakeup list. If the child event is disabled due to a call to perf_aux_output_begin() or perf_aux_output_end(), the wakeup is dropped leaving userspace hanging forever on the poll. Normally the event is explicitly re-enabled by userspace after it wakes up to read the aux data, but in this case it does not get woken up so the event remains disabled. This can be reproduced when using Arm SPE and 'stress' which forks once before running the workload. By looking at the list of aux buffers read, it's apparent that they stop after the fork: perf record -e arm_spe// -vvv -- stress -c 1 With this patch applied they continue to be printed. This behaviour doesn't happen when using systemwide or per-cpu mode. Reported-by: Ruben Ayrapetyan <Ruben.Ayrapetyan@arm.com> Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20211206113840.130802-2-james.clark@arm.com Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
dc5769c7b0 |
PM: hibernate: Remove register_nosave_region_late()
[ Upstream commit 33569ef3c754a82010f266b7b938a66a3ccf90a4 ] It is an unused wrapper forcing kmalloc allocation for registering nosave regions. Also, rename __register_nosave_region() to register_nosave_region() now that there is no need for disambiguation. Signed-off-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com> Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
85008bde41 |
sched: Avoid double preemption in __cond_resched_*lock*()
[ Upstream commit 7e406d1ff39b8ee574036418a5043c86723170cf ] For PREEMPT/DYNAMIC_PREEMPT the *_unlock() will already trigger a preemption, no point in then calling preempt_schedule_common() *again*. Use _cond_resched() instead, since this is a NOP for the preemptible configs while it provide a preemption point for the others. Reported-by: xuhaifeng <xuhaifeng@oppo.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/YcGnvDEYBwOiV0cR@hirez.programming.kicks-ass.net Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
78c28fdf16 |
tracing: Propagate is_signed to expression
commit 097f1eefedeab528cecbd35586dfe293853ffb17 upstream.
During expression parsing, a new expression field is created which
should inherit the properties of the operands, such as size and
is_signed.
is_signed propagation was missing, causing spurious errors with signed
operands. Add it in parse_expr() and parse_unary() to fix the problem.
Link: https://lkml.kernel.org/r/f4dac08742fd7a0920bf80a73c6c44042f5eaa40.1643319703.git.zanussi@kernel.org
Cc: stable@vger.kernel.org
Fixes:
|
||
|
|
8ff65d38ba |
Merge remote-tracking branch into HEAD
* keystone/mirror-android13-5.15: (68 commits) FROMLIST: kasan: improve vmalloc tests FROMGIT: kasan: documentation updates FROMGIT: arm64: select KASAN_VMALLOC for SW/HW_TAGS modes FROMGIT: kasan: allow enabling KASAN_VMALLOC and SW/HW_TAGS FROMGIT: kasan: add kasan.vmalloc command line flag FROMGIT: kasan: clean up feature flags for HW_TAGS mode FROMGIT: kasan: mark kasan_arg_stacktrace as __initdata FROMGIT: kasan, arm64: don't tag executable vmalloc allocations FROMGIT: kasan, vmalloc: only tag normal vmalloc allocations BACKPORT: FROMGIT: kasan, vmalloc: add vmalloc tagging for HW_TAGS BACKPORT: FROMGIT: kasan, page_alloc: allow skipping memory init for HW_TAGS BACKPORT: FROMGIT: kasan, page_alloc: allow skipping unpoisoning for HW_TAGS BACKPORT: FROMGIT: kasan, mm: only define ___GFP_SKIP_KASAN_POISON with HW_TAGS FROMGIT: kasan, vmalloc: unpoison VM_ALLOC pages after mapping BACKPORT: FROMGIT: kasan, vmalloc, arm64: mark vmalloc mappings as pgprot_tagged FROMGIT: kasan, vmalloc: add vmalloc tagging for SW_TAGS FROMGIT: kasan, arm64: reset pointer tags of vmapped stacks FROMLIST: kasan, fork: reset pointer tags of vmapped stacks FROMGIT: kasan, vmalloc: reset tags in vmalloc functions FROMGIT: kasan: add wrappers for vmalloc hooks ... Signed-off-by: keystone-kernel-automerger <keystone-kernel-automerger@google.com> Change-Id: I02eec7b63e206525edd2a106d0ed211c141f81fb |
||
|
|
261a7a2ac9 |
BACKPORT: FROMGIT: kasan, vmalloc: add vmalloc tagging for HW_TAGS
(Backport: workaround kasan_populate_early_vm_area_shadow missing due to 3252b1d8309e not backported.) Add vmalloc tagging support to HW_TAGS KASAN. The key difference between HW_TAGS and the other two KASAN modes when it comes to vmalloc: HW_TAGS KASAN can only assign tags to physical memory. The other two modes have shadow memory covering every mapped virtual memory region. Make __kasan_unpoison_vmalloc() for HW_TAGS KASAN: - Skip non-VM_ALLOC mappings as HW_TAGS KASAN can only tag a single mapping of normal physical memory; see the comment in the function. - Generate a random tag, tag the returned pointer and the allocation, and initialize the allocation at the same time. - Propagate the tag into the page stucts to allow accesses through page_address(vmalloc_to_page()). The rest of vmalloc-related KASAN hooks are not needed: - The shadow-related ones are fully skipped. - __kasan_poison_vmalloc() is kept as a no-op with a comment. Poisoning and zeroing of physical pages that are backing vmalloc() allocations are skipped via __GFP_SKIP_KASAN_UNPOISON and __GFP_SKIP_ZERO: __kasan_unpoison_vmalloc() does that instead. Enabling CONFIG_KASAN_VMALLOC with HW_TAGS is not yet allowed. Link: https://lkml.kernel.org/r/d19b2e9e59a9abc59d05b72dea8429dcaea739c6.1643047180.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> (cherry picked from commit c9a950bcf1d67298187050bc3179096e4ef248c1 git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git akpm) Bug: 217222520 Change-Id: I446b0ae074938389ade70bf503784d4d32b5d09b Signed-off-by: Andrey Konovalov <andreyknvl@google.com> |