16.0
3970 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
1ed9fa92d9 |
Revert "trace: rtb: add msm_rtb register tracing feature snapshot"
This reverts commit
|
||
|
|
6ef8f48172 |
BACKPORT: bpf: add writable context for raw tracepoints
This is an opt-in interface that allows a tracepoint to provide a safe buffer that can be written from a BPF_PROG_TYPE_RAW_TRACEPOINT program. The size of the buffer must be a compile-time constant, and is checked before allowing a BPF program to attach to a tracepoint that uses this feature. The pointer to this buffer will be the first argument of tracepoints that opt in; the pointer is valid and can be bpf_probe_read() by both BPF_PROG_TYPE_RAW_TRACEPOINT and BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE programs that attach to such a tracepoint, but the buffer to which it points may only be written by the latter. Change-Id: I9f96e1c0e7ae90fe32795d537a100f26e388af2d Signed-off-by: Matt Mullins <mmullins@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
55e3564711 |
BACKPORT: tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing
commit 9913d5745bd720c4266805c8d29952a3702e4eca upstream. All internal use cases for tracepoint_probe_register() is set to not ever be called with the same function and data. If it is, it is considered a bug, as that means the accounting of handling tracepoints is corrupted. If the function and data for a tracepoint is already registered when tracepoint_probe_register() is called, it will call WARN_ON_ONCE() and return with EEXISTS. The BPF system call can end up calling tracepoint_probe_register() with the same data, which now means that this can trigger the warning because of a user space process. As WARN_ON_ONCE() should not be called because user space called a system call with bad data, there needs to be a way to register a tracepoint without triggering a warning. Enter tracepoint_probe_register_may_exist(), which can be called, but will not cause a WARN_ON() if the probe already exists. It will still error out with EEXIST, which will then be sent to the user space that performed the BPF system call. This keeps the previous testing for issues with other users of the tracepoint code, while letting BPF call it with duplicated data and not warn about it. Link: https://lore.kernel.org/lkml/20210626135845.4080-1-penguin-kernel@I-love.SAKURA.ne.jp/ Link: https://syzkaller.appspot.com/bug?id=41f4318cf01762389f4d1c1c459da4f542fe5153 Cc: stable@vger.kernel.org Fixes: c4f6699dfcb85 ("bpf: introduce BPF_RAW_TRACEPOINT") Reported-by: syzbot <syzbot+721aa903751db87aa244@syzkaller.appspotmail.com> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Tested-by: syzbot+721aa903751db87aa244@syzkaller.appspotmail.com Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
cab00d97f6 |
UPSTREAM: bpf: add map helper functions push, pop, peek in more BPF programs
commit f1a2e44a3aec ("bpf: add queue and stack maps") introduced new BPF
helper functions:
- BPF_FUNC_map_push_elem
- BPF_FUNC_map_pop_elem
- BPF_FUNC_map_peek_elem
but they were made available only for network BPF programs. This patch
makes them available for tracepoint, cgroup and lirc programs.
Change-Id: Id49274e5be3ab81f6eb60b6890834247d68487e6
Signed-off-by: Alban Crequy <alban@kinvolk.io>
Cc: Mauricio Vasquez B <mauricio.vasquez@polito.it>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
||
|
|
c6116b006a |
BACKPORT: bpf: support raw tracepoints in modules
Distributions build drivers as modules, including network and filesystem drivers which export numerous tracepoints. This enables bpf(BPF_RAW_TRACEPOINT_OPEN) to attach to those tracepoints. Change-Id: I2ea2898f5dedf7e70aff39c0f8ae0a5d7aa1d2af Signed-off-by: Matt Mullins <mmullins@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
481c26e010 |
BACKPORT: atomics/treewide: Rename __atomic_add_unless() => atomic_fetch_add_unless()
While __atomic_add_unless() was originally intended as a building-block
for atomic_add_unless(), it's now used in a number of places around the
kernel. It's the only common atomic operation named __atomic*(), rather
than atomic_*(), and for consistency it would be better named
atomic_fetch_add_unless().
This lack of consistency is slightly confusing, and gets in the way of
scripting atomics. Given that, let's clean things up and promote it to
an official part of the atomics API, in the form of
atomic_fetch_add_unless().
This patch converts definitions and invocations over to the new name,
including the instrumented version, using the following script:
----
git grep -w __atomic_add_unless | while read line; do
sed -i '{s/\<__atomic_add_unless\>/atomic_fetch_add_unless/}' "${line%%:*}";
done
git grep -w __arch_atomic_add_unless | while read line; do
sed -i '{s/\<__arch_atomic_add_unless\>/arch_atomic_fetch_add_unless/}' "${line%%:*}";
done
----
Note that we do not have atomic{64,_long}_fetch_add_unless(), which will
be introduced by later patches.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Palmer Dabbelt <palmer@sifive.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/20180621121321.4761-2-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
||
|
|
b14604aea0 |
BACKPORT: bpf: implement bpf_get_current_cgroup_id() helper
bpf has been used extensively for tracing. For example, bcc contains an almost full set of bpf-based tools to trace kernel and user functions/events. Most tracing tools are currently either filtered based on pid or system-wide. Containers have been used quite extensively in industry and cgroup is often used together to provide resource isolation and protection. Several processes may run inside the same container. It is often desirable to get container-level tracing results as well, e.g. syscall count, function count, I/O activity, etc. This patch implements a new helper, bpf_get_current_cgroup_id(), which will return cgroup id based on the cgroup within which the current task is running. The later patch will provide an example to show that userspace can get the same cgroup id so it could configure a filter or policy in the bpf program based on task cgroup id. The helper is currently implemented for tracing. It can be added to other program types as well when needed. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
02dc4f9f34 |
UPSTREAM: bpf: fix context access in tracing progs on 32 bit archs
Wang reported that all the testcases for BPF_PROG_TYPE_PERF_EVENT program type in test_verifier report the following errors on x86_32: 172/p unpriv: spill/fill of different pointers ldx FAIL Unexpected error message! 0: (bf) r6 = r10 1: (07) r6 += -8 2: (15) if r1 == 0x0 goto pc+3 R1=ctx(id=0,off=0,imm=0) R6=fp-8,call_-1 R10=fp0,call_-1 3: (bf) r2 = r10 4: (07) r2 += -76 5: (7b) *(u64 *)(r6 +0) = r2 6: (55) if r1 != 0x0 goto pc+1 R1=ctx(id=0,off=0,imm=0) R2=fp-76,call_-1 R6=fp-8,call_-1 R10=fp0,call_-1 fp-8=fp 7: (7b) *(u64 *)(r6 +0) = r1 8: (79) r1 = *(u64 *)(r6 +0) 9: (79) r1 = *(u64 *)(r1 +68) invalid bpf_context access off=68 size=8 378/p check bpf_perf_event_data->sample_period byte load permitted FAIL Failed to load prog 'Permission denied'! 0: (b7) r0 = 0 1: (71) r0 = *(u8 *)(r1 +68) invalid bpf_context access off=68 size=1 379/p check bpf_perf_event_data->sample_period half load permitted FAIL Failed to load prog 'Permission denied'! 0: (b7) r0 = 0 1: (69) r0 = *(u16 *)(r1 +68) invalid bpf_context access off=68 size=2 380/p check bpf_perf_event_data->sample_period word load permitted FAIL Failed to load prog 'Permission denied'! 0: (b7) r0 = 0 1: (61) r0 = *(u32 *)(r1 +68) invalid bpf_context access off=68 size=4 381/p check bpf_perf_event_data->sample_period dword load permitted FAIL Failed to load prog 'Permission denied'! 0: (b7) r0 = 0 1: (79) r0 = *(u64 *)(r1 +68) invalid bpf_context access off=68 size=8 Reason is that struct pt_regs on x86_32 doesn't fully align to 8 byte boundary due to its size of 68 bytes. Therefore, bpf_ctx_narrow_access_ok() will then bail out saying that off & (size_default - 1) which is 68 & 7 doesn't cleanly align in the case of sample_period access from struct bpf_perf_event_data, hence verifier wrongly thinks we might be doing an unaligned access here though underlying arch can handle it just fine. Therefore adjust this down to machine size and check and rewrite the offset for narrow access on that basis. We also need to fix corresponding pe_prog_is_valid_access(), since we hit the check for off % size != 0 (e.g. 68 % 8 -> 4) in the first and last test. With that in place, progs for tracing work on x86_32. Reported-by: Wang YanQing <udknight@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Wang YanQing <udknight@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
3df59db197 |
UPSTREAM: bpf: bpf_prog_array_copy() should return -ENOENT if exclude_prog not found
This makes is it possible for bpf prog detach to return -ENOENT. Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
||
|
|
a24867fc2d |
BACKPORT: bpf: avoid rcu_dereference inside bpf_event_mutex lock region
During perf event attaching/detaching bpf programs,
the tp_event->prog_array change is protected by the
bpf_event_mutex lock in both attaching and deteching
functions. Although tp_event->prog_array is a rcu
pointer, rcu_derefrence is not needed to access it
since mutex lock will guarantee ordering.
Verified through "make C=2" that sparse
locking check still happy with the new change.
Also change the label name in perf_event_{attach,detach}_bpf_prog
from "out" to "unlock" to reflect the code action after the label.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
||
|
|
790e430552 |
UPSTREAM: bpf: introduce bpf subcommand BPF_TASK_FD_QUERY
Currently, suppose a userspace application has loaded a bpf program and attached it to a tracepoint/kprobe/uprobe, and a bpf introspection tool, e.g., bpftool, wants to show which bpf program is attached to which tracepoint/kprobe/uprobe. Such attachment information will be really useful to understand the overall bpf deployment in the system. There is a name field (16 bytes) for each program, which could be used to encode the attachment point. There are some drawbacks for this approaches. First, bpftool user (e.g., an admin) may not really understand the association between the name and the attachment point. Second, if one program is attached to multiple places, encoding a proper name which can imply all these attachments becomes difficult. This patch introduces a new bpf subcommand BPF_TASK_FD_QUERY. Given a pid and fd, if the <pid, fd> is associated with a tracepoint/kprobe/uprobe perf event, BPF_TASK_FD_QUERY will return . prog_id . tracepoint name, or . k[ret]probe funcname + offset or kernel addr, or . u[ret]probe filename + offset to the userspace. The user can use "bpftool prog" to find more information about bpf program itself with prog_id. Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
fc5298310c |
UPSTREAM: perf/core: Implement the 'perf_uprobe' PMU
This patch adds perf_uprobe support with similar pattern as previous patch (for kprobe). Two functions, create_local_trace_uprobe() and destroy_local_trace_uprobe(), are created so a uprobe can be created and attached to the file descriptor created by perf_event_open(). Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Yonghong Song <yhs@fb.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Cc: <daniel@iogearbox.net> Cc: <davem@davemloft.net> Cc: <kernel-team@fb.com> Cc: <rostedt@goodmis.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20171206224518.3598254-7-songliubraving@fb.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
30febf09c8 |
UPSTREAM: perf/core: Implement the 'perf_kprobe' PMU
A new PMU type, perf_kprobe is added. Based on attr from perf_event_open(), perf_kprobe creates a kprobe (or kretprobe) for the perf_event. This kprobe is private to this perf_event, and thus not added to global lists, and not available in tracefs. Two functions, create_local_trace_kprobe() and destroy_local_trace_kprobe() are added to created and destroy these local trace_kprobe. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Yonghong Song <yhs@fb.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Cc: <daniel@iogearbox.net> Cc: <davem@davemloft.net> Cc: <kernel-team@fb.com> Cc: <rostedt@goodmis.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20171206224518.3598254-6-songliubraving@fb.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
cf168d7e20 |
BACKPORT: bpf: add bpf_get_stack helper
Currently, stackmap and bpf_get_stackid helper are provided for bpf program to get the stack trace. This approach has a limitation though. If two stack traces have the same hash, only one will get stored in the stackmap table, so some stack traces are missing from user perspective. This patch implements a new helper, bpf_get_stack, will send stack traces directly to bpf program. The bpf program is able to see all stack traces, and then can do in-kernel processing or send stack traces to user space through shared map or bpf_perf_event_output. Acked-by: Alexei Starovoitov <ast@fb.com> Change-Id: I7dbdcba1a8ceda4c3626a07c436b33d9f35b3c0e Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
13e5fae1ef |
UPSTREAM: bpf/tracing: fix a deadlock in perf_event_detach_bpf_prog
syzbot reported a possible deadlock in perf_event_detach_bpf_prog. The error details: ====================================================== WARNING: possible circular locking dependency detected 4.16.0-rc7+ #3 Not tainted ------------------------------------------------------ syz-executor7/24531 is trying to acquire lock: (bpf_event_mutex){+.+.}, at: [<000000008a849b07>] perf_event_detach_bpf_prog+0x92/0x3d0 kernel/trace/bpf_trace.c:854 but task is already holding lock: (&mm->mmap_sem){++++}, at: [<0000000038768f87>] vm_mmap_pgoff+0x198/0x280 mm/util.c:353 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&mm->mmap_sem){++++}: __might_fault+0x13a/0x1d0 mm/memory.c:4571 _copy_to_user+0x2c/0xc0 lib/usercopy.c:25 copy_to_user include/linux/uaccess.h:155 [inline] bpf_prog_array_copy_info+0xf2/0x1c0 kernel/bpf/core.c:1694 perf_event_query_prog_array+0x1c7/0x2c0 kernel/trace/bpf_trace.c:891 _perf_ioctl kernel/events/core.c:4750 [inline] perf_ioctl+0x3e1/0x1480 kernel/events/core.c:4770 vfs_ioctl fs/ioctl.c:46 [inline] do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:686 SYSC_ioctl fs/ioctl.c:701 [inline] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 -> #0 (bpf_event_mutex){+.+.}: lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3920 __mutex_lock_common kernel/locking/mutex.c:756 [inline] __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908 perf_event_detach_bpf_prog+0x92/0x3d0 kernel/trace/bpf_trace.c:854 perf_event_free_bpf_prog kernel/events/core.c:8147 [inline] _free_event+0xbdb/0x10f0 kernel/events/core.c:4116 put_event+0x24/0x30 kernel/events/core.c:4204 perf_mmap_close+0x60d/0x1010 kernel/events/core.c:5172 remove_vma+0xb4/0x1b0 mm/mmap.c:172 remove_vma_list mm/mmap.c:2490 [inline] do_munmap+0x82a/0xdf0 mm/mmap.c:2731 mmap_region+0x59e/0x15a0 mm/mmap.c:1646 do_mmap+0x6c0/0xe00 mm/mmap.c:1483 do_mmap_pgoff include/linux/mm.h:2223 [inline] vm_mmap_pgoff+0x1de/0x280 mm/util.c:355 SYSC_mmap_pgoff mm/mmap.c:1533 [inline] SyS_mmap_pgoff+0x462/0x5f0 mm/mmap.c:1491 SYSC_mmap arch/x86/kernel/sys_x86_64.c:100 [inline] SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:91 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&mm->mmap_sem); lock(bpf_event_mutex); lock(&mm->mmap_sem); lock(bpf_event_mutex); *** DEADLOCK *** ====================================================== The bug is introduced by Commit f371b304f12e ("bpf/tracing: allow user space to query prog array on the same tp") where copy_to_user, which requires mm->mmap_sem, is called inside bpf_event_mutex lock. At the same time, during perf_event file descriptor close, mm->mmap_sem is held first and then subsequent perf_event_detach_bpf_prog needs bpf_event_mutex lock. Such a senario caused a deadlock. As suggested by Daniel, moving copy_to_user out of the bpf_event_mutex lock should fix the problem. Fixes: f371b304f12e ("bpf/tracing: allow user space to query prog array on the same tp") Reported-by: syzbot+dc5ca0e4c9bfafaf2bae@syzkaller.appspotmail.com Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
||
|
|
eb9a2d2132 |
BACKPORT: bpf: Check attach type at prog load time
== The problem == There are use-cases when a program of some type can be attached to multiple attach points and those attach points must have different permissions to access context or to call helpers. E.g. context structure may have fields for both IPv4 and IPv6 but it doesn't make sense to read from / write to IPv6 field when attach point is somewhere in IPv4 stack. Same applies to BPF-helpers: it may make sense to call some helper from some attach point, but not from other for same prog type. == The solution == Introduce `expected_attach_type` field in in `struct bpf_attr` for `BPF_PROG_LOAD` command. If scenario described in "The problem" section is the case for some prog type, the field will be checked twice: 1) At load time prog type is checked to see if attach type for it must be known to validate program permissions correctly. Prog will be rejected with EINVAL if it's the case and `expected_attach_type` is not specified or has invalid value. 2) At attach time `attach_type` is compared with `expected_attach_type`, if prog type requires to have one, and, if they differ, attach will be rejected with EINVAL. The `expected_attach_type` is now available as part of `struct bpf_prog` in both `bpf_verifier_ops->is_valid_access()` and `bpf_verifier_ops->get_func_proto()` () and can be used to check context accesses and calls to helpers correspondingly. Initially the idea was discussed by Alexei Starovoitov <ast@fb.com> and Daniel Borkmann <daniel@iogearbox.net> here: https://marc.info/?l=linux-netdev&m=152107378717201&w=2 Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
||
|
|
beba6490af |
UPSTREAM: bpf: introduce BPF_RAW_TRACEPOINT
Introduce BPF_PROG_TYPE_RAW_TRACEPOINT bpf program type to access
kernel internal arguments of the tracepoints in their raw form.
>From bpf program point of view the access to the arguments look like:
struct bpf_raw_tracepoint_args {
__u64 args[0];
};
int bpf_prog(struct bpf_raw_tracepoint_args *ctx)
{
// program can read args[N] where N depends on tracepoint
// and statically verified at program load+attach time
}
kprobe+bpf infrastructure allows programs access function arguments.
This feature allows programs access raw tracepoint arguments.
Similar to proposed 'dynamic ftrace events' there are no abi guarantees
to what the tracepoints arguments are and what their meaning is.
The program needs to type cast args properly and use bpf_probe_read()
helper to access struct fields when argument is a pointer.
For every tracepoint __bpf_trace_##call function is prepared.
In assembler it looks like:
(gdb) disassemble __bpf_trace_xdp_exception
Dump of assembler code for function __bpf_trace_xdp_exception:
0xffffffff81132080 <+0>: mov %ecx,%ecx
0xffffffff81132082 <+2>: jmpq 0xffffffff811231f0 <bpf_trace_run3>
where
TRACE_EVENT(xdp_exception,
TP_PROTO(const struct net_device *dev,
const struct bpf_prog *xdp, u32 act),
The above assembler snippet is casting 32-bit 'act' field into 'u64'
to pass into bpf_trace_run3(), while 'dev' and 'xdp' args are passed as-is.
All of ~500 of __bpf_trace_*() functions are only 5-10 byte long
and in total this approach adds 7k bytes to .text.
This approach gives the lowest possible overhead
while calling trace_xdp_exception() from kernel C code and
transitioning into bpf land.
Since tracepoint+bpf are used at speeds of 1M+ events per second
this is valuable optimization.
The new BPF_RAW_TRACEPOINT_OPEN sys_bpf command is introduced
that returns anon_inode FD of 'bpf-raw-tracepoint' object.
The user space looks like:
// load bpf prog with BPF_PROG_TYPE_RAW_TRACEPOINT type
prog_fd = bpf_prog_load(...);
// receive anon_inode fd for given bpf_raw_tracepoint with prog attached
raw_tp_fd = bpf_raw_tracepoint_open("xdp_exception", prog_fd);
Ctrl-C of tracing daemon or cmdline tool that uses this feature
will automatically detach bpf program, unload it and
unregister tracepoint probe.
On the kernel side the __bpf_raw_tp_map section of pointers to
tracepoint definition and to __bpf_trace_*() probe function is used
to find a tracepoint with "xdp_exception" name and
corresponding __bpf_trace_xdp_exception() probe function
which are passed to tracepoint_probe_register() to connect probe
with tracepoint.
Addition of bpf_raw_tracepoint doesn't interfere with ftrace and perf
tracepoint mechanisms. perf_event_open() can be used in parallel
on the same tracepoint.
Multiple bpf_raw_tracepoint_open("xdp_exception", prog_fd) are permitted.
Each with its own bpf program. The kernel will execute
all tracepoint probes and all attached bpf programs.
In the future bpf_raw_tracepoints can be extended with
query/introspection logic.
__bpf_raw_tp_map section logic was contributed by Steven Rostedt
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
||
|
|
923cb2f5c0 |
UPSTREAM: trace/bpf: remove helper bpf_perf_prog_read_value from tracepoint type programs
Commit 4bebdc7a85aa ("bpf: add helper bpf_perf_prog_read_value")
added helper bpf_perf_prog_read_value so that perf_event type program
can read event counter and enabled/running time.
This commit, however, introduced a bug which allows this helper
for tracepoint type programs. This is incorrect as bpf_perf_prog_read_value
needs to access perf_event through its bpf_perf_event_data_kern type context,
which is not available for tracepoint type program.
This patch fixed the issue by separating bpf_func_proto between tracepoint
and perf_event type programs and removed bpf_perf_prog_read_value
from tracepoint func prototype.
Fixes: 4bebdc7a85aa ("bpf: add helper bpf_perf_prog_read_value")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
||
|
|
33f8a6306d |
UPSTREAM: bpf: fix bpf_prog_array_copy_to_user warning from perf event prog query
syzkaller tried to perform a prog query in perf_event_query_prog_array()
where struct perf_event_query_bpf had an ids_len of 1,073,741,353 and
thus causing a warning due to failed kcalloc() allocation out of the
bpf_prog_array_copy_to_user() helper. Given we cannot attach more than
64 programs to a perf event, there's no point in allowing huge ids_len.
Therefore, allow a buffer that would fix the maximum number of ids and
also add a __GFP_NOWARN to the temporary ids buffer.
Fixes: f371b304f12e ("bpf/tracing: allow user space to query prog array on the same tp")
Fixes: 0911287ce32b ("bpf: fix bpf_prog_array_copy_to_user() issues")
Reported-by: syzbot+cab5816b0edbabf598b3@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
||
|
|
166acbeb9c |
BACKPORT: error-injection: Separate error-injection from kprobe
Since error-injection framework is not limited to be used by kprobes, nor bpf. Other kernel subsystems can use it freely for checking safeness of error-injection, e.g. livepatch, ftrace etc. So this separate error-injection framework from kprobes. Some differences has been made: - "kprobe" word is removed from any APIs/structures. - BPF_ALLOW_ERROR_INJECTION() is renamed to ALLOW_ERROR_INJECTION() since it is not limited for BPF too. - CONFIG_FUNCTION_ERROR_INJECTION is the config item of this feature. It is automatically enabled if the arch supports error injection feature for kprobe or ftrace etc. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
6dac7045d5 |
UPSTREAM: bpf/tracing: fix kernel/events/core.c compilation error
Commit f371b304f12e ("bpf/tracing: allow user space to
query prog array on the same tp") introduced a perf
ioctl command to query prog array attached to the
same perf tracepoint. The commit introduced a
compilation error under certain config conditions, e.g.,
(1). CONFIG_BPF_SYSCALL is not defined, or
(2). CONFIG_TRACING is defined but neither CONFIG_UPROBE_EVENTS
nor CONFIG_KPROBE_EVENTS is defined.
Error message:
kernel/events/core.o: In function `perf_ioctl':
core.c:(.text+0x98c4): undefined reference to `bpf_event_query_prog_array'
This patch fixed this error by guarding the real definition under
CONFIG_BPF_EVENTS and provided static inline dummy function
if CONFIG_BPF_EVENTS was not defined.
It renamed the function from bpf_event_query_prog_array to
perf_event_query_prog_array and moved the definition from linux/bpf.h
to linux/trace_events.h so the definition is in proximity to
other prog_array related functions.
Fixes: f371b304f12e ("bpf/tracing: allow user space to query prog array on the same tp")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
||
|
|
6ac6e790b6 |
BACKPORT: bpf: add a bpf_override_function helper
Error injection is sloppy and very ad-hoc. BPF could fill this niche perfectly with it's kprobe functionality. We could make sure errors are only triggered in specific call chains that we care about with very specific situations. Accomplish this with the bpf_override_funciton helper. This will modify the probe'd callers return value to the specified value and set the PC to an override function that simply returns, bypassing the originally probed function. This gives us a nice clean way to implement systematic error injection for all of our code paths. Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
3b0c288f84 |
UPSTREAM: bpf/tracing: allow user space to query prog array on the same tp
Commit e87c6bc3852b ("bpf: permit multiple bpf attachments
for a single perf event") added support to attach multiple
bpf programs to a single perf event.
Although this provides flexibility, users may want to know
what other bpf programs attached to the same tp interface.
Besides getting visibility for the underlying bpf system,
such information may also help consolidate multiple bpf programs,
understand potential performance issues due to a large array,
and debug (e.g., one bpf program which overwrites return code
may impact subsequent program results).
Commit
|
||
|
|
cea10be5bb |
BACKPORT: bpf: set maximum number of attached progs to 64 for a single perf tp
cgropu+bpf prog array has a maximum number of 64 programs.
Let us apply the same limit here.
Fixes: e87c6bc3852b ("bpf: permit multiple bpf attachments for a single perf event")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
||
|
|
d4147ff626 |
BACKPORT: bpf: Revert bpf_overrid_function() helper changes.
NACK'd by x86 maintainer. Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
13ac06e899 |
BACKPORT: bpf: add a bpf_override_function helper
Error injection is sloppy and very ad-hoc. BPF could fill this niche perfectly with it's kprobe functionality. We could make sure errors are only triggered in specific call chains that we care about with very specific situations. Accomplish this with the bpf_override_funciton helper. This will modify the probe'd callers return value to the specified value and set the PC to an override function that simply returns, bypassing the originally probed function. This gives us a nice clean way to implement systematic error injection for all of our code paths. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Josef Bacik <jbacik@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
1a13936c7e |
UPSTREAM: bpf: remove tail_call and get_stackid helper declarations from bpf.h
commit afdb09c720b6 ("security: bpf: Add LSM hooks for bpf object related
syscall") included linux/bpf.h in linux/security.h. As a result, bpf
programs including bpf_helpers.h and some other header that ends up
pulling in also security.h, such as several examples under samples/bpf,
fail to compile because bpf_tail_call and bpf_get_stackid are now
"redefined as different kind of symbol".
>From bpf.h:
u64 bpf_tail_call(u64 ctx, u64 r2, u64 index, u64 r4, u64 r5);
u64 bpf_get_stackid(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
Whereas in bpf_helpers.h they are:
static void (*bpf_tail_call)(void *ctx, void *map, int index);
static int (*bpf_get_stackid)(void *ctx, void *map, int flags);
Fix this by removing the unused declaration of bpf_tail_call and moving
the declaration of bpf_get_stackid in bpf_trace.c, which is the only
place where it's needed.
Signed-off-by: Gianluca Borello <g.borello@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
||
|
|
c02a12ef4d |
BACKPORT: bpf: split verifier and program ops
struct bpf_verifier_ops contains both verifier ops and operations used later during program's lifetime (test_run). Split the runtime ops into a different structure. BPF_PROG_TYPE() will now append ## _prog_ops or ## _verifier_ops to the names. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
ed7bb028f0 |
BACKPORT: bpf: add helper bpf_perf_prog_read_value
This patch adds helper bpf_perf_prog_read_cvalue for perf event based bpf programs, to read event counter and enabled/running time. The enabled/running time is accumulated since the perf event open. The typical use case for perf event based bpf program is to attach itself to a single event. In such cases, if it is desirable to get scaling factor between two bpf invocations, users can can save the time values in a map, and use the value from the map and the current value to calculate the scaling factor. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
5cd5c6f4b2 |
BACKPORT: bpf: add helper bpf_perf_event_read_value for perf event array map
Hardware pmu counters are limited resources. When there are more pmu based perf events opened than available counters, kernel will multiplex these events so each event gets certain percentage (but not 100%) of the pmu time. In case that multiplexing happens, the number of samples or counter value will not reflect the case compared to no multiplexing. This makes comparison between different runs difficult. Typically, the number of samples or counter value should be normalized before comparing to other experiments. The typical normalization is done like: normalized_num_samples = num_samples * time_enabled / time_running normalized_counter_value = counter_value * time_enabled / time_running where time_enabled is the time enabled for event and time_running is the time running for event since last normalization. This patch adds helper bpf_perf_event_read_value for kprobed based perf event array map, to read perf counter and enabled/running time. The enabled/running time is accumulated since the perf event open. To achieve scaling factor between two bpf invocations, users can can use cpu_id as the key (which is typical for perf array usage model) to remember the previous value and do the calculation inside the bpf program. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
456de77985 |
BACKPORT: bpf: perf event change needed for subsequent bpf helpers
This patch does not impact existing functionalities. It contains the changes in perf event area needed for subsequent bpf_perf_event_read_value and bpf_perf_prog_read_value helpers. Change-Id: I066312fce9ebb0185b02ce6904e057d728473f90 Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
3996f04715 |
Squashed revert of BPF backports
Revert "Partially revert "fixup: add back code missed during BPF picking""
This reverts commit cc477455f73d317733850a9e4818dfd90be4d33d.
Revert "bpf: lpm_trie: check left child of last leftmost node for NULL"
This reverts commit e89007b7df49292c5ae52b3d165c0d815a61cd10.
Revert "BACKPORT: bpf: Fix out-of-bounds write in trie_get_next_key()"
This reverts commit a1c4f565bb00b05ab3734a64451c08b0b965ce42.
Revert "bpf: Fix exact match conditions in trie_get_next_key()"
This reverts commit 4356a64dad3d38372147457b3004930c6e2e9c51.
Revert "bpf: fix kernel page fault in lpm map trie_get_next_key"
This reverts commit df4649b5d6cb374edbb67e5a5ecbd102a2e6c897.
Revert "bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE map"
This reverts commit fe6656a5d48df6144fe9929399c648957166edd0.
Revert "bpf: allow helpers to return PTR_TO_SOCK_COMMON"
This reverts commit b24d1ae9ccbf3ebe6f4baa50d2d48c03be02bc17.
Revert "bpf: implement lookup-free direct value access for maps"
This reverts commit de1959fcd3df0629380894d9c47ebb253c920ad1.
Revert "bpf: Add bpf_verifier_vlog() and bpf_verifier_log_needed()"
This reverts commit b777824607bd3eb8c9130f4639d97d15bcac9af5.
Revert "bpf: Don't return EINVAL from {get,set}sockopt when optlen > PAGE_SIZE"
This reverts commit 4cfef728c1eac6cce34f4fff1fbab3e66dc430d9.
Revert "bpf: always allocate at least 16 bytes for setsockopt hook"
This reverts commit 59817f83c964c753e93a75128ecaad4eeaa769fc.
Revert "bpf, sockmap: convert to generic sk_msg interface"
This reverts commit fe4ef742e22924b21749de333211941d0205501e.
Revert "bpf: sockmap, convert bpf_compute_data_pointers to bpf_*_sk_skb"
This reverts commit d17c8c2c2f623e087d6c297de50c173a006e6e55.
Revert "bpf: sockmap: fix typos"
This reverts commit 07e31378d7795371cdbccce06b4125b27ffce536.
Revert "sockmap: convert refcnt to an atomic refcnt"
This reverts commit c1fa11ec9da5dc0e8cae4334c550264cff77eef9.
Revert "bpf: sockmap, add hash map support"
This reverts commit 3f43379c38e329e9a7d4b5a1640670de37ba317b.
Revert "bpf: sockmap, refactor sockmap routines to work with hashmap"
This reverts commit 41a2b6e925db031978eb2484835f60908de884d7.
Revert "bpf: implement getsockopt and setsockopt hooks"
This reverts commit 9526fe6ff3e06939c12bb781e0dda01a8f3017ec.
Revert "bpf: Introduce bpf sk local storage"
This reverts commit ffedc38a46ddaca40de672fafe78c45fbfae9839.
Revert "bpf: introduce BPF_F_LOCK flag"
This reverts commit e7f5758fbcb1674e17c645837f7bff3b1febbad5.
Revert "bpf: Introduce ARG_PTR_TO_{INT,LONG} arg types"
This reverts commit e29b4e3c2bdd3b5d0d34668836ae8e5115cb31af.
Revert "bpf/verifier: add ARG_PTR_TO_UNINIT_MAP_VALUE"
This reverts commit f25c66c27cd6a774fb73769d804f91e969dd5f7b.
Revert "bpf: allow map helpers access to map values directly"
This reverts commit 7af696635219d0c5cdf1a166bb7543cae9e50328.
Revert "bpf: add writable context for raw tracepoints"
This reverts commit a546d8f0433039cee0de6ce96d5d35c4033a7b98.
Revert "bpf: Add struct bpf_tcp_sock and BPF_FUNC_tcp_sock"
This reverts commit 03093478c52e79c94791a04f8138d5c019119087.
Revert "bpf: Support socket lookup in CGROUP_SOCK_ADDR progs"
This reverts commit 8047013945361fbff0e449c8a212cb6fc93a5245.
Revert "bpf: Extend the sk_lookup() helper to XDP hookpoint."
This reverts commit 8315368983086e70ccc6f103d710903c63cca7df.
Revert "xdp: generic XDP handling of xdp_rxq_info"
This reverts commit 11d9514e6e6801941abf1c0485fd4ef53082d970.
Revert "xdp: move struct xdp_buff from filter.h to xdp.h"
This reverts commit a1795f54e4d99e02d5cb84a46fac0240cf29e206.
Revert "net: avoid including xdp.h in filter.h"
This reverts commit a39c59398f3ab64de44e5953ee0bd23c5136bb48.
Revert "xdp: base API for new XDP rx-queue info concept"
This reverts commit 49fb5bae77ab2041a2ad9f9f87ad7e0a6e215fdf.
Revert "net: Add asynchronous callbacks for xfrm on layer 2."
This reverts commit d0656f64d7719993d5634a9fc6600026e9a805ee.
Revert "xfrm: Separate ESP handling from segmentation for GRO packets."
This reverts commit c8afadf7f5ed8786652d307558345ef90ea91726.
Revert "net: move secpath_exist helper to sk_buff.h"
This reverts commit 0e5483057121dad47567b01845c656955e51989e.
Revert "sk_buff: add skb extension infrastructure"
This reverts commit 3a9ae74b075757495c4becf4dd1eec056d364801.
Revert "fixup: add back code missed during BPF picking"
This reverts commit 74ec8cef7051b5af72f2a6d83ca8c51c3c61c444.
Revert "bpf: undo prog rejection on read-only lock failure"
This reverts commit af2dc6e4993c4221603dbe6e81a3d0c8269f3171.
Revert "bpf: Add helper to retrieve socket in BPF"
This reverts commit 53495e3bc33cb46d9961ea122f576faded058aa1.
Revert "SQUASH! bpf: Add a bpf_sock pointer to __sk_buff and a bpf_sk_fullsock helpe"
This reverts commit 3b25fbf81c041af954d9f5ac1c7867eb07c40b07.
Revert "bpf: introduce bpf_spin_lock"
This reverts commit 0095fb54160e4f8b326fa8df103e334f90c5ab56.
Revert "bpf: enable cgroup local storage map pretty print with kind_flag"
This reverts commit 3fe92cb79b5eae557b113c37b03e78efee2280db.
Revert "bpf: btf: fix struct/union/fwd types with kind_flag"
This reverts commit 2bd4856277f459974dd6234a849cbe20fd475b8f.
Revert "bpf: add bpffs pretty print for cgroup local storage maps"
This reverts commit e07d8c8279f37cee8471846a63acc51f1ab7ce03.
Revert "bpf: pass struct btf pointer to the map_check_btf() callback"
This reverts commit 78a8140faf32710799c19495db28d71693c98030.
Revert "bpf: Define cgroup_bpf_enabled for CONFIG_CGROUP_BPF=n"
This reverts commit aada945d89950c67099e490af1c4c25eef7f31e6.
Revert "bpf: introduce per-cpu cgroup local storage"
This reverts commit d37432968663559f06c7fd7df44197a807fb84ca.
Revert "bpf: btf: Rename btf_key_id and btf_value_id in bpf_map_info"
This reverts commit 063c5a25e5f47e8b82b6c43a44ed7be851884abb.
Revert "bpf: fix a compilation error when CONFIG_BPF_SYSCALL is not defined"
This reverts commit bcf5bfaf50bb6f1f981d5c538f87e6da7aab78f2.
Revert "bpf: Create a new btf_name_by_offset() for non type name use case"
This reverts commit 52b4739d0bdd763e1b00feb50bef8a821f5c7570.
Revert "bpf: reject any prog that failed read-only lock"
This reverts commit 30d1bfec06a3bcaa773213113904580e3046a57a.
Revert "bpf: Add bpf_line_info support"
This reverts commit 50b094eeeb1ced32c62b3a10045bbf43126de760.
Revert "bpf: don't leave partial mangled prog in jit_subprogs error path"
This reverts commit a466f85be89f5daab4bd748f92915ea713d63934.
Revert "bpf: btf: support proper non-jit func info"
This reverts commit 492a556de94c502376ec3b0d5a724ec9fe9f6996.
Revert "bpf: Introduce bpf_func_info"
This reverts commit 39cade88686b0d9b7befc1f14e9d2c2cad19a769.
Revert "bpf: btf: Add BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO"
This reverts commit 2010b6bacc271a48e74942506f3cf45268b6c264.
Revert "bpf: fix bpf_prog_get_info_by_fd to return 0 func_lens for unpriv"
This reverts commit a0ea14ac88a0f5529a635fc6e20277942fc6bb99.
Revert "bpf: Expose check_uarg_tail_zero()"
This reverts commit 1190aaae686534c2854838b3d642dac45d26b1f4.
Revert "bpf: Append prog->aux->name in bpf_get_prog_name()"
This reverts commit 8b82528df4a11a8501393c854978662fc218014e.
Revert "bpf: get JITed image lengths of functions via syscall"
This reverts commit 0722dbc626915fcb9acb952ebc1fcb0c4554cb07.
Revert "bpf: get kernel symbol addresses via syscall"
This reverts commit 6736ec7558dd262fef6669eec02a9797c7c4ecb7.
Revert "bpf: Add gpl_compatible flag to struct bpf_prog_info"
This reverts commit b60c7a51fd3692259c93413f3e87150078be1dac.
Revert "bpf: centre subprog information fields"
This reverts commit b5186fdf6f3e1bb38d7e4abfed5bf7dd6f85a6c3.
Revert "bpf: unify main prog and subprog"
This reverts commit e8e2ad5d9ae98bc7b85b99c0712a5dfbfc151a41.
Revert "bpf: fix maximum stack depth tracking logic"
This reverts commit 10c7127615dc2c00b724069a1620b2232d905113.
Revert "bpf, x64: fix memleak when not converging on calls"
This reverts commit 6bc867f718ef2656266f984b605151971026cc98.
Revert "bpf: decouple btf from seq bpf fs dump and enable more maps"
This reverts commit 3036e2c4384d3f43c695b88c8a1cf97b8337e3bd.
Revert "bpf: Add reference tracking to verifier"
This reverts commit 3a4900a188ac4de817dc6f114f01159d7bdd2f3e.
Revert "bpf: properly enforce index mask to prevent out-of-bounds speculation"
This reverts commit ef85925d5c07b46f7447487605da601fc7be026e.
Revert "bpf, verifier: detect misconfigured mem, size argument pair"
This reverts commit c3853ee3cb96833e907f18bf90e78040fe4cf06f.
Revert "bpf: introduce ARG_PTR_TO_MEM_OR_NULL"
This reverts commit 58560e13f545f2a079bbce17ac1b731d8b94fec7.
Revert "bpf: Macrofy stack state copy"
This reverts commit 88d98d8c2ae320ab248150eb86e1c89427e5017c.
Revert "bpf: Generalize ptr_or_null regs check"
This reverts commit d2cbc2e57b8624699a1548e67b7b3ce992b396fc.
Revert "bpf: Add iterator for spilled registers"
This reverts commit d956e1ba51a7e5ce86bb35002e26d4c1e0a2497c.
Revert "bpf/verifier: refine retval R0 state for bpf_get_stack helper"
This reverts commit ceaf6d678ccb60da107b0455da64c7bf90c5102d.
Revert "bpf: Remove struct bpf_verifier_env argument from print_bpf_insn"
This reverts commit 058fd54c07a289f9b506f2d2326434e411fa65fe.
Revert "bpf: annotate bpf_insn_print_t with __printf"
This reverts commit 9b07d2ccf07855d62446e274d817672713f15be4.
Revert "bpf: allow for correlation of maps and helpers in dump"
This reverts commit af690c2e2d177352f7270f77d8a6bc9e9f60c98c.
Revert "bpf: Add bpf_patch_call_args prototype to include/linux/bpf.h"
This reverts commit 8a2c588b3ab98916147fe4a449312ce8db70c471.
Revert "bpf: x64: add JIT support for multi-function programs"
This reverts commit 752f261e545f80942272c6becf82def1729f84be.
Revert "bpf: fix net.core.bpf_jit_enable race"
This reverts commit 4720901114c20204aa3ffa2076265d2c8cc9e81b.
Revert "bpf: add support for bpf_call to interpreter"
This reverts commit c79b2e547adc8e50dabc72244370cfd37ac6a6bd.
Revert "bpf: introduce function calls (verification)"
This reverts commit f779fda96c7d9e921525f48d67fa2e9c68b4bd48.
Revert "bpf: cleanup register_is_null()"
This reverts commit 1c81f751670b4feb3102e4de136e25fa24e303fe.
Revert "bpf: print liveness info to verifier log"
This reverts commit fdc851301b33b9d646bd1d37124cbd45cedcd62b.
Revert "bpf: also improve pattern matches for meta access"
This reverts commit 9aa150d07927b911f26e0db2af0efd6aa07b8707.
Revert "bpf: add meta pointer for direct access"
This reverts commit 94f3f502ef9ef150ed687113cfbd38e91b5edc44.
Revert "bpf: rename bpf_compute_data_end into bpf_compute_data_pointers"
This reverts commit 9573c6feb301346cd1493eea4e363c6d8345e899.
Revert "bpf: squash of log related commits"
This reverts commit b08f2111e030a72a92eec4ebd6201165d03a20b8.
Revert "bpf: move instruction printing into a separate file"
This reverts commit 8fcbd39afb58847914f3f84d9c076000e09d2fb9.
Revert "bpf: btf: Introduce BTF ID"
This reverts commit 423c40d67dfc783c3b0cb227d9da53e725e0f35c.
Revert "bpf: btf: Add pretty print support to the basic arraymap"
This reverts commit 6cd4d5bba662ca0d8980e5806ef37e0341eab929.
Revert "nsfs: clean-up ns_get_path() signature to return int"
This reverts commit ec1ce41701f411c5dee396cec2931fb651f447cc.
Revert "bpf_obj_do_pin(): switch to vfs_mkobj(), quit abusing ->mknod()"
This reverts commit 8fbcb4ebf5a751f4685cdd2757cff2264032a5d9.
Revert "bpf: offload: report device information about offloaded maps"
This reverts commit 1105e63f25a9db675671288b583a5ce2c7d10b1f.
Revert "bpf: offload: add map offload infrastructure"
This reverts commit 20cdf9df3d5bd010d799ea3c80219f625c998307.
Revert "bpf: add map_alloc_check callback"
This reverts commit 6feb4121ea083053ac9587ac426195efe9fb143d.
Revert "bpf: offload: factor out netdev checking at allocation time"
This reverts commit 1425fb5676b8fe9d761f2f6545e4be8880ce0ac8.
Revert "bpf: rename bpf_dev_offload -> bpf_prog_offload"
This reverts commit a03ae0ec508200433fd6c35b87e342df4de0b320.
Revert "bpf: offload: allow netdev to disappear while verifier is running"
This reverts commit f6cf7214fd1ff3a018009ba90c33eac1d8de21de.
Revert "bpf: offload: free program id when device disappears"
This reverts commit b12b5e56b799cfe900ab8f0ee4177c6c08a904c6.
Revert "bpf: offload: report device information for offloaded programs"
This reverts commit c73c9a0ffa332eeb49927a48780f5537597e2d42.
Revert "bpf: offload: don't require rtnl for dev list manipulation"
This reverts commit 1993f08662f07581a370899a2da209ba0c996dbb.
Revert "bpf: offload: ignore namespace moves"
This reverts commit 9fefb21d8aa2691019f9c4f0b8025fb45ba60b49.
Revert "bpf: Add PTR_TO_SOCKET verifier type"
This reverts commit 55fdbc844801cd4007237fa6c5842b46985a5c9a.
Revert "bpf: extend cgroup bpf core to allow multiple cgroup storage types"
This reverts commit a6d82e371ef32fb24d493cff32765b4607581dd4.
Revert "bpf: permit CGROUP_DEVICE programs accessing helper bpf_get_current_cgroup_id()"
This reverts commit 1bfd0a07a8317004a89d6de736e24861db8281b5.
Revert "bpf: implement bpf_get_current_cgroup_id() helper"
This reverts commit 23603ed6d7df86392701a7ea7d9a1dba66f28d4b.
Revert "bpf: introduce the bpf_get_local_storage() helper function"
This reverts commit 3d777256b1c9f34975c5230d836023ea3e0d4cfd.
Revert "bpf/verifier: introduce BPF_PTR_TO_MAP_VALUE"
This reverts commit 93c12733dc97984f7bf57a77160eacc480bfc3de.
Revert "bpf: extend bpf_prog_array to store pointers to the cgroup storage"
This reverts commit b26baff1fb34607938c9ac0e421e3f4b5fedad4d.
Revert "BACKPORT: bpf: allocate cgroup storage entries on attaching bpf programs"
This reverts commit 804605c21a3be3277c0031504dcd3fdd1be64290.
Revert "bpf: include errno.h from bpf-cgroup.h"
This reverts commit 6b4df332b357e9a5942ca4c6f985cd33dfc30e25.
Revert "bpf: pass a pointer to a cgroup storage using pcpu variable"
This reverts commit c8af92dc9fc00e49f06f6997969284ef5e5c5af5.
Revert "bpf: introduce cgroup storage maps"
This reverts commit c61c2271cb8a1e47678bddc8cdfae83035a07fec.
Revert "bpf: add ability to charge bpf maps memory dynamically"
This reverts commit 3a430745e9f675b450477fffead5568046432f29.
Revert "bpf: add helper for copying attrs to struct bpf_map"
This reverts commit 6d7be0ae93371692e564c00003ce184cbaefbb8d.
Revert "bpf: introduce new bpf cpu map type BPF_MAP_TYPE_CPUMAP"
This reverts commit 15f584d2d3d4814cfbd3059ab810db02af8773a0.
Revert "bpf/tracing: fix a deadlock in perf_event_detach_bpf_prog"
This reverts commit fc9bf5e48985f7c3a39bf34a27477a2607a5dc6d.
Revert "bpf: set maximum number of attached progs to 64 for a single perf tp"
This reverts commit 0d5fc9795d824fbca21b81c8d91748ba21313d4c.
Revert "bpf: avoid rcu_dereference inside bpf_event_mutex lock region"
This reverts commit 948e200e3173dd959de907e326f2a2c90eda4b28.
Revert "bpf: fix bpf_prog_array_copy_to_user() issues"
This reverts commit 66811698b8de9b3cf13c09730d287b6d1d5d3699.
Revert "bpf: fix pointer offsets in context for 32 bit"
This reverts commit 99661813c136c52e56b328a2a8ecd2bc0e187eba.
Revert "BACKPORT: bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data"
This reverts commit 36f0ea00dd121b13f80617e5b2eb93ba160df85a.
Revert "BACKPORT: bpf: Sysctl hook"
This reverts commit 4a543990e03b5de4a2c23777abd0f77afd61cc2d.
Revert "BACKPORT: flow_dissector: implements flow dissector BPF hook"
This reverts commit de610a8a4324170a0deaf12e2e64c2ff068785fb.
Revert "BACKPORT: bpf: Add base proto function for cgroup-bpf programs"
This reverts commit f3ac0a6cbec3472ff2e3808a436891881f3cbf87.
Revert "FROMLIST: [net-next,v2,1/2] bpf: Allow CGROUP_SKB eBPF program to access sk_buff"
This reverts commit 6d4dcc0e3de628003d91075e4b1ab1a128b8892e.
Revert "BACKPORT: bpf: introduce BPF_RAW_TRACEPOINT"
This reverts commit b2a5c6b4958c8250e58ddb6c334018a5f7ee5437.
Revert "bpf/tracing: fix kernel/events/core.c compilation error"
This reverts commit 70249d4eb7359e9dc59e044951beb99d0d8725cd.
Revert "BACKPORT: bpf/tracing: allow user space to query prog array on the same tp"
This reverts commit 08a6d8c01372940bfec78fdc6cb8a47e08c745b0.
Revert "bpf: sockmap, add sock close() hook to remove socks"
This reverts commit e6b363b8d09d9740dff309fb4dc88e7a1e90726b.
Revert "BACKPORT: bpf: remove the verifier ops from program structure"
This reverts commit 94c2f61efa741bf6a97415f42cfbfb9ec83dfd8e.
Revert "bpf, cgroup: implement eBPF-based device controller for cgroup v2"
This reverts commit 22faa9c56550a34488e607ca3aca59c68b1f7938.
Revert "BACKPORT: bpf: split verifier and program ops"
This reverts commit d2b1388504c1129d5756bb9b20af9bd64e75d015.
Revert "bpf: btf: Break up btf_type_is_void()"
This reverts commit 052989c47b68feaf381d371ec1e6a169edc26d30.
Revert "bpf: btf: refactor btf_int_bits_seq_show()"
This reverts commit 8cc3fb30656cfab91205194a8ee7661bdd95e005.
Revert "BACKPORT: bpf: fix unconnected udp hooks"
This reverts commit b108e725aa70e39cfd37296d1a1d31e8896fa7b7.
Revert "BACKPORT: bpf: enforce return code for cgroup-bpf programs"
This reverts commit 10215080915bfbdaa9f666a95ffda02cc1ef7a29.
Revert "bpf: Hooks for sys_sendmsg"
This reverts commit cd847db1be8a37e0e7e9c813b5d8f93697dc5af0.
Revert "BACKPORT: devmap: Allow map lookups from eBPF"
This reverts commit 37da95fde647e8967b362e0769136bfbebc03628.
Revert "BACKPORT: xdp: Add devmap_hash map type for looking up devices by hashed index"
This reverts commit ae6a87f44c4ef20ac290ce68c4d5b542cf46f3d7.
Revert "kernel: bpf: devmap: Create __dev_map_alloc_node"
This reverts commit 15928a97ed93cf9f606a21bf869ff421b997a2c5.
Revert "BACKPORT: bpf: Post-hooks for sys_bind"
This reverts commit c221d44e76c3ab69285c9986680e5eb726cf157b.
Revert "BACKPORT: bpf: Hooks for sys_connect"
This reverts commit 003311ea43163c77e4e0c1921b81438286925baa.
Revert "BACKPORT: net: Introduce __inet_bind() and __inet6_bind"
This reverts commit 74f1eb60012c13bd606e4dc718e63aec7f8cce8f.
Revert "BACKPORT: bpf: Hooks for sys_bind"
This reverts commit cef0bd97f2fec8363c3ef58b2cb508deaa9bc5b2.
Revert "BACKPORT: bpf: introduce BPF_PROG_QUERY command"
This reverts commit a4ef81ce48cb25843ddb4d4331dacf2742215909.
Revert "BACKPORT: bpf: Check attach type at prog load time"
This reverts commit 750a3f976c75797e572a6dfdd2e8865b8b49964a.
Revert "bpf: offload: rename the ifindex field"
This reverts commit 921e6becfb28fbe505603bf927f195d1d72a0eea.
Revert "BACKPORT: bpf: offload: add infrastructure for loading programs for a specific netdev"
This reverts commit cb1607a58d026a4ac1d9e71f6c3cd1dc23820e2f.
Revert "BACKPORT: net: bpf: rename ndo_xdp to ndo_bpf"
This reverts commit 932d47ebc5910bb1ec954002206b1ce8749a9cd6.
Revert "bpf: btf: fix truncated last_member_type_id in btf_struct_resolve"
This reverts commit e7af669fe00a8e2030913088836189a9f65a04d8.
Revert "bpf/btf: Fix BTF verification of enum members in struct/union"
This reverts commit a098516b98fe35e8f0e89709443fff8b37eb04b8.
Revert "bpf: fix BTF limits"
This reverts commit 794ad07fab9540989f96351c11b039e2229c2a8e.
Revert "bpf, btf: fix a missing check bug in btf_parse"
This reverts commit 27c4178ecc8edbb2306fa479f275ffd35f5b57c9.
Revert "bpf: btf: Fix a missing check bug"
This reverts commit 71f5a7d140aa5a37d164e217b2fefcb2d409b894.
Revert "bpf: btf: Fix end boundary calculation for type section"
This reverts commit 549615befd671b6877677acb009b66cd374408d3.
Revert "bpf: fix bpf_skb_load_bytes_relative pkt length check"
This reverts commit 5f3d68c4da18dfbcde4c02cb34c63599709fcf3c.
Revert "bpf: btf: Ensure the member->offset is in the right order"
This reverts commit 4f9d26cbc747a4728c4944b7dc9725fc2737f892.
Revert "bpf: btf: Clean up BTF_INT_BITS() in uapi btf.h"
This reverts commit 480c6f80a14431f6d680a687363dcb0d9cd1d7a8.
Revert "bpf: btf: Fix bitfield extraction for big endian"
This reverts commit 0463c259aa21e99d1bf798c8cf54da18b5906938.
Revert "bpf: btf: Ensure t->type == 0 for BTF_KIND_FWD"
This reverts commit ecc54be6970a3484eb163ac09996856c9ece5727.
Revert "bpf: btf: Check array t->size"
This reverts commit 3cda848b9be9fbb6dfa8912a425801c263bcbff7.
Revert "bpf: btf: avoid -Wreturn-type warning"
This reverts commit fd7fede5952004dcacb39f318249c4cf8e5c51e0.
Revert "bpf: btf: Avoid variable length array"
This reverts commit 2826641eb171c705d0b2db86d8834eff33945d0e.
Revert "bpf: btf: Remove unused bits from uapi/linux/btf.h"
This reverts commit 2d9e7a574f7e47a027974ec616ac812ad6a2d086.
Revert "bpf: btf: Check array->index_type"
This reverts commit f9ee68f7e8a471450536a70b43bd96d4bdfbfb81.
Revert "bpf: btf: Change how section is supported in btf_header"
This reverts commit 63a4474da4bf56c8a700d542bcf3a57a4b737ed6.
Revert "bpf: Fix compiler warning on info.map_ids for 32bit platform"
This reverts commit a4f706ea7d2b874ef739168a12a30ae5454487a6.
Revert "BACKPORT: bpf: Use char in prog and map name"
This reverts commit 8d4ad88eabb5d1500814c5f5b76a11f80346669c.
Revert "bpf: Change bpf_obj_name_cpy() to better ensure map's name is init by 0"
This reverts commit c4acfd3c9f5a97123c240676750f3e4ae2a2c24c.
Revert "BACKPORT: bpf: Add map_name to bpf_map_info"
This reverts commit 0e03a4e584eabe3f4c448f06f271753cdaae3aab.
Revert "BACKPORT: bpf: Add name, load_time, uid and map_ids to bpf_prog_info"
This reverts commit 16872f60e6c1fc6b10e905ff18c14d8aaeb4e09d.
Revert "bpf: btf: Avoid WARN_ON when CONFIG_REFCOUNT_FULL=y"
This reverts commit 0b618ec6e162e650aaa583a31f4de4c4558148bf.
Revert "BACKPORT: bpf: btf: Clean up btf.h in uapi"
This reverts commit ea0c0ad08c18ddf62dbb6c8edc814c75cbb3e8b9.
Revert "bpf: btf: Add BPF_OBJ_GET_INFO_BY_FD support to BTF fd"
This reverts commit f51fe1d1edb742176c622bc93301e98a1cbf2e63.
Revert "BACKPORT: bpf: btf: Add BPF_BTF_LOAD command"
This reverts commit 85db8f764069f15d1b181bea67336ce4d66a58c1.
Revert "bpf: btf: Add pretty print capability for data with BTF type info"
This reverts commit 0a8aae433c53b1f441cab70979517660fb6a6038.
Revert "bpf: btf: Check members of struct/union"
This reverts commit ce2e8103ac1a977ce32db51ec042faea6f100a3d.
Revert "bpf: btf: Validate type reference"
This reverts commit a1aa96e6dae2b4c8c0b0a4dedab3006d3f697460.
Revert "bpf: Update logging functions to work with BTF"
This reverts commit b9289460f0a6b5c261ec0b6dcafa6fcd09d4957e.
Revert "BACKPORT: bpf: btf: Introduce BPF Type Format (BTF)"
This reverts commit ceebd58f6470e8ec6d9d694ab382fe88f43b998b.
Revert "BACKPORT: bpf: Rename bpf_verifer_log"
This reverts commit 50bdc7513d966811fb418d24a0e5797ffd8c907c.
Revert "BACKPORT: bpf: encapsulate verifier log state into a structure"
This reverts commit 0bcb397bde4675fdeb977d9debed20ed213f9ecd.
Change-Id: Iecaa276b078c6d2db773a8071e7da9e6195277d6
|
||
|
|
622705f7c9 |
Merge tag 'v4.14.356-openela-rc1' of https://github.com/openela/kernel-lts
This is the 4.14.356 OpenELA-Extended LTS stable release candidate 1 Conflicts: arch/arm/include/asm/uaccess.h drivers/android/binder.c drivers/android/binder_alloc.c drivers/block/loop.c drivers/infiniband/ulp/srpt/ib_srpt.c drivers/mmc/core/mmc_test.c drivers/net/usb/usbnet.c fs/aio.c fs/f2fs/inode.c fs/f2fs/namei.c fs/f2fs/segment.c fs/f2fs/super.c fs/select.c include/linux/fs.h include/net/netns/ipv4.h kernel/power/swap.c mm/page_alloc.c net/core/filter.c net/ipv4/af_inet.c net/ipv4/sysctl_net_ipv4.c net/ipv4/tcp_ipv4.c net/ipv6/af_inet6.c net/qrtr/qrtr.c sound/usb/stream.c Change-Id: I016dabcf8f4fd90dae7083272b3465d184c07de8 |
||
|
|
1c8f7c550e |
bpf: add writable context for raw tracepoints
This is an opt-in interface that allows a tracepoint to provide a safe buffer that can be written from a BPF_PROG_TYPE_RAW_TRACEPOINT program. The size of the buffer must be a compile-time constant, and is checked before allowing a BPF program to attach to a tracepoint that uses this feature. The pointer to this buffer will be the first argument of tracepoints that opt in; the pointer is valid and can be bpf_probe_read() by both BPF_PROG_TYPE_RAW_TRACEPOINT and BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE programs that attach to such a tracepoint, but the buffer to which it points may only be written by the latter. Signed-off-by: Matt Mullins <mmullins@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
81a8cfcd3f |
bpf: implement bpf_get_current_cgroup_id() helper
bpf has been used extensively for tracing. For example, bcc contains an almost full set of bpf-based tools to trace kernel and user functions/events. Most tracing tools are currently either filtered based on pid or system-wide. Containers have been used quite extensively in industry and cgroup is often used together to provide resource isolation and protection. Several processes may run inside the same container. It is often desirable to get container-level tracing results as well, e.g. syscall count, function count, I/O activity, etc. This patch implements a new helper, bpf_get_current_cgroup_id(), which will return cgroup id based on the cgroup within which the current task is running. The later patch will provide an example to show that userspace can get the same cgroup id so it could configure a filter or policy in the bpf program based on task cgroup id. The helper is currently implemented for tracing. It can be added to other program types as well when needed. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
4d904156cc |
bpf/tracing: fix a deadlock in perf_event_detach_bpf_prog
syzbot reported a possible deadlock in perf_event_detach_bpf_prog. The error details: ====================================================== WARNING: possible circular locking dependency detected 4.16.0-rc7+ #3 Not tainted ------------------------------------------------------ syz-executor7/24531 is trying to acquire lock: (bpf_event_mutex){+.+.}, at: [<000000008a849b07>] perf_event_detach_bpf_prog+0x92/0x3d0 kernel/trace/bpf_trace.c:854 but task is already holding lock: (&mm->mmap_sem){++++}, at: [<0000000038768f87>] vm_mmap_pgoff+0x198/0x280 mm/util.c:353 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&mm->mmap_sem){++++}: __might_fault+0x13a/0x1d0 mm/memory.c:4571 _copy_to_user+0x2c/0xc0 lib/usercopy.c:25 copy_to_user include/linux/uaccess.h:155 [inline] bpf_prog_array_copy_info+0xf2/0x1c0 kernel/bpf/core.c:1694 perf_event_query_prog_array+0x1c7/0x2c0 kernel/trace/bpf_trace.c:891 _perf_ioctl kernel/events/core.c:4750 [inline] perf_ioctl+0x3e1/0x1480 kernel/events/core.c:4770 vfs_ioctl fs/ioctl.c:46 [inline] do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:686 SYSC_ioctl fs/ioctl.c:701 [inline] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 -> #0 (bpf_event_mutex){+.+.}: lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3920 __mutex_lock_common kernel/locking/mutex.c:756 [inline] __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908 perf_event_detach_bpf_prog+0x92/0x3d0 kernel/trace/bpf_trace.c:854 perf_event_free_bpf_prog kernel/events/core.c:8147 [inline] _free_event+0xbdb/0x10f0 kernel/events/core.c:4116 put_event+0x24/0x30 kernel/events/core.c:4204 perf_mmap_close+0x60d/0x1010 kernel/events/core.c:5172 remove_vma+0xb4/0x1b0 mm/mmap.c:172 remove_vma_list mm/mmap.c:2490 [inline] do_munmap+0x82a/0xdf0 mm/mmap.c:2731 mmap_region+0x59e/0x15a0 mm/mmap.c:1646 do_mmap+0x6c0/0xe00 mm/mmap.c:1483 do_mmap_pgoff include/linux/mm.h:2223 [inline] vm_mmap_pgoff+0x1de/0x280 mm/util.c:355 SYSC_mmap_pgoff mm/mmap.c:1533 [inline] SyS_mmap_pgoff+0x462/0x5f0 mm/mmap.c:1491 SYSC_mmap arch/x86/kernel/sys_x86_64.c:100 [inline] SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:91 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&mm->mmap_sem); lock(bpf_event_mutex); lock(&mm->mmap_sem); lock(bpf_event_mutex); *** DEADLOCK *** ====================================================== The bug is introduced by Commit f371b304f12e ("bpf/tracing: allow user space to query prog array on the same tp") where copy_to_user, which requires mm->mmap_sem, is called inside bpf_event_mutex lock. At the same time, during perf_event file descriptor close, mm->mmap_sem is held first and then subsequent perf_event_detach_bpf_prog needs bpf_event_mutex lock. Such a senario caused a deadlock. As suggested by Daniel, moving copy_to_user out of the bpf_event_mutex lock should fix the problem. Fixes: f371b304f12e ("bpf/tracing: allow user space to query prog array on the same tp") Reported-by: syzbot+dc5ca0e4c9bfafaf2bae@syzkaller.appspotmail.com Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
||
|
|
ae71256def |
bpf: set maximum number of attached progs to 64 for a single perf tp
cgropu+bpf prog array has a maximum number of 64 programs.
Let us apply the same limit here.
Fixes: e87c6bc3852b ("bpf: permit multiple bpf attachments for a single perf event")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
||
|
|
e72762af21 |
bpf: avoid rcu_dereference inside bpf_event_mutex lock region
During perf event attaching/detaching bpf programs,
the tp_event->prog_array change is protected by the
bpf_event_mutex lock in both attaching and deteching
functions. Although tp_event->prog_array is a rcu
pointer, rcu_derefrence is not needed to access it
since mutex lock will guarantee ordering.
Verified through "make C=2" that sparse
locking check still happy with the new change.
Also change the label name in perf_event_{attach,detach}_bpf_prog
from "out" to "unlock" to reflect the code action after the label.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
||
|
|
56281ddce6 |
BACKPORT: bpf: introduce BPF_RAW_TRACEPOINT
Introduce BPF_PROG_TYPE_RAW_TRACEPOINT bpf program type to access
kernel internal arguments of the tracepoints in their raw form.
>From bpf program point of view the access to the arguments look like:
struct bpf_raw_tracepoint_args {
__u64 args[0];
};
int bpf_prog(struct bpf_raw_tracepoint_args *ctx)
{
// program can read args[N] where N depends on tracepoint
// and statically verified at program load+attach time
}
kprobe+bpf infrastructure allows programs access function arguments.
This feature allows programs access raw tracepoint arguments.
Similar to proposed 'dynamic ftrace events' there are no abi guarantees
to what the tracepoints arguments are and what their meaning is.
The program needs to type cast args properly and use bpf_probe_read()
helper to access struct fields when argument is a pointer.
For every tracepoint __bpf_trace_##call function is prepared.
In assembler it looks like:
(gdb) disassemble __bpf_trace_xdp_exception
Dump of assembler code for function __bpf_trace_xdp_exception:
0xffffffff81132080 <+0>: mov %ecx,%ecx
0xffffffff81132082 <+2>: jmpq 0xffffffff811231f0 <bpf_trace_run3>
where
TRACE_EVENT(xdp_exception,
TP_PROTO(const struct net_device *dev,
const struct bpf_prog *xdp, u32 act),
The above assembler snippet is casting 32-bit 'act' field into 'u64'
to pass into bpf_trace_run3(), while 'dev' and 'xdp' args are passed as-is.
All of ~500 of __bpf_trace_*() functions are only 5-10 byte long
and in total this approach adds 7k bytes to .text.
This approach gives the lowest possible overhead
while calling trace_xdp_exception() from kernel C code and
transitioning into bpf land.
Since tracepoint+bpf are used at speeds of 1M+ events per second
this is valuable optimization.
The new BPF_RAW_TRACEPOINT_OPEN sys_bpf command is introduced
that returns anon_inode FD of 'bpf-raw-tracepoint' object.
The user space looks like:
// load bpf prog with BPF_PROG_TYPE_RAW_TRACEPOINT type
prog_fd = bpf_prog_load(...);
// receive anon_inode fd for given bpf_raw_tracepoint with prog attached
raw_tp_fd = bpf_raw_tracepoint_open("xdp_exception", prog_fd);
Ctrl-C of tracing daemon or cmdline tool that uses this feature
will automatically detach bpf program, unload it and
unregister tracepoint probe.
On the kernel side the __bpf_raw_tp_map section of pointers to
tracepoint definition and to __bpf_trace_*() probe function is used
to find a tracepoint with "xdp_exception" name and
corresponding __bpf_trace_xdp_exception() probe function
which are passed to tracepoint_probe_register() to connect probe
with tracepoint.
Addition of bpf_raw_tracepoint doesn't interfere with ftrace and perf
tracepoint mechanisms. perf_event_open() can be used in parallel
on the same tracepoint.
Multiple bpf_raw_tracepoint_open("xdp_exception", prog_fd) are permitted.
Each with its own bpf program. The kernel will execute
all tracepoint probes and all attached bpf programs.
In the future bpf_raw_tracepoints can be extended with
query/introspection logic.
__bpf_raw_tp_map section logic was contributed by Steven Rostedt
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
||
|
|
78753a55f2 |
bpf/tracing: fix kernel/events/core.c compilation error
Commit f371b304f12e ("bpf/tracing: allow user space to
query prog array on the same tp") introduced a perf
ioctl command to query prog array attached to the
same perf tracepoint. The commit introduced a
compilation error under certain config conditions, e.g.,
(1). CONFIG_BPF_SYSCALL is not defined, or
(2). CONFIG_TRACING is defined but neither CONFIG_UPROBE_EVENTS
nor CONFIG_KPROBE_EVENTS is defined.
Error message:
kernel/events/core.o: In function `perf_ioctl':
core.c:(.text+0x98c4): undefined reference to `bpf_event_query_prog_array'
This patch fixed this error by guarding the real definition under
CONFIG_BPF_EVENTS and provided static inline dummy function
if CONFIG_BPF_EVENTS was not defined.
It renamed the function from bpf_event_query_prog_array to
perf_event_query_prog_array and moved the definition from linux/bpf.h
to linux/trace_events.h so the definition is in proximity to
other prog_array related functions.
Fixes: f371b304f12e ("bpf/tracing: allow user space to query prog array on the same tp")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
||
|
|
617bdda030 |
BACKPORT: bpf/tracing: allow user space to query prog array on the same tp
Commit e87c6bc3852b ("bpf: permit multiple bpf attachments
for a single perf event") added support to attach multiple
bpf programs to a single perf event.
Although this provides flexibility, users may want to know
what other bpf programs attached to the same tp interface.
Besides getting visibility for the underlying bpf system,
such information may also help consolidate multiple bpf programs,
understand potential performance issues due to a large array,
and debug (e.g., one bpf program which overwrites return code
may impact subsequent program results).
Commit
|
||
|
|
61d5c303c8 |
BACKPORT: bpf: split verifier and program ops
struct bpf_verifier_ops contains both verifier ops and operations used later during program's lifetime (test_run). Split the runtime ops into a different structure. BPF_PROG_TYPE() will now append ## _prog_ops or ## _verifier_ops to the names. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
825559c99e |
tracing: Remove precision vsnprintf() check from print event
[ Upstream commit 5efd3e2aef91d2d812290dcb25b2058e6f3f532c ]
This reverts 60be76eeabb3d ("tracing: Add size check when printing
trace_marker output"). The only reason the precision check was added
was because of a bug that miscalculated the write size of the string into
the ring buffer and it truncated it removing the terminating nul byte. On
reading the trace it crashed the kernel. But this was due to the bug in
the code that happened during development and should never happen in
practice. If anything, the precision can hide bugs where the string in the
ring buffer isn't nul terminated and it will not be checked.
Link: https://lore.kernel.org/all/C7E7AF1A-D30F-4D18-B8E5-AF1EF58004F5@linux.ibm.com/
Link: https://lore.kernel.org/linux-trace-kernel/20240227125706.04279ac2@gandalf.local.home
Link: https://lore.kernel.org/all/20240302111244.3a1674be@gandalf.local.home/
Link: https://lore.kernel.org/linux-trace-kernel/20240304174341.2a561d9f@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: 60be76eeabb3d ("tracing: Add size check when printing trace_marker output")
Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit f3de4b5d1ab8139aee39cc8afbd86a2cf260ad91)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
|
||
|
|
fdfd1ef491 |
tracing: Avoid possible softlockup in tracing_iter_reset()
[ Upstream commit 49aa8a1f4d6800721c7971ed383078257f12e8f9 ]
In __tracing_open(), when max latency tracers took place on the cpu,
the time start of its buffer would be updated, then event entries with
timestamps being earlier than start of the buffer would be skipped
(see tracing_iter_reset()).
Softlockup will occur if the kernel is non-preemptible and too many
entries were skipped in the loop that reset every cpu buffer, so add
cond_resched() to avoid it.
Cc: stable@vger.kernel.org
Fixes:
|
||
|
|
ae2112e6a0 |
ring-buffer: Rename ring_buffer_read() to read_buffer_iter_advance()
[ Upstream commit bc1a72afdc4a91844928831cac85731566e03bc6 ] When the ring buffer was first created, the iterator followed the normal producer/consumer operations where it had both a peek() operation, that just returned the event at the current location, and a read(), that would return the event at the current location and also increment the iterator such that the next peek() or read() will return the next event. The only use of the ring_buffer_read() is currently to move the iterator to the next location and nothing now actually reads the event it returns. Rename this function to its actual use case to ring_buffer_iter_advance(), which also adds the "iter" part to the name, which is more meaningful. As the timestamp returned by ring_buffer_read() was never used, there's no reason that this new version should bother having returning it. It will also become a void function. Link: http://lkml.kernel.org/r/20200317213416.018928618@goodmis.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Stable-dep-of: 49aa8a1f4d68 ("tracing: Avoid possible softlockup in tracing_iter_reset()") Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit ac8ffa21dde0c1edcd9dd98b5555a0aa4eea3b1f) Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> |
||
|
|
f01fa25d84 |
Merge branch 'deprecated/android-4.14-stable' of https://android.googlesource.com/kernel/common into lineage-21.0
Change-Id: I8750f4152cf3c402ef61f9266766128541dfa05c |
||
|
|
b28271a442 |
tracing: Fix overflow in get_free_elt()
commit bcf86c01ca4676316557dd482c8416ece8c2e143 upstream.
"tracing_map->next_elt" in get_free_elt() is at risk of overflowing.
Once it overflows, new elements can still be inserted into the tracing_map
even though the maximum number of elements (`max_elts`) has been reached.
Continuing to insert elements after the overflow could result in the
tracing_map containing "tracing_map->max_size" elements, leaving no empty
entries.
If any attempt is made to insert an element into a full tracing_map using
`__tracing_map_insert()`, it will cause an infinite loop with preemption
disabled, leading to a CPU hang problem.
Fix this by preventing any further increments to "tracing_map->next_elt"
once it reaches "tracing_map->max_elt".
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Fixes:
|
||
|
|
f5c299189c |
BACKPORT: bpf: Check attach type at prog load time
== The problem == There are use-cases when a program of some type can be attached to multiple attach points and those attach points must have different permissions to access context or to call helpers. E.g. context structure may have fields for both IPv4 and IPv6 but it doesn't make sense to read from / write to IPv6 field when attach point is somewhere in IPv4 stack. Same applies to BPF-helpers: it may make sense to call some helper from some attach point, but not from other for same prog type. == The solution == Introduce `expected_attach_type` field in in `struct bpf_attr` for `BPF_PROG_LOAD` command. If scenario described in "The problem" section is the case for some prog type, the field will be checked twice: 1) At load time prog type is checked to see if attach type for it must be known to validate program permissions correctly. Prog will be rejected with EINVAL if it's the case and `expected_attach_type` is not specified or has invalid value. 2) At attach time `attach_type` is compared with `expected_attach_type`, if prog type requires to have one, and, if they differ, attach will be rejected with EINVAL. The `expected_attach_type` is now available as part of `struct bpf_prog` in both `bpf_verifier_ops->is_valid_access()` and `bpf_verifier_ops->get_func_proto()` () and can be used to check context accesses and calls to helpers correspondingly. Initially the idea was discussed by Alexei Starovoitov <ast@fb.com> and Daniel Borkmann <daniel@iogearbox.net> here: https://marc.info/?l=linux-netdev&m=152107378717201&w=2 Change-Id: Idead9c9cb4251bf5bd843b68bcb83072d5746226 Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
||
|
|
e07075f4d5 |
BACKPORT: bpf: add bpf_ktime_get_boot_ns()
On a device like a cellphone which is constantly suspending and resuming CLOCK_MONOTONIC is not particularly useful for keeping track of or reacting to external network events. Instead you want to use CLOCK_BOOTTIME. Hence add bpf_ktime_get_boot_ns() as a mirror of bpf_ktime_get_ns() based around CLOCK_BOOTTIME instead of CLOCK_MONOTONIC. Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> (cherry picked from commit 71d19214776e61b33da48f7c1b46e522c7f78221) Change-Id: Ifd62c410dcc5112fd1a473a7e1f70231ca514bc0 |
||
|
|
a43f1f02b3 |
ring-buffer: Fix a race between readers and resize checks
commit c2274b908db05529980ec056359fae916939fdaa upstream.
The reader code in rb_get_reader_page() swaps a new reader page into the
ring buffer by doing cmpxchg on old->list.prev->next to point it to the
new page. Following that, if the operation is successful,
old->list.next->prev gets updated too. This means the underlying
doubly-linked list is temporarily inconsistent, page->prev->next or
page->next->prev might not be equal back to page for some page in the
ring buffer.
The resize operation in ring_buffer_resize() can be invoked in parallel.
It calls rb_check_pages() which can detect the described inconsistency
and stop further tracing:
[ 190.271762] ------------[ cut here ]------------
[ 190.271771] WARNING: CPU: 1 PID: 6186 at kernel/trace/ring_buffer.c:1467 rb_check_pages.isra.0+0x6a/0xa0
[ 190.271789] Modules linked in: [...]
[ 190.271991] Unloaded tainted modules: intel_uncore_frequency(E):1 skx_edac(E):1
[ 190.272002] CPU: 1 PID: 6186 Comm: cmd.sh Kdump: loaded Tainted: G E 6.9.0-rc6-default #5 158d3e1e6d0b091c34c3b96bfd99a1c58306d79f
[ 190.272011] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552c-rebuilt.opensuse.org 04/01/2014
[ 190.272015] RIP: 0010:rb_check_pages.isra.0+0x6a/0xa0
[ 190.272023] Code: [...]
[ 190.272028] RSP: 0018:ffff9c37463abb70 EFLAGS: 00010206
[ 190.272034] RAX: ffff8eba04b6cb80 RBX: 0000000000000007 RCX: ffff8eba01f13d80
[ 190.272038] RDX: ffff8eba01f130c0 RSI: ffff8eba04b6cd00 RDI: ffff8eba0004c700
[ 190.272042] RBP: ffff8eba0004c700 R08: 0000000000010002 R09: 0000000000000000
[ 190.272045] R10: 00000000ffff7f52 R11: ffff8eba7f600000 R12: ffff8eba0004c720
[ 190.272049] R13: ffff8eba00223a00 R14: 0000000000000008 R15: ffff8eba067a8000
[ 190.272053] FS: 00007f1bd64752c0(0000) GS:ffff8eba7f680000(0000) knlGS:0000000000000000
[ 190.272057] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 190.272061] CR2: 00007f1bd6662590 CR3: 000000010291e001 CR4: 0000000000370ef0
[ 190.272070] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 190.272073] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 190.272077] Call Trace:
[ 190.272098] <TASK>
[ 190.272189] ring_buffer_resize+0x2ab/0x460
[ 190.272199] __tracing_resize_ring_buffer.part.0+0x23/0xa0
[ 190.272206] tracing_resize_ring_buffer+0x65/0x90
[ 190.272216] tracing_entries_write+0x74/0xc0
[ 190.272225] vfs_write+0xf5/0x420
[ 190.272248] ksys_write+0x67/0xe0
[ 190.272256] do_syscall_64+0x82/0x170
[ 190.272363] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 190.272373] RIP: 0033:0x7f1bd657d263
[ 190.272381] Code: [...]
[ 190.272385] RSP: 002b:00007ffe72b643f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 190.272391] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f1bd657d263
[ 190.272395] RDX: 0000000000000002 RSI: 0000555a6eb538e0 RDI: 0000000000000001
[ 190.272398] RBP: 0000555a6eb538e0 R08: 000000000000000a R09: 0000000000000000
[ 190.272401] R10: 0000555a6eb55190 R11: 0000000000000246 R12: 00007f1bd6662500
[ 190.272404] R13: 0000000000000002 R14: 00007f1bd6667c00 R15: 0000000000000002
[ 190.272412] </TASK>
[ 190.272414] ---[ end trace 0000000000000000 ]---
Note that ring_buffer_resize() calls rb_check_pages() only if the parent
trace_buffer has recording disabled. Recent commit d78ab792705c
("tracing: Stop current tracer when resizing buffer") causes that it is
now always the case which makes it more likely to experience this issue.
The window to hit this race is nonetheless very small. To help
reproducing it, one can add a delay loop in rb_get_reader_page():
ret = rb_head_page_replace(reader, cpu_buffer->reader_page);
if (!ret)
goto spin;
for (unsigned i = 0; i < 1U << 26; i++) /* inserted delay loop */
__asm__ __volatile__ ("" : : : "memory");
rb_list_head(reader->list.next)->prev = &cpu_buffer->reader_page->list;
.. and then run the following commands on the target system:
echo 1 > /sys/kernel/tracing/events/sched/sched_switch/enable
while true; do
echo 16 > /sys/kernel/tracing/buffer_size_kb; sleep 0.1
echo 8 > /sys/kernel/tracing/buffer_size_kb; sleep 0.1
done &
while true; do
for i in /sys/kernel/tracing/per_cpu/*; do
timeout 0.1 cat $i/trace_pipe; sleep 0.2
done
done
To fix the problem, make sure ring_buffer_resize() doesn't invoke
rb_check_pages() concurrently with a reader operating on the same
ring_buffer_per_cpu by taking its cpu_buffer->reader_lock.
Link: https://lore.kernel.org/linux-trace-kernel/20240517134008.24529-3-petr.pavlu@suse.com
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fixes:
|