805988 Commits

Author SHA1 Message Date
Michael Bestas
8d3f74786c google_battery: Report battery technology
sunfish is using Li-Po battery

Change-Id: Id6da21d48739acfdc90a4f88402039c0eeb04c12
2025-12-20 17:56:48 +02:00
Michael Bestas
5cfe79f5ee max1720x_battery: Report correct battery technology
coral & flame use Li-Po batteries.

Change-Id: I8782933f029b0f984d76ac909100ae98e1b80f92
2025-12-20 17:56:42 +02:00
Nolen Johnson
e1096a63d8 arch: arm64: configs: floral/sunfish: Disable CFI
* The 5.4 BPF backports introduced the following CFI failure that causes
  crashes on boot:
  ```
   [   39.971958] c7   1177 Kernel panic - not syncing: CFI failure (target: bpf_prog_3cf034b66a4fc635_cgroupsock_inet+0x358/0x1000)
   [   39.971958] c7   1177
   [   40.020188] c7   1177 CPU: 7 PID: 1177 Comm: netd Tainted: G S       C      4.14.355-openela-gc266e5bfed05 #4
   [   40.030374] c7   1177 Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM sm8150 Coral (DT)
   [   40.041090] c7   1177 Call trace:
   [   40.044519] c7   1177  dump_backtrace.cfi_jt+0x0/0x4
   [   40.049632] c7   1177  dump_stack+0x88/0xc0
   [   40.053940] c7   1177  panic+0x134/0x414
   [   40.057980] c7   1177  scs_task_reset+0x0/0x10
   [   40.062553] c7   1177  name_to_dev_t+0x0/0x8a0
   [   40.067129] c7   1177  __cfi_slowpath+0x1f0/0x2b4
   [   40.071966] c7   1177  __cgroup_bpf_run_filter_sk+0x1b0/0x280
   [   40.077872] c7   1177  inet_create+0x4d0/0x528
   [   40.082445] c7   1177  __sock_create+0x190/0x2a0
   [   40.087195] c7   1177  SyS_socket+0x6c/0x150
   [   40.101918] c7   1177  el0_svc_naked+0x34/0x38
   ```
* Prioritize these devices living on, re-enable CFI later when/if
  this is figured out.

Change-Id: Iee4b07ea7f7a562bd5a3e2bec9de906fe14c2389
2025-12-15 22:00:02 -05:00
Michael Bestas
4e27b1cf9f qcacmn: Use proper clock specifier names in functions
Reference: I3463b9045bddeba00d6f9fcf78d63008459c1b9a
Change-Id: Ide613d5a6469bc12cf55979184decd9ebb3cd3d2
2025-10-10 16:09:53 +03:00
Michael Bestas
659c26dfbc Merge branch 'android-msm-pixel-4.14' into lineage-22.2
* 'android-msm-pixel-4.14':
  UPSTREAM: net: inet: do not leave a dangling sk pointer in inet_create()
  UPSTREAM: kdb: use __ktime_get_real_seconds instead of __current_kernel_time
  locking/barriers: Introduce smp_cond_load_relaxed() and atomic_cond_read_relaxed()
  iio: power: pac193x: Use proper clock specifier names in functions
  UPSTREAM: seccomp, bpf: disable preemption before calling into bpf prog
  BACKPORT: bpf: Fix 32 bit src register truncation on div/mod
  BACKPORT: perf, bpf: Introduce PERF_RECORD_BPF_EVENT
  UPSTREAM: perf, bpf: Introduce PERF_RECORD_KSYMBOL
  UPSTREAM: bpf: BPF_PROG_TYPE_CGROUP_{SKB, SOCK, SOCK_ADDR} require cgroups enabled
  UPSTREAM: net: bpfilter: disallow to remove bpfilter module while being used
  UPSTREAM: net: bpfilter: restart bpfilter_umh when error occurred
  UPSTREAM: net: bpfilter: use cleanup callback to release umh_info
  UPSTREAM: signal/bpfilter: Fix bpfilter_kernl to use send_sig not force_sig
  UPSTREAM: net: bpfilter: use get_pid_task instead of pid_task
  UPSTREAM: net: bpfilter: Fix type cast and pointer warnings
  BACKPORT: bpfilter: include bpfilter_umh in assembly instead of using objcopy
  UPSTREAM: bpfilter: fix race in pipe access
  UPSTREAM: bpfilter: fix building without CONFIG_INET
  BACKPORT: umh: add exit routine for UMH process
  UPSTREAM: umh: fix race condition
  ...

 Conflicts:
	Makefile

Change-Id: I1a03d49f21b2f18779eb498a9aeacef76fcbb12d
2025-10-10 16:09:02 +03:00
Ignat Korchagin
c74ae87ccd UPSTREAM: net: inet: do not leave a dangling sk pointer in inet_create()
[ Upstream commit 9365fa510c6f82e3aa550a09d0c5c6b44dbc78ff ]

sock_init_data() attaches the allocated sk object to the provided sock
object. If inet_create() fails later, the sk object is freed, but the
sock object retains the dangling pointer, which may create use-after-free
later.

Clear the sk pointer in the sock object on error.

Change-Id: I6015c5d12e445a1d14dfd5c2417a8410d8b48c6d
Signed-off-by: Ignat Korchagin <ignat@cloudflare.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241014153808.51894-7-ignat@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Ulrich Hecht <uli@kernel.org>
2025-10-10 16:05:51 +03:00
Arnd Bergmann
6c184d1605 UPSTREAM: kdb: use __ktime_get_real_seconds instead of __current_kernel_time
kdb is the only user of the __current_kernel_time() interface, which is
not y2038 safe and should be removed at some point.

The kdb code also goes to great lengths to print the time in a
human-readable format from 'struct timespec', again using a non-y2038-safe
re-implementation of the generic time_to_tm() code.

Using __current_kernel_time() here is necessary since the regular
accessors that require a sequence lock might hang when called during the
xtime update. However, this is safe in the particular case since kdb is
only interested in the tv_sec field that is updated atomically.

In order to make this y2038-safe, I'm converting the code to the generic
time64_to_tm helper, but that introduces the problem that we have no
interface like __current_kernel_time() that provides a 64-bit timestamp
in a lockless, safe and architecture-independent way. I have multiple
ideas for how to solve that:

- __ktime_get_real_seconds() is lockless, but can return
  incorrect results on 32-bit architectures in the special case that
  we are in the process of changing the time across the epoch, either
  during the timer tick that overflows the seconds in 2038, or while
  calling settimeofday.

- ktime_get_real_fast_ns() would work in this context, but does
  require a call into the clocksource driver to return a high-resolution
  timestamp. This may have undesired side-effects in the debugger,
  since we want to limit the interactions with the rest of the kernel.

- Adding a ktime_get_real_fast_seconds() based on tk_fast_mono
  plus tkr->base_real without the tk_clock_read() delta. Not sure about
  the value of adding yet another interface here.

- Changing the existing ktime_get_real_seconds() to use
  tk_fast_mono on 32-bit architectures rather than xtime_sec.  I think
  this could work, but am not entirely sure if this is an improvement.

I picked the first of those for simplicity here. It's technically
not correct but probably good enough as the time is only used for the
debugging output and the race will likely never be hit in practice.
Another downside is having to move the declaration into a public header
file.

Let me know if anyone has a different preference.

Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patchwork.kernel.org/patch/9775309/
Change-Id: Ic5c968feb81a895868090a60ecde4388f3483838
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2025-10-10 16:05:45 +03:00
Will Deacon
c7f1afcb67 locking/barriers: Introduce smp_cond_load_relaxed() and atomic_cond_read_relaxed()
Whilst we currently provide smp_cond_load_acquire() and
atomic_cond_read_acquire(), there are cases where the ACQUIRE semantics
are
not required because of a subsequent fence or release operation once the
conditional loop has exited.

This patch adds relaxed versions of the conditional spinning primitives
to avoid unnecessary barrier overhead on architectures such as arm64.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: boqun.feng@gmail.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: paulmck@linux.vnet.ibm.com
Link:
http: //lkml.kernel.org/r/1524738868-31318-2-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: atndko <z1281552865@gmail.com>
Change-Id: I292516e86abd5cf3032151fa9e6cc0d36d1ceda4
2025-10-10 06:22:07 +03:00
Michael Bestas
00cad583fb iio: power: pac193x: Use proper clock specifier names in functions
Reference: I3463b9045bddeba00d6f9fcf78d63008459c1b9a
Change-Id: If84a641c534283d9f3e3974b094ea23de2cf6260
2025-10-10 05:38:12 +03:00
Alexei Starovoitov
cd91fc001c UPSTREAM: seccomp, bpf: disable preemption before calling into bpf prog
All BPF programs must be called with preemption disabled.

Fixes: 568f196756ad ("bpf: check that BPF programs run with preemption disabled")
Reported-by: syzbot+8bf19ee2aa580de7a2a7@syzkaller.appspotmail.com
Change-Id: Ia5fa93009d0e31261eab2890b435730f4e310c6a
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2025-10-10 05:15:18 +03:00
Daniel Borkmann
878da6fdad BACKPORT: bpf: Fix 32 bit src register truncation on div/mod
commit e88b2c6e5a4d9ce30d75391e4d950da74bb2bd90 upstream.

While reviewing a different fix, John and I noticed an oddity in one of the
BPF program dumps that stood out, for example:

  # bpftool p d x i 13
   0: (b7) r0 = 808464450
   1: (b4) w4 = 808464432
   2: (bc) w0 = w0
   3: (15) if r0 == 0x0 goto pc+1
   4: (9c) w4 %= w0
  [...]

In line 2 we noticed that the mov32 would 32 bit truncate the original src
register for the div/mod operation. While for the two operations the dst
register is typically marked unknown e.g. from adjust_scalar_min_max_vals()
the src register is not, and thus verifier keeps tracking original bounds,
simplified:

  0: R1=ctx(id=0,off=0,imm=0) R10=fp0
  0: (b7) r0 = -1
  1: R0_w=invP-1 R1=ctx(id=0,off=0,imm=0) R10=fp0
  1: (b7) r1 = -1
  2: R0_w=invP-1 R1_w=invP-1 R10=fp0
  2: (3c) w0 /= w1
  3: R0_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R1_w=invP-1 R10=fp0
  3: (77) r1 >>= 32
  4: R0_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R1_w=invP4294967295 R10=fp0
  4: (bf) r0 = r1
  5: R0_w=invP4294967295 R1_w=invP4294967295 R10=fp0
  5: (95) exit
  processed 6 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

Runtime result of r0 at exit is 0 instead of expected -1. Remove the
verifier mov32 src rewrite in div/mod and replace it with a jmp32 test
instead. After the fix, we result in the following code generation when
having dividend r1 and divisor r6:

  div, 64 bit:                             div, 32 bit:

   0: (b7) r6 = 8                           0: (b7) r6 = 8
   1: (b7) r1 = 8                           1: (b7) r1 = 8
   2: (55) if r6 != 0x0 goto pc+2           2: (56) if w6 != 0x0 goto pc+2
   3: (ac) w1 ^= w1                         3: (ac) w1 ^= w1
   4: (05) goto pc+1                        4: (05) goto pc+1
   5: (3f) r1 /= r6                         5: (3c) w1 /= w6
   6: (b7) r0 = 0                           6: (b7) r0 = 0
   7: (95) exit                             7: (95) exit

  mod, 64 bit:                             mod, 32 bit:

   0: (b7) r6 = 8                           0: (b7) r6 = 8
   1: (b7) r1 = 8                           1: (b7) r1 = 8
   2: (15) if r6 == 0x0 goto pc+1           2: (16) if w6 == 0x0 goto pc+1
   3: (9f) r1 %= r6                         3: (9c) w1 %= w6
   4: (b7) r0 = 0                           4: (b7) r0 = 0
   5: (95) exit                             5: (95) exit

x86 in particular can throw a 'divide error' exception for div
instruction not only for divisor being zero, but also for the case
when the quotient is too large for the designated register. For the
edx:eax and rdx:rax dividend pair it is not an issue in x86 BPF JIT
since we always zero edx (rdx). Hence really the only protection
needed is against divisor being zero.

Also add some other code missed when backporting.

Fixes: 68fda450a7df ("bpf: fix 32-bit divide by zero")
Co-developed-by: John Fastabend <john.fastabend@gmail.com>
Change-Id: I35a7f4f346bbcbc2f01003e607f2b00b7abe92ae
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-10 05:15:18 +03:00
Song Liu
59829c4c8d BACKPORT: perf, bpf: Introduce PERF_RECORD_BPF_EVENT
For better performance analysis of BPF programs, this patch introduces
PERF_RECORD_BPF_EVENT, a new perf_event_type that exposes BPF program
load/unload information to user space.

Each BPF program may contain up to BPF_MAX_SUBPROGS (256) sub programs.
The following example shows kernel symbols for a BPF program with 7 sub
programs:

    ffffffffa0257cf9 t bpf_prog_b07ccb89267cf242_F
    ffffffffa02592e1 t bpf_prog_2dcecc18072623fc_F
    ffffffffa025b0e9 t bpf_prog_bb7a405ebaec5d5c_F
    ffffffffa025dd2c t bpf_prog_a7540d4a39ec1fc7_F
    ffffffffa025fcca t bpf_prog_05762d4ade0e3737_F
    ffffffffa026108f t bpf_prog_db4bd11e35df90d4_F
    ffffffffa0263f00 t bpf_prog_89d64e4abf0f0126_F
    ffffffffa0257cf9 t bpf_prog_ae31629322c4b018__dummy_tracepoi

When a bpf program is loaded, PERF_RECORD_KSYMBOL is generated for each
of these sub programs. Therefore, PERF_RECORD_BPF_EVENT is not needed
for simple profiling.

For annotation, user space need to listen to PERF_RECORD_BPF_EVENT and
gather more information about these (sub) programs via sys_bpf.

Change-Id: I8ed02f808501c32f406108c282c853a56d0dcc25
Signed-off-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradeaed.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team@fb.com
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20190117161521.1341602-4-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-10 05:15:18 +03:00
Song Liu
4c46c50e97 UPSTREAM: perf, bpf: Introduce PERF_RECORD_KSYMBOL
For better performance analysis of dynamically JITed and loaded kernel
functions, such as BPF programs, this patch introduces
PERF_RECORD_KSYMBOL, a new perf_event_type that exposes kernel symbol
register/unregister information to user space.

The following data structure is used for PERF_RECORD_KSYMBOL.

    /*
     * struct {
     *      struct perf_event_header        header;
     *      u64                             addr;
     *      u32                             len;
     *      u16                             ksym_type;
     *      u16                             flags;
     *      char                            name[];
     *      struct sample_id                sample_id;
     * };
     */

Change-Id: I3e6901ef579878015f6a75d15699230882f79e1f
Signed-off-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team@fb.com
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20190117161521.1341602-2-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-10 05:15:18 +03:00
Stanislav Fomichev
6251b00a32 UPSTREAM: bpf: BPF_PROG_TYPE_CGROUP_{SKB, SOCK, SOCK_ADDR} require cgroups enabled
There is no way to exercise appropriate attach points without cgroups
enabled. This lets test_verifier correctly skip tests for these
prog_types if kernel was compiled without BPF cgroup support.

Change-Id: Icadf66ef5cb1b78c43c710df0c5ac9c70859d7c3
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2025-10-10 05:15:18 +03:00
Taehee Yoo
37e113d0b1 UPSTREAM: net: bpfilter: disallow to remove bpfilter module while being used
The bpfilter.ko module can be removed while functions of the bpfilter.ko
are executing. so panic can occurred. in order to protect that, locks can
be used. a bpfilter_lock protects routines in the
__bpfilter_process_sockopt() but it's not enough because __exit routine
can be executed concurrently.

Now, the bpfilter_umh can not run in parallel.
So, the module do not removed while it's being used and it do not
double-create UMH process.
The members of the umh_info and the bpfilter_umh_ops are protected by
the bpfilter_umh_ops.lock.

test commands:
   while :
   do
	iptables -I FORWARD -m string --string ap --algo kmp &
	modprobe -rv bpfilter &
   done

splat looks like:
[  298.623435] BUG: unable to handle kernel paging request at fffffbfff807440b
[  298.628512] #PF error: [normal kernel read fault]
[  298.633018] PGD 124327067 P4D 124327067 PUD 11c1a3067 PMD 119eb2067 PTE 0
[  298.638859] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[  298.638859] CPU: 0 PID: 2997 Comm: iptables Not tainted 4.20.0+ #154
[  298.638859] RIP: 0010:__mutex_lock+0x6b9/0x16a0
[  298.638859] Code: c0 00 00 e8 89 82 ff ff 80 bd 8f fc ff ff 00 0f 85 d9 05 00 00 48 8b 85 80 fc ff ff 48 bf 00 00 00 00 00 fc ff df 48 c1 e8 03 <80> 3c 38 00 0f 85 1d 0e 00 00 48 8b 85 c8 fc ff ff 49 39 47 58 c6
[  298.638859] RSP: 0018:ffff88810e7777a0 EFLAGS: 00010202
[  298.638859] RAX: 1ffffffff807440b RBX: ffff888111bd4d80 RCX: 0000000000000000
[  298.638859] RDX: 1ffff110235ff806 RSI: ffff888111bd5538 RDI: dffffc0000000000
[  298.638859] RBP: ffff88810e777b30 R08: 0000000080000002 R09: 0000000000000000
[  298.638859] R10: 0000000000000000 R11: 0000000000000000 R12: fffffbfff168a42c
[  298.638859] R13: ffff888111bd4d80 R14: ffff8881040e9a05 R15: ffffffffc03a2000
[  298.638859] FS:  00007f39e3758700(0000) GS:ffff88811ae00000(0000) knlGS:0000000000000000
[  298.638859] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  298.638859] CR2: fffffbfff807440b CR3: 000000011243e000 CR4: 00000000001006f0
[  298.638859] Call Trace:
[  298.638859]  ? mutex_lock_io_nested+0x1560/0x1560
[  298.638859]  ? kasan_kmalloc+0xa0/0xd0
[  298.638859]  ? kmem_cache_alloc+0x1c2/0x260
[  298.638859]  ? __alloc_file+0x92/0x3c0
[  298.638859]  ? alloc_empty_file+0x43/0x120
[  298.638859]  ? alloc_file_pseudo+0x220/0x330
[  298.638859]  ? sock_alloc_file+0x39/0x160
[  298.638859]  ? __sys_socket+0x113/0x1d0
[  298.638859]  ? __x64_sys_socket+0x6f/0xb0
[  298.638859]  ? do_syscall_64+0x138/0x560
[  298.638859]  ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  298.638859]  ? __alloc_file+0x92/0x3c0
[  298.638859]  ? init_object+0x6b/0x80
[  298.638859]  ? cyc2ns_read_end+0x10/0x10
[  298.638859]  ? cyc2ns_read_end+0x10/0x10
[  298.638859]  ? hlock_class+0x140/0x140
[  298.638859]  ? sched_clock_local+0xd4/0x140
[  298.638859]  ? sched_clock_local+0xd4/0x140
[  298.638859]  ? check_flags.part.37+0x440/0x440
[  298.638859]  ? __lock_acquire+0x4f90/0x4f90
[  298.638859]  ? set_rq_offline.part.89+0x140/0x140
[ ... ]

Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module")
Change-Id: I3186f2ec6cbf8263aed16d562021f741ab25bf86
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-10-10 05:15:17 +03:00
Taehee Yoo
877467a023 UPSTREAM: net: bpfilter: restart bpfilter_umh when error occurred
The bpfilter_umh will be stopped via __stop_umh() when the bpfilter
error occurred.
The bpfilter_umh() couldn't start again because there is no restart
routine.

The section of the bpfilter_umh_{start/end} is no longer .init.rodata
because these area should be reused in the restart routine. hence
the section name is changed to .bpfilter_umh.

The bpfilter_ops->start() is restart callback. it will be called when
bpfilter_umh is stopped.
The stop bit means bpfilter_umh is stopped. this bit is set by both
start and stop routine.

Before this patch,
Test commands:
   $ iptables -vnL
   $ kill -9 <pid of bpfilter_umh>
   $ iptables -vnL
   [  480.045136] bpfilter: write fail -32
   $ iptables -vnL

All iptables commands will fail.

After this patch,
Test commands:
   $ iptables -vnL
   $ kill -9 <pid of bpfilter_umh>
   $ iptables -vnL
   $ iptables -vnL

Now, all iptables commands will work.

Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module")
Change-Id: I4c1fa7fd3af172967e450f3db8fab77434b341b6
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-10-10 05:15:17 +03:00
Taehee Yoo
a36282e83e UPSTREAM: net: bpfilter: use cleanup callback to release umh_info
Now, UMH process is killed, do_exit() calls the umh_info->cleanup callback
to release members of the umh_info.
This patch makes bpfilter_umh's cleanup routine to use the
umh_info->cleanup callback.

Change-Id: I865aa89eb58d9bba883ed59e4b921fe401413853
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-10-10 05:15:17 +03:00
Eric W. Biederman
b1701ab72a UPSTREAM: signal/bpfilter: Fix bpfilter_kernl to use send_sig not force_sig
[ Upstream commit 1dfd1711de2952fd1bfeea7152bd1687a4eea771 ]

The locking in force_sig_info is not prepared to deal with
a task that exits or execs (as sighand may change).  As force_sig
is only built to handle synchronous exceptions.

Further the function force_sig_info changes the signal state if the
signal is ignored, or blocked or if SIGNAL_UNKILLABLE will prevent the
delivery of the signal.  The signal SIGKILL can not be ignored and can
not be blocked and SIGNAL_UNKILLABLE won't prevent it from being
delivered.

So using force_sig rather than send_sig for SIGKILL is pointless.

Because it won't impact the sending of the signal and and because
using force_sig is wrong, replace force_sig with send_sig.

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module")
Change-Id: Id62ece4eade762ecc94dd3e4a940907e25c48fab
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-10 05:15:17 +03:00
Taehee Yoo
2bb861eb81 UPSTREAM: net: bpfilter: use get_pid_task instead of pid_task
pid_task() dereferences rcu protected tasks array.
But there is no rcu_read_lock() in shutdown_umh() routine so that
rcu_read_lock() is needed.
get_pid_task() is wrapper function of pid_task. it holds rcu_read_lock()
then calls pid_task(). if task isn't NULL, it increases reference count
of task.

test commands:
   %modprobe bpfilter
   %modprobe -rv bpfilter

splat looks like:
[15102.030932] =============================
[15102.030957] WARNING: suspicious RCU usage
[15102.030985] 4.19.0-rc7+ #21 Not tainted
[15102.031010] -----------------------------
[15102.031038] kernel/pid.c:330 suspicious rcu_dereference_check() usage!
[15102.031063]
	       other info that might help us debug this:

[15102.031332]
	       rcu_scheduler_active = 2, debug_locks = 1
[15102.031363] 1 lock held by modprobe/1570:
[15102.031389]  #0: 00000000580ef2b0 (bpfilter_lock){+.+.}, at: stop_umh+0x13/0x52 [bpfilter]
[15102.031552]
               stack backtrace:
[15102.031583] CPU: 1 PID: 1570 Comm: modprobe Not tainted 4.19.0-rc7+ #21
[15102.031607] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015
[15102.031628] Call Trace:
[15102.031676]  dump_stack+0xc9/0x16b
[15102.031723]  ? show_regs_print_info+0x5/0x5
[15102.031801]  ? lockdep_rcu_suspicious+0x117/0x160
[15102.031855]  pid_task+0x134/0x160
[15102.031900]  ? find_vpid+0xf0/0xf0
[15102.032017]  shutdown_umh.constprop.1+0x1e/0x53 [bpfilter]
[15102.032055]  stop_umh+0x46/0x52 [bpfilter]
[15102.032092]  __x64_sys_delete_module+0x47e/0x570
[ ... ]

Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module")
Change-Id: Ideb9fdae191efb594fa2a258d5e559fb0bba39ce
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-10-10 05:15:17 +03:00
Shanthosh RK
779d3afa0e UPSTREAM: net: bpfilter: Fix type cast and pointer warnings
Fixes the following Sparse warnings:

net/bpfilter/bpfilter_kern.c:62:21: warning: cast removes address space
of expression
net/bpfilter/bpfilter_kern.c:101:49: warning: Using plain integer as
NULL pointer

Change-Id: I18ad265abd911bab943a3c84a3dac540e28a1da6
Signed-off-by: Shanthosh RK <shanthosh.rk@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-10-10 05:15:16 +03:00
Masahiro Yamada
37c55622bd BACKPORT: bpfilter: include bpfilter_umh in assembly instead of using objcopy
What we want here is to embed a user-space program into the kernel.
Instead of the complex ELF magic, let's simply wrap it in the assembly
with the '.incbin' directive.

Change-Id: I8a532c9f9c5d2bc853821becb40832d56cdbb72c
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-10-10 05:15:16 +03:00
Alexei Starovoitov
bb85cafcdd UPSTREAM: bpfilter: fix race in pipe access
syzbot reported the following crash
[  338.293946] bpfilter: read fail -512
[  338.304515] kasan: GPF could be caused by NULL-ptr deref or user memory access
[  338.311863] general protection fault: 0000 [#1] SMP KASAN
[  338.344360] RIP: 0010:__vfs_write+0x4a6/0x960
[  338.426363] Call Trace:
[  338.456967]  __kernel_write+0x10c/0x380
[  338.460928]  __bpfilter_process_sockopt+0x1d8/0x35b
[  338.487103]  bpfilter_mbox_request+0x4d/0xb0
[  338.491492]  bpfilter_ip_get_sockopt+0x6b/0x90

This can happen when multiple cpus trying to talk to user mode process
via bpfilter_mbox_request(). One cpu grabs the mutex while another goes to
sleep on the same mutex. Then former cpu sees that umh pipe is down and
shuts down the pipes. Later cpu finally acquires the mutex and crashes
on freed pipe.
Fix the race by using info.pid as an indicator that umh and pipes are healthy
and check it after acquiring the mutex.

Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module")
Reported-by: syzbot+7ade6c94abb2774c0fee@syzkaller.appspotmail.com
Change-Id: Ic3f5aa80143601b532f6ca2feb4ef97b249805f0
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-10-10 05:15:16 +03:00
Arnd Bergmann
177bcf1e2b UPSTREAM: bpfilter: fix building without CONFIG_INET
bpfilter_process_sockopt is a callback that gets called from
ip_setsockopt() and ip_getsockopt(). However, when CONFIG_INET is
disabled, it never gets called at all, and assigning a function to the
callback pointer results in a link failure:

net/bpfilter/bpfilter_kern.o: In function `__stop_umh':
bpfilter_kern.c:(.text.unlikely+0x3): undefined reference to `bpfilter_process_sockopt'
net/bpfilter/bpfilter_kern.o: In function `load_umh':
bpfilter_kern.c:(.init.text+0x73): undefined reference to `bpfilter_process_sockopt'

Since there is no caller in this configuration, I assume we can
simply make the assignment conditional.

Change-Id: Ia233a7b3be4b060f653aad9d4a3f045ba3b8a07e
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-10-10 05:15:16 +03:00
Taehee Yoo
b5e8b9574f BACKPORT: umh: add exit routine for UMH process
A UMH process which is created by the fork_usermode_blob() such as
bpfilter needs to release members of the umh_info when process is
terminated.
But the do_exit() does not release members of the umh_info. hence module
which uses UMH needs own code to detect whether UMH process is
terminated or not.
But this implementation needs extra code for checking the status of
UMH process. it eventually makes the code more complex.

The new PF_UMH flag is added and it is used to identify UMH processes.
The exit_umh() does not release members of the umh_info.
Hence umh_info->cleanup callback should release both members of the
umh_info and the private data.

Suggested-by: David S. Miller <davem@davemloft.net>
Change-Id: I280bc1d8a89c544dfb215a63e45f79e39f39c2a2
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-10-10 05:15:15 +03:00
Alexei Starovoitov
4b6e90408d UPSTREAM: umh: fix race condition
kasan reported use-after-free:
BUG: KASAN: use-after-free in call_usermodehelper_exec_work+0x2d3/0x310 kernel/umh.c:195
Write of size 4 at addr ffff8801d9202370 by task kworker/u4:2/50
Workqueue: events_unbound call_usermodehelper_exec_work
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
 print_address_description+0x6c/0x20b mm/kasan/report.c:256
 kasan_report_error mm/kasan/report.c:354 [inline]
 kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
 __asan_report_store4_noabort+0x17/0x20 mm/kasan/report.c:437
 call_usermodehelper_exec_work+0x2d3/0x310 kernel/umh.c:195
 process_one_work+0xc1e/0x1b50 kernel/workqueue.c:2145
 worker_thread+0x1cc/0x1440 kernel/workqueue.c:2279
 kthread+0x345/0x410 kernel/kthread.c:240
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412

The reason is that 'sub_info' cannot be accessed out of parent task
context, since it will be freed by the child.
Instead remember the pid in the child task.

Fixes: 449325b52b7a ("umh: introduce fork_usermode_blob() helper")
Reported-by: syzbot+2c73319c406f1987d156@syzkaller.appspotmail.com
Change-Id: I1b36abbc64252ce37a871e8ad13d86a3da26bed2
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-10-10 05:10:38 +03:00
Olivier Brunel
65bb06ea65 UPSTREAM: umh: Add command line to user mode helpers
User mode helpers were spawned without a command line, and because
an empty command line is used by many tools to identify processes as
kernel threads, this could cause some issues.

Notably during killing spree on shutdown, since such helper would then
be skipped (i.e. not killed) which would result in the process remaining
alive, and thus preventing unmouting of the rootfs (as experienced with
the bpfilter umh).

Fixes: 449325b52b7a ("umh: introduce fork_usermode_blob() helper")
Change-Id: If4e6324e5826b1e5bbd9af16ddf3f24cf5f77143
Signed-off-by: Olivier Brunel <jjk@jjacky.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-10-10 05:10:38 +03:00
Stanislav Fomichev
92d69585fc BACKPORT: bpf/flow_dissector: support ipv6 flow_label and BPF_FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL
Add support for exporting ipv6 flow label via bpf_flow_keys.
Export flow label from bpf_flow.c and also return early when
BPF_FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL is passed.

Acked-by: Petar Penkov <ppenkov@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Petar Penkov <ppenkov@google.com>
Change-Id: I6b4c3771022f19c184867fb6045351d59cfde68b
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-10-10 05:10:38 +03:00
Stanislav Fomichev
ea08c9239a UPSTREAM: bpf: Move skb->len == 0 checks into __bpf_redirect
[ Upstream commit 114039b342014680911c35bd6b72624180fd669a ]

To avoid potentially breaking existing users.

Both mac/no-mac cases have to be amended; mac_header >= network_header
is not enough (verified with a new test, see next patch).

Fixes: fd1894224407 ("bpf: Don't redirect packets with invalid pkt_len")
Change-Id: If63ce98321997875863f3fce255061e8a64bba5e
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20221121180340.1983627-1-sdf@google.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-10 05:10:38 +03:00
Zhengchao Shao
55db37ac28 UPSTREAM: bpf: Don't redirect packets with invalid pkt_len
commit fd1894224407c484f652ad456e1ce423e89bb3eb upstream.

Syzbot found an issue [1]: fq_codel_drop() try to drop a flow whitout any
skbs, that is, the flow->head is null.
The root cause, as the [2] says, is because that bpf_prog_test_run_skb()
run a bpf prog which redirects empty skbs.
So we should determine whether the length of the packet modified by bpf
prog or others like bpf_prog_test is valid before forwarding it directly.

LINK: [1] https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5
LINK: [2] https://www.spinics.net/lists/netdev/msg777503.html

Reported-by: syzbot+7a12909485b94426aceb@syzkaller.appspotmail.com
Change-Id: Ieba6ad52b2eb657fb02ef7d3e414932d7743f073
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220715115559.139691-1-shaozhengchao@huawei.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-10 05:10:38 +03:00
Stanislav Fomichev
2e3ed15e06 UPSTREAM: bpf/flow_dissector: support flags in BPF_PROG_TEST_RUN
This will allow us to write tests for those flags.

v2:
* Swap kfree(data) and kfree(user_ctx) (Song Liu)

Acked-by: Petar Penkov <ppenkov@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Petar Penkov <ppenkov@google.com>
Change-Id: Ic1b6920aa27ee21945df37585af2b0dfe37dca6e
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-10-10 05:10:37 +03:00
Stanislav Fomichev
b1066b18b6 UPSTREAM: bpf/flow_dissector: pass input flags to BPF flow dissector program
C flow dissector supports input flags that tell it to customize parsing
by either stopping early or trying to parse as deep as possible. Pass
those flags to the BPF flow dissector so it can make the same
decisions. In the next commits I'll add support for those flags to
our reference bpf_flow.c

v3:
* Export copy of flow dissector flags instead of moving (Alexei Starovoitov)

Acked-by: Petar Penkov <ppenkov@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Petar Penkov <ppenkov@google.com>
Change-Id: I46a68f8b2249915fff5d97a1394ea662d9a0ac46
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-10-10 05:10:37 +03:00
Stanislav Fomichev
90dd5c6e4a UPSTREAM: flow_dissector: remove unused FLOW_DISSECTOR_F_STOP_AT_L3 flag
This flag is not used by any caller, remove it.

Change-Id: I280df0c749d35d6259fdcf3c7f9d3d142e6289a4
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-10-10 05:10:37 +03:00
Stanislav Fomichev
950546b3a3 UPSTREAM: flow_dissector: handle no-skb use case
When called without skb, gather all required data from the
__skb_flow_dissect's arguments and use recently introduces
no-skb mode of bpf flow dissector.

Note: WARN_ON_ONCE(!net) will now trigger for eth_get_headlen users.

Change-Id: Ice7d27b1adaeb9adb02ea52fda999450a6d765af
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2025-10-10 05:10:37 +03:00
Stanislav Fomichev
576577e335 UPSTREAM: net: plumb network namespace into __skb_flow_dissect
This new argument will be used in the next patches for the
eth_get_headlen use case. eth_get_headlen calls flow dissector
with only data (without skb) so there is currently no way to
pull attached BPF flow dissector program. With this new argument,
we can amend the callers to explicitly pass network namespace
so we can use attached BPF program.

Change-Id: I07dbe25c77a13f8cc33b037e63c532979d290ad0
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2025-10-10 05:10:37 +03:00
Paolo Abeni
1a31a45a69 BACKPORT: flow_dissector: do not rely on implicit casts
This change fixes a couple of type mismatch reported by the sparse
tool, explicitly using the requested type for the offending arguments.

Change-Id: I73920d0a210f1ef2229a406c49844970208d6ada
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-10-10 05:10:36 +03:00
YueHaibing
f80c0fae3a UPSTREAM: net: skbuff.h: fix using plain integer as NULL warning
Fixes the following sparse warning:
./include/linux/skbuff.h:2365:58: warning: Using plain integer as NULL pointer

Change-Id: Ic8e19a78428b9bb31a1a47f9add52e8642872dd3
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-10-10 05:10:36 +03:00
Paolo Abeni
ef1fc7549d BACKPORT: net: core: rework basic flow dissection helper
When the core networking needs to detect the transport offset in a given
packet and parse it explicitly, a full-blown flow_keys struct is used for
storage.
This patch introduces a smaller keys store, rework the basic flow dissect
helper to use it, and apply this new helper where possible - namely in
skb_probe_transport_header(). The used flow dissector data structures
are renamed to match more closely the new role.

The above gives ~50% performance improvement in micro benchmarking around
skb_probe_transport_header() and ~30% around eth_get_headlen(), mostly due
to the smaller memset. Small, but measurable improvement is measured also
in macro benchmarking.

v1 -> v2: use the new helper in eth_get_headlen() and skb_get_poff(),
  as per DaveM suggestion

[Linux4: Apply to include/linux/virtio_net.h]

Suggested-by: David Miller <davem@davemloft.net>
Change-Id: Iebe280d3793028c9dd640b5dc59d046601f02cca
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-10-10 05:10:36 +03:00
Eric Dumazet
ec56fc209f UPSTREAM: flow_dissector: disable preemption around BPF calls
Various things in eBPF really require us to disable preemption
before running an eBPF program.

syzbot reported :

BUG: assuming atomic context at net/core/flow_dissector.c:737
in_atomic(): 0, irqs_disabled(): 0, pid: 24710, name: syz-executor.3
2 locks held by syz-executor.3/24710:
 #0: 00000000e81a4bf1 (&tfile->napi_mutex){+.+.}, at: tun_get_user+0x168e/0x3ff0 drivers/net/tun.c:1850
 #1: 00000000254afebd (rcu_read_lock){....}, at: __skb_flow_dissect+0x1e1/0x4bb0 net/core/flow_dissector.c:822
CPU: 1 PID: 24710 Comm: syz-executor.3 Not tainted 5.1.0+ #6
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 __cant_sleep kernel/sched/core.c:6165 [inline]
 __cant_sleep.cold+0xa3/0xbb kernel/sched/core.c:6142
 bpf_flow_dissect+0xfe/0x390 net/core/flow_dissector.c:737
 __skb_flow_dissect+0x362/0x4bb0 net/core/flow_dissector.c:853
 skb_flow_dissect_flow_keys_basic include/linux/skbuff.h:1322 [inline]
 skb_probe_transport_header include/linux/skbuff.h:2500 [inline]
 skb_probe_transport_header include/linux/skbuff.h:2493 [inline]
 tun_get_user+0x2cfe/0x3ff0 drivers/net/tun.c:1940
 tun_chr_write_iter+0xbd/0x156 drivers/net/tun.c:2037
 call_write_iter include/linux/fs.h:1872 [inline]
 do_iter_readv_writev+0x5fd/0x900 fs/read_write.c:693
 do_iter_write fs/read_write.c:970 [inline]
 do_iter_write+0x184/0x610 fs/read_write.c:951
 vfs_writev+0x1b3/0x2f0 fs/read_write.c:1015
 do_writev+0x15b/0x330 fs/read_write.c:1058
 __do_sys_writev fs/read_write.c:1131 [inline]
 __se_sys_writev fs/read_write.c:1128 [inline]
 __x64_sys_writev+0x75/0xb0 fs/read_write.c:1128
 do_syscall_64+0x103/0x670 arch/x86/entry/common.c:298
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: d58e468b1112 ("flow_dissector: implements flow dissector BPF hook")
Change-Id: I326ce1e572ae9c2dc060c172dbd0227bb8ee9eef
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Petar Penkov <ppenkov@google.com>
Cc: Stanislav Fomichev <sdf@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-10-10 05:10:36 +03:00
Matt Mullins
60d6ddbbcb UPSTREAM: selftests: bpf: test writable buffers in raw tps
This tests that:
  * a BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE cannot be attached if it
    uses either:
    * a variable offset to the tracepoint buffer, or
    * an offset beyond the size of the tracepoint buffer
  * a tracer can modify the buffer provided when attached to a writable
    tracepoint in bpf_prog_test_run

Change-Id: I3fdf650c1d9a68d64d08b170b14e46a35d180dd7
Signed-off-by: Matt Mullins <mmullins@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-10-10 05:10:35 +03:00
Stanislav Fomichev
24ba78cf32 UPSTREAM: bpf/flow_dissector: don't adjust nhoff by ETH_HLEN in BPF_PROG_TEST_RUN
Now that we use skb-less flow dissector let's return true nhoff and
thoff. We used to adjust them by ETH_HLEN because that's how it was
done in the skb case. For VLAN tests that looks confusing: nhoff is
pointing to vlan parts :-\

Warning, this is an API change for BPF_PROG_TEST_RUN! Feel free to drop
if you think that it's too late at this point to fix it.

Change-Id: Icae82218f62c91b7860981bc62269eba8e58c25c
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2025-10-10 05:10:35 +03:00
Stanislav Fomichev
197add7571 UPSTREAM: bpf: when doing BPF_PROG_TEST_RUN for flow dissector use no-skb mode
Now that we have bpf_flow_dissect which can work on raw data,
use it when doing BPF_PROG_TEST_RUN for flow dissector.

Simplifies bpf_prog_test_run_flow_dissector and allows us to
test no-skb mode.

Note, that previously, with bpf_flow_dissect_skb we used to call
eth_type_trans which pulled L2 (ETH_HLEN) header and we explicitly called
skb_reset_network_header. That means flow_keys->nhoff would be
initialized to 0 (skb_network_offset) in init_flow_keys.
Now we call bpf_flow_dissect with nhoff set to ETH_HLEN and need
to undo it once the dissection is done to preserve the existing behavior.

Change-Id: Icaf9c2a76c6176c647081535f7d48d94fff2e126
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2025-10-10 05:10:35 +03:00
Stanislav Fomichev
633f94f1b3 UPSTREAM: bpf: explicitly prohibit ctx_{in, out} in non-skb BPF_PROG_TEST_RUN
This should allow us later to extend BPF_PROG_TEST_RUN for non-skb case
and be sure that nobody is erroneously setting ctx_{in,out}.

Fixes: b0b9395d865e ("bpf: support input __sk_buff context in BPF_PROG_TEST_RUN")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Change-Id: Ibe85a9217cd265966ef57e1298b37a6b9baa50c4
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2025-10-10 05:10:35 +03:00
Bo YU
96748a8305 UPSTREAM: bpf: fix warning about using plain integer as NULL
Sparse warning below:

sudo make C=2 CF=-D__CHECK_ENDIAN__ M=net/bpf/
CHECK   net/bpf//test_run.c
net/bpf//test_run.c:19:77: warning: Using plain integer as NULL pointer
./include/linux/bpf-cgroup.h:295:77: warning: Using plain integer as NULL pointer

Fixes: 8bad74f9840f ("bpf: extend cgroup bpf core to allow multiple cgroup storage types")
Acked-by: Yonghong Song <yhs@fb.com>
Change-Id: I1dba06e69eb1e3b4a3163dc32b48c4fc6bdc755d
Signed-off-by: Bo YU <tsu.yubo@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2025-10-10 05:10:35 +03:00
Stanislav Fomichev
a141d3a0ae UPSTREAM: bpf/test_run: fix unkillable BPF_PROG_TEST_RUN for flow dissector
Syzbot found out that running BPF_PROG_TEST_RUN with repeat=0xffffffff
makes process unkillable. The problem is that when CONFIG_PREEMPT is
enabled, we never see need_resched() return true. This is due to the
fact that preempt_enable() (which we do in bpf_test_run_one on each
iteration) now handles resched if it's needed.

Let's disable preemption for the whole run, not per test. In this case
we can properly see whether resched is needed.
Let's also properly return -EINTR to the userspace in case of a signal
interrupt.

This is a follow up for a recently fixed issue in bpf_test_run, see
commit df1a2cb7c74b ("bpf/test_run: fix unkillable
BPF_PROG_TEST_RUN").

Reported-by: syzbot <syzkaller@googlegroups.com>
Change-Id: I38d1d9553d1ee281666de7000d06684962658397
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2025-10-10 05:10:34 +03:00
Stanislav Fomichev
ffcd4a79e2 UPSTREAM: bpf/test_run: fix unkillable BPF_PROG_TEST_RUN
Syzbot found out that running BPF_PROG_TEST_RUN with repeat=0xffffffff
makes process unkillable. The problem is that when CONFIG_PREEMPT is
enabled, we never see need_resched() return true. This is due to the
fact that preempt_enable() (which we do in bpf_test_run_one on each
iteration) now handles resched if it's needed.

Let's disable preemption for the whole run, not per test. In this case
we can properly see whether resched is needed.
Let's also properly return -EINTR to the userspace in case of a signal
interrupt.

See recent discussion:
http://lore.kernel.org/netdev/CAH3MdRWHr4N8jei8jxDppXjmw-Nw=puNDLbu1dQOFQHxfU2onA@mail.gmail.com

I'll follow up with the same fix bpf_prog_test_run_flow_dissector in
bpf-next.

Reported-by: syzbot <syzkaller@googlegroups.com>
Change-Id: I6744e83dc918d20f44cdf9f37cbb0f223a970aa4
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2025-10-10 05:10:34 +03:00
Lorenz Bauer
6435fd3f19 UPSTREAM: bpf: respect size hint to BPF_PROG_TEST_RUN if present
Use data_size_out as a size hint when copying test output to user space.
ENOSPC is returned if the output buffer is too small.
Callers which so far did not set data_size_out are not affected.

Change-Id: Ic1a42d1903e96a26a27a56489b75be05c58996ff
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-10-10 05:10:34 +03:00
Alan Maguire
dff662af4a BACKPORT: bpf: fix whitespace for ENCAP_L2 defines in bpf.h
replace tab after #define with space in line with rest of definitions

Change-Id: I29e1364abd94abe1e251816032890f895e0159f0
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2025-10-10 05:10:34 +03:00
Stanislav Fomichev
4ccc5880b5 BACKPORT: bpf: add BPF_CGROUP_SOCK_OPS callback that is executed on every RTT
Performance impact should be minimal because it's under a new
BPF_SOCK_OPS_RTT_CB_FLAG flag that has to be explicitly enabled.

Suggested-by: Eric Dumazet <edumazet@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Priyaranjan Jha <priyarjha@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Change-Id: I19814edf78101f87e6dc364343f8173d9e230850
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2025-10-10 05:10:34 +03:00
Stanislav Fomichev
21b246cdbf UPSTREAM: bpf: support cloning sk storage on accept()
Add new helper bpf_sk_storage_clone which optionally clones sk storage
and call it from sk_clone_lock.

Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Change-Id: Iee30d2442b76f6fd7904829314e63f3391e7d811
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2025-10-10 05:10:33 +03:00
Andrii Nakryiko
83a3335d5e UPSTREAM: bpf: Add bpf_patch_call_args prototype to include/linux/bpf.h
[ Upstream commit a643bff752dcf72a07e1b2ab2f8587e4f51118be ]

Add bpf_patch_call_args() prototype. This function is called from BPF verifier
and only if CONFIG_BPF_JIT_ALWAYS_ON is not defined. This fixes compiler
warning about missing prototype in some kernel configurations.

Fixes: 1ea47e01ad6e ("bpf: add support for bpf_call to interpreter")
Reported-by: kernel test robot <lkp@intel.com>
Change-Id: Ic06a473f58bf111d74d23d5527f1b25be013db75
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210112075520.4103414-2-andrii@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-10 05:10:33 +03:00