11706 Commits

Author SHA1 Message Date
kondors1995
2d5c481f29 bpf: squash revert spoofing and some backports:
Squashed commit of the following:

commit 259593385c05a430c4685b611c0e43b4272c22f8
Author: John Galt <johngaltfirstrun@gmail.com>
Date:   Fri Dec 13 08:30:37 2024 -0500

    bpf: squash revert spoofing and some backports:

    Squashed commit of the following:

    commit 8ac5df9c8bc9575059fff6cea0c40463b96fc129
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:58:17 2024 -0500

        Revert "BACKPORT: bpf: add skb_load_bytes_relative helper"

        This reverts commit 029893dcc5d67af16fdf0723bacaae37ec567f67.

    commit dbcbceafe848744ec188f74e87e9717916d359ea
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:58:13 2024 -0500

        Revert "BACKPORT: bpf: encapsulate verifier log state into a structure"

        This reverts commit d861145b97d247cbd9fe1400df52155f48639126.

    commit 478f4dfee0406b54525e68764cc9ba48af1624fc
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:58:10 2024 -0500

        Revert "BACKPORT: bpf: Rename bpf_verifer_log"

        This reverts commit 5d088635de1bf2d6ae9ea94e3dd1c601d30c0cce.

    commit 7bc7c24beb82168b49337530cb56b5dfeeafe19a
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:58:07 2024 -0500

        Revert "BACKPORT: bpf: btf: Introduce BPF Type Format (BTF)"

        This reverts commit 93d34e26514b4d9d15fd176706f57634b2e97485.

    commit 7106457ba90a459b6241fdd44df658c1b52c0e4b
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:58:03 2024 -0500

        Revert "bpf: Update logging functions to work with BTF"

        This reverts commit 97e6c528eb2f76c58a3b6a4c1e7fbeafcd97633a.

    commit 08e68c7ba56f5e78fd1afcd5a2164716a75b0fe3
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:58:00 2024 -0500

        Revert "bpf: btf: Validate type reference"

        This reverts commit c7b7eecbc1134e5d8865af2cc0692fc7156175d5.

    commit 7763cf0831970a64ed62f9b7362fca02ab6e83f1
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:57:51 2024 -0500

        Revert "bpf: btf: Check members of struct/union"

        This reverts commit 9a77b51cad6f04866ca067ca0e70a89b9f59ed56.

    commit eb033235f666b5f66995f4cf89702de7ab4721f8
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:57:47 2024 -0500

        Revert "bpf: btf: Add pretty print capability for data with BTF type info"

        This reverts commit 745692103435221d6e39bc177811769995540525.

    commit c32995674ace91e06c591d2f63177585e81adc75
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:57:43 2024 -0500

        Revert "BACKPORT: bpf: btf: Add BPF_BTF_LOAD command"

        This reverts commit 4e0afd38e20e5aa2df444361309bc07251ca6b2a.

    commit 1310bc8d4aca0015c8723e7624121eddf76b3244
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:57:38 2024 -0500

        Revert "bpf: btf: Add BPF_OBJ_GET_INFO_BY_FD support to BTF fd"

        This reverts commit d4b5d76d9101b97e6fe5181bcefe7f601ed19926.

    commit 881a49445608712bdb0a0f0c959838bdbc725f62
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:57:34 2024 -0500

        Revert "BACKPORT: bpf: btf: Clean up btf.h in uapi"

        This reverts commit 26b661822933d41b3feb59bb284334bfbbc82af4.

    commit e2109fd858ebd5fe392c8bf579b9350fbca35a35
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:57:29 2024 -0500

        Revert "bpf: btf: Avoid WARN_ON when CONFIG_REFCOUNT_FULL=y"

        This reverts commit 9abf878903404e649fef4ad0b189eec1c13d29fe.

    commit 088a7d9137f03da4e0fc1d72add3901823081ccd
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:57:23 2024 -0500

        Revert "bpf: Fix compiler warning on info.map_ids for 32bit platform"

        This reverts commit a3a278e1f6cf167d538ac52f4ad60bb9cf8d4129.

    commit 6e14aed6b63f2b266982454d83678445c062cf39
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:57:13 2024 -0500

        Revert "bpf: btf: Change how section is supported in btf_header"

        This reverts commit 4b60ffd683eb623a184b46761777838d7c49e707.

    commit 151a60855c23bf0317734031481d779efb369d6c
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:57:08 2024 -0500

        Revert "bpf: btf: Check array->index_type"

        This reverts commit b00e10f1a073fadce178b6fb62496722e16db303.

    commit 49775e9074a54ac5f60f518e6fc5a26172996eae
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:57:01 2024 -0500

        Revert "bpf: btf: Remove unused bits from uapi/linux/btf.h"

        This reverts commit c90c6ad34f7a8f565f351d21c2d5b9706838767d.

    commit b6d6c6ab28e4b018da6ce9e64125e63f4191d3d9
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:56:58 2024 -0500

        Revert "bpf: btf: Avoid variable length array"

        This reverts commit fe7d1f7750242e77a73839d173ac36c3e39d4171.

    commit a45bedecb9b1175fef96f2d64fba2d61777dbf35
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:56:49 2024 -0500

        Revert "bpf: btf: avoid -Wreturn-type warning"

        This reverts commit 78214f1e390bf1d69d9ae4ee80072ac85c34619e.

    commit 445efb8465b9fa5706d81098417f15656265322e
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:56:46 2024 -0500

        Revert "bpf: btf: Check array t->size"

        This reverts commit aed532e7466f77885a362e4b863bf90c41e834ba.

    commit 8aada590d525de735cf39196d88722e727c141e9
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:56:42 2024 -0500

        Revert "bpf: btf: Ensure t->type == 0 for BTF_KIND_FWD"

        This reverts commit 8c8b601dcc2e62e1276b73dfee8b49e40fb65944.

    commit ed67ad09e866c9c30897488088bbb4555ea3dc80
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:56:38 2024 -0500

        Revert "bpf: btf: Fix bitfield extraction for big endian"

        This reverts commit b0696a226c52868d64963f01665dd1a640a92f2b.

    commit 5cc64db782daf86cdf7ac77133ca94181bb29146
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:56:35 2024 -0500

        Revert "bpf: btf: Clean up BTF_INT_BITS() in uapi btf.h"

        This reverts commit 0f008594540b09c667ea88fc87cf289b8db334da.

    commit 3a5c6b9010426449c08ecdcc10e758431b1e515f
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:56:31 2024 -0500

        Revert "bpf: btf: Ensure the member->offset is in the right order"

        This reverts commit c5e361ecd6d45a7cdbffda02e4691a7a37198bdd.

    commit bd6173c1ac458b08d6cedaf06e6e53c93e6b0cc5
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:56:26 2024 -0500

        Revert "bpf: fix bpf_skb_load_bytes_relative pkt length check"

        This reverts commit 9ea14969874cd7896588df435c890f6f2f547821.

    commit 0b61d26b25a65d9ded4611426c6da9c78e41567c
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:56:22 2024 -0500

        Revert "bpf: btf: Fix end boundary calculation for type section"

        This reverts commit 08ef221c7fb604cb60c490fa999ec7254d492f05.

    commit 72fb2b9bb5b90f60ab71915fe4e57eeee3308163
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:56:18 2024 -0500

        Revert "bpf: btf: Fix a missing check bug"

        This reverts commit 594687e3e01e26086f3b0173e5eda9b9f0b672f8.

    commit 575a34ceba4013ad0230038f29f6ea0b3ba41a7e
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:56:15 2024 -0500

        Revert "bpf, btf: fix a missing check bug in btf_parse"

        This reverts commit 6bf31bbc438663756e92fb0aad4f5a35fd730fb0.

    commit bcca98c0bc5e19b38af3ddcd0feee80ad26e1f96
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:56:11 2024 -0500

        Revert "bpf: fix BTF limits"

        This reverts commit e351b26ae671dfacd82f27c1c5f66cf8089d930d.

    commit f71c484e340041d8828c94b39a233ea587d8cc09
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:56:07 2024 -0500

        Revert "bpf/btf: Fix BTF verification of enum members in struct/union"

        This reverts commit 861e65b744c171d59850e61a01715f194f25e45c.

    commit eca310722a2624d33cd49884aa18c36d435b10f8
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:56:02 2024 -0500

        Revert "bpf: btf: fix truncated last_member_type_id in btf_struct_resolve"

        This reverts commit d6cd1eac41b10e606ec7f445162a0617c01be973.

    commit caae5c99a3ca7bed0e318b31b6aa7ca8260a1c52
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:55:58 2024 -0500

        Revert "BACKPORT: net: bpf: rename ndo_xdp to ndo_bpf"

        This reverts commit 2a1ddcb6a384745195d57b4e4cdda2a55d2cbe47.

    commit f90bdcdaa095a4f10268bb740470a3e0893be21b
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:55:54 2024 -0500

        Revert "BACKPORT: bpf: offload: add infrastructure for loading programs for a specific netdev"

        This reverts commit a9516d402726094eafccce26a99cf5110d188be9.

    commit c6e0ce9019c06d9a45c030a2bc38eed320afd45a
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:55:50 2024 -0500

        Revert "bpf: offload: rename the ifindex field"

        This reverts commit 36bc9c7351a1dc78b3e71571998af381e876b4cb.

    commit 88b6a4d41b69df804b846a8ebdca410517e08343
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:55:46 2024 -0500

        Revert "BACKPORT: bpf: Check attach type at prog load time"

        This reverts commit fe5a0d514e4970d86983458136d4a2f6caeee365.

    commit 9ccfaa66a5ea042331f0aacdb3667e23c8ed363e
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:55:43 2024 -0500

        Revert "BACKPORT: bpf: introduce BPF_PROG_QUERY command"

        This reverts commit a5720688858170f1054f9549b5a628db1c252a88.

    commit adab2743b3fa0853d0351b33b0a286de745025e5
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:55:37 2024 -0500

        Revert "BACKPORT: bpf: Hooks for sys_bind"

        This reverts commit e484887c7e7aa026521ddc1773233368a6304b24.

    commit d462e09db98ad89b3a836f9b9a925812b0d8cfe7
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:55:33 2024 -0500

        Revert "BACKPORT: net: Introduce __inet_bind() and __inet6_bind"

        This reverts commit 41a3131c3e94c28fd084dd6f4358baee3824fd17.

    commit cdf7f55dc65b4bdf7ecfc924be77c6a039709b3d
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:55:29 2024 -0500

        Revert "BACKPORT: bpf: Hooks for sys_connect"

        This reverts commit f26fe7233e2885ef489707ab5a5a5dda9f081b80.

    commit 97685d5058f76ba4ea6dd2db157f4537f3a8953d
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:55:23 2024 -0500

        Revert "BACKPORT: bpf: Post-hooks for sys_bind"

        This reverts commit 284ac5bc7c70dac338301445e94e1ad40fb40fdb.

    commit d03d9c05036d3109eae643f473cc5a5ad0a80721
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:55:19 2024 -0500

        Revert "kernel: bpf: devmap: Create __dev_map_alloc_node"

        This reverts commit db726149fa9abfd1ca9add3e2db6b1524f7e90a3.

    commit 8c34bcb3e4c6630799764871b4af2e5f9344a371
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:55:15 2024 -0500

        Revert "BACKPORT: xdp: Add devmap_hash map type for looking up devices by hashed index"

        This reverts commit c4d4e1d201d8433e06b2ac66041d7105095a0204.

    commit ef277c7b3a08fd59943eb2b47af64afc513de008
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:55:11 2024 -0500

        Revert "BACKPORT: devmap: Allow map lookups from eBPF"

        This reverts commit 24d196375871c72de0de977de79afede5a7d1780.

    commit 4fcd87869c55c28ed59bff916d640147601816d2
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:55:07 2024 -0500

        Revert "gen_headers_{arm, arm64}: Add btf.h to the list"

        This reverts commit 37edfe7c90bac355885ffec3327b338a34619792.

    commit b89560e0b405b58ecc5fc12c15ad4f56147760d6
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:55:03 2024 -0500

        Revert "syscall: Fake uname to 4.19 for bpfloader/netd"

        This reverts commit 186e74af61269602d0c068d98928b1f25e03eba2.

    commit fd49f8c35eb7875d6810a5a52877ebc59bfd4530
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:54:59 2024 -0500

        Revert "syscall: Fake uname to 4.19 also for netbpfload"

        This reverts commit 34b9a1ab387d7dc83ede613b2c12b3741ea08edb.

    commit b853fcf2ff892664d0ff522ca7fd530bc94c023e
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Fri Dec 13 07:54:53 2024 -0500

        Revert "syscall: Increase bpf fake uname to 5.4"

        This reverts commit 9cdc014e11b410a7f03d8c968a35ee0dd6a28fff.

    # Conflicts:
    #	net/ipv4/af_inet.c
    #	net/ipv6/af_inet6.c

commit 4a0143fa36d300485650dc447b580151a69a3be2
Author: kondors1995 <normandija1945@gmail.com>
Date:   Wed Dec 18 13:48:16 2024 +0200

    Revert "syscall: Fake uname to 4.19 for bpfloader/netd"

    This reverts commit 417f37c97f.

commit 6f512c5c7341a51d7bbc9cdd93814764cae8868f
Author: kondors1995 <normandija1945@gmail.com>
Date:   Wed Dec 18 13:48:16 2024 +0200

    Revert "syscall: Fake uname to 4.19 also for netbpfload"

    This reverts commit a4c61c3d97.

commit 41f326616251f0122d81e518082ef7faaad4b2e5
Author: kondors1995 <normandija1945@gmail.com>
Date:   Wed Dec 18 13:48:15 2024 +0200

    Revert "syscall: Increase bpf fake uname to 5.4"

    This reverts commit 4a906017d4.

commit a0d3db72a836096cf533516d56c81a43150976ed
Author: kondors1995 <normandija1945@gmail.com>
Date:   Wed Dec 18 13:46:12 2024 +0200

    Revert "bpf: Hooks for sys_sendmsg"

    This reverts commit 735c155332.

commit 246eb3d90b95e0ab5aee8d5a9e9cd639c7beb174
Author: kondors1995 <normandija1945@gmail.com>
Date:   Wed Dec 18 13:45:08 2024 +0200

    Revert "syscall: Increase fake uname to 6.6.40"

    This reverts commit 92494b9920.

commit c56eaa5b7f170f58f2ade14bb71aaad2964b9018
Author: kondors1995 <normandija1945@gmail.com>
Date:   Mon Dec 9 21:35:20 2024 +0200

    raphael_defconfig: increase sbalance pooling rate to 10s

commit 54d190b8af
Author: Sultan Alsawaf <sultan@kerneltoast.com>
Date:   Wed Dec 4 15:53:22 2024 -0800

    sbalance: Fix severe misattribution of movable IRQs to the last active CPU

    Due to a horrible omission in the big IRQ list traversal, all movable IRQs
    are misattributed to the last active CPU in the system since that's what
    `bd` is last set to in the loop prior. This horribly breaks SBalance's
    notion of balance, producing nonsensical balancing decisions and failing to
    balance IRQs even when they are heavily imbalanced.

    Fix the massive breakage by adding the missing line of code to set `bd` to
    the CPU an IRQ actually belongs to, so that it's added to the correct CPU's
    movable IRQs list.

    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

commit f2fa2db581
Author: Sultan Alsawaf <sultan@kerneltoast.com>
Date:   Wed Dec 4 14:31:52 2024 -0800

    sbalance: Don't race with CPU hotplug

    When a CPU is hotplugged, cpu_active_mask is modified without any RCU
    synchronization. As a result, the only synchronization for cpu_active_mask
    provided by the hotplug code is the CPU hotplug lock.

    Furthermore, since IRQ balance is majorly disrupted during CPU hotplug due
    to mass IRQ migration off a dying CPU, SBalance just shouldn't operate
    while a CPU hotplug is in progress.

    Take the CPU hotplug lock in balance_irqs() to prevent races and mishaps
    during CPU hotplugs.

    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

commit a4e81ff60a
Author: Sultan Alsawaf <sultan@kerneltoast.com>
Date:   Wed Dec 4 14:16:48 2024 -0800

    sbalance: Convert various IRQ counter types to unsigned ints

    These counted values are actually unsigned ints, not unsigned longs.
    Convert them to unsigned ints since there's no reason for them to be longs.

    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
2024-12-19 17:34:31 +02:00
kondors1995
373d5574ec Merge branch 'linux-4.14.y' of https://github.com/openela/kernel-lts into 15.0 2024-10-20 18:10:19 +03:00
Eric Dumazet
795faf9727 net: fix __dst_negative_advice() race
commit 92f1655aa2b2294d0b49925f3b875a634bd3b59e upstream.

__dst_negative_advice() does not enforce proper RCU rules when
sk->dst_cache must be cleared, leading to possible UAF.

RCU rules are that we must first clear sk->sk_dst_cache,
then call dst_release(old_dst).

Note that sk_dst_reset(sk) is implementing this protocol correctly,
while __dst_negative_advice() uses the wrong order.

Given that ip6_negative_advice() has special logic
against RTF_CACHE, this means each of the three ->negative_advice()
existing methods must perform the sk_dst_reset() themselves.

Note the check against NULL dst is centralized in
__dst_negative_advice(), there is no need to duplicate
it in various callbacks.

Many thanks to Clement Lecigne for tracking this issue.

This old bug became visible after the blamed commit, using UDP sockets.

Fixes: a87cb3e48e ("net: Facility to report route quality of connected sockets")
Reported-by: Clement Lecigne <clecigne@google.com>
Diagnosed-by: Clement Lecigne <clecigne@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tom Herbert <tom@herbertland.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20240528114353.1794151-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[mheyne: contextual conflict in ip6_negative_advice due to missing
  commit c3c14da0288d ("net/ipv6: add rcu locking to ip6_negative_advice") and
  commit 93531c674315 ("net/ipv6: separate handling of FIB entries from dst based routes")]
Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-10 10:27:57 +00:00
kondors1995
ebe2649ab6 Merge remote-tracking branch 'openela/linux-4.14.y' into 14.0-matrix 2024-08-11 18:42:43 +03:00
Pablo Neira Ayuso
c01fc5238f netfilter: nf_tables: fully validate NFT_DATA_VALUE on store to data registers
[ Upstream commit 7931d32955e09d0a11b1fe0b6aac1bfa061c005c ]

register store validation for NFT_DATA_VALUE is conditional, however,
the datatype is always either NFT_DATA_VALUE or NFT_DATA_VERDICT. This
only requires a new helper function to infer the register type from the
set datatype so this conditional check can be removed. Otherwise,
pointer to chain object can be leaked through the registers.

Fixes: 96518518cc ("netfilter: add nftables")
Reported-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 40188a25a9847dbeb7ec67517174a835a677752f)
[Vegard: fixed unrelated conflict due to missing commit
 9c22bd1ab442c552e9481f1157589362887a7f47 ("netfilter: nf_tables: defer
 gc run if previous batch is still pending").]
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-08-08 15:52:20 +00:00
Luiz Augusto von Dentz
9ae3694d96 Bluetooth: L2CAP: Fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ
[ Upstream commit 806a5198c05987b748b50f3d0c0cfb3d417381a4 ]

This removes the bogus check for max > hcon->le_conn_max_interval since
the later is just the initial maximum conn interval not the maximum the
stack could support which is really 3200=4000ms.

In order to pass GAP/CONN/CPUP/BV-05-C one shall probably enter values
of the following fields in IXIT that would cause hci_check_conn_params
to fail:

TSPX_conn_update_int_min
TSPX_conn_update_int_max
TSPX_conn_update_peripheral_latency
TSPX_conn_update_supervision_timeout

Link: https://github.com/bluez/bluez/issues/847
Fixes: e4b019515f95 ("Bluetooth: Enforce validation on max value of connection interval")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit a1f9c8328219b9bb2c29846d6c74ea981e60b5a7)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-08-08 15:52:16 +00:00
Pablo Neira Ayuso
50bfcb0af9 netfilter: nf_tables: drop map element references from preparation phase
[ Upstream commit 628bd3e49cba1c066228e23d71a852c23e26da73 ]

set .destroy callback releases the references to other objects in maps.
This is very late and it results in spurious EBUSY errors. Drop refcount
from the preparation phase instead, update set backend not to drop
reference counter from set .destroy path.

Exceptions: NFT_TRANS_PREPARE_ERROR does not require to drop the
reference counter because the transaction abort path releases the map
references for each element since the set is unbound. The abort path
also deals with releasing reference counter for new elements added to
unbound sets.

Fixes: 591054469b ("netfilter: nf_tables: revisit chain/object refcounting from elements")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit bc9f791d2593f17e39f87c6e2b3a36549a3705b1)
[Harshit: Fixed 3 different conflicts due to missing commit:
 71cc0873e0e0 ("netfilter: nf_tables: Simplify set backend selection")
 and commit: 79b174ade16d ("netfilter: nf_tables: garbage collection for
 stateful expressions") and commit: 88bae77d6606 ("netfilter: nf_tables:
 use net_generic infra for transaction data") and these can't be
 backported to 4.14.y]
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-07-15 18:30:23 +00:00
Pablo Neira Ayuso
e4382ad0d8 netfilter: nf_tables: pass ctx to nf_tables_expr_destroy()
nft_set_elem_destroy() can be called from call_rcu context. Annotate
netns and table in set object so we can populate the context object.
Moreover, pass context object to nf_tables_set_elem_destroy() from the
commit phase, since it is already available from there.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
(cherry picked from commit 3453c92731884bad7c4c3a0667228b964747f3d5)
[Harshit: 4.14.y had backport commit: 4e0dbab570 ("netfilter:
 nf_tables: do not allow SET_ID to refer to another table") which does
 add couple of things which this commit is supposed to add]
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
[Vegard: removed .family = set->table->family assignment in
 nft_set_elem_destroy() as we're missing commit
 36596dadf54a920d26286cf9f421fb4ef648b51f ("netfilter: nf_tables: add
 single table list for all families").]
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-07-15 18:30:23 +00:00
Pablo Neira Ayuso
05084047e5 netfilter: nf_tables: fix set double-free in abort path
[ Upstream commit 40ba1d9b4d19796afc9b7ece872f5f3e8f5e2c13 ]

The abort path can cause a double-free of an anonymous set.
Added-and-to-be-aborted rule looks like this:

udp dport { 137, 138 } drop

The to-be-aborted transaction list looks like this:

newset
newsetelem
newsetelem
rule

This gets walked in reverse order, so first pass disables the rule, the
set elements, then the set.

After synchronize_rcu(), we then destroy those in same order: rule, set
element, set element, newset.

Problem is that the anonymous set has already been bound to the rule, so
the rule (lookup expression destructor) already frees the set, when then
cause use-after-free when trying to delete the elements from this set,
then try to free the set again when handling the newset expression.

Rule releases the bound set in first place from the abort path, this
causes the use-after-free on set element removal when undoing the new
element transactions. To handle this, skip new element transaction if
set is bound from the abort path.

This is still causes the use-after-free on set element removal.  To
handle this, remove transaction from the list when the set is already
bound.

Joint work with Florian Westphal.

Fixes: f6ac85858976 ("netfilter: nf_tables: unbind set in rule from commit path")
Bugzilla: https://bugzilla.netfilter.org/show_bug.cgi?id=1325
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 25ddad73070c8fd15528fe804db90f0bda0dc6ad)
[Vegard: fix conflicts due to commit 59c38c8076
 ("netfilter: nf_tables: use-after-free in failing rule with bound set")
 which included extra code for the NFT_MSG_NEWSETELEM case that was
 added upstream in commit 40ba1d9b4d19796afc9b7ece872f5f3e8f5e2c13
 ("netfilter: nf_tables: fix set double-free in abort path") and
 conflicts due to commit 63d921c3e52a14125f293efea5f78508c36668c1
 ("netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound
 set/chain") for the bind/true differences.]
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-07-15 18:28:11 +00:00
Pablo Neira Ayuso
832e81e147 netfilter: nf_tables: add nft_set_is_anonymous() helper
Add helper function to test for the NFT_SET_ANONYMOUS flag.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
(cherry picked from commit 408070d6ee3490da63430bc8ce13348cf2eb47ea)
[Harshit: Adding this helper commit as it would help for future commit
backports, fix conlfict reoslution due to missing commit: af26f3e2903b
("netfilter: nf_tables: unbind set in rule from commit path") which is a
later commit but applied to 4.14.y already and I have considered using
nft_is_anonymous() helper at couple of other places where the previous
backports avoided to use it as the helper wasn't present]
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
2024-07-15 17:44:33 +00:00
Qingfang DENG
34358bf261 neighbour: fix unaligned access to pneigh_entry
commit ed779fe4c9b5a20b4ab4fd6f3e19807445bb78c7 upstream.

After the blamed commit, the member key is longer 4-byte aligned. On
platforms that do not support unaligned access, e.g., MIPS32R2 with
unaligned_action set to 1, this will trigger a crash when accessing
an IPv6 pneigh_entry, as the key is cast to an in6_addr pointer.

Change the type of the key to u32 to make it aligned.

Fixes: 62dd93181a ("[IPV6] NDISC: Set per-entry is_router flag in Proxy NA.")
Signed-off-by: Qingfang DENG <qingfang.deng@siflower.com.cn>
Link: https://lore.kernel.org/r/20230601015432.159066-1-dqfext@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit f451d1a013fd585cbf70a65ca6b9cf3548bb039f)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-07-15 17:44:33 +00:00
SK00RUPA
e0b9e449c2 Merge tag 'v4.14.347-openela' into 14.0
This is the 4.14.347 OpenELA-Extended LTS stable release
2024-06-04 17:36:34 +02:00
SK00RUPA
d4959529f7 Merge tag 'v4.14.346-openela' into 14.0
This is the 4.14.346 OpenELA-Extended LTS stable release
2024-06-04 17:36:16 +02:00
SK00RUPA
29a1ff2b6d Merge tag 'v4.14.345-openela' into 14.0
This is the 4.14.345 OpenELA-Extended LTS stable release
2024-06-04 17:31:52 +02:00
Kuniyuki Iwashima
bade56293a af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc().
commit 1971d13ffa84a551d29a81fdf5b5ec5be166ac83 upstream.

syzbot reported a lockdep splat regarding unix_gc_lock and
unix_state_lock().

One is called from recvmsg() for a connected socket, and another
is called from GC for TCP_LISTEN socket.

So, the splat is false-positive.

Let's add a dedicated lock class for the latter to suppress the splat.

Note that this change is not necessary for net-next.git as the issue
is only applied to the old GC impl.

[0]:
WARNING: possible circular locking dependency detected
6.9.0-rc5-syzkaller-00007-g4d2008430ce8 #0 Not tainted
 -----------------------------------------------------
kworker/u8:1/11 is trying to acquire lock:
ffff88807cea4e70 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
ffff88807cea4e70 (&u->lock){+.+.}-{2:2}, at: __unix_gc+0x40e/0xf70 net/unix/garbage.c:302

but task is already holding lock:
ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: __unix_gc+0x117/0xf70 net/unix/garbage.c:261

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

 -> #1 (unix_gc_lock){+.+.}-{2:2}:
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
       __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
       _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
       spin_lock include/linux/spinlock.h:351 [inline]
       unix_notinflight+0x13d/0x390 net/unix/garbage.c:140
       unix_detach_fds net/unix/af_unix.c:1819 [inline]
       unix_destruct_scm+0x221/0x350 net/unix/af_unix.c:1876
       skb_release_head_state+0x100/0x250 net/core/skbuff.c:1188
       skb_release_all net/core/skbuff.c:1200 [inline]
       __kfree_skb net/core/skbuff.c:1216 [inline]
       kfree_skb_reason+0x16d/0x3b0 net/core/skbuff.c:1252
       kfree_skb include/linux/skbuff.h:1262 [inline]
       manage_oob net/unix/af_unix.c:2672 [inline]
       unix_stream_read_generic+0x1125/0x2700 net/unix/af_unix.c:2749
       unix_stream_splice_read+0x239/0x320 net/unix/af_unix.c:2981
       do_splice_read fs/splice.c:985 [inline]
       splice_file_to_pipe+0x299/0x500 fs/splice.c:1295
       do_splice+0xf2d/0x1880 fs/splice.c:1379
       __do_splice fs/splice.c:1436 [inline]
       __do_sys_splice fs/splice.c:1652 [inline]
       __se_sys_splice+0x331/0x4a0 fs/splice.c:1634
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

 -> #0 (&u->lock){+.+.}-{2:2}:
       check_prev_add kernel/locking/lockdep.c:3134 [inline]
       check_prevs_add kernel/locking/lockdep.c:3253 [inline]
       validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
       __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
       __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
       _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
       spin_lock include/linux/spinlock.h:351 [inline]
       __unix_gc+0x40e/0xf70 net/unix/garbage.c:302
       process_one_work kernel/workqueue.c:3254 [inline]
       process_scheduled_works+0xa10/0x17c0 kernel/workqueue.c:3335
       worker_thread+0x86d/0xd70 kernel/workqueue.c:3416
       kthread+0x2f0/0x390 kernel/kthread.c:388
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(unix_gc_lock);
                               lock(&u->lock);
                               lock(unix_gc_lock);
  lock(&u->lock);

 *** DEADLOCK ***

3 locks held by kworker/u8:1/11:
 #0: ffff888015089148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3229 [inline]
 #0: ffff888015089148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_scheduled_works+0x8e0/0x17c0 kernel/workqueue.c:3335
 #1: ffffc90000107d00 (unix_gc_work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3230 [inline]
 #1: ffffc90000107d00 (unix_gc_work){+.+.}-{0:0}, at: process_scheduled_works+0x91b/0x17c0 kernel/workqueue.c:3335
 #2: ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
 #2: ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: __unix_gc+0x117/0xf70 net/unix/garbage.c:261

stack backtrace:
CPU: 0 PID: 11 Comm: kworker/u8:1 Not tainted 6.9.0-rc5-syzkaller-00007-g4d2008430ce8 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Workqueue: events_unbound __unix_gc
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
 check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187
 check_prev_add kernel/locking/lockdep.c:3134 [inline]
 check_prevs_add kernel/locking/lockdep.c:3253 [inline]
 validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
 __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
 _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
 spin_lock include/linux/spinlock.h:351 [inline]
 __unix_gc+0x40e/0xf70 net/unix/garbage.c:302
 process_one_work kernel/workqueue.c:3254 [inline]
 process_scheduled_works+0xa10/0x17c0 kernel/workqueue.c:3335
 worker_thread+0x86d/0xd70 kernel/workqueue.c:3416
 kthread+0x2f0/0x390 kernel/kthread.c:388
 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>

Fixes: 47d8ac011fe1 ("af_unix: Fix garbage collector racing against connect()")
Reported-and-tested-by: syzbot+fa379358c28cc87cc307@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=fa379358c28cc87cc307
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20240424170443.9832-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit b29dcdd0582c00cd6ee0bd7c958d3639aa9db27f)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-06-03 12:47:31 +00:00
Kuniyuki Iwashima
40d8d26e71 af_unix: Do not use atomic ops for unix_sk(sk)->inflight.
[ Upstream commit 97af84a6bba2ab2b9c704c08e67de3b5ea551bb2 ]

When touching unix_sk(sk)->inflight, we are always under
spin_lock(&unix_gc_lock).

Let's convert unix_sk(sk)->inflight to the normal unsigned long.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240123170856.41348-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit c8a2b1f7208b0ea0a4ad4355e0510d84f508a9ff)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-06-03 12:47:30 +00:00
Jiri Benc
fee87d3871 ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr
[ Upstream commit 7633c4da919ad51164acbf1aa322cc1a3ead6129 ]

Although ipv6_get_ifaddr walks inet6_addr_lst under the RCU lock, it
still means hlist_for_each_entry_rcu can return an item that got removed
from the list. The memory itself of such item is not freed thanks to RCU
but nothing guarantees the actual content of the memory is sane.

In particular, the reference count can be zero. This can happen if
ipv6_del_addr is called in parallel. ipv6_del_addr removes the entry
from inet6_addr_lst (hlist_del_init_rcu(&ifp->addr_lst)) and drops all
references (__in6_ifa_put(ifp) + in6_ifa_put(ifp)). With bad enough
timing, this can happen:

1. In ipv6_get_ifaddr, hlist_for_each_entry_rcu returns an entry.

2. Then, the whole ipv6_del_addr is executed for the given entry. The
   reference count drops to zero and kfree_rcu is scheduled.

3. ipv6_get_ifaddr continues and tries to increments the reference count
   (in6_ifa_hold).

4. The rcu is unlocked and the entry is freed.

5. The freed entry is returned.

Prevent increasing of the reference count in such case. The name
in6_ifa_hold_safe is chosen to mimic the existing fib6_info_hold_safe.

[   41.506330] refcount_t: addition on 0; use-after-free.
[   41.506760] WARNING: CPU: 0 PID: 595 at lib/refcount.c:25 refcount_warn_saturate+0xa5/0x130
[   41.507413] Modules linked in: veth bridge stp llc
[   41.507821] CPU: 0 PID: 595 Comm: python3 Not tainted 6.9.0-rc2.main-00208-g49563be82afa #14
[   41.508479] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
[   41.509163] RIP: 0010:refcount_warn_saturate+0xa5/0x130
[   41.509586] Code: ad ff 90 0f 0b 90 90 c3 cc cc cc cc 80 3d c0 30 ad 01 00 75 a0 c6 05 b7 30 ad 01 01 90 48 c7 c7 38 cc 7a 8c e8 cc 18 ad ff 90 <0f> 0b 90 90 c3 cc cc cc cc 80 3d 98 30 ad 01 00 0f 85 75 ff ff ff
[   41.510956] RSP: 0018:ffffbda3c026baf0 EFLAGS: 00010282
[   41.511368] RAX: 0000000000000000 RBX: ffff9e9c46914800 RCX: 0000000000000000
[   41.511910] RDX: ffff9e9c7ec29c00 RSI: ffff9e9c7ec1c900 RDI: ffff9e9c7ec1c900
[   41.512445] RBP: ffff9e9c43660c9c R08: 0000000000009ffb R09: 00000000ffffdfff
[   41.512998] R10: 00000000ffffdfff R11: ffffffff8ca58a40 R12: ffff9e9c4339a000
[   41.513534] R13: 0000000000000001 R14: ffff9e9c438a0000 R15: ffffbda3c026bb48
[   41.514086] FS:  00007fbc4cda1740(0000) GS:ffff9e9c7ec00000(0000) knlGS:0000000000000000
[   41.514726] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   41.515176] CR2: 000056233b337d88 CR3: 000000000376e006 CR4: 0000000000370ef0
[   41.515713] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   41.516252] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   41.516799] Call Trace:
[   41.517037]  <TASK>
[   41.517249]  ? __warn+0x7b/0x120
[   41.517535]  ? refcount_warn_saturate+0xa5/0x130
[   41.517923]  ? report_bug+0x164/0x190
[   41.518240]  ? handle_bug+0x3d/0x70
[   41.518541]  ? exc_invalid_op+0x17/0x70
[   41.520972]  ? asm_exc_invalid_op+0x1a/0x20
[   41.521325]  ? refcount_warn_saturate+0xa5/0x130
[   41.521708]  ipv6_get_ifaddr+0xda/0xe0
[   41.522035]  inet6_rtm_getaddr+0x342/0x3f0
[   41.522376]  ? __pfx_inet6_rtm_getaddr+0x10/0x10
[   41.522758]  rtnetlink_rcv_msg+0x334/0x3d0
[   41.523102]  ? netlink_unicast+0x30f/0x390
[   41.523445]  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[   41.523832]  netlink_rcv_skb+0x53/0x100
[   41.524157]  netlink_unicast+0x23b/0x390
[   41.524484]  netlink_sendmsg+0x1f2/0x440
[   41.524826]  __sys_sendto+0x1d8/0x1f0
[   41.525145]  __x64_sys_sendto+0x1f/0x30
[   41.525467]  do_syscall_64+0xa5/0x1b0
[   41.525794]  entry_SYSCALL_64_after_hwframe+0x72/0x7a
[   41.526213] RIP: 0033:0x7fbc4cfcea9a
[   41.526528] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
[   41.527942] RSP: 002b:00007ffcf54012a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[   41.528593] RAX: ffffffffffffffda RBX: 00007ffcf5401368 RCX: 00007fbc4cfcea9a
[   41.529173] RDX: 000000000000002c RSI: 00007fbc4b9d9bd0 RDI: 0000000000000005
[   41.529786] RBP: 00007fbc4bafb040 R08: 00007ffcf54013e0 R09: 000000000000000c
[   41.530375] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[   41.530977] R13: ffffffffc4653600 R14: 0000000000000001 R15: 00007fbc4ca85d1b
[   41.531573]  </TASK>

Fixes: 5c578aedcb ("IPv6: convert addrconf hash list to RCU")
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Link: https://lore.kernel.org/r/8ab821e36073a4a406c50ec83c9e8dc586c539e4.1712585809.git.jbenc@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit b4b3b69a19016d4e7fbdbd1dbcc184915eb862e1)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-05-31 12:30:38 +00:00
Eric Dumazet
5f11455435 geneve: fix header validation in geneve[6]_xmit_skb
[ Upstream commit d8a6213d70accb403b82924a1c229e733433a5ef ]

syzbot is able to trigger an uninit-value in geneve_xmit() [1]

Problem : While most ip tunnel helpers (like ip_tunnel_get_dsfield())
uses skb_protocol(skb, true), pskb_inet_may_pull() is only using
skb->protocol.

If anything else than ETH_P_IPV6 or ETH_P_IP is found in skb->protocol,
pskb_inet_may_pull() does nothing at all.

If a vlan tag was provided by the caller (af_packet in the syzbot case),
the network header might not point to the correct location, and skb
linear part could be smaller than expected.

Add skb_vlan_inet_prepare() to perform a complete mac validation.

Use this in geneve for the moment, I suspect we need to adopt this
more broadly.

v4 - Jakub reported v3 broke l2_tos_ttl_inherit.sh selftest
   - Only call __vlan_get_protocol() for vlan types.
Link: https://lore.kernel.org/netdev/20240404100035.3270a7d5@kernel.org/

v2,v3 - Addressed Sabrina comments on v1 and v2
Link: https://lore.kernel.org/netdev/Zg1l9L2BNoZWZDZG@hog/

[1]

BUG: KMSAN: uninit-value in geneve_xmit_skb drivers/net/geneve.c:910 [inline]
 BUG: KMSAN: uninit-value in geneve_xmit+0x302d/0x5420 drivers/net/geneve.c:1030
  geneve_xmit_skb drivers/net/geneve.c:910 [inline]
  geneve_xmit+0x302d/0x5420 drivers/net/geneve.c:1030
  __netdev_start_xmit include/linux/netdevice.h:4903 [inline]
  netdev_start_xmit include/linux/netdevice.h:4917 [inline]
  xmit_one net/core/dev.c:3531 [inline]
  dev_hard_start_xmit+0x247/0xa20 net/core/dev.c:3547
  __dev_queue_xmit+0x348d/0x52c0 net/core/dev.c:4335
  dev_queue_xmit include/linux/netdevice.h:3091 [inline]
  packet_xmit+0x9c/0x6c0 net/packet/af_packet.c:276
  packet_snd net/packet/af_packet.c:3081 [inline]
  packet_sendmsg+0x8bb0/0x9ef0 net/packet/af_packet.c:3113
  sock_sendmsg_nosec net/socket.c:730 [inline]
  __sock_sendmsg+0x30f/0x380 net/socket.c:745
  __sys_sendto+0x685/0x830 net/socket.c:2191
  __do_sys_sendto net/socket.c:2203 [inline]
  __se_sys_sendto net/socket.c:2199 [inline]
  __x64_sys_sendto+0x125/0x1d0 net/socket.c:2199
 do_syscall_64+0xd5/0x1f0
 entry_SYSCALL_64_after_hwframe+0x6d/0x75

Uninit was created at:
  slab_post_alloc_hook mm/slub.c:3804 [inline]
  slab_alloc_node mm/slub.c:3845 [inline]
  kmem_cache_alloc_node+0x613/0xc50 mm/slub.c:3888
  kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:577
  __alloc_skb+0x35b/0x7a0 net/core/skbuff.c:668
  alloc_skb include/linux/skbuff.h:1318 [inline]
  alloc_skb_with_frags+0xc8/0xbf0 net/core/skbuff.c:6504
  sock_alloc_send_pskb+0xa81/0xbf0 net/core/sock.c:2795
  packet_alloc_skb net/packet/af_packet.c:2930 [inline]
  packet_snd net/packet/af_packet.c:3024 [inline]
  packet_sendmsg+0x722d/0x9ef0 net/packet/af_packet.c:3113
  sock_sendmsg_nosec net/socket.c:730 [inline]
  __sock_sendmsg+0x30f/0x380 net/socket.c:745
  __sys_sendto+0x685/0x830 net/socket.c:2191
  __do_sys_sendto net/socket.c:2203 [inline]
  __se_sys_sendto net/socket.c:2199 [inline]
  __x64_sys_sendto+0x125/0x1d0 net/socket.c:2199
 do_syscall_64+0xd5/0x1f0
 entry_SYSCALL_64_after_hwframe+0x6d/0x75

CPU: 0 PID: 5033 Comm: syz-executor346 Not tainted 6.9.0-rc1-syzkaller-00005-g928a87efa423 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024

Fixes: d13f048dd40e ("net: geneve: modify IP header check in geneve6_xmit_skb and geneve_xmit_skb")
Reported-by: syzbot+9ee20ec1de7b3168db09@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/000000000000d19c3a06152f9ee4@google.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Phillip Potter <phil@philpotter.co.uk>
Cc: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Phillip Potter <phil@philpotter.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 43be590456e1f3566054ce78ae2dbb68cbe1a536)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-05-31 10:46:18 +00:00
Eric Dumazet
a15af438bc tcp: properly terminate timers for kernel sockets
[ Upstream commit 151c9c724d05d5b0dd8acd3e11cb69ef1f2dbada ]

We had various syzbot reports about tcp timers firing after
the corresponding netns has been dismantled.

Fortunately Josef Bacik could trigger the issue more often,
and could test a patch I wrote two years ago.

When TCP sockets are closed, we call inet_csk_clear_xmit_timers()
to 'stop' the timers.

inet_csk_clear_xmit_timers() can be called from any context,
including when socket lock is held.
This is the reason it uses sk_stop_timer(), aka del_timer().
This means that ongoing timers might finish much later.

For user sockets, this is fine because each running timer
holds a reference on the socket, and the user socket holds
a reference on the netns.

For kernel sockets, we risk that the netns is freed before
timer can complete, because kernel sockets do not hold
reference on the netns.

This patch adds inet_csk_clear_xmit_timers_sync() function
that using sk_stop_timer_sync() to make sure all timers
are terminated before the kernel socket is released.
Modules using kernel sockets close them in their netns exit()
handler.

Also add sock_not_owned_by_me() helper to get LOCKDEP
support : inet_csk_clear_xmit_timers_sync() must not be called
while socket lock is held.

It is very possible we can revert in the future commit
3a58f13a881e ("net: rds: acquire refcount on TCP sockets")
which attempted to solve the issue in rds only.
(net/smc/af_smc.c and net/mptcp/subflow.c have similar code)

We probably can remove the check_net() tests from
tcp_out_of_resources() and __tcp_close() in the future.

Reported-by: Josef Bacik <josef@toxicpanda.com>
Closes: https://lore.kernel.org/netdev/20240314210740.GA2823176@perftesting/
Fixes: 26abe14379 ("net: Modify sk_alloc to not reference count the netns of kernel sockets.")
Fixes: 8a68173691 ("net: sk_clone_lock() should only do get_net() if the parent is not a kernel socket")
Link: https://lore.kernel.org/bpf/CANn89i+484ffqb93aQm1N-tjxxvb3WDKX0EbD7318RwRgsatjw@mail.gmail.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Josef Bacik <josef@toxicpanda.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Link: https://lore.kernel.org/r/20240322135732.1535772-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 93f0133b9d589cc6e865f254ad9be3e9d8133f50)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-05-30 09:00:42 +00:00
Geliang Tang
f113056cc5 mptcp: add sk_stop_timer_sync helper
[ Upstream commit 08b81d873126b413cda511b1ea1cbb0e99938bbd ]

This patch added a new helper sk_stop_timer_sync, it deactivates a timer
like sk_stop_timer, but waits for the handler to finish.

Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: 151c9c724d05 ("tcp: properly terminate timers for kernel sockets")
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 9c382bc16fa8f7499b0663398437e125cf4f763b)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-05-30 09:00:42 +00:00
kondors1995
1dc18a5da0 Merge remote-tracking branch 'openela/linux-4.14.y' into 14.0 2024-05-12 12:26:07 +03:00
Neal Cardwell
5069afc313 tcp: fix excessive TLP and RACK timeouts from HZ rounding
[ Upstream commit 1c2709cfff1dedbb9591e989e2f001484208d914 ]

We discovered from packet traces of slow loss recovery on kernels with
the default HZ=250 setting (and min_rtt < 1ms) that after reordering,
when receiving a SACKed sequence range, the RACK reordering timer was
firing after about 16ms rather than the desired value of roughly
min_rtt/4 + 2ms. The problem is largely due to the RACK reorder timer
calculation adding in TCP_TIMEOUT_MIN, which is 2 jiffies. On kernels
with HZ=250, this is 2*4ms = 8ms. The TLP timer calculation has the
exact same issue.

This commit fixes the TLP transmit timer and RACK reordering timer
floor calculation to more closely match the intended 2ms floor even on
kernels with HZ=250. It does this by adding in a new
TCP_TIMEOUT_MIN_US floor of 2000 us and then converting to jiffies,
instead of the current approach of converting to jiffies and then
adding th TCP_TIMEOUT_MIN value of 2 jiffies.

Our testing has verified that on kernels with HZ=1000, as expected,
this does not produce significant changes in behavior, but on kernels
with the default HZ=250 the latency improvement can be large. For
example, our tests show that for HZ=250 kernels at low RTTs this fix
roughly halves the latency for the RACK reorder timer: instead of
mostly firing at 16ms it mostly fires at 8ms.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Fixes: bb4d991a28 ("tcp: adjust tail loss probe timeout")
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231015174700.2206872-1-ncardwell.sw@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
2024-05-06 14:36:40 +00:00
Eric Dumazet
759b99e274 tcp: Namespace-ify sysctl_tcp_early_retrans
[ Upstream commit 2ae21cf527da0e5cf9d7ee14bd5b0909bb9d1a75 ]

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: 1c2709cfff1d ("tcp: fix excessive TLP and RACK timeouts from HZ rounding")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
2024-05-06 14:36:40 +00:00
Hangbin Liu
7fb02a7f7d bonding: fix macvlan over alb bond support
[ Upstream commit e74216b8def3803e98ae536de78733e9d7f3b109 ]

The commit 14af9963ba ("bonding: Support macvlans on top of tlb/rlb mode
bonds") aims to enable the use of macvlans on top of rlb bond mode. However,
the current rlb bond mode only handles ARP packets to update remote neighbor
entries. This causes an issue when a macvlan is on top of the bond, and
remote devices send packets to the macvlan using the bond's MAC address
as the destination. After delivering the packets to the macvlan, the macvlan
will rejects them as the MAC address is incorrect. Consequently, this commit
makes macvlan over bond non-functional.

To address this problem, one potential solution is to check for the presence
of a macvlan port on the bond device using netif_is_macvlan_port(bond->dev)
and return NULL in the rlb_arp_xmit() function. However, this approach
doesn't fully resolve the situation when a VLAN exists between the bond and
macvlan.

So let's just do a partial revert for commit 14af9963ba in rlb_arp_xmit().
As the comment said, Don't modify or load balance ARPs that do not originate
locally.

Fixes: 14af9963ba ("bonding: Support macvlans on top of tlb/rlb mode bonds")
Reported-by: susan.zheng@veritas.com
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2117816
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
2024-05-06 14:36:38 +00:00
Jakub Kicinski
a975f7d7c8 net: remove bond_slave_has_mac_rcu()
[ Upstream commit 8b0fdcdc3a7d44aff907f0103f5ffb86b12bfe71 ]

No caller since v3.16.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: e74216b8def3 ("bonding: fix macvlan over alb bond support")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
2024-05-06 14:36:38 +00:00
Cambda Zhu
c3b84f9003 net: Replace the limit of TCP_LINGER2 with TCP_FIN_TIMEOUT_MAX
[ Upstream commit f0628c524fd188c3f9418e12478dfdfadacba815 ]

This patch changes the behavior of TCP_LINGER2 about its limit. The
sysctl_tcp_fin_timeout used to be the limit of TCP_LINGER2 but now it's
only the default value. A new macro named TCP_FIN_TIMEOUT_MAX is added
as the limit of TCP_LINGER2, which is 2 minutes.

Since TCP_LINGER2 used sysctl_tcp_fin_timeout as the default value
and the limit in the past, the system administrator cannot set the
default value for most of sockets and let some sockets have a greater
timeout. It might be a mistake that let the sysctl to be the limit of
the TCP_LINGER2. Maybe we can add a new sysctl to set the max of
TCP_LINGER2, but FIN-WAIT-2 timeout is usually no need to be too long
and 2 minutes are legal considering TCP specs.

Changes in v3:
- Remove the new socket option and change the TCP_LINGER2 behavior so
  that the timeout can be set to value between sysctl_tcp_fin_timeout
  and 2 minutes.

Changes in v2:
- Add int overflow check for the new socket option.

Changes in v1:
- Add a new socket option to set timeout greater than
  sysctl_tcp_fin_timeout.

Signed-off-by: Cambda Zhu <cambda@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: 9df5335ca974 ("tcp: annotate data-races around tp->linger2")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
2024-05-06 14:36:36 +00:00
Krzysztof Kozlowski
b6e41b5c23 nfc: constify several pointers to u8, char and sk_buff
[ Upstream commit 3df40eb3a2ea58bf404a38f15a7a2768e4762cb0 ]

Several functions receive pointers to u8, char or sk_buff but do not
modify the contents so make them const.  This allows doing the same for
local variables and in total makes the code a little bit safer.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 0d9b41daa590 ("nfc: llcp: fix possible use of uninitialized variable in nfc_llcp_send_connect()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
2024-05-06 14:36:35 +00:00
kondors1995
bcaa6a3480 Merge remote-tracking branch 'openela/linux-4.14.y' into 14.0 2024-04-21 22:19:28 +03:00
Xin Long
c4d2af79e4 sock_diag: request _diag module only when the family or proto has been registered
Now when using 'ss' in iproute, kernel would try to load all _diag
modules, which also causes corresponding family and proto modules
to be loaded as well due to module dependencies.

Like after running 'ss', sctp, dccp, af_packet (if it works as a module)
would be loaded.

For example:

  $ lsmod|grep sctp
  $ ss
  $ lsmod|grep sctp
  sctp_diag              16384  0
  sctp                  323584  5 sctp_diag
  inet_diag              24576  4 raw_diag,tcp_diag,sctp_diag,udp_diag
  libcrc32c              16384  3 nf_conntrack,nf_nat,sctp

As these family and proto modules are loaded unintentionally, it
could cause some problems, like:

- Some debug tools use 'ss' to collect the socket info, which loads all
  those diag and family and protocol modules. It's noisy for identifying
  issues.

- Users usually expect to drop sctp init packet silently when they
  have no sense of sctp protocol instead of sending abort back.

- It wastes resources (especially with multiple netns), and SCTP module
  can't be unloaded once it's loaded.

...

In short, it's really inappropriate to have these family and proto
modules loaded unexpectedly when just doing debugging with inet_diag.

This patch is to introduce sock_load_diag_module() where it loads
the _diag module only when it's corresponding family or proto has
been already registered.

Note that we can't just load _diag module without the family or
proto loaded, as some symbols used in _diag module are from the
family or proto module.

v1->v2:
  - move inet proto check to inet_diag to avoid a compiling err.
v2->v3:
  - define sock_load_diag_module in sock.c and export one symbol
    only.
  - improve the changelog.

Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit bf2ae2e4bf9360e07c0cdfa166bcdc0afd92f4ce)
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
2024-04-16 10:30:28 +00:00
Willem de Bruijn
683e08ff6e ip: validate header length on virtual device xmit
[ Upstream commit cb9f1b783850b14cbd7f87d061d784a666dfba1f ]

KMSAN detected read beyond end of buffer in vti and sit devices when
passing truncated packets with PF_PACKET. The issue affects additional
ip tunnel devices.

Extend commit 76c0ddd8c3a6 ("ip6_tunnel: be careful when accessing the
inner header") and commit ccfec9e5cb2d ("ip_tunnel: be careful when
accessing the inner header").

Move the check to a separate helper and call at the start of each
ndo_start_xmit function in net/ipv4 and net/ipv6.

Minor changes:
- convert dev_kfree_skb to kfree_skb on error path,
  as dev_kfree_skb calls consume_skb which is not for error paths.
- use pskb_network_may_pull even though that is pedantic here,
  as the same as pskb_may_pull for devices without llheaders.
- do not cache ipv6 hdrs if used only once
  (unsafe across pskb_may_pull, was more relevant to earlier patch)

Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit cde81154f86e36df0fad21c23976cc8b80c01600)
[vegard: drop changes to ip6erspan_tunnel_xmit(), which doesn't exist
 in 4.14 due to missing commit 5a963eb61b7c39e6c422b6e48619d19d04719358
 ("ip6_gre: Add ERSPAN native tunnel support") from 4.16]
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-04-13 08:29:03 +00:00
Andrey Ignatov
3583a6eb3f BACKPORT: bpf: Hooks for sys_connect
== The problem ==

See description of the problem in the initial patch of this patch set.

== The solution ==

The patch provides much more reliable in-kernel solution for the 2nd
part of the problem: making outgoing connecttion from desired IP.

It adds new attach types `BPF_CGROUP_INET4_CONNECT` and
`BPF_CGROUP_INET6_CONNECT` for program type
`BPF_PROG_TYPE_CGROUP_SOCK_ADDR` that can be used to override both
source and destination of a connection at connect(2) time.

Local end of connection can be bound to desired IP using newly
introduced BPF-helper `bpf_bind()`. It allows to bind to only IP though,
and doesn't support binding to port, i.e. leverages
`IP_BIND_ADDRESS_NO_PORT` socket option. There are two reasons for this:
* looking for a free port is expensive and can affect performance
  significantly;
* there is no use-case for port.

As for remote end (`struct sockaddr *` passed by user), both parts of it
can be overridden, remote IP and remote port. It's useful if an
application inside cgroup wants to connect to another application inside
same cgroup or to itself, but knows nothing about IP assigned to the
cgroup.

Support is added for IPv4 and IPv6, for TCP and UDP.

IPv4 and IPv6 have separate attach types for same reason as sys_bind
hooks, i.e. to prevent reading from / writing to e.g. user_ip6 fields
when user passes sockaddr_in since it'd be out-of-bound.

== Implementation notes ==

The patch introduces new field in `struct proto`: `pre_connect` that is
a pointer to a function with same signature as `connect` but is called
before it. The reason is in some cases BPF hooks should be called way
before control is passed to `sk->sk_prot->connect`. Specifically
`inet_dgram_connect` autobinds socket before calling
`sk->sk_prot->connect` and there is no way to call `bpf_bind()` from
hooks from e.g. `ip4_datagram_connect` or `ip6_datagram_connect` since
it'd cause double-bind. On the other hand `proto.pre_connect` provides a
flexible way to add BPF hooks for connect only for necessary `proto` and
call them at desired time before `connect`. Since `bpf_bind()` is
allowed to bind only to IP and autobind in `inet_dgram_connect` binds
only port there is no chance of double-bind.

bpf_bind() sets `force_bind_address_no_port` to bind to only IP despite
of value of `bind_address_no_port` socket field.

bpf_bind() sets `with_lock` to `false` when calling to __inet_bind()
and __inet6_bind() since all call-sites, where bpf_bind() is called,
already hold socket lock.

Change-Id: I03eb513369c630b203466621d1fbdb9b29c8333c
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
2024-03-26 18:34:01 +02:00
Andrey Ignatov
36f017a281 BACKPORT: net: Introduce __inet_bind() and __inet6_bind
Refactor `bind()` code to make it ready to be called from BPF helper
function `bpf_bind()` (will be added soon). Implementation of
`inet_bind()` and `inet6_bind()` is separated into `__inet_bind()` and
`__inet6_bind()` correspondingly. These function can be used from both
`sk_prot->bind` and `bpf_bind()` contexts.

New functions have two additional arguments.

`force_bind_address_no_port` forces binding to IP only w/o checking
`inet_sock.bind_address_no_port` field. It'll allow to bind local end of
a connection to desired IP in `bpf_bind()` w/o changing
`bind_address_no_port` field of a socket. It's useful since `bpf_bind()`
can return an error and we'd need to restore original value of
`bind_address_no_port` in that case if we changed this before calling to
the helper.

`with_lock` specifies whether to lock socket when working with `struct
sk` or not. The argument is set to `true` for `sk_prot->bind`, i.e. old
behavior is preserved. But it will be set to `false` for `bpf_bind()`
use-case. The reason is all call-sites, where `bpf_bind()` will be
called, already hold that socket lock.

Change-Id: I3cd102acdb2b3c14946ef8452fd7afb763e8215f
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
2024-03-26 18:34:01 +02:00
kondors1995
e6b0b34bb0 Merge branch 'linux-4.14.y' of https://github.com/openela/kernel-lts into 14.0 2024-03-26 18:32:22 +02:00
Eric Dumazet
fc4e079263 af_unix: fix lockdep positive in sk_diag_dump_icons()
[ Upstream commit 4d322dce82a1d44f8c83f0f54f95dd1b8dcf46c9 ]

syzbot reported a lockdep splat [1].

Blamed commit hinted about the possible lockdep
violation, and code used unix_state_lock_nested()
in an attempt to silence lockdep.

It is not sufficient, because unix_state_lock_nested()
is already used from unix_state_double_lock().

We need to use a separate subclass.

This patch adds a distinct enumeration to make things
more explicit.

Also use swap() in unix_state_double_lock() as a clean up.

v2: add a missing inline keyword to unix_state_lock_nested()

[1]
WARNING: possible circular locking dependency detected
6.8.0-rc1-syzkaller-00356-g8a696a29c690 #0 Not tainted

syz-executor.1/2542 is trying to acquire lock:
 ffff88808b5df9e8 (rlock-AF_UNIX){+.+.}-{2:2}, at: skb_queue_tail+0x36/0x120 net/core/skbuff.c:3863

but task is already holding lock:
 ffff88808b5dfe70 (&u->lock/1){+.+.}-{2:2}, at: unix_dgram_sendmsg+0xfc7/0x2200 net/unix/af_unix.c:2089

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&u->lock/1){+.+.}-{2:2}:
        lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754
        _raw_spin_lock_nested+0x31/0x40 kernel/locking/spinlock.c:378
        sk_diag_dump_icons net/unix/diag.c:87 [inline]
        sk_diag_fill+0x6ea/0xfe0 net/unix/diag.c:157
        sk_diag_dump net/unix/diag.c:196 [inline]
        unix_diag_dump+0x3e9/0x630 net/unix/diag.c:220
        netlink_dump+0x5c1/0xcd0 net/netlink/af_netlink.c:2264
        __netlink_dump_start+0x5d7/0x780 net/netlink/af_netlink.c:2370
        netlink_dump_start include/linux/netlink.h:338 [inline]
        unix_diag_handler_dump+0x1c3/0x8f0 net/unix/diag.c:319
       sock_diag_rcv_msg+0xe3/0x400
        netlink_rcv_skb+0x1df/0x430 net/netlink/af_netlink.c:2543
        sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:280
        netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline]
        netlink_unicast+0x7e6/0x980 net/netlink/af_netlink.c:1367
        netlink_sendmsg+0xa37/0xd70 net/netlink/af_netlink.c:1908
        sock_sendmsg_nosec net/socket.c:730 [inline]
        __sock_sendmsg net/socket.c:745 [inline]
        sock_write_iter+0x39a/0x520 net/socket.c:1160
        call_write_iter include/linux/fs.h:2085 [inline]
        new_sync_write fs/read_write.c:497 [inline]
        vfs_write+0xa74/0xca0 fs/read_write.c:590
        ksys_write+0x1a0/0x2c0 fs/read_write.c:643
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x63/0x6b

-> #0 (rlock-AF_UNIX){+.+.}-{2:2}:
        check_prev_add kernel/locking/lockdep.c:3134 [inline]
        check_prevs_add kernel/locking/lockdep.c:3253 [inline]
        validate_chain+0x1909/0x5ab0 kernel/locking/lockdep.c:3869
        __lock_acquire+0x1345/0x1fd0 kernel/locking/lockdep.c:5137
        lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754
        __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
        _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
        skb_queue_tail+0x36/0x120 net/core/skbuff.c:3863
        unix_dgram_sendmsg+0x15d9/0x2200 net/unix/af_unix.c:2112
        sock_sendmsg_nosec net/socket.c:730 [inline]
        __sock_sendmsg net/socket.c:745 [inline]
        ____sys_sendmsg+0x592/0x890 net/socket.c:2584
        ___sys_sendmsg net/socket.c:2638 [inline]
        __sys_sendmmsg+0x3b2/0x730 net/socket.c:2724
        __do_sys_sendmmsg net/socket.c:2753 [inline]
        __se_sys_sendmmsg net/socket.c:2750 [inline]
        __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2750
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x63/0x6b

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&u->lock/1);
                               lock(rlock-AF_UNIX);
                               lock(&u->lock/1);
  lock(rlock-AF_UNIX);

 *** DEADLOCK ***

1 lock held by syz-executor.1/2542:
  #0: ffff88808b5dfe70 (&u->lock/1){+.+.}-{2:2}, at: unix_dgram_sendmsg+0xfc7/0x2200 net/unix/af_unix.c:2089

stack backtrace:
CPU: 1 PID: 2542 Comm: syz-executor.1 Not tainted 6.8.0-rc1-syzkaller-00356-g8a696a29c690 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
Call Trace:
 <TASK>
  __dump_stack lib/dump_stack.c:88 [inline]
  dump_stack_lvl+0x1e7/0x2d0 lib/dump_stack.c:106
  check_noncircular+0x366/0x490 kernel/locking/lockdep.c:2187
  check_prev_add kernel/locking/lockdep.c:3134 [inline]
  check_prevs_add kernel/locking/lockdep.c:3253 [inline]
  validate_chain+0x1909/0x5ab0 kernel/locking/lockdep.c:3869
  __lock_acquire+0x1345/0x1fd0 kernel/locking/lockdep.c:5137
  lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754
  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
  _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
  skb_queue_tail+0x36/0x120 net/core/skbuff.c:3863
  unix_dgram_sendmsg+0x15d9/0x2200 net/unix/af_unix.c:2112
  sock_sendmsg_nosec net/socket.c:730 [inline]
  __sock_sendmsg net/socket.c:745 [inline]
  ____sys_sendmsg+0x592/0x890 net/socket.c:2584
  ___sys_sendmsg net/socket.c:2638 [inline]
  __sys_sendmmsg+0x3b2/0x730 net/socket.c:2724
  __do_sys_sendmmsg net/socket.c:2753 [inline]
  __se_sys_sendmmsg net/socket.c:2750 [inline]
  __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2750
  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
  do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7f26d887cda9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f26d95a60c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
RAX: ffffffffffffffda RBX: 00007f26d89abf80 RCX: 00007f26d887cda9
RDX: 000000000000003e RSI: 00000000200bd000 RDI: 0000000000000004
RBP: 00007f26d88c947a R08: 0000000000000000 R09: 0000000000000000
R10: 00000000000008c0 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007f26d89abf80 R15: 00007ffcfe081a68

Fixes: 2aac7a2cb0 ("unix_diag: Pending connections IDs NLA")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20240130184235.1620738-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 875f31aaa67e306098befa5e798a049075910fa7)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-03-08 08:21:35 +00:00
Kuniyuki Iwashima
dc5870e21f llc: Drop support for ETH_P_TR_802_2.
[ Upstream commit e3f9bed9bee261e3347131764e42aeedf1ffea61 ]

syzbot reported an uninit-value bug below. [0]

llc supports ETH_P_802_2 (0x0004) and used to support ETH_P_TR_802_2
(0x0011), and syzbot abused the latter to trigger the bug.

  write$tun(r0, &(0x7f0000000040)={@val={0x0, 0x11}, @val, @mpls={[], @llc={@snap={0xaa, 0x1, ')', "90e5dd"}}}}, 0x16)

llc_conn_handler() initialises local variables {saddr,daddr}.mac
based on skb in llc_pdu_decode_sa()/llc_pdu_decode_da() and passes
them to __llc_lookup().

However, the initialisation is done only when skb->protocol is
htons(ETH_P_802_2), otherwise, __llc_lookup_established() and
__llc_lookup_listener() will read garbage.

The missing initialisation existed prior to commit 211ed86510
("net: delete all instances of special processing for token ring").

It removed the part to kick out the token ring stuff but forgot to
close the door allowing ETH_P_TR_802_2 packets to sneak into llc_rcv().

Let's remove llc_tr_packet_type and complete the deprecation.

[0]:
BUG: KMSAN: uninit-value in __llc_lookup_established+0xe9d/0xf90
 __llc_lookup_established+0xe9d/0xf90
 __llc_lookup net/llc/llc_conn.c:611 [inline]
 llc_conn_handler+0x4bd/0x1360 net/llc/llc_conn.c:791
 llc_rcv+0xfbb/0x14a0 net/llc/llc_input.c:206
 __netif_receive_skb_one_core net/core/dev.c:5527 [inline]
 __netif_receive_skb+0x1a6/0x5a0 net/core/dev.c:5641
 netif_receive_skb_internal net/core/dev.c:5727 [inline]
 netif_receive_skb+0x58/0x660 net/core/dev.c:5786
 tun_rx_batched+0x3ee/0x980 drivers/net/tun.c:1555
 tun_get_user+0x53af/0x66d0 drivers/net/tun.c:2002
 tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2048
 call_write_iter include/linux/fs.h:2020 [inline]
 new_sync_write fs/read_write.c:491 [inline]
 vfs_write+0x8ef/0x1490 fs/read_write.c:584
 ksys_write+0x20f/0x4c0 fs/read_write.c:637
 __do_sys_write fs/read_write.c:649 [inline]
 __se_sys_write fs/read_write.c:646 [inline]
 __x64_sys_write+0x93/0xd0 fs/read_write.c:646
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
 entry_SYSCALL_64_after_hwframe+0x63/0x6b

Local variable daddr created at:
 llc_conn_handler+0x53/0x1360 net/llc/llc_conn.c:783
 llc_rcv+0xfbb/0x14a0 net/llc/llc_input.c:206

CPU: 1 PID: 5004 Comm: syz-executor994 Not tainted 6.6.0-syzkaller-14500-g1c41041124bd #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023

Fixes: 211ed86510 ("net: delete all instances of special processing for token ring")
Reported-by: syzbot+b5ad66046b913bc04c6f@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b5ad66046b913bc04c6f
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240119015515.61898-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 165ad1e22779685c3ed3dd349c6c4c632309cc62)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-03-08 08:21:28 +00:00
Luiz Augusto von Dentz
6ba5a14ad6 Bluetooth: Fix bogus check for re-auth no supported with non-ssp
[ Upstream commit d03376c185926098cb4d668d6458801eb785c0a5 ]

This reverts 19f8def031
"Bluetooth: Fix auth_complete_evt for legacy units" which seems to be
working around a bug on a broken controller rather then any limitation
imposed by the Bluetooth spec, in fact if there ws not possible to
re-auth the command shall fail not succeed.

Fixes: 19f8def031 ("Bluetooth: Fix auth_complete_evt for legacy units")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit f7f627ac761b2fb0c487e5aaff1585f1014ab9a6)
[fix trivial conflict due to missing commit
 2064ee332e4c1b7495cf68b84355c213d8fe71fd ("Bluetooth: Use bt_dev_err
 and bt_dev_info when possible")]
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-02-02 11:33:42 +00:00
Jon Maxwell
12cda1d577 ipv6: remove max_size check inline with ipv4
commit af6d10345ca76670c1b7c37799f0d5576ccef277 upstream.

In ip6_dst_gc() replace:

  if (entries > gc_thresh)

With:

  if (entries > ops->gc_thresh)

Sending Ipv6 packets in a loop via a raw socket triggers an issue where a
route is cloned by ip6_rt_cache_alloc() for each packet sent. This quickly
consumes the Ipv6 max_size threshold which defaults to 4096 resulting in
these warnings:

[1]   99.187805] dst_alloc: 7728 callbacks suppressed
[2] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
.
.
[300] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.

When this happens the packet is dropped and sendto() gets a network is
unreachable error:

remaining pkt 200557 errno 101
remaining pkt 196462 errno 101
.
.
remaining pkt 126821 errno 101

Implement David Aherns suggestion to remove max_size check seeing that Ipv6
has a GC to manage memory usage. Ipv4 already does not check max_size.

Here are some memory comparisons for Ipv4 vs Ipv6 with the patch:

Test by running 5 instances of a program that sends UDP packets to a raw
socket 5000000 times. Compare Ipv4 and Ipv6 performance with a similar
program.

Ipv4:

Before test:

MemFree:        29427108 kB
Slab:             237612 kB

ip6_dst_cache       1912   2528    256   32    2 : tunables    0    0    0
xfrm_dst_cache         0      0    320   25    2 : tunables    0    0    0
ip_dst_cache        2881   3990    192   42    2 : tunables    0    0    0

During test:

MemFree:        29417608 kB
Slab:             247712 kB

ip6_dst_cache       1912   2528    256   32    2 : tunables    0    0    0
xfrm_dst_cache         0      0    320   25    2 : tunables    0    0    0
ip_dst_cache       44394  44394    192   42    2 : tunables    0    0    0

After test:

MemFree:        29422308 kB
Slab:             238104 kB

ip6_dst_cache       1912   2528    256   32    2 : tunables    0    0    0
xfrm_dst_cache         0      0    320   25    2 : tunables    0    0    0
ip_dst_cache        3048   4116    192   42    2 : tunables    0    0    0

Ipv6 with patch:

Errno 101 errors are not observed anymore with the patch.

Before test:

MemFree:        29422308 kB
Slab:             238104 kB

ip6_dst_cache       1912   2528    256   32    2 : tunables    0    0    0
xfrm_dst_cache         0      0    320   25    2 : tunables    0    0    0
ip_dst_cache        3048   4116    192   42    2 : tunables    0    0    0

During Test:

MemFree:        29431516 kB
Slab:             240940 kB

ip6_dst_cache      11980  12064    256   32    2 : tunables    0    0    0
xfrm_dst_cache         0      0    320   25    2 : tunables    0    0    0
ip_dst_cache        3048   4116    192   42    2 : tunables    0    0    0

After Test:

MemFree:        29441816 kB
Slab:             238132 kB

ip6_dst_cache       1902   2432    256   32    2 : tunables    0    0    0
xfrm_dst_cache         0      0    320   25    2 : tunables    0    0    0
ip_dst_cache        3048   4116    192   42    2 : tunables    0    0    0

Tested-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230112012532.311021-1-jmaxwell37@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com>
Cc: <stable@vger.kernel.org> # 4.19.x
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 95372b040ae689293c6863b90049f1af68410c8b)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-01-18 11:27:48 +00:00
Eric Dumazet
e0411760af ipv6: make ip6_rt_gc_expire an atomic_t
commit 9cb7c013420f98fa6fd12fc6a5dc055170c108db upstream.

Reads and Writes to ip6_rt_gc_expire always have been racy,
as syzbot reported lately [1]

There is a possible risk of under-flow, leading
to unexpected high value passed to fib6_run_gc(),
although I have not observed this in the field.

Hosts hitting ip6_dst_gc() very hard are under pretty bad
state anyway.

[1]
BUG: KCSAN: data-race in ip6_dst_gc / ip6_dst_gc

read-write to 0xffff888102110744 of 4 bytes by task 13165 on cpu 1:
 ip6_dst_gc+0x1f3/0x220 net/ipv6/route.c:3311
 dst_alloc+0x9b/0x160 net/core/dst.c:86
 ip6_dst_alloc net/ipv6/route.c:344 [inline]
 icmp6_dst_alloc+0xb2/0x360 net/ipv6/route.c:3261
 mld_sendpack+0x2b9/0x580 net/ipv6/mcast.c:1807
 mld_send_cr net/ipv6/mcast.c:2119 [inline]
 mld_ifc_work+0x576/0x800 net/ipv6/mcast.c:2651
 process_one_work+0x3d3/0x720 kernel/workqueue.c:2289
 worker_thread+0x618/0xa70 kernel/workqueue.c:2436
 kthread+0x1a9/0x1e0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30

read-write to 0xffff888102110744 of 4 bytes by task 11607 on cpu 0:
 ip6_dst_gc+0x1f3/0x220 net/ipv6/route.c:3311
 dst_alloc+0x9b/0x160 net/core/dst.c:86
 ip6_dst_alloc net/ipv6/route.c:344 [inline]
 icmp6_dst_alloc+0xb2/0x360 net/ipv6/route.c:3261
 mld_sendpack+0x2b9/0x580 net/ipv6/mcast.c:1807
 mld_send_cr net/ipv6/mcast.c:2119 [inline]
 mld_ifc_work+0x576/0x800 net/ipv6/mcast.c:2651
 process_one_work+0x3d3/0x720 kernel/workqueue.c:2289
 worker_thread+0x618/0xa70 kernel/workqueue.c:2436
 kthread+0x1a9/0x1e0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30

value changed: 0x00000bb3 -> 0x00000ba9

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 11607 Comm: kworker/0:21 Not tainted 5.18.0-rc1-syzkaller-00037-g42e7a03d3bad-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: mld mld_ifc_work

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220413181333.649424-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[ 4.19: context adjustment in include/net/netns/ipv6.h ]
Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com>
Cc: <stable@vger.kernel.org> # 4.19.x
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit b4cfbeaebeb355dbaefb218470055de2e8a73020)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-01-18 11:27:47 +00:00
Eric Dumazet
2ee1663e55 net/dst: use a smaller percpu_counter batch for dst entries accounting
commit cf86a086a18095e33e0637cb78cda1fcf5280852 upstream.

percpu_counter_add() uses a default batch size which is quite big
on platforms with 256 cpus. (2*256 -> 512)

This means dst_entries_get_fast() can be off by +/- 2*(nr_cpus^2)
(131072 on servers with 256 cpus)

Reduce the batch size to something more reasonable, and
add logic to ip6_dst_gc() to call dst_entries_get_slow()
before calling the _very_ expensive fib6_run_gc() function.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com>
Cc: <stable@vger.kernel.org> # 4.19.x
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 9635bd0a5296e2e725c6b33e530d0ef582e2f64e)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-01-18 11:27:47 +00:00
kondors1995
1803e52ec8 Merge branch 'android-4.14-stable' of https://github.com/aosp-mirror/kernel_common into 14.0 2023-12-21 10:49:06 +02:00
Greg Kroah-Hartman
8382692884 Merge 4.14.333 into android-4.14-stable
Changes in 4.14.333
	tg3: Move the [rt]x_dropped counters to tg3_napi
	tg3: Increment tx_dropped in tg3_tso_bug()
	drm/amdgpu: correct chunk_ptr to a pointer to chunk.
	net: hns: fix fake link up on xge port
	tcp: do not accept ACK of bytes we never sent
	RDMA/bnxt_re: Correct module description string
	hwmon: (acpi_power_meter) Fix 4.29 MW bug
	tracing: Fix a warning when allocating buffered events fails
	scsi: be2iscsi: Fix a memleak in beiscsi_init_wrb_handle()
	ALSA: pcm: fix out-of-bounds in snd_pcm_state_names
	nilfs2: prevent WARNING in nilfs_sufile_set_segment_usage()
	tracing: Always update snapshot buffer size
	tracing: Fix incomplete locking when disabling buffered events
	tracing: Fix a possible race when disabling buffered events
	packet: Move reference count in packet_sock to atomic_long_t
	parport: Add support for Brainboxes IX/UC/PX parallel cards
	serial: sc16is7xx: address RX timeout interrupt errata
	serial: 8250_omap: Add earlycon support for the AM654 UART controller
	KVM: s390/mm: Properly reset no-dat
	nilfs2: fix missing error check for sb_set_blocksize call
	netlink: don't call ->netlink_bind with table lock held
	genetlink: add CAP_NET_ADMIN test for multicast bind
	psample: Require 'CAP_NET_ADMIN' when joining "packets" group
	drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group
	Linux 4.14.333

Change-Id: Iebcaaf9d6c5e2ef71dd23c3c6246f6cef45d296a
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-12-14 08:59:48 +00:00
Ido Schimmel
d9478fe0a8 drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group
commit e03781879a0d524ce3126678d50a80484a513c4b upstream.

The "NET_DM" generic netlink family notifies drop locations over the
"events" multicast group. This is problematic since by default generic
netlink allows non-root users to listen to these notifications.

Fix by adding a new field to the generic netlink multicast group
structure that when set prevents non-root users or root without the
'CAP_SYS_ADMIN' capability (in the user namespace owning the network
namespace) from joining the group. Set this field for the "events"
group. Use 'CAP_SYS_ADMIN' rather than 'CAP_NET_ADMIN' because of the
nature of the information that is shared over this group.

Note that the capability check in this case will always be performed
against the initial user namespace since the family is not netns aware
and only operates in the initial network namespace.

A new field is added to the structure rather than using the "flags"
field because the existing field uses uAPI flags and it is inappropriate
to add a new uAPI flag for an internal kernel check. In net-next we can
rework the "flags" field to use internal flags and fold the new field
into it. But for now, in order to reduce the amount of changes, add a
new field.

Since the information can only be consumed by root, mark the control
plane operations that start and stop the tracing as root-only using the
'GENL_ADMIN_PERM' flag.

Tested using [1].

Before:

 # capsh -- -c ./dm_repo
 # capsh --drop=cap_sys_admin -- -c ./dm_repo

After:

 # capsh -- -c ./dm_repo
 # capsh --drop=cap_sys_admin -- -c ./dm_repo
 Failed to join "events" multicast group

[1]
 $ cat dm.c
 #include <stdio.h>
 #include <netlink/genl/ctrl.h>
 #include <netlink/genl/genl.h>
 #include <netlink/socket.h>

 int main(int argc, char **argv)
 {
 	struct nl_sock *sk;
 	int grp, err;

 	sk = nl_socket_alloc();
 	if (!sk) {
 		fprintf(stderr, "Failed to allocate socket\n");
 		return -1;
 	}

 	err = genl_connect(sk);
 	if (err) {
 		fprintf(stderr, "Failed to connect socket\n");
 		return err;
 	}

 	grp = genl_ctrl_resolve_grp(sk, "NET_DM", "events");
 	if (grp < 0) {
 		fprintf(stderr,
 			"Failed to resolve \"events\" multicast group\n");
 		return grp;
 	}

 	err = nl_socket_add_memberships(sk, grp, NFNLGRP_NONE);
 	if (err) {
 		fprintf(stderr, "Failed to join \"events\" multicast group\n");
 		return err;
 	}

 	return 0;
 }
 $ gcc -I/usr/include/libnl3 -lnl-3 -lnl-genl-3 -o dm_repo dm.c

Fixes: 9a8afc8d39 ("Network Drop Monitor: Adding drop monitor implementation & Netlink protocol")
Reported-by: "The UK's National Cyber Security Centre (NCSC)" <security@ncsc.gov.uk>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20231206213102.1824398-3-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-13 16:46:18 +01:00
Ido Schimmel
eddc6121ec genetlink: add CAP_NET_ADMIN test for multicast bind
This is a partial backport of upstream commit 4d54cc32112d ("mptcp:
avoid lock_fast usage in accept path"). It is only a partial backport
because the patch in the link below was erroneously squash-merged into
upstream commit 4d54cc32112d ("mptcp: avoid lock_fast usage in accept
path"). Below is the original patch description from Florian Westphal:

"
genetlink sets NL_CFG_F_NONROOT_RECV for its netlink socket so anyone can
subscribe to multicast messages.

rtnetlink doesn't allow this unconditionally,  rtnetlink_bind() restricts
bind requests to CAP_NET_ADMIN for a few groups.

This allows to set GENL_UNS_ADMIN_PERM flag on genl mcast groups to
mandate CAP_NET_ADMIN.

This will be used by the upcoming mptcp netlink event facility which
exposes the token (mptcp connection identifier) to userspace.
"

Link: https://lore.kernel.org/mptcp/20210213000001.379332-8-mathew.j.martineau@linux.intel.com/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-13 16:46:17 +01:00
kondors1995
3a48744561 Merge branch 'android-4.14-stable' of https://github.com/aosp-mirror/kernel_common into 13.0 2023-12-02 18:01:02 +02:00
Greg Kroah-Hartman
52d13de272 Merge 4.14.331 into android-4.14-stable
Changes in 4.14.331
	locking/ww_mutex/test: Fix potential workqueue corruption
	clocksource/drivers/timer-imx-gpt: Fix potential memory leak
	clocksource/drivers/timer-atmel-tcb: Fix initialization on SAM9 hardware
	x86/mm: Drop the 4 MB restriction on minimal NUMA node memory size
	wifi: mac80211: don't return unset power in ieee80211_get_tx_power()
	wifi: ath9k: fix clang-specific fortify warnings
	wifi: ath10k: fix clang-specific fortify warning
	net: annotate data-races around sk->sk_dst_pending_confirm
	drm/amd: Fix UBSAN array-index-out-of-bounds for SMU7
	drm/amd: Fix UBSAN array-index-out-of-bounds for Polaris and Tonga
	selftests/efivarfs: create-read: fix a resource leak
	crypto: pcrypt - Fix hungtask for PADATA_RESET
	RDMA/hfi1: Use FIELD_GET() to extract Link Width
	fs/jfs: Add check for negative db_l2nbperpage
	fs/jfs: Add validity check for db_maxag and db_agpref
	jfs: fix array-index-out-of-bounds in dbFindLeaf
	jfs: fix array-index-out-of-bounds in diAlloc
	ALSA: hda: Fix possible null-ptr-deref when assigning a stream
	atm: iphase: Do PCI error checks on own line
	scsi: libfc: Fix potential NULL pointer dereference in fc_lport_ptp_setup()
	tty: vcc: Add check for kstrdup() in vcc_probe()
	i2c: sun6i-p2wi: Prevent potential division by zero
	media: gspca: cpia1: shift-out-of-bounds in set_flicker
	media: vivid: avoid integer overflow
	gfs2: ignore negated quota changes
	pwm: Fix double shift bug
	media: venus: hfi: add checks to perform sanity on queue pointers
	randstruct: Fix gcc-plugin performance mode to stay in group
	KVM: x86: Ignore MSR_AMD64_TW_CFG access
	audit: don't take task_lock() in audit_exe_compare() code path
	audit: don't WARN_ON_ONCE(!current->mm) in audit_exe_compare()
	hvc/xen: fix error path in xen_hvc_init() to always register frontend driver
	PCI/sysfs: Protect driver's D3cold preference from user space
	mmc: vub300: fix an error code
	PM: hibernate: Use __get_safe_page() rather than touching the list
	PM: hibernate: Clean up sync_read handling in snapshot_write_next()
	mmc: meson-gx: Remove setting of CMD_CFG_ERROR
	genirq/generic_chip: Make irq_remove_generic_chip() irqdomain aware
	jbd2: fix potential data lost in recovering journal raced with synchronizing fs bdev
	mcb: fix error handling for different scenarios when parsing
	parisc: Prevent booting 64-bit kernels on PA1.x machines
	parisc/pgtable: Do not drop upper 5 address bits of physical address
	ALSA: info: Fix potential deadlock at disconnection
	net: dsa: lan9303: consequently nested-lock physical MDIO
	i2c: i801: fix potential race in i801_block_transaction_byte_by_byte
	media: sharp: fix sharp encoding
	media: venus: hfi: fix the check to handle session buffer requirement
	ext4: apply umask if ACL support is disabled
	ext4: correct offset of gdb backup in non meta_bg group to update_backups
	ext4: correct return value of ext4_convert_meta_bg
	ext4: remove gdb backup copy for meta bg in setup_new_flex_group_blocks
	scsi: virtio_scsi: limit number of hw queues by nr_cpu_ids
	net: sched: fix race condition in qdisc_graft()
	Linux 4.14.331

Change-Id: I1a1bce75363d3b2c731f3e947543c6506bed9817
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-11-28 17:35:00 +00:00
Eric Dumazet
541184f4ab net: annotate data-races around sk->sk_dst_pending_confirm
[ Upstream commit eb44ad4e635132754bfbcb18103f1dcb7058aedd ]

This field can be read or written without socket lock being held.

Add annotations to avoid load-store tearing.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28 16:45:43 +00:00
kondors1995
f62c01b1c6 Merge branch 'android-4.14-stable' of https://github.com/aosp-mirror/kernel_common into 13.0 2023-11-04 12:25:20 +02:00
Greg Kroah-Hartman
67419faf9f Merge 4.14.328 into android-4.14-stable
Changes in 4.14.328
	RDMA/cxgb4: Check skb value for failure to allocate
	HID: logitech-hidpp: Fix kernel crash on receiver USB disconnect
	drm: etvnaviv: fix bad backport leading to warning
	ieee802154: ca8210: Fix a potential UAF in ca8210_probe
	drm/vmwgfx: fix typo of sizeof argument
	ixgbe: fix crash with empty VF macvlan list
	nfc: nci: assert requested protocol is valid
	workqueue: Override implicit ordered attribute in workqueue_apply_unbound_cpumask()
	usb: xhci: xhci-ring: Use sysdev for mapping bounce buffer
	net: usb: dm9601: fix uninitialized variable use in dm9601_mdio_read
	usb: musb: Get the musb_qh poniter after musb_giveback
	usb: musb: Modify the "HWVers" register address
	iio: pressure: bmp280: Fix NULL pointer exception
	iio: pressure: ms5611: ms5611_prom_is_valid false negative bug
	mcb: remove is_added flag from mcb_device struct
	ceph: fix incorrect revoked caps assert in ceph_fill_file_size()
	Input: powermate - fix use-after-free in powermate_config_complete
	Input: xpad - add PXN V900 support
	cgroup: Remove duplicates in cgroup v1 tasks file
	pinctrl: avoid unsafe code pattern in find_pinctrl()
	usb: gadget: udc-xilinx: replace memcpy with memcpy_toio
	usb: gadget: ncm: Handle decoding of multiple NTB's in unwrap call
	x86/cpu: Fix AMD erratum #1485 on Zen4-based CPUs
	usb: hub: Guard against accesses to uninitialized BOS descriptors
	Bluetooth: hci_event: Ignore NULL link key
	Bluetooth: Reject connection with the device which has same BD_ADDR
	Bluetooth: Fix a refcnt underflow problem for hci_conn
	Bluetooth: vhci: Fix race when opening vhci device
	Bluetooth: hci_event: Fix coding style
	Bluetooth: avoid memcmp() out of bounds warning
	nfc: nci: fix possible NULL pointer dereference in send_acknowledge()
	regmap: fix NULL deref on lookup
	KVM: x86: Mask LVTPC when handling a PMI
	netfilter: nft_payload: fix wrong mac header matching
	xfrm: fix a data-race in xfrm_gen_index()
	net: ipv4: fix return value check in esp_remove_trailer
	net: ipv6: fix return value check in esp_remove_trailer
	net: rfkill: gpio: prevent value glitch during probe
	net: usb: smsc95xx: Fix an error code in smsc95xx_reset()
	i40e: prevent crash on probe if hw registers have invalid values
	ARM: dts: ti: omap: Fix noisy serial with overrun-throttle-ms for mapphone
	btrfs: initialize start_slot in btrfs_log_prealloc_extents
	i2c: mux: Avoid potential false error message in i2c_mux_add_adapter
	overlayfs: set ctime when setting mtime and atime
	gpio: timberdale: Fix potential deadlock on &tgpio->lock
	ata: libata-eh: Fix compilation warning in ata_eh_link_report()
	tracing: relax trace_event_eval_update() execution with cond_resched()
	HID: holtek: fix slab-out-of-bounds Write in holtek_kbd_input_event
	Bluetooth: Avoid redundant authentication
	Bluetooth: hci_core: Fix build warnings
	wifi: mac80211: allow transmitting EAPOL frames with tainted key
	wifi: cfg80211: avoid leaking stack data into trace
	sky2: Make sure there is at least one frag_addr available
	mmc: core: Capture correct oemid-bits for eMMC cards
	Revert "pinctrl: avoid unsafe code pattern in find_pinctrl()"
	ACPI: irq: Fix incorrect return value in acpi_register_gsi()
	USB: serial: option: add Telit LE910C4-WWX 0x1035 composition
	USB: serial: option: add entry for Sierra EM9191 with new firmware
	USB: serial: option: add Fibocom to DELL custom modem FM101R-GL
	perf: Disallow mis-matched inherited group reads
	s390/pci: fix iommu bitmap allocation
	gpio: vf610: set value before the direction to avoid a glitch
	Bluetooth: hci_sock: fix slab oob read in create_monitor_event
	Bluetooth: hci_sock: Correctly bounds check and pad HCI_MON_NEW_INDEX name
	Bluetooth: hci_event: Fix using memcmp when comparing keys
	Linux 4.14.328

Change-Id: I0ad6691640e3f75a6016e2004f005414a50dc7b9
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-10-25 11:33:41 +00:00
Luiz Augusto von Dentz
d9ce7d4383 Bluetooth: hci_core: Fix build warnings
[ Upstream commit dcda165706b9fbfd685898d46a6749d7d397e0c0 ]

This fixes the following warnings:

net/bluetooth/hci_core.c: In function ‘hci_register_dev’:
net/bluetooth/hci_core.c:2620:54: warning: ‘%d’ directive output may
be truncated writing between 1 and 10 bytes into a region of size 5
[-Wformat-truncation=]
 2620 |         snprintf(hdev->name, sizeof(hdev->name), "hci%d", id);
      |                                                      ^~
net/bluetooth/hci_core.c:2620:50: note: directive argument in the range
[0, 2147483647]
 2620 |         snprintf(hdev->name, sizeof(hdev->name), "hci%d", id);
      |                                                  ^~~~~~~
net/bluetooth/hci_core.c:2620:9: note: ‘snprintf’ output between 5 and
14 bytes into a destination of size 8
 2620 |         snprintf(hdev->name, sizeof(hdev->name), "hci%d", id);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-25 11:13:32 +02:00
Eric Dumazet
849412179d xfrm: fix a data-race in xfrm_gen_index()
commit 3e4bc23926b83c3c67e5f61ae8571602754131a6 upstream.

xfrm_gen_index() mutual exclusion uses net->xfrm.xfrm_policy_lock.

This means we must use a per-netns idx_generator variable,
instead of a static one.
Alternative would be to use an atomic variable.

syzbot reported:

BUG: KCSAN: data-race in xfrm_sk_policy_insert / xfrm_sk_policy_insert

write to 0xffffffff87005938 of 4 bytes by task 29466 on cpu 0:
xfrm_gen_index net/xfrm/xfrm_policy.c:1385 [inline]
xfrm_sk_policy_insert+0x262/0x640 net/xfrm/xfrm_policy.c:2347
xfrm_user_policy+0x413/0x540 net/xfrm/xfrm_state.c:2639
do_ipv6_setsockopt+0x1317/0x2ce0 net/ipv6/ipv6_sockglue.c:943
ipv6_setsockopt+0x57/0x130 net/ipv6/ipv6_sockglue.c:1012
rawv6_setsockopt+0x21e/0x410 net/ipv6/raw.c:1054
sock_common_setsockopt+0x61/0x70 net/core/sock.c:3697
__sys_setsockopt+0x1c9/0x230 net/socket.c:2263
__do_sys_setsockopt net/socket.c:2274 [inline]
__se_sys_setsockopt net/socket.c:2271 [inline]
__x64_sys_setsockopt+0x66/0x80 net/socket.c:2271
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

read to 0xffffffff87005938 of 4 bytes by task 29460 on cpu 1:
xfrm_sk_policy_insert+0x13e/0x640
xfrm_user_policy+0x413/0x540 net/xfrm/xfrm_state.c:2639
do_ipv6_setsockopt+0x1317/0x2ce0 net/ipv6/ipv6_sockglue.c:943
ipv6_setsockopt+0x57/0x130 net/ipv6/ipv6_sockglue.c:1012
rawv6_setsockopt+0x21e/0x410 net/ipv6/raw.c:1054
sock_common_setsockopt+0x61/0x70 net/core/sock.c:3697
__sys_setsockopt+0x1c9/0x230 net/socket.c:2263
__do_sys_setsockopt net/socket.c:2274 [inline]
__se_sys_setsockopt net/socket.c:2271 [inline]
__x64_sys_setsockopt+0x66/0x80 net/socket.c:2271
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

value changed: 0x00006ad8 -> 0x00006b18

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 29460 Comm: syz-executor.1 Not tainted 6.5.0-rc5-syzkaller-00243-g9106536c1aa3 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023

Fixes: 1121994c80 ("netns xfrm: policy insertion in netns")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25 11:13:31 +02:00