Commit Graph

1811 Commits

Author SHA1 Message Date
Christoph Hellwig
975824c4e3 BACKPORT: mm: remove the pgprot argument to __vmalloc
The pgprot argument to __vmalloc is always PAGE_KERNEL now, so remove it.

Change-Id: Iae5854c7005dec82942db58215d615a10bde1f31
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com> [hyperv]
Acked-by: Gao Xiang <xiang@kernel.org> [erofs]
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Wei Liu <wei.liu@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200414131348.444715-22-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-09-08 17:27:09 +03:00
Christoph Hellwig
c5095805e9 BACKPORT: sysctl: pass kernel pointers to ->proc_handler
Instead of having all the sysctl handlers deal with user pointers, which
is rather hairy in terms of the BPF interaction, copy the input to and
from  userspace in common code.  This also means that the strings are
always NUL-terminated by the common code, making the API a little bit
safer.

As most handler just pass through the data to one of the common handlers
a lot of the changes are mechnical.

Change-Id: Ic71fd778e4cea58adc51d634d9e53c1f9f90cdf2
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-09-08 17:26:43 +03:00
Greg Kroah-Hartman
4f94b88d7d Merge 4.19.324 into android-4.19-stable
Changes in 4.19.324
	arm64: dts: rockchip: Fix rt5651 compatible value on rk3399-sapphire-excavator
	ARM: dts: rockchip: fix rk3036 acodec node
	ARM: dts: rockchip: drop grf reference from rk3036 hdmi
	ARM: dts: rockchip: Fix the realtek audio codec on rk3036-kylin
	HID: core: zero-initialize the report buffer
	security/keys: fix slab-out-of-bounds in key_task_permission
	sctp: properly validate chunk size in sctp_sf_ootb()
	can: c_can: fix {rx,tx}_errors statistics
	net: hns3: fix kernel crash when uninstalling driver
	media: stb0899_algo: initialize cfr before using it
	media: dvbdev: prevent the risk of out of memory access
	media: dvb_frontend: don't play tricks with underflow values
	media: adv7604: prevent underflow condition when reporting colorspace
	ALSA: firewire-lib: fix return value on fail in amdtp_tscm_init()
	media: s5p-jpeg: prevent buffer overflows
	media: cx24116: prevent overflows on SNR calculus
	media: v4l2-tpg: prevent the risk of a division by zero
	drm/amdgpu: add missing size check in amdgpu_debugfs_gprwave_read()
	drm/amdgpu: prevent NULL pointer dereference if ATIF is not supported
	dm cache: correct the number of origin blocks to match the target length
	dm cache: fix out-of-bounds access to the dirty bitset when resizing
	dm cache: optimize dirty bit checking with find_next_bit when resizing
	dm cache: fix potential out-of-bounds access on the first resume
	dm-unstriped: cast an operand to sector_t to prevent potential uint32_t overflow
	nfs: Fix KMSAN warning in decode_getfattr_attrs()
	btrfs: reinitialize delayed ref list after deleting it from the list
	bonding (gcc13): synchronize bond_{a,t}lb_xmit() types
	net: bridge: xmit: make sure we have at least eth header len bytes
	media: uvcvideo: Skip parsing frames of type UVC_VS_UNDEFINED in uvc_parse_format
	fs/proc: fix compile warning about variable 'vmcore_mmap_ops'
	usb: musb: sunxi: Fix accessing an released usb phy
	USB: serial: io_edgeport: fix use after free in debug printk
	USB: serial: qcserial: add support for Sierra Wireless EM86xx
	USB: serial: option: add Fibocom FG132 0x0112 composition
	USB: serial: option: add Quectel RG650V
	irqchip/gic-v3: Force propagation of the active state with a read-back
	ocfs2: remove entry once instead of null-ptr-dereference in ocfs2_xa_remove()
	ALSA: pcm: Return 0 when size < start_threshold in capture
	ALSA: usb-audio: Add custom mixer status quirks for RME CC devices
	ALSA: usb-audio: Support jack detection on Dell dock
	ALSA: usb-audio: Add quirks for Dell WD19 dock
	hv_sock: Initializing vsk->trans to NULL to prevent a dangling pointer
	vsock/virtio: Initialization of the dangling pointer occurring in vsk->trans
	ALSA: usb-audio: Add endianness annotations
	9p: Avoid creating multiple slab caches with the same name
	HID: multitouch: Add quirk for HONOR MagicBook Art 14 touchpad
	bpf: use kvzmalloc to allocate BPF verifier environment
	sound: Make CONFIG_SND depend on INDIRECT_IOMEM instead of UML
	powerpc/powernv: Free name on error in opal_event_init()
	fs: Fix uninitialized value issue in from_kuid and from_kgid
	net: usb: qmi_wwan: add Fibocom FG132 0x0112 composition
	9p: fix slab cache name creation for real
	Linux 4.19.324

Change-Id: Ib8e7c89304d2c2cc72aea03446ea40a8704b41ec
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-11-17 20:02:03 +00:00
Nikolay Aleksandrov
3e01fc3c66 net: bridge: xmit: make sure we have at least eth header len bytes
commit 8bd67ebb50c0145fd2ca8681ab65eb7e8cde1afc upstream.

syzbot triggered an uninit value[1] error in bridge device's xmit path
by sending a short (less than ETH_HLEN bytes) skb. To fix it check if
we can actually pull that amount instead of assuming.

Tested with dropwatch:
 drop at: br_dev_xmit+0xb93/0x12d0 [bridge] (0xffffffffc06739b3)
 origin: software
 timestamp: Mon May 13 11:31:53 2024 778214037 nsec
 protocol: 0x88a8
 length: 2
 original length: 2
 drop reason: PKT_TOO_SMALL

[1]
BUG: KMSAN: uninit-value in br_dev_xmit+0x61d/0x1cb0 net/bridge/br_device.c:65
 br_dev_xmit+0x61d/0x1cb0 net/bridge/br_device.c:65
 __netdev_start_xmit include/linux/netdevice.h:4903 [inline]
 netdev_start_xmit include/linux/netdevice.h:4917 [inline]
 xmit_one net/core/dev.c:3531 [inline]
 dev_hard_start_xmit+0x247/0xa20 net/core/dev.c:3547
 __dev_queue_xmit+0x34db/0x5350 net/core/dev.c:4341
 dev_queue_xmit include/linux/netdevice.h:3091 [inline]
 __bpf_tx_skb net/core/filter.c:2136 [inline]
 __bpf_redirect_common net/core/filter.c:2180 [inline]
 __bpf_redirect+0x14a6/0x1620 net/core/filter.c:2187
 ____bpf_clone_redirect net/core/filter.c:2460 [inline]
 bpf_clone_redirect+0x328/0x470 net/core/filter.c:2432
 ___bpf_prog_run+0x13fe/0xe0f0 kernel/bpf/core.c:1997
 __bpf_prog_run512+0xb5/0xe0 kernel/bpf/core.c:2238
 bpf_dispatcher_nop_func include/linux/bpf.h:1234 [inline]
 __bpf_prog_run include/linux/filter.h:657 [inline]
 bpf_prog_run include/linux/filter.h:664 [inline]
 bpf_test_run+0x499/0xc30 net/bpf/test_run.c:425
 bpf_prog_test_run_skb+0x14ea/0x1f20 net/bpf/test_run.c:1058
 bpf_prog_test_run+0x6b7/0xad0 kernel/bpf/syscall.c:4269
 __sys_bpf+0x6aa/0xd90 kernel/bpf/syscall.c:5678
 __do_sys_bpf kernel/bpf/syscall.c:5767 [inline]
 __se_sys_bpf kernel/bpf/syscall.c:5765 [inline]
 __x64_sys_bpf+0xa0/0xe0 kernel/bpf/syscall.c:5765
 x64_sys_call+0x96b/0x3b50 arch/x86/include/generated/asm/syscalls_64.h:322
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xcf/0x1e0 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzbot+a63a1f6a062033cf0f40@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a63a1f6a062033cf0f40
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Randy MacLeod <Randy.MacLeod@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-17 14:58:07 +01:00
Greg Kroah-Hartman
2d76dea417 Merge 4.19.323 into android-4.19-stable
Changes in 4.19.323
	staging: iio: frequency: ad9833: Get frequency value statically
	staging: iio: frequency: ad9833: Load clock using clock framework
	staging: iio: frequency: ad9834: Validate frequency parameter value
	usbnet: ipheth: fix carrier detection in modes 1 and 4
	net: ethernet: use ip_hdrlen() instead of bit shift
	net: phy: vitesse: repair vsc73xx autonegotiation
	scripts: kconfig: merge_config: config files: add a trailing newline
	arm64: dts: rockchip: override BIOS_DISABLE signal via GPIO hog on RK3399 Puma
	net/mlx5: Update the list of the PCI supported devices
	net: ftgmac100: Enable TX interrupt to avoid TX timeout
	net: dpaa: Pad packets to ETH_ZLEN
	soundwire: stream: Revert "soundwire: stream: fix programming slave ports for non-continous port maps"
	selftests/vm: remove call to ksft_set_plan()
	selftests/kcmp: remove call to ksft_set_plan()
	ASoC: allow module autoloading for table db1200_pids
	pinctrl: at91: make it work with current gpiolib
	microblaze: don't treat zero reserved memory regions as error
	net: ftgmac100: Ensure tx descriptor updates are visible
	wifi: iwlwifi: mvm: fix iwl_mvm_max_scan_ie_fw_cmd_room()
	wifi: iwlwifi: mvm: don't wait for tx queues if firmware is dead
	ASoC: tda7419: fix module autoloading
	spi: bcm63xx: Enable module autoloading
	x86/hyperv: Set X86_FEATURE_TSC_KNOWN_FREQ when Hyper-V provides frequency
	ocfs2: add bounds checking to ocfs2_xattr_find_entry()
	ocfs2: strict bound check before memcmp in ocfs2_xattr_find_entry()
	gpio: prevent potential speculation leaks in gpio_device_get_desc()
	USB: serial: pl2303: add device id for Macrosilicon MS3020
	ACPI: PMIC: Remove unneeded check in tps68470_pmic_opregion_probe()
	wifi: ath9k: fix parameter check in ath9k_init_debug()
	wifi: ath9k: Remove error checks when creating debugfs entries
	netfilter: nf_tables: elements with timeout below CONFIG_HZ never expire
	wifi: cfg80211: fix UBSAN noise in cfg80211_wext_siwscan()
	wifi: cfg80211: fix two more possible UBSAN-detected off-by-one errors
	wifi: mac80211: use two-phase skb reclamation in ieee80211_do_stop()
	can: bcm: Clear bo->bcm_proc_read after remove_proc_entry().
	Bluetooth: btusb: Fix not handling ZPL/short-transfer
	block, bfq: fix possible UAF for bfqq->bic with merge chain
	block, bfq: choose the last bfqq from merge chain in bfq_setup_cooperator()
	block, bfq: don't break merge chain in bfq_split_bfqq()
	spi: ppc4xx: handle irq_of_parse_and_map() errors
	spi: ppc4xx: Avoid returning 0 when failed to parse and map IRQ
	ARM: versatile: fix OF node leak in CPUs prepare
	reset: berlin: fix OF node leak in probe() error path
	clocksource/drivers/qcom: Add missing iounmap() on errors in msm_dt_timer_init()
	hwmon: (max16065) Fix overflows seen when writing limits
	mtd: slram: insert break after errors in parsing the map
	hwmon: (ntc_thermistor) fix module autoloading
	power: supply: max17042_battery: Fix SOC threshold calc w/ no current sense
	fbdev: hpfb: Fix an error handling path in hpfb_dio_probe()
	drm/stm: Fix an error handling path in stm_drm_platform_probe()
	drm/amd: fix typo
	drm/amdgpu: Replace one-element array with flexible-array member
	drm/amdgpu: properly handle vbios fake edid sizing
	drm/radeon: Replace one-element array with flexible-array member
	drm/radeon: properly handle vbios fake edid sizing
	drm/rockchip: vop: Allow 4096px width scaling
	drm/radeon/evergreen_cs: fix int overflow errors in cs track offsets
	jfs: fix out-of-bounds in dbNextAG() and diAlloc()
	drm/msm/a5xx: properly clear preemption records on resume
	drm/msm/a5xx: fix races in preemption evaluation stage
	ipmi: docs: don't advertise deprecated sysfs entries
	drm/msm: fix %s null argument error
	xen: use correct end address of kernel for conflict checking
	xen/swiotlb: simplify range_straddles_page_boundary()
	xen/swiotlb: add alignment check for dma buffers
	selftests/bpf: Fix error compiling test_lru_map.c
	xz: cleanup CRC32 edits from 2018
	kthread: add kthread_work tracepoints
	kthread: fix task state in kthread worker if being frozen
	jbd2: introduce/export functions jbd2_journal_submit|finish_inode_data_buffers()
	ext4: clear EXT4_GROUP_INFO_WAS_TRIMMED_BIT even mount with discard
	smackfs: Use rcu_assign_pointer() to ensure safe assignment in smk_set_cipso
	ext4: avoid negative min_clusters in find_group_orlov()
	ext4: return error on ext4_find_inline_entry
	ext4: avoid OOB when system.data xattr changes underneath the filesystem
	nilfs2: fix potential null-ptr-deref in nilfs_btree_insert()
	nilfs2: determine empty node blocks as corrupted
	nilfs2: fix potential oob read in nilfs_btree_check_delete()
	perf sched timehist: Fix missing free of session in perf_sched__timehist()
	perf sched timehist: Fixed timestamp error when unable to confirm event sched_in time
	perf time-utils: Fix 32-bit nsec parsing
	clk: rockchip: Set parent rate for DCLK_VOP clock on RK3228
	drivers: media: dvb-frontends/rtl2832: fix an out-of-bounds write error
	drivers: media: dvb-frontends/rtl2830: fix an out-of-bounds write error
	PCI: xilinx-nwl: Fix register misspelling
	RDMA/iwcm: Fix WARNING:at_kernel/workqueue.c:#check_flush_dependency
	pinctrl: single: fix missing error code in pcs_probe()
	clk: ti: dra7-atl: Fix leak of of_nodes
	pinctrl: mvebu: Fix devinit_dove_pinctrl_probe function
	RDMA/cxgb4: Added NULL check for lookup_atid
	ntb: intel: Fix the NULL vs IS_ERR() bug for debugfs_create_dir()
	nfsd: call cache_put if xdr_reserve_space returns NULL
	f2fs: enhance to update i_mode and acl atomically in f2fs_setattr()
	f2fs: fix typo
	f2fs: fix to update i_ctime in __f2fs_setxattr()
	f2fs: remove unneeded check condition in __f2fs_setxattr()
	f2fs: reduce expensive checkpoint trigger frequency
	coresight: tmc: sg: Do not leak sg_table
	netfilter: nf_reject_ipv6: fix nf_reject_ip6_tcphdr_put()
	net: seeq: Fix use after free vulnerability in ether3 Driver Due to Race Condition
	tcp: introduce tcp_skb_timestamp_us() helper
	tcp: check skb is non-NULL in tcp_rto_delta_us()
	net: qrtr: Update packets cloning when broadcasting
	netfilter: ctnetlink: compile ctnetlink_label_size with CONFIG_NF_CONNTRACK_EVENTS
	crypto: aead,cipher - zeroize key buffer after use
	Remove *.orig pattern from .gitignore
	soc: versatile: integrator: fix OF node leak in probe() error path
	USB: appledisplay: close race between probe and completion handler
	USB: misc: cypress_cy7c63: check for short transfer
	firmware_loader: Block path traversal
	tty: rp2: Fix reset with non forgiving PCIe host bridges
	drbd: Fix atomicity violation in drbd_uuid_set_bm()
	drbd: Add NULL check for net_conf to prevent dereference in state validation
	ACPI: sysfs: validate return type of _STR method
	f2fs: prevent possible int overflow in dir_block_index()
	f2fs: avoid potential int overflow in sanity_check_area_boundary()
	vfs: fix race between evice_inodes() and find_inode()&iput()
	fs: Fix file_set_fowner LSM hook inconsistencies
	nfs: fix memory leak in error path of nfs4_do_reclaim
	PCI: xilinx-nwl: Use irq_data_get_irq_chip_data()
	PCI: xilinx-nwl: Fix off-by-one in INTx IRQ handler
	soc: versatile: realview: fix memory leak during device remove
	soc: versatile: realview: fix soc_dev leak during device remove
	usb: yurex: Replace snprintf() with the safer scnprintf() variant
	USB: misc: yurex: fix race between read and write
	pps: remove usage of the deprecated ida_simple_xx() API
	pps: add an error check in parport_attach
	i2c: aspeed: Update the stop sw state when the bus recovery occurs
	i2c: isch: Add missed 'else'
	usb: yurex: Fix inconsistent locking bug in yurex_read()
	mailbox: rockchip: fix a typo in module autoloading
	mailbox: bcm2835: Fix timeout during suspend mode
	ceph: remove the incorrect Fw reference check when dirtying pages
	netfilter: uapi: NFTA_FLOWTABLE_HOOK is NLA_NESTED
	netfilter: nf_tables: prevent nf_skb_duplicated corruption
	r8152: Factor out OOB link list waits
	net: ethernet: lantiq_etop: fix memory disclosure
	net: avoid potential underflow in qdisc_pkt_len_init() with UFO
	net: add more sanity checks to qdisc_pkt_len_init()
	ipv4: ip_gre: Fix drops of small packets in ipgre_xmit
	sctp: set sk_state back to CLOSED if autobind fails in sctp_listen_start
	ALSA: hda/generic: Unconditionally prefer preferred_dacs pairs
	ALSA: hda/conexant: Fix conflicting quirk for System76 Pangolin
	f2fs: Require FMODE_WRITE for atomic write ioctls
	wifi: ath9k: fix possible integer overflow in ath9k_get_et_stats()
	wifi: ath9k_htc: Use __skb_set_length() for resetting urb before resubmit
	net: hisilicon: hip04: fix OF node leak in probe()
	net: hisilicon: hns_dsaf_mac: fix OF node leak in hns_mac_get_info()
	net: hisilicon: hns_mdio: fix OF node leak in probe()
	ACPICA: Fix memory leak if acpi_ps_get_next_namepath() fails
	ACPICA: Fix memory leak if acpi_ps_get_next_field() fails
	ACPI: EC: Do not release locks during operation region accesses
	ACPICA: check null return of ACPI_ALLOCATE_ZEROED() in acpi_db_convert_to_package()
	tipc: guard against string buffer overrun
	net: mvpp2: Increase size of queue_name buffer
	ipv4: Check !in_dev earlier for ioctl(SIOCSIFADDR).
	ipv4: Mask upper DSCP bits and ECN bits in NETLINK_FIB_LOOKUP family
	tcp: avoid reusing FIN_WAIT2 when trying to find port in connect() process
	ACPICA: iasl: handle empty connection_node
	wifi: mwifiex: Fix memcpy() field-spanning write warning in mwifiex_cmd_802_11_scan_ext()
	signal: Replace BUG_ON()s
	ALSA: asihpi: Fix potential OOB array access
	ALSA: hdsp: Break infinite MIDI input flush loop
	fbdev: pxafb: Fix possible use after free in pxafb_task()
	power: reset: brcmstb: Do not go into infinite loop if reset fails
	ata: sata_sil: Rename sil_blacklist to sil_quirks
	jfs: UBSAN: shift-out-of-bounds in dbFindBits
	jfs: Fix uaf in dbFreeBits
	jfs: check if leafidx greater than num leaves per dmap tree
	jfs: Fix uninit-value access of new_ea in ea_buffer
	drm/amd/display: Check stream before comparing them
	drm/amd/display: Fix index out of bounds in degamma hardware format translation
	drm/printer: Allow NULL data in devcoredump printer
	scsi: aacraid: Rearrange order of struct aac_srb_unit
	drm/radeon/r100: Handle unknown family in r100_cp_init_microcode()
	of/irq: Refer to actual buffer size in of_irq_parse_one()
	ext4: ext4_search_dir should return a proper error
	ext4: fix i_data_sem unlock order in ext4_ind_migrate()
	spi: s3c64xx: fix timeout counters in flush_fifo
	selftests: breakpoints: use remaining time to check if suspend succeed
	selftests: vDSO: fix vDSO symbols lookup for powerpc64
	i2c: xiic: Wait for TX empty to avoid missed TX NAKs
	spi: bcm63xx: Fix module autoloading
	perf/core: Fix small negative period being ignored
	parisc: Fix itlb miss handler for 64-bit programs
	ALSA: core: add isascii() check to card ID generator
	ext4: no need to continue when the number of entries is 1
	ext4: propagate errors from ext4_find_extent() in ext4_insert_range()
	ext4: fix incorrect tid assumption in __jbd2_log_wait_for_space()
	ext4: aovid use-after-free in ext4_ext_insert_extent()
	ext4: fix double brelse() the buffer of the extents path
	ext4: fix incorrect tid assumption in ext4_wait_for_tail_page_commit()
	parisc: Fix 64-bit userspace syscall path
	of/irq: Support #msi-cells=<0> in of_msi_get_domain
	jbd2: stop waiting for space when jbd2_cleanup_journal_tail() returns error
	ocfs2: fix the la space leak when unmounting an ocfs2 volume
	ocfs2: fix uninit-value in ocfs2_get_block()
	ocfs2: reserve space for inline xattr before attaching reflink tree
	ocfs2: cancel dqi_sync_work before freeing oinfo
	ocfs2: remove unreasonable unlock in ocfs2_read_blocks
	ocfs2: fix null-ptr-deref when journal load failed.
	ocfs2: fix possible null-ptr-deref in ocfs2_set_buffer_uptodate
	riscv: define ILLEGAL_POINTER_VALUE for 64bit
	aoe: fix the potential use-after-free problem in more places
	clk: rockchip: fix error for unknown clocks
	media: uapi/linux/cec.h: cec_msg_set_reply_to: zero flags
	media: venus: fix use after free bug in venus_remove due to race condition
	iio: magnetometer: ak8975: Fix reading for ak099xx sensors
	tomoyo: fallback to realpath if symlink's pathname does not exist
	Input: adp5589-keys - fix adp5589_gpio_get_value()
	btrfs: wait for fixup workers before stopping cleaner kthread during umount
	gpio: davinci: fix lazy disable
	ext4: avoid ext4_error()'s caused by ENOMEM in the truncate path
	ext4: fix slab-use-after-free in ext4_split_extent_at()
	ext4: update orig_path in ext4_find_extent()
	arm64: Add Cortex-715 CPU part definition
	arm64: cputype: Add Neoverse-N3 definitions
	arm64: errata: Expand speculative SSBS workaround once more
	uprobes: fix kernel info leak via "[uprobes]" vma
	nfsd: use ktime_get_seconds() for timestamps
	nfsd: fix delegation_blocked() to block correctly for at least 30 seconds
	rtc: at91sam9: drop platform_data support
	rtc: at91sam9: fix OF node leak in probe() error path
	ACPI: battery: Simplify battery hook locking
	ACPI: battery: Fix possible crash when unregistering a battery hook
	ext4: fix inode tree inconsistency caused by ENOMEM
	net: ethernet: cortina: Drop TSO support
	tracing: Remove precision vsnprintf() check from print event
	drm: Move drm_mode_setcrtc() local re-init to failure path
	drm/crtc: fix uninitialized variable use even harder
	virtio_console: fix misc probe bugs
	Input: synaptics-rmi4 - fix UAF of IRQ domain on driver removal
	bpf: Check percpu map value size first
	s390/facility: Disable compile time optimization for decompressor code
	s390/mm: Add cond_resched() to cmm_alloc/free_pages()
	ext4: nested locking for xattr inode
	s390/cpum_sf: Remove WARN_ON_ONCE statements
	ktest.pl: Avoid false positives with grub2 skip regex
	clk: bcm: bcm53573: fix OF node leak in init
	i2c: i801: Use a different adapter-name for IDF adapters
	PCI: Mark Creative Labs EMU20k2 INTx masking as broken
	media: videobuf2-core: clear memory related fields in __vb2_plane_dmabuf_put()
	usb: chipidea: udc: enable suspend interrupt after usb reset
	tools/iio: Add memory allocation failure check for trigger_name
	driver core: bus: Return -EIO instead of 0 when show/store invalid bus attribute
	fbdev: sisfb: Fix strbuf array overflow
	NFS: Remove print_overflow_msg()
	SUNRPC: Fix integer overflow in decode_rc_list()
	tcp: fix tcp_enter_recovery() to zero retrans_stamp when it's safe
	netfilter: br_netfilter: fix panic with metadata_dst skb
	Bluetooth: RFCOMM: FIX possible deadlock in rfcomm_sk_state_change
	gpio: aspeed: Add the flush write to ensure the write complete.
	clk: Add (devm_)clk_get_optional() functions
	clk: generalize devm_clk_get() a bit
	clk: Provide new devm_clk helpers for prepared and enabled clocks
	gpio: aspeed: Use devm_clk api to manage clock source
	igb: Do not bring the device up after non-fatal error
	net: ibm: emac: mal: fix wrong goto
	ppp: fix ppp_async_encode() illegal access
	net: ipv6: ensure we call ipv6_mc_down() at most once
	CDC-NCM: avoid overflow in sanity checking
	HID: plantronics: Workaround for an unexcepted opposite volume key
	Revert "usb: yurex: Replace snprintf() with the safer scnprintf() variant"
	usb: xhci: Fix problem with xhci resume from suspend
	usb: storage: ignore bogus device raised by JieLi BR21 USB sound chip
	net: Fix an unsafe loop on the list
	posix-clock: Fix missing timespec64 check in pc_clock_settime()
	arm64: probes: Remove broken LDR (literal) uprobe support
	arm64: probes: Fix simulate_ldr*_literal()
	PCI: Add function 0 DMA alias quirk for Glenfly Arise chip
	fat: fix uninitialized variable
	KVM: Fix a data race on last_boosted_vcpu in kvm_vcpu_on_spin()
	net: dsa: mv88e6xxx: Fix out-of-bound access
	s390/sclp_vt220: Convert newlines to CRLF instead of LFCR
	KVM: s390: Change virtual to physical address access in diag 0x258 handler
	x86/cpufeatures: Define X86_FEATURE_AMD_IBPB_RET
	drm/vmwgfx: Handle surface check failure correctly
	iio: dac: stm32-dac-core: add missing select REGMAP_MMIO in Kconfig
	iio: adc: ti-ads8688: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
	iio: hid-sensors: Fix an error handling path in _hid_sensor_set_report_latency()
	iio: light: opt3001: add missing full-scale range value
	Bluetooth: Remove debugfs directory on module init failure
	Bluetooth: btusb: Fix regression with fake CSR controllers 0a12:0001
	xhci: Fix incorrect stream context type macro
	USB: serial: option: add support for Quectel EG916Q-GL
	USB: serial: option: add Telit FN920C04 MBIM compositions
	parport: Proper fix for array out-of-bounds access
	x86/apic: Always explicitly disarm TSC-deadline timer
	nilfs2: propagate directory read errors from nilfs_find_entry()
	clk: Fix pointer casting to prevent oops in devm_clk_release()
	clk: Fix slab-out-of-bounds error in devm_clk_release()
	RDMA/bnxt_re: Fix incorrect AVID type in WQE structure
	RDMA/cxgb4: Fix RDMA_CM_EVENT_UNREACHABLE error for iWARP
	RDMA/bnxt_re: Return more meaningful error
	drm/msm/dsi: fix 32-bit signed integer extension in pclk_rate calculation
	macsec: don't increment counters for an unrelated SA
	net: ethernet: aeroflex: fix potential memory leak in greth_start_xmit_gbit()
	net: systemport: fix potential memory leak in bcm_sysport_xmit()
	usb: typec: altmode should keep reference to parent
	Bluetooth: bnep: fix wild-memory-access in proto_unregister
	arm64:uprobe fix the uprobe SWBP_INSN in big-endian
	arm64: probes: Fix uprobes for big-endian kernels
	KVM: s390: gaccess: Refactor gpa and length calculation
	KVM: s390: gaccess: Refactor access address range check
	KVM: s390: gaccess: Cleanup access to guest pages
	KVM: s390: gaccess: Check if guest address is in memslot
	udf: fix uninit-value use in udf_get_fileshortad
	jfs: Fix sanity check in dbMount
	net/sun3_82586: fix potential memory leak in sun3_82586_send_packet()
	be2net: fix potential memory leak in be_xmit()
	net: usb: usbnet: fix name regression
	posix-clock: posix-clock: Fix unbalanced locking in pc_clock_settime()
	ALSA: hda/realtek: Update default depop procedure
	drm/amd: Guard against bad data for ATIF ACPI method
	ACPI: button: Add DMI quirk for Samsung Galaxy Book2 to fix initial lid detection issue
	nilfs2: fix kernel bug due to missing clearing of buffer delay flag
	hv_netvsc: Fix VF namespace also in synthetic NIC NETDEV_REGISTER event
	selinux: improve error checking in sel_write_load()
	arm64/uprobes: change the uprobe_opcode_t typedef to fix the sparse warning
	xfrm: validate new SA's prefixlen using SA family when sel.family is unset
	usb: dwc3: remove generic PHY calibrate() calls
	usb: dwc3: Add splitdisable quirk for Hisilicon Kirin Soc
	usb: dwc3: core: Stop processing of pending events if controller is halted
	cgroup: Fix potential overflow issue when checking max_depth
	wifi: mac80211: skip non-uploaded keys in ieee80211_iter_keys
	gtp: simplify error handling code in 'gtp_encap_enable()'
	gtp: allow -1 to be specified as file description from userspace
	net/sched: stop qdisc_tree_reduce_backlog on TC_H_ROOT
	bpf: Fix out-of-bounds write in trie_get_next_key()
	net: support ip generic csum processing in skb_csum_hwoffload_help
	net: skip offload for NETIF_F_IPV6_CSUM if ipv6 header contains extension
	netfilter: nft_payload: sanitize offset and length before calling skb_checksum()
	firmware: arm_sdei: Fix the input parameter of cpuhp_remove_state()
	net: amd: mvme147: Fix probe banner message
	misc: sgi-gru: Don't disable preemption in GRU driver
	usbip: tools: Fix detach_port() invalid port error path
	usb: phy: Fix API devm_usb_put_phy() can not release the phy
	xhci: Fix Link TRB DMA in command ring stopped completion event
	Revert "driver core: Fix uevent_show() vs driver detach race"
	wifi: mac80211: do not pass a stopped vif to the driver in .get_txpower
	wifi: ath10k: Fix memory leak in management tx
	wifi: iwlegacy: Clear stale interrupts before resuming device
	nilfs2: fix potential deadlock with newly created symlinks
	ocfs2: pass u64 to ocfs2_truncate_inline maybe overflow
	nilfs2: fix kernel bug due to missing clearing of checked flag
	mm: shmem: fix data-race in shmem_getattr()
	vt: prevent kernel-infoleak in con_font_get()
	Linux 4.19.323

Change-Id: I2348f834187153067ab46b3b48b8fe7da9cee1f1
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-11-09 11:24:17 +00:00
Andy Roulin
f07131239a netfilter: br_netfilter: fix panic with metadata_dst skb
[ Upstream commit f9ff7665cd128012868098bbd07e28993e314fdb ]

Fix a kernel panic in the br_netfilter module when sending untagged
traffic via a VxLAN device.
This happens during the check for fragmentation in br_nf_dev_queue_xmit.

It is dependent on:
1) the br_netfilter module being loaded;
2) net.bridge.bridge-nf-call-iptables set to 1;
3) a bridge with a VxLAN (single-vxlan-device) netdevice as a bridge port;
4) untagged frames with size higher than the VxLAN MTU forwarded/flooded

When forwarding the untagged packet to the VxLAN bridge port, before
the netfilter hooks are called, br_handle_egress_vlan_tunnel is called and
changes the skb_dst to the tunnel dst. The tunnel_dst is a metadata type
of dst, i.e., skb_valid_dst(skb) is false, and metadata->dst.dev is NULL.

Then in the br_netfilter hooks, in br_nf_dev_queue_xmit, there's a check
for frames that needs to be fragmented: frames with higher MTU than the
VxLAN device end up calling br_nf_ip_fragment, which in turns call
ip_skb_dst_mtu.

The ip_dst_mtu tries to use the skb_dst(skb) as if it was a valid dst
with valid dst->dev, thus the crash.

This case was never supported in the first place, so drop the packet
instead.

PING 10.0.0.2 (10.0.0.2) from 0.0.0.0 h1-eth0: 2000(2028) bytes of data.
[  176.291791] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000110
[  176.292101] Mem abort info:
[  176.292184]   ESR = 0x0000000096000004
[  176.292322]   EC = 0x25: DABT (current EL), IL = 32 bits
[  176.292530]   SET = 0, FnV = 0
[  176.292709]   EA = 0, S1PTW = 0
[  176.292862]   FSC = 0x04: level 0 translation fault
[  176.293013] Data abort info:
[  176.293104]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[  176.293488]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[  176.293787]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[  176.293995] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000043ef5000
[  176.294166] [0000000000000110] pgd=0000000000000000,
p4d=0000000000000000
[  176.294827] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[  176.295252] Modules linked in: vxlan ip6_udp_tunnel udp_tunnel veth
br_netfilter bridge stp llc ipv6 crct10dif_ce
[  176.295923] CPU: 0 PID: 188 Comm: ping Not tainted
6.8.0-rc3-g5b3fbd61b9d1 #2
[  176.296314] Hardware name: linux,dummy-virt (DT)
[  176.296535] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS
BTYPE=--)
[  176.296808] pc : br_nf_dev_queue_xmit+0x390/0x4ec [br_netfilter]
[  176.297382] lr : br_nf_dev_queue_xmit+0x2ac/0x4ec [br_netfilter]
[  176.297636] sp : ffff800080003630
[  176.297743] x29: ffff800080003630 x28: 0000000000000008 x27:
ffff6828c49ad9f8
[  176.298093] x26: ffff6828c49ad000 x25: 0000000000000000 x24:
00000000000003e8
[  176.298430] x23: 0000000000000000 x22: ffff6828c4960b40 x21:
ffff6828c3b16d28
[  176.298652] x20: ffff6828c3167048 x19: ffff6828c3b16d00 x18:
0000000000000014
[  176.298926] x17: ffffb0476322f000 x16: ffffb7e164023730 x15:
0000000095744632
[  176.299296] x14: ffff6828c3f1c880 x13: 0000000000000002 x12:
ffffb7e137926a70
[  176.299574] x11: 0000000000000001 x10: ffff6828c3f1c898 x9 :
0000000000000000
[  176.300049] x8 : ffff6828c49bf070 x7 : 0008460f18d5f20e x6 :
f20e0100bebafeca
[  176.300302] x5 : ffff6828c7f918fe x4 : ffff6828c49bf070 x3 :
0000000000000000
[  176.300586] x2 : 0000000000000000 x1 : ffff6828c3c7ad00 x0 :
ffff6828c7f918f0
[  176.300889] Call trace:
[  176.301123]  br_nf_dev_queue_xmit+0x390/0x4ec [br_netfilter]
[  176.301411]  br_nf_post_routing+0x2a8/0x3e4 [br_netfilter]
[  176.301703]  nf_hook_slow+0x48/0x124
[  176.302060]  br_forward_finish+0xc8/0xe8 [bridge]
[  176.302371]  br_nf_hook_thresh+0x124/0x134 [br_netfilter]
[  176.302605]  br_nf_forward_finish+0x118/0x22c [br_netfilter]
[  176.302824]  br_nf_forward_ip.part.0+0x264/0x290 [br_netfilter]
[  176.303136]  br_nf_forward+0x2b8/0x4e0 [br_netfilter]
[  176.303359]  nf_hook_slow+0x48/0x124
[  176.303803]  __br_forward+0xc4/0x194 [bridge]
[  176.304013]  br_flood+0xd4/0x168 [bridge]
[  176.304300]  br_handle_frame_finish+0x1d4/0x5c4 [bridge]
[  176.304536]  br_nf_hook_thresh+0x124/0x134 [br_netfilter]
[  176.304978]  br_nf_pre_routing_finish+0x29c/0x494 [br_netfilter]
[  176.305188]  br_nf_pre_routing+0x250/0x524 [br_netfilter]
[  176.305428]  br_handle_frame+0x244/0x3cc [bridge]
[  176.305695]  __netif_receive_skb_core.constprop.0+0x33c/0xecc
[  176.306080]  __netif_receive_skb_one_core+0x40/0x8c
[  176.306197]  __netif_receive_skb+0x18/0x64
[  176.306369]  process_backlog+0x80/0x124
[  176.306540]  __napi_poll+0x38/0x17c
[  176.306636]  net_rx_action+0x124/0x26c
[  176.306758]  __do_softirq+0x100/0x26c
[  176.307051]  ____do_softirq+0x10/0x1c
[  176.307162]  call_on_irq_stack+0x24/0x4c
[  176.307289]  do_softirq_own_stack+0x1c/0x2c
[  176.307396]  do_softirq+0x54/0x6c
[  176.307485]  __local_bh_enable_ip+0x8c/0x98
[  176.307637]  __dev_queue_xmit+0x22c/0xd28
[  176.307775]  neigh_resolve_output+0xf4/0x1a0
[  176.308018]  ip_finish_output2+0x1c8/0x628
[  176.308137]  ip_do_fragment+0x5b4/0x658
[  176.308279]  ip_fragment.constprop.0+0x48/0xec
[  176.308420]  __ip_finish_output+0xa4/0x254
[  176.308593]  ip_finish_output+0x34/0x130
[  176.308814]  ip_output+0x6c/0x108
[  176.308929]  ip_send_skb+0x50/0xf0
[  176.309095]  ip_push_pending_frames+0x30/0x54
[  176.309254]  raw_sendmsg+0x758/0xaec
[  176.309568]  inet_sendmsg+0x44/0x70
[  176.309667]  __sys_sendto+0x110/0x178
[  176.309758]  __arm64_sys_sendto+0x28/0x38
[  176.309918]  invoke_syscall+0x48/0x110
[  176.310211]  el0_svc_common.constprop.0+0x40/0xe0
[  176.310353]  do_el0_svc+0x1c/0x28
[  176.310434]  el0_svc+0x34/0xb4
[  176.310551]  el0t_64_sync_handler+0x120/0x12c
[  176.310690]  el0t_64_sync+0x190/0x194
[  176.311066] Code: f9402e61 79402aa2 927ff821 f9400023 (f9408860)
[  176.315743] ---[ end trace 0000000000000000 ]---
[  176.316060] Kernel panic - not syncing: Oops: Fatal exception in
interrupt
[  176.316371] Kernel Offset: 0x37e0e3000000 from 0xffff800080000000
[  176.316564] PHYS_OFFSET: 0xffff97d780000000
[  176.316782] CPU features: 0x0,88000203,3c020000,0100421b
[  176.317210] Memory Limit: none
[  176.317527] ---[ end Kernel panic - not syncing: Oops: Fatal
Exception in interrupt ]---\

Fixes: 11538d039a ("bridge: vlan dst_metadata hooks in ingress and egress paths")
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Andy Roulin <aroulin@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20241001154400.22787-2-aroulin@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-11-08 16:19:17 +01:00
Greg Kroah-Hartman
1b3964c5e0 Merge 4.19.322 into android-4.19-stable
Changes in 4.19.322
	net: usb: qmi_wwan: add MeiG Smart SRM825L
	usb: dwc3: st: Add of_node_put() before return in probe function
	usb: dwc3: st: add missing depopulate in probe error path
	drm/amdgpu: Fix uninitialized variable warning in amdgpu_afmt_acr
	drm/amdgpu: fix overflowed array index read warning
	drm/amdgpu: fix ucode out-of-bounds read warning
	drm/amdgpu: fix mc_data out-of-bounds read warning
	drm/amdkfd: Reconcile the definition and use of oem_id in struct kfd_topology_device
	apparmor: fix possible NULL pointer dereference
	usbip: Don't submit special requests twice
	smack: tcp: ipv4, fix incorrect labeling
	media: uvcvideo: Enforce alignment of frame and interval
	block: initialize integrity buffer to zero before writing it to media
	virtio_net: Fix napi_skb_cache_put warning
	udf: Limit file size to 4TB
	ALSA: usb-audio: Sanity checks for each pipe and EP types
	ALSA: usb-audio: Fix gpf in snd_usb_pipe_sanity_check
	sch/netem: fix use after free in netem_dequeue
	ALSA: hda/conexant: Add pincfg quirk to enable top speakers on Sirius devices
	ata: libata: Fix memory leak for error path in ata_host_alloc()
	mmc: dw_mmc: Fix IDMAC operation with pages bigger than 4K
	fuse: use unsigned type for getxattr/listxattr size truncation
	clk: qcom: clk-alpha-pll: Fix the pll post div mask
	nilfs2: fix missing cleanup on rollforward recovery error
	nilfs2: fix state management in error path of log writing function
	ALSA: hda: Add input value sanity checks to HDMI channel map controls
	smack: unix sockets: fix accept()ed socket label
	irqchip/armada-370-xp: Do not allow mapping IRQ 0 and 1
	af_unix: Remove put_pid()/put_cred() in copy_peercred().
	netfilter: nf_conncount: fix wrong variable type
	udf: Avoid excessive partition lengths
	wifi: brcmsmac: advertise MFP_CAPABLE to enable WPA3
	media: qcom: camss: Add check for v4l2_fwnode_endpoint_parse
	pcmcia: Use resource_size function on resource object
	can: bcm: Remove proc entry when dev is unregistered.
	igb: Fix not clearing TimeSync interrupts for 82580
	platform/x86: dell-smbios: Fix error path in dell_smbios_init()
	cx82310_eth: re-enable ethernet mode after router reboot
	drivers/net/usb: Remove all strcpy() uses
	net: usb: don't write directly to netdev->dev_addr
	usbnet: modern method to get random MAC
	rfkill: fix spelling mistake contidion to condition
	net: bridge: add support for sticky fdb entries
	bridge: switchdev: Allow clearing FDB entry offload indication
	net: bridge: fdb: convert is_local to bitops
	net: bridge: fdb: convert is_static to bitops
	net: bridge: fdb: convert is_sticky to bitops
	net: bridge: fdb: convert added_by_user to bitops
	net: bridge: fdb: convert added_by_external_learn to use bitops
	net: bridge: br_fdb_external_learn_add(): always set EXT_LEARN
	net: dsa: vsc73xx: fix possible subblocks range of CAPT block
	iommu/vt-d: Handle volatile descriptor status read
	cgroup: Protect css->cgroup write under css_set_lock
	um: line: always fill *error_out in setup_one_line()
	devres: Initialize an uninitialized struct member
	pci/hotplug/pnv_php: Fix hotplug driver crash on Powernv
	hwmon: (adc128d818) Fix underflows seen when writing limit attributes
	hwmon: (lm95234) Fix underflows seen when writing limit attributes
	hwmon: (nct6775-core) Fix underflows seen when writing limit attributes
	hwmon: (w83627ehf) Fix underflows seen when writing limit attributes
	wifi: mwifiex: Do not return unused priv in mwifiex_get_priv_by_id()
	smp: Add missing destroy_work_on_stack() call in smp_call_on_cpu()
	btrfs: replace BUG_ON with ASSERT in walk_down_proc()
	btrfs: clean up our handling of refs == 0 in snapshot delete
	PCI: Add missing bridge lock to pci_bus_lock()
	btrfs: initialize location to fix -Wmaybe-uninitialized in btrfs_lookup_dentry()
	HID: cougar: fix slab-out-of-bounds Read in cougar_report_fixup
	Input: uinput - reject requests with unreasonable number of slots
	usbnet: ipheth: race between ipheth_close and error handling
	Squashfs: sanity check symbolic link size
	of/irq: Prevent device address out-of-bounds read in interrupt map walk
	ata: pata_macio: Use WARN instead of BUG
	iio: buffer-dmaengine: fix releasing dma channel on error
	iio: fix scale application in iio_convert_raw_to_processed_unlocked
	nvmem: Fix return type of devm_nvmem_device_get() in kerneldoc
	uio_hv_generic: Fix kernel NULL pointer dereference in hv_uio_rescind
	Drivers: hv: vmbus: Fix rescind handling in uio_hv_generic
	VMCI: Fix use-after-free when removing resource in vmci_resource_remove()
	clocksource/drivers/imx-tpm: Fix return -ETIME when delta exceeds INT_MAX
	clocksource/drivers/imx-tpm: Fix next event not taking effect sometime
	uprobes: Use kzalloc to allocate xol area
	ring-buffer: Rename ring_buffer_read() to read_buffer_iter_advance()
	tracing: Avoid possible softlockup in tracing_iter_reset()
	nilfs2: replace snprintf in show functions with sysfs_emit
	nilfs2: protect references to superblock parameters exposed in sysfs
	netns: add pre_exit method to struct pernet_operations
	ila: call nf_unregister_net_hooks() sooner
	ACPI: processor: Return an error if acpi_processor_get_info() fails in processor_add()
	ACPI: processor: Fix memory leaks in error paths of processor_add()
	drm/i915/fence: Mark debug_fence_init_onstack() with __maybe_unused
	drm/i915/fence: Mark debug_fence_free() with __maybe_unused
	rtmutex: Drop rt_mutex::wait_lock before scheduling
	net, sunrpc: Remap EPERM in case of connection failure in xs_tcp_setup_socket
	cx82310_eth: fix error return code in cx82310_bind()
	netns: restore ops before calling ops_exit_list
	Revert "parisc: Use irq_enter_rcu() to fix warning at kernel/context_tracking.c:367"
	Linux 4.19.322

Change-Id: I91163696e8593c077f8fe3d59348a68c76a2624b
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-09-12 10:36:52 +00:00
Jonas Gorski
7d9933cb99 net: bridge: br_fdb_external_learn_add(): always set EXT_LEARN
[ Upstream commit bee2ef946d3184e99077be526567d791c473036f ]

When userspace wants to take over a fdb entry by setting it as
EXTERN_LEARNED, we set both flags BR_FDB_ADDED_BY_EXT_LEARN and
BR_FDB_ADDED_BY_USER in br_fdb_external_learn_add().

If the bridge updates the entry later because its port changed, we clear
the BR_FDB_ADDED_BY_EXT_LEARN flag, but leave the BR_FDB_ADDED_BY_USER
flag set.

If userspace then wants to take over the entry again,
br_fdb_external_learn_add() sees that BR_FDB_ADDED_BY_USER and skips
setting the BR_FDB_ADDED_BY_EXT_LEARN flags, thus silently ignores the
update.

Fix this by always allowing to set BR_FDB_ADDED_BY_EXT_LEARN regardless
if this was a user fdb entry or not.

Fixes: 710ae7287737 ("net: bridge: Mark FDB entries that were added by user as such")
Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20240903081958.29951-1-jonas.gorski@bisdn.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-12 11:02:53 +02:00
Nikolay Aleksandrov
d3bc290bdd net: bridge: fdb: convert added_by_external_learn to use bitops
[ Upstream commit b5cd9f7c42480ede119a390607a9dbe6263f6795 ]

Convert the added_by_external_learn field to a flag and use bitops.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: bee2ef946d31 ("net: bridge: br_fdb_external_learn_add(): always set EXT_LEARN")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-12 11:02:52 +02:00
Nikolay Aleksandrov
4b1bf0ea37 net: bridge: fdb: convert added_by_user to bitops
[ Upstream commit ac3ca6af443aa495c7907e5010ac77fbd2450eaa ]

Straight-forward convert of the added_by_user field to bitops.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: bee2ef946d31 ("net: bridge: br_fdb_external_learn_add(): always set EXT_LEARN")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-12 11:02:52 +02:00
Nikolay Aleksandrov
f210d06825 net: bridge: fdb: convert is_sticky to bitops
[ Upstream commit e0458d9a733ba71a2821d0c3fc0745baac697db0 ]

Straight-forward convert of the is_sticky field to bitops.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: bee2ef946d31 ("net: bridge: br_fdb_external_learn_add(): always set EXT_LEARN")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-12 11:02:52 +02:00
Nikolay Aleksandrov
806d9b8740 net: bridge: fdb: convert is_static to bitops
[ Upstream commit 29e63fffd666f1945756882d4b02bc7bec132101 ]

Convert the is_static to bitops, make use of the combined
test_and_set/clear_bit to simplify expressions in fdb_add_entry.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: bee2ef946d31 ("net: bridge: br_fdb_external_learn_add(): always set EXT_LEARN")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-12 11:02:52 +02:00
Nikolay Aleksandrov
9969873b37 net: bridge: fdb: convert is_local to bitops
[ Upstream commit 6869c3b02b596eba931a754f56875d2e2ac612db ]

The patch adds a new fdb flags field in the hole between the two cache
lines and uses it to convert is_local to bitops.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: bee2ef946d31 ("net: bridge: br_fdb_external_learn_add(): always set EXT_LEARN")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-12 11:02:52 +02:00
Ido Schimmel
76c1d0d1cb bridge: switchdev: Allow clearing FDB entry offload indication
[ Upstream commit e9ba0fbc7dd23a74e77960c98c988f59a1ff75aa ]

Currently, an FDB entry only ceases being offloaded when it is deleted.
This changes with VxLAN encapsulation.

Devices capable of performing VxLAN encapsulation usually have only one
FDB table, unlike the software data path which has two - one in the
bridge driver and another in the VxLAN driver.

Therefore, bridge FDB entries pointing to a VxLAN device are only
offloaded if there is a corresponding entry in the VxLAN FDB.

Allow clearing the offload indication in case the corresponding entry
was deleted from the VxLAN FDB.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: bee2ef946d31 ("net: bridge: br_fdb_external_learn_add(): always set EXT_LEARN")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-12 11:02:52 +02:00
Nikolay Aleksandrov
c5a0142c4d net: bridge: add support for sticky fdb entries
[ Upstream commit 435f2e7cc0b783615d7fbcf08f5f00d289f9caeb ]

Add support for entries which are "sticky", i.e. will not change their port
if they show up from a different one. A new ndm flag is introduced for that
purpose - NTF_STICKY. We allow to set it only to non-local entries.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: bee2ef946d31 ("net: bridge: br_fdb_external_learn_add(): always set EXT_LEARN")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-12 11:02:52 +02:00
Greg Kroah-Hartman
65e58a8638 Merge 4.19.314 into android-4.19-stable
Changes in 4.19.314
	dmaengine: pl330: issue_pending waits until WFP state
	dmaengine: Revert "dmaengine: pl330: issue_pending waits until WFP state"
	wifi: nl80211: don't free NULL coalescing rule
	drm/amdkfd: change system memory overcommit limit
	drm/amdgpu: Fix leak when GPU memory allocation fails
	net: slightly optimize eth_type_trans
	ethernet: add a helper for assigning port addresses
	ethernet: Add helper for assigning packet type when dest address does not match device address
	pinctrl: core: delete incorrect free in pinctrl_enable()
	power: rt9455: hide unused rt9455_boost_voltage_values
	pinctrl: devicetree: fix refcount leak in pinctrl_dt_to_map()
	s390/mm: Fix storage key clearing for guest huge pages
	s390/mm: Fix clearing storage keys for huge pages
	bna: ensure the copied buf is NUL terminated
	nsh: Restore skb->{protocol,data,mac_header} for outer header in nsh_gso_segment().
	net l2tp: drop flow hash on forward
	net: dsa: mv88e6xxx: Add number of MACs in the ATU
	net: dsa: mv88e6xxx: Fix number of databases for 88E6141 / 88E6341
	net: bridge: fix multicast-to-unicast with fraglist GSO
	tipc: fix a possible memleak in tipc_buf_append
	scsi: lpfc: Update lpfc_ramp_down_queue_handler() logic
	gfs2: Fix invalid metadata access in punch_hole
	wifi: mac80211: fix ieee80211_bss_*_flags kernel-doc
	net: mark racy access on sk->sk_rcvbuf
	scsi: bnx2fc: Remove spin_lock_bh while releasing resources after upload
	ALSA: line6: Zero-initialize message buffers
	net: bcmgenet: Reset RBUF on first open
	ata: sata_gemini: Check clk_enable() result
	firewire: ohci: mask bus reset interrupts between ISR and bottom half
	tools/power turbostat: Fix added raw MSR output
	tools/power turbostat: Fix Bzy_MHz documentation typo
	btrfs: make btrfs_clear_delalloc_extent() free delalloc reserve
	btrfs: always clear PERTRANS metadata during commit
	scsi: target: Fix SELinux error when systemd-modules loads the target module
	selftests: timers: Fix valid-adjtimex signed left-shift undefined behavior
	fs/9p: only translate RWX permissions for plain 9P2000
	fs/9p: translate O_TRUNC into OTRUNC
	9p: explicitly deny setlease attempts
	gpio: wcove: Use -ENOTSUPP consistently
	gpio: crystalcove: Use -ENOTSUPP consistently
	fs/9p: drop inodes immediately on non-.L too
	net:usb:qmi_wwan: support Rolling modules
	tcp: remove redundant check on tskb
	tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets
	tcp: Use refcount_inc_not_zero() in tcp_twsk_unique().
	Bluetooth: Fix use-after-free bugs caused by sco_sock_timeout
	Bluetooth: l2cap: fix null-ptr-deref in l2cap_chan_timeout
	rtnetlink: Correct nested IFLA_VF_VLAN_LIST attribute validation
	phonet: fix rtm_phonet_notify() skb allocation
	net: bridge: fix corrupted ethernet header on multicast-to-unicast
	ipv6: fib6_rules: avoid possible NULL dereference in fib6_rule_action()
	af_unix: Do not use atomic ops for unix_sk(sk)->inflight.
	af_unix: Fix garbage collector racing against connect()
	firewire: nosy: ensure user_length is taken into account when fetching packet contents
	usb: gadget: composite: fix OS descriptors w_value logic
	usb: gadget: f_fs: Fix a race condition when processing setup packets.
	tipc: fix UAF in error path
	dyndbg: fix old BUG_ON in >control parser
	drm/vmwgfx: Fix invalid reads in fence signaled events
	net: fix out-of-bounds access in ops_init
	af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc().
	Linux 4.19.314

Change-Id: Iee5ac090f6fe369f9faa89d92ad17b66b8a41bee
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-05-17 15:28:37 +00:00
Felix Fietkau
e96b4e3e5e net: bridge: fix corrupted ethernet header on multicast-to-unicast
[ Upstream commit 86b29d830ad69eecff25b22dc96c14c6573718e6 ]

The change from skb_copy to pskb_copy unfortunately changed the data
copying to omit the ethernet header, since it was pulled before reaching
this point. Fix this by calling __skb_push/pull around pskb_copy.

Fixes: 59c878cbcdd8 ("net: bridge: fix multicast-to-unicast with fraglist GSO")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17 11:42:42 +02:00
Felix Fietkau
01386957ca net: bridge: fix multicast-to-unicast with fraglist GSO
[ Upstream commit 59c878cbcdd80ed39315573b3511d0acfd3501b5 ]

Calling skb_copy on a SKB_GSO_FRAGLIST skb is not valid, since it returns
an invalid linearized skb. This code only needs to change the ethernet
header, so pskb_copy is the right function to call here.

Fixes: 6db6f0eae6 ("bridge: multicast to unicast")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17 11:42:38 +02:00
Greg Kroah-Hartman
1b36874123 Revert "net: bridge: use DEV_STATS_INC()"
This reverts commit d2346e6beb which is
commit 44bdb313da57322c9b3c108eb66981c6ec6509f4 upstream.

It is needed to be reverted as it relies on an abi-breaking change that
also must be reverted.  It's not really needed for Android systems, but
if it is needed to come back, it can be reworked to be a CRC-abi neutral
change if so desired.

Bug: 161946584
Change-Id: I8a9f35972d5ade06a3bafd6a965ac2430ad0e63d
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-10-12 09:06:31 +00:00
Eric Dumazet
d2346e6beb net: bridge: use DEV_STATS_INC()
[ Upstream commit 44bdb313da57322c9b3c108eb66981c6ec6509f4 ]

syzbot/KCSAN reported data-races in br_handle_frame_finish() [1]
This function can run from multiple cpus without mutual exclusion.

Adopt SMP safe DEV_STATS_INC() to update dev->stats fields.

Handles updates to dev->stats.tx_dropped while we are at it.

[1]
BUG: KCSAN: data-race in br_handle_frame_finish / br_handle_frame_finish

read-write to 0xffff8881374b2178 of 8 bytes by interrupt on cpu 1:
br_handle_frame_finish+0xd4f/0xef0 net/bridge/br_input.c:189
br_nf_hook_thresh+0x1ed/0x220
br_nf_pre_routing_finish_ipv6+0x50f/0x540
NF_HOOK include/linux/netfilter.h:304 [inline]
br_nf_pre_routing_ipv6+0x1e3/0x2a0 net/bridge/br_netfilter_ipv6.c:178
br_nf_pre_routing+0x526/0xba0 net/bridge/br_netfilter_hooks.c:508
nf_hook_entry_hookfn include/linux/netfilter.h:144 [inline]
nf_hook_bridge_pre net/bridge/br_input.c:272 [inline]
br_handle_frame+0x4c9/0x940 net/bridge/br_input.c:417
__netif_receive_skb_core+0xa8a/0x21e0 net/core/dev.c:5417
__netif_receive_skb_one_core net/core/dev.c:5521 [inline]
__netif_receive_skb+0x57/0x1b0 net/core/dev.c:5637
process_backlog+0x21f/0x380 net/core/dev.c:5965
__napi_poll+0x60/0x3b0 net/core/dev.c:6527
napi_poll net/core/dev.c:6594 [inline]
net_rx_action+0x32b/0x750 net/core/dev.c:6727
__do_softirq+0xc1/0x265 kernel/softirq.c:553
run_ksoftirqd+0x17/0x20 kernel/softirq.c:921
smpboot_thread_fn+0x30a/0x4a0 kernel/smpboot.c:164
kthread+0x1d7/0x210 kernel/kthread.c:388
ret_from_fork+0x48/0x60 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304

read-write to 0xffff8881374b2178 of 8 bytes by interrupt on cpu 0:
br_handle_frame_finish+0xd4f/0xef0 net/bridge/br_input.c:189
br_nf_hook_thresh+0x1ed/0x220
br_nf_pre_routing_finish_ipv6+0x50f/0x540
NF_HOOK include/linux/netfilter.h:304 [inline]
br_nf_pre_routing_ipv6+0x1e3/0x2a0 net/bridge/br_netfilter_ipv6.c:178
br_nf_pre_routing+0x526/0xba0 net/bridge/br_netfilter_hooks.c:508
nf_hook_entry_hookfn include/linux/netfilter.h:144 [inline]
nf_hook_bridge_pre net/bridge/br_input.c:272 [inline]
br_handle_frame+0x4c9/0x940 net/bridge/br_input.c:417
__netif_receive_skb_core+0xa8a/0x21e0 net/core/dev.c:5417
__netif_receive_skb_one_core net/core/dev.c:5521 [inline]
__netif_receive_skb+0x57/0x1b0 net/core/dev.c:5637
process_backlog+0x21f/0x380 net/core/dev.c:5965
__napi_poll+0x60/0x3b0 net/core/dev.c:6527
napi_poll net/core/dev.c:6594 [inline]
net_rx_action+0x32b/0x750 net/core/dev.c:6727
__do_softirq+0xc1/0x265 kernel/softirq.c:553
do_softirq+0x5e/0x90 kernel/softirq.c:454
__local_bh_enable_ip+0x64/0x70 kernel/softirq.c:381
__raw_spin_unlock_bh include/linux/spinlock_api_smp.h:167 [inline]
_raw_spin_unlock_bh+0x36/0x40 kernel/locking/spinlock.c:210
spin_unlock_bh include/linux/spinlock.h:396 [inline]
batadv_tt_local_purge+0x1a8/0x1f0 net/batman-adv/translation-table.c:1356
batadv_tt_purge+0x2b/0x630 net/batman-adv/translation-table.c:3560
process_one_work kernel/workqueue.c:2630 [inline]
process_scheduled_works+0x5b8/0xa30 kernel/workqueue.c:2703
worker_thread+0x525/0x730 kernel/workqueue.c:2784
kthread+0x1d7/0x210 kernel/kthread.c:388
ret_from_fork+0x48/0x60 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304

value changed: 0x00000000000d7190 -> 0x00000000000d7191

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 14848 Comm: kworker/u4:11 Not tainted 6.6.0-rc1-syzkaller-00236-gad8a69f361b9 #0

Fixes: 1c29fc4989 ("[BRIDGE]: keep track of received multicast packets")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Roopa Prabhu <roopa@nvidia.com>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
Cc: bridge@lists.linux-foundation.org
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20230918091351.1356153-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10 21:44:57 +02:00
Vladimir Oltean
8fa6db24bd net: bridge: keep ports without IFF_UNICAST_FLT in BR_PROMISC mode
[ Upstream commit 6ca3c005d0604e8d2b439366e3923ea58db99641 ]

According to the synchronization rules for .ndo_get_stats() as seen in
Documentation/networking/netdevices.rst, acquiring a plain spin_lock()
should not be illegal, but the bridge driver implementation makes it so.

After running these commands, I am being faced with the following
lockdep splat:

$ ip link add link swp0 name macsec0 type macsec encrypt on && ip link set swp0 up
$ ip link add dev br0 type bridge vlan_filtering 1 && ip link set br0 up
$ ip link set macsec0 master br0 && ip link set macsec0 up

  ========================================================
  WARNING: possible irq lock inversion dependency detected
  6.4.0-04295-g31b577b4bd4a #603 Not tainted
  --------------------------------------------------------
  swapper/1/0 just changed the state of lock:
  ffff6bd348724cd8 (&br->lock){+.-.}-{3:3}, at: br_forward_delay_timer_expired+0x34/0x198
  but this lock took another, SOFTIRQ-unsafe lock in the past:
   (&ocelot->stats_lock){+.+.}-{3:3}

  and interrupts could create inverse lock ordering between them.

  other info that might help us debug this:
  Chain exists of:
    &br->lock --> &br->hash_lock --> &ocelot->stats_lock

   Possible interrupt unsafe locking scenario:

         CPU0                    CPU1
         ----                    ----
    lock(&ocelot->stats_lock);
                                 local_irq_disable();
                                 lock(&br->lock);
                                 lock(&br->hash_lock);
    <Interrupt>
      lock(&br->lock);

   *** DEADLOCK ***

(details about the 3 locks skipped)

swp0 is instantiated by drivers/net/dsa/ocelot/felix.c, and this
only matters to the extent that its .ndo_get_stats64() method calls
spin_lock(&ocelot->stats_lock).

Documentation/locking/lockdep-design.rst says:

| A lock is irq-safe means it was ever used in an irq context, while a lock
| is irq-unsafe means it was ever acquired with irq enabled.

(...)

| Furthermore, the following usage based lock dependencies are not allowed
| between any two lock-classes::
|
|    <hardirq-safe>   ->  <hardirq-unsafe>
|    <softirq-safe>   ->  <softirq-unsafe>

Lockdep marks br->hash_lock as softirq-safe, because it is sometimes
taken in softirq context (for example br_fdb_update() which runs in
NET_RX softirq), and when it's not in softirq context it blocks softirqs
by using spin_lock_bh().

Lockdep marks ocelot->stats_lock as softirq-unsafe, because it never
blocks softirqs from running, and it is never taken from softirq
context. So it can always be interrupted by softirqs.

There is a call path through which a function that holds br->hash_lock:
fdb_add_hw_addr() will call a function that acquires ocelot->stats_lock:
ocelot_port_get_stats64(). This can be seen below:

ocelot_port_get_stats64+0x3c/0x1e0
felix_get_stats64+0x20/0x38
dsa_slave_get_stats64+0x3c/0x60
dev_get_stats+0x74/0x2c8
rtnl_fill_stats+0x4c/0x150
rtnl_fill_ifinfo+0x5cc/0x7b8
rtmsg_ifinfo_build_skb+0xe4/0x150
rtmsg_ifinfo+0x5c/0xb0
__dev_notify_flags+0x58/0x200
__dev_set_promiscuity+0xa0/0x1f8
dev_set_promiscuity+0x30/0x70
macsec_dev_change_rx_flags+0x68/0x88
__dev_set_promiscuity+0x1a8/0x1f8
__dev_set_rx_mode+0x74/0xa8
dev_uc_add+0x74/0xa0
fdb_add_hw_addr+0x68/0xd8
fdb_add_local+0xc4/0x110
br_fdb_add_local+0x54/0x88
br_add_if+0x338/0x4a0
br_add_slave+0x20/0x38
do_setlink+0x3a4/0xcb8
rtnl_newlink+0x758/0x9d0
rtnetlink_rcv_msg+0x2f0/0x550
netlink_rcv_skb+0x128/0x148
rtnetlink_rcv+0x24/0x38

the plain English explanation for it is:

The macsec0 bridge port is created without p->flags & BR_PROMISC,
because it is what br_manage_promisc() decides for a VLAN filtering
bridge with a single auto port.

As part of the br_add_if() procedure, br_fdb_add_local() is called for
the MAC address of the device, and this results in a call to
dev_uc_add() for macsec0 while the softirq-safe br->hash_lock is taken.

Because macsec0 does not have IFF_UNICAST_FLT, dev_uc_add() ends up
calling __dev_set_promiscuity() for macsec0, which is propagated by its
implementation, macsec_dev_change_rx_flags(), to the lower device: swp0.
This triggers the call path:

dev_set_promiscuity(swp0)
-> rtmsg_ifinfo()
   -> dev_get_stats()
      -> ocelot_port_get_stats64()

with a calling context that lockdep doesn't like (br->hash_lock held).

Normally we don't see this, because even though many drivers that can be
bridge ports don't support IFF_UNICAST_FLT, we need a driver that

(a) doesn't support IFF_UNICAST_FLT, *and*
(b) it forwards the IFF_PROMISC flag to another driver, and
(c) *that* driver implements ndo_get_stats64() using a softirq-unsafe
    spinlock.

Condition (b) is necessary because the first __dev_set_rx_mode() calls
__dev_set_promiscuity() with "bool notify=false", and thus, the
rtmsg_ifinfo() code path won't be entered.

The same criteria also hold true for DSA switches which don't report
IFF_UNICAST_FLT. When the DSA master uses a spin_lock() in its
ndo_get_stats64() method, the same lockdep splat can be seen.

I think the deadlock possibility is real, even though I didn't reproduce
it, and I'm thinking of the following situation to support that claim:

fdb_add_hw_addr() runs on a CPU A, in a context with softirqs locally
disabled and br->hash_lock held, and may end up attempting to acquire
ocelot->stats_lock.

In parallel, ocelot->stats_lock is currently held by a thread B (say,
ocelot_check_stats_work()), which is interrupted while holding it by a
softirq which attempts to lock br->hash_lock.

Thread B cannot make progress because br->hash_lock is held by A. Whereas
thread A cannot make progress because ocelot->stats_lock is held by B.

When taking the issue at face value, the bridge can avoid that problem
by simply making the ports promiscuous from a code path with a saner
calling context (br->hash_lock not held). A bridge port without
IFF_UNICAST_FLT is going to become promiscuous as soon as we call
dev_uc_add() on it (which we do unconditionally), so why not be
preemptive and make it promiscuous right from the beginning, so as to
not be taken by surprise.

With this, we've broken the links between code that holds br->hash_lock
or br->lock and code that calls into the ndo_change_rx_flags() or
ndo_get_stats64() ops of the bridge port.

Fixes: 2796d0c648 ("bridge: Automatically manage port promiscuous mode.")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-11 11:45:14 +02:00
Florian Westphal
1e98318af2 netfilter: ebtables: fix memory leak when blob is malformed
[ Upstream commit 62ce44c4fff947eebdf10bb582267e686e6835c9 ]

The bug fix was incomplete, it "replaced" crash with a memory leak.
The old code had an assignment to "ret" embedded into the conditional,
restore this.

Fixes: 7997eff82828 ("netfilter: ebtables: reject blobs that don't provide all entry points")
Reported-and-tested-by: syzbot+a24c5252f3e3ab733464@syzkaller.appspotmail.com
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-28 11:02:56 +02:00
Harsh Modi
89810dbbff netfilter: br_netfilter: Drop dst references before setting.
[ Upstream commit d047283a7034140ea5da759a494fd2274affdd46 ]

The IPv6 path already drops dst in the daddr changed case, but the IPv4
path does not. This change makes the two code paths consistent.

Further, it is possible that there is already a metadata_dst allocated from
ingress that might already be attached to skbuff->dst while following
the bridge path. If it is not released before setting a new
metadata_dst, it will be leaked. This is similar to what is done in
bpf_set_tunnel_key() or ip6_route_input().

It is important to note that the memory being leaked is not the dst
being set in the bridge code, but rather memory allocated from some
other code path that is not being freed correctly before the skb dst is
overwritten.

An example of the leakage fixed by this commit found using kmemleak:

unreferenced object 0xffff888010112b00 (size 256):
  comm "softirq", pid 0, jiffies 4294762496 (age 32.012s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 80 16 f1 83 ff ff ff ff  ................
    e1 4e f6 82 ff ff ff ff 00 00 00 00 00 00 00 00  .N..............
  backtrace:
    [<00000000d79567ea>] metadata_dst_alloc+0x1b/0xe0
    [<00000000be113e13>] udp_tun_rx_dst+0x174/0x1f0
    [<00000000a36848f4>] geneve_udp_encap_recv+0x350/0x7b0
    [<00000000d4afb476>] udp_queue_rcv_one_skb+0x380/0x560
    [<00000000ac064aea>] udp_unicast_rcv_skb+0x75/0x90
    [<000000009a8ee8c5>] ip_protocol_deliver_rcu+0xd8/0x230
    [<00000000ef4980bb>] ip_local_deliver_finish+0x7a/0xa0
    [<00000000d7533c8c>] __netif_receive_skb_one_core+0x89/0xa0
    [<00000000a879497d>] process_backlog+0x93/0x190
    [<00000000e41ade9f>] __napi_poll+0x28/0x170
    [<00000000b4c0906b>] net_rx_action+0x14f/0x2a0
    [<00000000b20dd5d4>] __do_softirq+0xf4/0x305
    [<000000003a7d7e15>] __irq_exit_rcu+0xc3/0x140
    [<00000000968d39a2>] sysvec_apic_timer_interrupt+0x9e/0xc0
    [<000000009e920794>] asm_sysvec_apic_timer_interrupt+0x16/0x20
    [<000000008942add0>] native_safe_halt+0x13/0x20

Florian Westphal says: "Original code was likely fine because nothing
ever did set a skb->dst entry earlier than bridge in those days."

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Harsh Modi <harshmodi@google.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-15 12:17:05 +02:00
Florian Westphal
358765beb8 netfilter: ebtables: reject blobs that don't provide all entry points
[ Upstream commit 7997eff82828304b780dc0a39707e1946d6f1ebf ]

Harshit Mogalapalli says:
 In ebt_do_table() function dereferencing 'private->hook_entry[hook]'
 can lead to NULL pointer dereference. [..] Kernel panic:

general protection fault, probably for non-canonical address 0xdffffc0000000005: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f]
[..]
RIP: 0010:ebt_do_table+0x1dc/0x1ce0
Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 5c 16 00 00 48 b8 00 00 00 00 00 fc ff df 49 8b 6c df 08 48 8d 7d 2c 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 88
[..]
Call Trace:
 nf_hook_slow+0xb1/0x170
 __br_forward+0x289/0x730
 maybe_deliver+0x24b/0x380
 br_flood+0xc6/0x390
 br_dev_xmit+0xa2e/0x12c0

For some reason ebtables rejects blobs that provide entry points that are
not supported by the table, but what it should instead reject is the
opposite: blobs that DO NOT provide an entry point supported by the table.

t->valid_hooks is the bitmask of hooks (input, forward ...) that will see
packets.  Providing an entry point that is not support is harmless
(never called/used), but the inverse isn't: it results in a crash
because the ebtables traverser doesn't expect a NULL blob for a location
its receiving packets for.

Instead of fixing all the individual checks, do what iptables is doing and
reject all blobs that differ from the expected hooks.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-05 10:26:29 +02:00
Florian Westphal
9f45c9d712 netfilter: br_netfilter: do not skip all hooks with 0 priority
[ Upstream commit c2577862eeb0be94f151f2f1fff662b028061b00 ]

When br_netfilter module is loaded, skbs may be diverted to the
ipv4/ipv6 hooks, just like as if we were routing.

Unfortunately, bridge filter hooks with priority 0 may be skipped
in this case.

Example:
1. an nftables bridge ruleset is loaded, with a prerouting
   hook that has priority 0.
2. interface is added to the bridge.
3. no tcp packet is ever seen by the bridge prerouting hook.
4. flush the ruleset
5. load the bridge ruleset again.
6. tcp packets are processed as expected.

After 1) the only registered hook is the bridge prerouting hook, but its
not called yet because the bridge hasn't been brought up yet.

After 2), hook order is:
   0 br_nf_pre_routing // br_netfilter internal hook
   0 chain bridge f prerouting // nftables bridge ruleset

The packet is diverted to br_nf_pre_routing.
If call-iptables is off, the nftables bridge ruleset is called as expected.

But if its enabled, br_nf_hook_thresh() will skip it because it assumes
that all 0-priority hooks had been called previously in bridge context.

To avoid this, check for the br_nf_pre_routing hook itself, we need to
resume directly after it, even if this hook has a priority of 0.

Unfortunately, this still results in different packet flow.
With this fix, the eval order after in 3) is:
1. br_nf_pre_routing
2. ip(6)tables (if enabled)
3. nftables bridge

but after 5 its the much saner:
1. nftables bridge
2. br_nf_pre_routing
3. ip(6)tables (if enabled)

Unfortunately I don't see a solution here:
It would be possible to move br_nf_pre_routing to a higher priority
so that it will be called later in the pipeline, but this also impacts
ebtables evaluation order, and would still result in this very ordering
problem for all nftables-bridge hooks with the same priority as the
br_nf_pre_routing one.

Searching back through the git history I don't think this has
ever behaved in any other way, hence, no fixes-tag.

Reported-by: Radim Hrazdil <rhrazdil@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-07-21 21:09:30 +02:00
Andrew Lunn
b1f86c34b2 net: bridge: Clear offload_fwd_mark when passing frame up bridge interface.
[ Upstream commit fbb3abdf2223cd0dfc07de85fe5a43ba7f435bdf ]

It is possible to stack bridges on top of each other. Consider the
following which makes use of an Ethernet switch:

       br1
     /    \
    /      \
   /        \
 br0.11    wlan0
   |
   br0
 /  |  \
p1  p2  p3

br0 is offloaded to the switch. Above br0 is a vlan interface, for
vlan 11. This vlan interface is then a slave of br1. br1 also has a
wireless interface as a slave. This setup trunks wireless lan traffic
over the copper network inside a VLAN.

A frame received on p1 which is passed up to the bridge has the
skb->offload_fwd_mark flag set to true, indicating that the switch has
dealt with forwarding the frame out ports p2 and p3 as needed. This
flag instructs the software bridge it does not need to pass the frame
back down again. However, the flag is not getting reset when the frame
is passed upwards. As a result br1 sees the flag, wrongly interprets
it, and fails to forward the frame to wlan0.

When passing a frame upwards, clear the flag. This is the Rx
equivalent of br_switchdev_frame_unmark() in br_dev_xmit().

Fixes: f1c2eddf4c ("bridge: switchdev: Use an helper to clear forward mark")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20220518005840.771575-1-andrew@lunn.ch
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-25 09:10:40 +02:00
Nikolay Aleksandrov
c13ec2c0f4 net: bridge: fix stale eth hdr pointer in br_dev_xmit
commit 823d81b0fa2cd83a640734e74caee338b5d3c093 upstream.

In br_dev_xmit() we perform vlan filtering in br_allowed_ingress() but
if the packet has the vlan header inside (e.g. bridge with disabled
tx-vlan-offload) then the vlan filtering code will use skb_vlan_untag()
to extract the vid before filtering which in turn calls pskb_may_pull()
and we may end up with a stale eth pointer. Moreover the cached eth header
pointer will generally be wrong after that operation. Remove the eth header
caching and just use eth_hdr() directly, the compiler does the right thing
and calculates it only once so we don't lose anything.

Fixes: 057658cb33 ("bridge: suppress arp pkts on BR_NEIGH_SUPPRESS ports")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Eduardo Vela <Nava> <evn@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-16 12:51:45 +01:00
Nikolay Aleksandrov
ac06e167c4 net: bridge: clear bridge's private skb space on xmit
commit fd65e5a95d08389444e8591a20538b3edece0e15 upstream.

We need to clear all of the bridge private skb variables as they can be
stale due to the packet being recirculated through the stack and then
transmitted through the bridge device. Similar memset is already done on
bridge's input. We've seen cases where proxyarp_replied was 1 on routed
multicast packets transmitted through the bridge to ports with neigh
suppress which were getting dropped. Same thing can in theory happen with
the port isolation bit as well.

Fixes: 821f1b21ca ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Huang Guobin <huangguobin4@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-29 10:19:18 +01:00
Florian Westphal
519f563eca netfilter: bridge: add support for pppoe filtering
[ Upstream commit 28b78ecffea8078d81466b2e01bb5a154509f1ba ]

This makes 'bridge-nf-filter-pppoe-tagged' sysctl work for
bridged traffic.

Looking at the original commit it doesn't appear this ever worked:

 static unsigned int br_nf_post_routing(unsigned int hook, struct sk_buff **pskb,
[..]
        if (skb->protocol == htons(ETH_P_8021Q)) {
                skb_pull(skb, VLAN_HLEN);
                skb->network_header += VLAN_HLEN;
+       } else if (skb->protocol == htons(ETH_P_PPP_SES)) {
+               skb_pull(skb, PPPOE_SES_HLEN);
+               skb->network_header += PPPOE_SES_HLEN;
        }
 [..]
	NF_HOOK(... POST_ROUTING, ...)

... but the adjusted offsets are never restored.

The alternative would be to rip this code out for good,
but otoh we'd have to keep this anyway for the vlan handling
(which works because vlan tag info is in the skb, not the packet
 payload).

Reported-and-tested-by: Amish Chana <amish@3g.co.za>
Fixes: 516299d2f5 ("[NETFILTER]: bridge-nf: filter bridged IPv4/IPv6 encapsulated in pppoe traffic")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-01-27 09:04:18 +01:00
Eric Dumazet
198d4e60e9 net: bridge: use nla_total_size_64bit() in br_get_linkxstats_size()
[ Upstream commit dbe0b88064494b7bb6a9b2aa7e085b14a3112d44 ]

bridge_fill_linkxstats() is using nla_reserve_64bit().

We must use nla_total_size_64bit() instead of nla_total_size()
for corresponding data structure.

Fixes: 1080ab95e3 ("net: bridge: add support for IGMP/MLD stats and export them via netlink")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Nikolay Aleksandrov <nikolay@nvidia.com>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-10-13 10:10:52 +02:00
Yang Yingliang
f41237f60c net: bridge: fix memleak in br_add_if()
[ Upstream commit 519133debcc19f5c834e7e28480b60bdc234fe02 ]

I got a memleak report:

BUG: memory leak
unreferenced object 0x607ee521a658 (size 240):
comm "syz-executor.0", pid 955, jiffies 4294780569 (age 16.449s)
hex dump (first 32 bytes, cpu 1):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000d830ea5a>] br_multicast_add_port+0x1c2/0x300 net/bridge/br_multicast.c:1693
[<00000000274d9a71>] new_nbp net/bridge/br_if.c:435 [inline]
[<00000000274d9a71>] br_add_if+0x670/0x1740 net/bridge/br_if.c:611
[<0000000012ce888e>] do_set_master net/core/rtnetlink.c:2513 [inline]
[<0000000012ce888e>] do_set_master+0x1aa/0x210 net/core/rtnetlink.c:2487
[<0000000099d1cafc>] __rtnl_newlink+0x1095/0x13e0 net/core/rtnetlink.c:3457
[<00000000a01facc0>] rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3488
[<00000000acc9186c>] rtnetlink_rcv_msg+0x369/0xa10 net/core/rtnetlink.c:5550
[<00000000d4aabb9c>] netlink_rcv_skb+0x134/0x3d0 net/netlink/af_netlink.c:2504
[<00000000bc2e12a3>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline]
[<00000000bc2e12a3>] netlink_unicast+0x4a0/0x6a0 net/netlink/af_netlink.c:1340
[<00000000e4dc2d0e>] netlink_sendmsg+0x789/0xc70 net/netlink/af_netlink.c:1929
[<000000000d22c8b3>] sock_sendmsg_nosec net/socket.c:654 [inline]
[<000000000d22c8b3>] sock_sendmsg+0x139/0x170 net/socket.c:674
[<00000000e281417a>] ____sys_sendmsg+0x658/0x7d0 net/socket.c:2350
[<00000000237aa2ab>] ___sys_sendmsg+0xf8/0x170 net/socket.c:2404
[<000000004f2dc381>] __sys_sendmsg+0xd3/0x190 net/socket.c:2433
[<0000000005feca6c>] do_syscall_64+0x37/0x90 arch/x86/entry/common.c:47
[<000000007304477d>] entry_SYSCALL_64_after_hwframe+0x44/0xae

On error path of br_add_if(), p->mcast_stats allocated in
new_nbp() need be freed, or it will be leaked.

Fixes: 1080ab95e3 ("net: bridge: add support for IGMP/MLD stats and export them via netlink")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/r/20210809132023.978546-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-08-26 08:36:38 -04:00
Wolfgang Bumiller
a2281d2f76 net: bridge: sync fdb to new unicast-filtering ports
commit a019abd8022061b917da767cd1a66ed823724eab upstream.

Since commit 2796d0c648 ("bridge: Automatically manage
port promiscuous mode.")
bridges with `vlan_filtering 1` and only 1 auto-port don't
set IFF_PROMISC for unicast-filtering-capable ports.

Normally on port changes `br_manage_promisc` is called to
update the promisc flags and unicast filters if necessary,
but it cannot distinguish between *new* ports and ones
losing their promisc flag, and new ports end up not
receiving the MAC address list.

Fix this by calling `br_fdb_sync_static` in `br_add_if`
after the port promisc flags are updated and the unicast
filter was supposed to have been filled.

Fixes: 2796d0c648 ("bridge: Automatically manage port promiscuous mode.")
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-28 11:13:45 +02:00
Nikolay Aleksandrov
2e70bf39b1 net: bridge: multicast: fix PIM hello router port marking race
commit 04bef83a3358946bfc98a5ecebd1b0003d83d882 upstream.

When a PIM hello packet is received on a bridge port with multicast
snooping enabled, we mark it as a router port automatically, that
includes adding that port the router port list. The multicast lock
protects that list, but it is not acquired in the PIM message case
leading to a race condition, we need to take it to fix the race.

Cc: stable@vger.kernel.org
Fixes: 91b02d3d13 ("bridge: mcast: add router port on PIM hello message")
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20 16:16:15 +02:00
Nikolay Aleksandrov
84fc1c944e net: bridge: fix vlan tunnel dst refcnt when egressing
commit cfc579f9d89af4ada58c69b03bcaa4887840f3b3 upstream.

The egress tunnel code uses dst_clone() and directly sets the result
which is wrong because the entry might have 0 refcnt or be already deleted,
causing number of problems. It also triggers the WARN_ON() in dst_hold()[1]
when a refcnt couldn't be taken. Fix it by using dst_hold_safe() and
checking if a reference was actually taken before setting the dst.

[1] dmesg WARN_ON log and following refcnt errors
 WARNING: CPU: 5 PID: 38 at include/net/dst.h:230 br_handle_egress_vlan_tunnel+0x10b/0x134 [bridge]
 Modules linked in: 8021q garp mrp bridge stp llc bonding ipv6 virtio_net
 CPU: 5 PID: 38 Comm: ksoftirqd/5 Kdump: loaded Tainted: G        W         5.13.0-rc3+ #360
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014
 RIP: 0010:br_handle_egress_vlan_tunnel+0x10b/0x134 [bridge]
 Code: e8 85 bc 01 e1 45 84 f6 74 90 45 31 f6 85 db 48 c7 c7 a0 02 19 a0 41 0f 94 c6 31 c9 31 d2 44 89 f6 e8 64 bc 01 e1 85 db 75 02 <0f> 0b 31 c9 31 d2 44 89 f6 48 c7 c7 70 02 19 a0 e8 4b bc 01 e1 49
 RSP: 0018:ffff8881003d39e8 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffffa01902a0
 RBP: ffff8881040c6700 R08: 0000000000000000 R09: 0000000000000001
 R10: 2ce93d0054fe0d00 R11: 54fe0d00000e0000 R12: ffff888109515000
 R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000401
 FS:  0000000000000000(0000) GS:ffff88822bf40000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f42ba70f030 CR3: 0000000109926000 CR4: 00000000000006e0
 Call Trace:
  br_handle_vlan+0xbc/0xca [bridge]
  __br_forward+0x23/0x164 [bridge]
  deliver_clone+0x41/0x48 [bridge]
  br_handle_frame_finish+0x36f/0x3aa [bridge]
  ? skb_dst+0x2e/0x38 [bridge]
  ? br_handle_ingress_vlan_tunnel+0x3e/0x1c8 [bridge]
  ? br_handle_frame_finish+0x3aa/0x3aa [bridge]
  br_handle_frame+0x2c3/0x377 [bridge]
  ? __skb_pull+0x33/0x51
  ? vlan_do_receive+0x4f/0x36a
  ? br_handle_frame_finish+0x3aa/0x3aa [bridge]
  __netif_receive_skb_core+0x539/0x7c6
  ? __list_del_entry_valid+0x16e/0x1c2
  __netif_receive_skb_list_core+0x6d/0xd6
  netif_receive_skb_list_internal+0x1d9/0x1fa
  gro_normal_list+0x22/0x3e
  dev_gro_receive+0x55b/0x600
  ? detach_buf_split+0x58/0x140
  napi_gro_receive+0x94/0x12e
  virtnet_poll+0x15d/0x315 [virtio_net]
  __napi_poll+0x2c/0x1c9
  net_rx_action+0xe6/0x1fb
  __do_softirq+0x115/0x2d8
  run_ksoftirqd+0x18/0x20
  smpboot_thread_fn+0x183/0x19c
  ? smpboot_unregister_percpu_thread+0x66/0x66
  kthread+0x10a/0x10f
  ? kthread_mod_delayed_work+0xb6/0xb6
  ret_from_fork+0x22/0x30
 ---[ end trace 49f61b07f775fd2b ]---
 dst_release: dst:00000000c02d677a refcnt:-1
 dst_release underflow

Cc: stable@vger.kernel.org
Fixes: 11538d039a ("bridge: vlan dst_metadata hooks in ingress and egress paths")
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-30 08:48:26 -04:00
Nikolay Aleksandrov
24a6e55f17 net: bridge: fix vlan tunnel dst null pointer dereference
commit 58e2071742e38f29f051b709a5cca014ba51166f upstream.

This patch fixes a tunnel_dst null pointer dereference due to lockless
access in the tunnel egress path. When deleting a vlan tunnel the
tunnel_dst pointer is set to NULL without waiting a grace period (i.e.
while it's still usable) and packets egressing are dereferencing it
without checking. Use READ/WRITE_ONCE to annotate the lockless use of
tunnel_id, use RCU for accessing tunnel_dst and make sure it is read
only once and checked in the egress path. The dst is already properly RCU
protected so we don't need to do anything fancy than to make sure
tunnel_id and tunnel_dst are read only once and checked in the egress path.

Cc: stable@vger.kernel.org
Fixes: 11538d039a ("bridge: vlan dst_metadata hooks in ingress and egress paths")
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-30 08:48:25 -04:00
Nikolay Aleksandrov
e9e5f34400 net: bridge: when suppression is enabled exclude RARP packets
[ Upstream commit 0353b4a96b7a9f60fe20d1b3ebd4931a4085f91c ]

Recently we had an interop issue where RARP packets got suppressed with
bridge neigh suppression enabled, but the check in the code was meant to
suppress GARP. Exclude RARP packets from it which would allow some VMWare
setups to work, to quote the report:
"Those RARP packets usually get generated by vMware to notify physical
switches when vMotion occurs. vMware may use random sip/tip or just use
sip=tip=0. So the RARP packet sometimes get properly flooded by the vtep
and other times get dropped by the logic"

Reported-by: Amer Abdalamer <amer@nvidia.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-22 10:59:40 +02:00
Vladimir Oltean
7b878f6a73 net: bridge: use switchdev for port flags set through sysfs too
commit 8043c845b63a2dd88daf2d2d268a33e1872800f0 upstream.

Looking through patchwork I don't see that there was any consensus to
use switchdev notifiers only in case of netlink provided port flags but
not sysfs (as a sort of deprecation, punishment or anything like that),
so we should probably keep the user interface consistent in terms of
functionality.

http://patchwork.ozlabs.org/project/netdev/patch/20170605092043.3523-3-jiri@resnulli.us/
http://patchwork.ozlabs.org/project/netdev/patch/20170608064428.4785-3-jiri@resnulli.us/

Fixes: 3922285d96 ("net: bridge: Add support for offloading port attributes")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-07 12:18:56 +01:00
Zhang Changzhong
e64cc46270 net: bridge: vlan: fix error return code in __vlan_add()
[ Upstream commit ee4f52a8de2c6f78b01f10b4c330867d88c1653a ]

Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: f8ed289fab ("bridge: vlan: use br_vlan_(get|put)_master to deal with refcounts")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/r/1607071737-33875-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-30 11:25:41 +01:00
Antoine Tenart
c9c048d4e3 netfilter: bridge: reset skb->pkt_type after NF_INET_POST_ROUTING traversal
[ Upstream commit 44f64f23bae2f0fad25503bc7ab86cd08d04cd47 ]

Netfilter changes PACKET_OTHERHOST to PACKET_HOST before invoking the
hooks as, while it's an expected value for a bridge, routing expects
PACKET_HOST. The change is undone later on after hook traversal. This
can be seen with pairs of functions updating skb>pkt_type and then
reverting it to its original value:

For hook NF_INET_PRE_ROUTING:
  setup_pre_routing / br_nf_pre_routing_finish

For hook NF_INET_FORWARD:
  br_nf_forward_ip / br_nf_forward_finish

But the third case where netfilter does this, for hook
NF_INET_POST_ROUTING, the packet type is changed in br_nf_post_routing
but never reverted. A comment says:

  /* We assume any code from br_dev_queue_push_xmit onwards doesn't care
   * about the value of skb->pkt_type. */

But when having a tunnel (say vxlan) attached to a bridge we have the
following call trace:

  br_nf_pre_routing
  br_nf_pre_routing_ipv6
     br_nf_pre_routing_finish
  br_nf_forward_ip
     br_nf_forward_finish
  br_nf_post_routing           <- pkt_type is updated to PACKET_HOST
     br_nf_dev_queue_xmit      <- but not reverted to its original value
  vxlan_xmit
     vxlan_xmit_one
        skb_tunnel_check_pmtu  <- a check on pkt_type is performed

In this specific case, this creates issues such as when an ICMPv6 PTB
should be sent back. When CONFIG_BRIDGE_NETFILTER is enabled, the PTB
isn't sent (as skb_tunnel_check_pmtu checks if pkt_type is PACKET_HOST
and returns early).

If the comment is right and no one cares about the value of
skb->pkt_type after br_dev_queue_push_xmit (which isn't true), resetting
it to its original value should be safe.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Florian Westphal <fw@strlen.de>
Link: https://lore.kernel.org/r/20201123174902.622102-1-atenart@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-08 10:18:53 +01:00
Heiner Kallweit
b1197f5d6f net: bridge: add missing counters to ndo_get_stats64 callback
[ Upstream commit 7a30ecc9237681bb125cbd30eee92bef7e86293d ]

In br_forward.c and br_input.c fields dev->stats.tx_dropped and
dev->stats.multicast are populated, but they are ignored in
ndo_get_stats64.

Fixes: 28172739f0 ("net: fix 64 bit counters on 32 bit arches")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/58ea9963-77ad-a7cf-8dfd-fc95ab95f606@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-24 13:27:17 +01:00
Thomas Martitz
13378a1d5b net: bridge: enfore alignment for ethernet address
[ Upstream commit db7202dec92e6caa2706c21d6fc359af318bde2e ]

The eth_addr member is passed to ether_addr functions that require
2-byte alignment, therefore the member must be properly aligned
to avoid unaligned accesses.

The problem is in place since the initial merge of multicast to unicast:
commit 6db6f0eae6 bridge: multicast to unicast

Fixes: 6db6f0eae6 ("bridge: multicast to unicast")
Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Martitz <t.martitz@avm.de>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30 23:17:03 -04:00
Ido Schimmel
1e74500f99 bridge: Avoid infinite loop when suppressing NS messages with invalid options
[ Upstream commit 53fc685243bd6fb90d90305cea54598b78d3cbfc ]

When neighbor suppression is enabled the bridge device might reply to
Neighbor Solicitation (NS) messages on behalf of remote hosts.

In case the NS message includes the "Source link-layer address" option
[1], the bridge device will use the specified address as the link-layer
destination address in its reply.

To avoid an infinite loop, break out of the options parsing loop when
encountering an option with length zero and disregard the NS message.

This is consistent with the IPv6 ndisc code and RFC 4886 which states
that "Nodes MUST silently discard an ND packet that contains an option
with length zero" [2].

[1] https://tools.ietf.org/html/rfc4861#section-4.3
[2] https://tools.ietf.org/html/rfc4861#section-4.6

Fixes: ed842faeb2 ("bridge: suppress nd pkts on BR_NEIGH_SUPPRESS ports")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Alla Segal <allas@mellanox.com>
Tested-by: Alla Segal <allas@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-22 09:04:58 +02:00
Michael Braun
f7d8095579 netfilter: nft_reject_bridge: enable reject with bridge vlan
commit e9c284ec4b41c827f4369973d2792992849e4fa5 upstream.

Currently, using the bridge reject target with tagged packets
results in untagged packets being sent back.

Fix this by mirroring the vlan id as well.

Fixes: 85f5b3086a ("netfilter: bridge: add reject support")
Signed-off-by: Michael Braun <michael-dev@fami-braun.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-03 08:19:47 +02:00
Florian Westphal
909021aed8 netfilter: ebtables: CONFIG_COMPAT: reject trailing data after last rule
[ Upstream commit 680f6af5337c98d116e4f127cea7845339dba8da ]

If userspace provides a rule blob with trailing data after last target,
we trigger a splat, then convert ruleset to 64bit format (with trailing
data), then pass that to do_replace_finish() which then returns -EINVAL.

Erroring out right away avoids the splat plus unneeded translation and
error unwind.

Fixes: 81e675c227 ("netfilter: ebtables: add CONFIG_COMPAT support")
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27 14:50:47 +01:00
Roopa Prabhu
221569dfed bridge: br_arp_nd_proxy: set icmp6_router if neigh has NTF_ROUTER
[ Upstream commit 7aca011f88eb57be1b17b0216247f4e32ac54e29 ]

Fixes: ed842faeb2 ("bridge: suppress nd pkts on BR_NEIGH_SUPPRESS ports")
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27 14:49:55 +01:00
Hangbin Liu
8bf95f28be net: add bool confirm_neigh parameter for dst_ops.update_pmtu
[ Upstream commit bd085ef678b2cc8c38c105673dfe8ff8f5ec0c57 ]

The MTU update code is supposed to be invoked in response to real
networking events that update the PMTU. In IPv6 PMTU update function
__ip6_rt_update_pmtu() we called dst_confirm_neigh() to update neighbor
confirmed time.

But for tunnel code, it will call pmtu before xmit, like:
  - tnl_update_pmtu()
    - skb_dst_update_pmtu()
      - ip6_rt_update_pmtu()
        - __ip6_rt_update_pmtu()
          - dst_confirm_neigh()

If the tunnel remote dst mac address changed and we still do the neigh
confirm, we will not be able to update neigh cache and ping6 remote
will failed.

So for this ip_tunnel_xmit() case, _EVEN_ if the MTU is changed, we
should not be invoking dst_confirm_neigh() as we have no evidence
of successful two-way communication at this point.

On the other hand it is also important to keep the neigh reachability fresh
for TCP flows, so we cannot remove this dst_confirm_neigh() call.

To fix the issue, we have to add a new bool parameter for dst_ops.update_pmtu
to choose whether we should do neigh update or not. I will add the parameter
in this patch and set all the callers to true to comply with the previous
way, and fix the tunnel code one by one on later patches.

v5: No change.
v4: No change.
v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
    dst_ops.update_pmtu to control whether we should do neighbor confirm.
    Also split the big patch to small ones for each area.
v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.

Suggested-by: David Miller <davem@davemloft.net>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-04 19:13:37 +01:00
Eric Dumazet
2ad86afcd9 netfilter: bridge: make sure to pull arp header in br_nf_forward_arp()
commit 5604285839aaedfb23ebe297799c6e558939334d upstream.

syzbot is kind enough to remind us we need to call skb_may_pull()

BUG: KMSAN: uninit-value in br_nf_forward_arp+0xe61/0x1230 net/bridge/br_netfilter_hooks.c:665
CPU: 1 PID: 11631 Comm: syz-executor.1 Not tainted 5.4.0-rc8-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1c9/0x220 lib/dump_stack.c:118
 kmsan_report+0x128/0x220 mm/kmsan/kmsan_report.c:108
 __msan_warning+0x64/0xc0 mm/kmsan/kmsan_instr.c:245
 br_nf_forward_arp+0xe61/0x1230 net/bridge/br_netfilter_hooks.c:665
 nf_hook_entry_hookfn include/linux/netfilter.h:135 [inline]
 nf_hook_slow+0x18b/0x3f0 net/netfilter/core.c:512
 nf_hook include/linux/netfilter.h:260 [inline]
 NF_HOOK include/linux/netfilter.h:303 [inline]
 __br_forward+0x78f/0xe30 net/bridge/br_forward.c:109
 br_flood+0xef0/0xfe0 net/bridge/br_forward.c:234
 br_handle_frame_finish+0x1a77/0x1c20 net/bridge/br_input.c:162
 nf_hook_bridge_pre net/bridge/br_input.c:245 [inline]
 br_handle_frame+0xfb6/0x1eb0 net/bridge/br_input.c:348
 __netif_receive_skb_core+0x20b9/0x51a0 net/core/dev.c:4830
 __netif_receive_skb_one_core net/core/dev.c:4927 [inline]
 __netif_receive_skb net/core/dev.c:5043 [inline]
 process_backlog+0x610/0x13c0 net/core/dev.c:5874
 napi_poll net/core/dev.c:6311 [inline]
 net_rx_action+0x7a6/0x1aa0 net/core/dev.c:6379
 __do_softirq+0x4a1/0x83a kernel/softirq.c:293
 do_softirq_own_stack+0x49/0x80 arch/x86/entry/entry_64.S:1091
 </IRQ>
 do_softirq kernel/softirq.c:338 [inline]
 __local_bh_enable_ip+0x184/0x1d0 kernel/softirq.c:190
 local_bh_enable+0x36/0x40 include/linux/bottom_half.h:32
 rcu_read_unlock_bh include/linux/rcupdate.h:688 [inline]
 __dev_queue_xmit+0x38e8/0x4200 net/core/dev.c:3819
 dev_queue_xmit+0x4b/0x60 net/core/dev.c:3825
 packet_snd net/packet/af_packet.c:2959 [inline]
 packet_sendmsg+0x8234/0x9100 net/packet/af_packet.c:2984
 sock_sendmsg_nosec net/socket.c:637 [inline]
 sock_sendmsg net/socket.c:657 [inline]
 __sys_sendto+0xc44/0xc70 net/socket.c:1952
 __do_sys_sendto net/socket.c:1964 [inline]
 __se_sys_sendto+0x107/0x130 net/socket.c:1960
 __x64_sys_sendto+0x6e/0x90 net/socket.c:1960
 do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45a679
Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f0a3c9e5c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 000000000045a679
RDX: 000000000000000e RSI: 0000000020000200 RDI: 0000000000000003
RBP: 000000000075bf20 R08: 00000000200000c0 R09: 0000000000000014
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f0a3c9e66d4
R13: 00000000004c8ec1 R14: 00000000004dfe28 R15: 00000000ffffffff

Uninit was created at:
 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:149 [inline]
 kmsan_internal_poison_shadow+0x5c/0x110 mm/kmsan/kmsan.c:132
 kmsan_slab_alloc+0x97/0x100 mm/kmsan/kmsan_hooks.c:86
 slab_alloc_node mm/slub.c:2773 [inline]
 __kmalloc_node_track_caller+0xe27/0x11a0 mm/slub.c:4381
 __kmalloc_reserve net/core/skbuff.c:141 [inline]
 __alloc_skb+0x306/0xa10 net/core/skbuff.c:209
 alloc_skb include/linux/skbuff.h:1049 [inline]
 alloc_skb_with_frags+0x18c/0xa80 net/core/skbuff.c:5662
 sock_alloc_send_pskb+0xafd/0x10a0 net/core/sock.c:2244
 packet_alloc_skb net/packet/af_packet.c:2807 [inline]
 packet_snd net/packet/af_packet.c:2902 [inline]
 packet_sendmsg+0x63a6/0x9100 net/packet/af_packet.c:2984
 sock_sendmsg_nosec net/socket.c:637 [inline]
 sock_sendmsg net/socket.c:657 [inline]
 __sys_sendto+0xc44/0xc70 net/socket.c:1952
 __do_sys_sendto net/socket.c:1964 [inline]
 __se_sys_sendto+0x107/0x130 net/socket.c:1960
 __x64_sys_sendto+0x6e/0x90 net/socket.c:1960
 do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: c4e70a87d9 ("netfilter: bridge: rename br_netfilter.c to br_netfilter_hooks.c")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-04 19:13:28 +01:00
Florian Westphal
751e2557de netfilter: ebtables: compat: reject all padding in matches/watchers
commit e608f631f0ba5f1fc5ee2e260a3a35d13107cbfe upstream.

syzbot reported following splat:

BUG: KASAN: vmalloc-out-of-bounds in size_entry_mwt net/bridge/netfilter/ebtables.c:2063 [inline]
BUG: KASAN: vmalloc-out-of-bounds in compat_copy_entries+0x128b/0x1380 net/bridge/netfilter/ebtables.c:2155
Read of size 4 at addr ffffc900004461f4 by task syz-executor267/7937

CPU: 1 PID: 7937 Comm: syz-executor267 Not tainted 5.5.0-rc1-syzkaller #0
 size_entry_mwt net/bridge/netfilter/ebtables.c:2063 [inline]
 compat_copy_entries+0x128b/0x1380 net/bridge/netfilter/ebtables.c:2155
 compat_do_replace+0x344/0x720 net/bridge/netfilter/ebtables.c:2249
 compat_do_ebt_set_ctl+0x22f/0x27e net/bridge/netfilter/ebtables.c:2333
 [..]

Because padding isn't considered during computation of ->buf_user_offset,
"total" is decremented by fewer bytes than it should.

Therefore, the first part of

if (*total < sizeof(*entry) || entry->next_offset < sizeof(*entry))

will pass, -- it should not have.  This causes oob access:
entry->next_offset is past the vmalloced size.

Reject padding and check that computed user offset (sum of ebt_entry
structure plus all individual matches/watchers/targets) is same
value that userspace gave us as the offset of the next entry.

Reported-by: syzbot+f68108fed972453a0ad4@syzkaller.appspotmail.com
Fixes: 81e675c227 ("netfilter: ebtables: add CONFIG_COMPAT support")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-04 19:13:26 +01:00
Nikolay Aleksandrov
bb168ebe95 net: bridge: deny dev_set_mac_address() when unregistering
[ Upstream commit c4b4c421857dc7b1cf0dccbd738472360ff2cd70 ]

We have an interesting memory leak in the bridge when it is being
unregistered and is a slave to a master device which would change the
mac of its slaves on unregister (e.g. bond, team). This is a very
unusual setup but we do end up leaking 1 fdb entry because
dev_set_mac_address() would cause the bridge to insert the new mac address
into its table after all fdbs are flushed, i.e. after dellink() on the
bridge has finished and we call NETDEV_UNREGISTER the bond/team would
release it and will call dev_set_mac_address() to restore its original
address and that in turn will add an fdb in the bridge.
One fix is to check for the bridge dev's reg_state in its
ndo_set_mac_address callback and return an error if the bridge is not in
NETREG_REGISTERED.

Easy steps to reproduce:
 1. add bond in mode != A/B
 2. add any slave to the bond
 3. add bridge dev as a slave to the bond
 4. destroy the bridge device

Trace:
 unreferenced object 0xffff888035c4d080 (size 128):
   comm "ip", pid 4068, jiffies 4296209429 (age 1413.753s)
   hex dump (first 32 bytes):
     41 1d c9 36 80 88 ff ff 00 00 00 00 00 00 00 00  A..6............
     d2 19 c9 5e 3f d7 00 00 00 00 00 00 00 00 00 00  ...^?...........
   backtrace:
     [<00000000ddb525dc>] kmem_cache_alloc+0x155/0x26f
     [<00000000633ff1e0>] fdb_create+0x21/0x486 [bridge]
     [<0000000092b17e9c>] fdb_insert+0x91/0xdc [bridge]
     [<00000000f2a0f0ff>] br_fdb_change_mac_address+0xb3/0x175 [bridge]
     [<000000001de02dbd>] br_stp_change_bridge_id+0xf/0xff [bridge]
     [<00000000ac0e32b1>] br_set_mac_address+0x76/0x99 [bridge]
     [<000000006846a77f>] dev_set_mac_address+0x63/0x9b
     [<00000000d30738fc>] __bond_release_one+0x3f6/0x455 [bonding]
     [<00000000fc7ec01d>] bond_netdev_event+0x2f2/0x400 [bonding]
     [<00000000305d7795>] notifier_call_chain+0x38/0x56
     [<0000000028885d4a>] call_netdevice_notifiers+0x1e/0x23
     [<000000008279477b>] rollback_registered_many+0x353/0x6a4
     [<0000000018ef753a>] unregister_netdevice_many+0x17/0x6f
     [<00000000ba854b7a>] rtnl_delete_link+0x3c/0x43
     [<00000000adf8618d>] rtnl_dellink+0x1dc/0x20a
     [<000000009b6395fd>] rtnetlink_rcv_msg+0x23d/0x268

Fixes: 4359881338 ("bridge: add local MAC address to forwarding table (v2)")
Reported-by: syzbot+2add91c08eb181fea1bf@syzkaller.appspotmail.com
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-21 10:57:10 +01:00
Florian Westphal
c5b0bbef43 bridge: ebtables: don't crash when using dnat target in output chains
[ Upstream commit b23c0742c2ce7e33ed79d10e451f70fdb5ca85d1 ]

xt_in() returns NULL in the output hook, skip the pkt_type change for
that case, redirection only makes sense in broute/prerouting hooks.

Reported-by: Tom Yan <tom.ty89@gmail.com>
Cc: Linus Lüssing <linus.luessing@c0d3.blue>
Fixes: cf3cb246e2 ("bridge: ebtables: fix reception of frames DNAT-ed to bridge device/port")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-05 09:19:41 +01:00