Changes in 5.4.285
usbnet: ipheth: fix carrier detection in modes 1 and 4
net: ethernet: use ip_hdrlen() instead of bit shift
net: phy: vitesse: repair vsc73xx autonegotiation
scripts: kconfig: merge_config: config files: add a trailing newline
arm64: dts: rockchip: override BIOS_DISABLE signal via GPIO hog on RK3399 Puma
ice: fix accounting for filters shared by multiple VSIs
net/mlx5e: Add missing link modes to ptys2ethtool_map
net: ftgmac100: Enable TX interrupt to avoid TX timeout
net: dpaa: Pad packets to ETH_ZLEN
spi: nxp-fspi: fix the KASAN report out-of-bounds bug
soundwire: stream: Revert "soundwire: stream: fix programming slave ports for non-continous port maps"
selftests: breakpoints: Fix a typo of function name
ASoC: allow module autoloading for table db1200_pids
ALSA: hda/realtek - Fixed ALC256 headphone no sound
ALSA: hda/realtek - FIxed ALC285 headphone no sound
pinctrl: at91: make it work with current gpiolib
microblaze: don't treat zero reserved memory regions as error
net: ftgmac100: Ensure tx descriptor updates are visible
wifi: iwlwifi: mvm: fix iwl_mvm_max_scan_ie_fw_cmd_room()
wifi: iwlwifi: mvm: don't wait for tx queues if firmware is dead
ASoC: tda7419: fix module autoloading
drm: komeda: Fix an issue related to normalized zpos
spi: bcm63xx: Enable module autoloading
x86/hyperv: Set X86_FEATURE_TSC_KNOWN_FREQ when Hyper-V provides frequency
ocfs2: add bounds checking to ocfs2_xattr_find_entry()
ocfs2: strict bound check before memcmp in ocfs2_xattr_find_entry()
gpio: prevent potential speculation leaks in gpio_device_get_desc()
inet: inet_defrag: prevent sk release while still in use
bpf: Fix DEVMAP_HASH overflow check on 32-bit arches
USB: serial: pl2303: add device id for Macrosilicon MS3020
USB: usbtmc: prevent kernel-usb-infoleak
ACPI: PMIC: Remove unneeded check in tps68470_pmic_opregion_probe()
wifi: ath9k: fix parameter check in ath9k_init_debug()
wifi: ath9k: Remove error checks when creating debugfs entries
fs: explicitly unregister per-superblock BDIs
mount: warn only once about timestamp range expiration
fs/namespace: fnic: Switch to use %ptTd
mount: handle OOM on mnt_warn_timestamp_expiry
can: j1939: use correct function name in comment
netfilter: nf_tables: elements with timeout below CONFIG_HZ never expire
netfilter: nf_tables: reject element expiration with no timeout
netfilter: nf_tables: reject expiration higher than timeout
wifi: cfg80211: fix UBSAN noise in cfg80211_wext_siwscan()
wifi: cfg80211: fix two more possible UBSAN-detected off-by-one errors
mac80211: parse radiotap header when selecting Tx queue
wifi: mac80211: use two-phase skb reclamation in ieee80211_do_stop()
wifi: wilc1000: fix potential RCU dereference issue in wilc_parse_join_bss_param
sock_map: Add a cond_resched() in sock_hash_free()
can: bcm: Clear bo->bcm_proc_read after remove_proc_entry().
Bluetooth: btusb: Fix not handling ZPL/short-transfer
net: tipc: avoid possible garbage value
block, bfq: fix possible UAF for bfqq->bic with merge chain
block, bfq: choose the last bfqq from merge chain in bfq_setup_cooperator()
block, bfq: don't break merge chain in bfq_split_bfqq()
spi: ppc4xx: handle irq_of_parse_and_map() errors
spi: ppc4xx: Avoid returning 0 when failed to parse and map IRQ
ARM: dts: imx7d-zii-rmu2: fix Ethernet PHY pinctrl property
ARM: versatile: fix OF node leak in CPUs prepare
reset: berlin: fix OF node leak in probe() error path
clocksource/drivers/qcom: Add missing iounmap() on errors in msm_dt_timer_init()
hwmon: (max16065) Fix overflows seen when writing limits
mtd: slram: insert break after errors in parsing the map
hwmon: (ntc_thermistor) fix module autoloading
power: supply: axp20x_battery: allow disabling battery charging
power: supply: axp20x_battery: Remove design from min and max voltage
power: supply: max17042_battery: Fix SOC threshold calc w/ no current sense
fbdev: hpfb: Fix an error handling path in hpfb_dio_probe()
mtd: powernv: Add check devm_kasprintf() returned value
drm/stm: Fix an error handling path in stm_drm_platform_probe()
drm/amdgpu: Replace one-element array with flexible-array member
drm/amdgpu: properly handle vbios fake edid sizing
drm/radeon: Replace one-element array with flexible-array member
drm/radeon: properly handle vbios fake edid sizing
drm/rockchip: vop: Allow 4096px width scaling
drm/rockchip: dw_hdmi: Fix reading EDID when using a forced mode
drm/radeon/evergreen_cs: fix int overflow errors in cs track offsets
jfs: fix out-of-bounds in dbNextAG() and diAlloc()
drm/msm: Fix incorrect file name output in adreno_request_fw()
drm/msm/a5xx: disable preemption in submits by default
drm/msm/a5xx: properly clear preemption records on resume
drm/msm/a5xx: fix races in preemption evaluation stage
ipmi: docs: don't advertise deprecated sysfs entries
drm/msm: fix %s null argument error
drivers:drm:exynos_drm_gsc:Fix wrong assignment in gsc_bind()
xen: use correct end address of kernel for conflict checking
xen/swiotlb: add alignment check for dma buffers
tpm: Clean up TPM space after command failure
selftests/bpf: Fix compile error from rlim_t in sk_storage_map.c
selftests/bpf: Fix compiling flow_dissector.c with musl-libc
selftests/bpf: Fix compiling tcp_rtt.c with musl-libc
selftests/bpf: Fix error compiling test_lru_map.c
xz: cleanup CRC32 edits from 2018
kthread: add kthread_work tracepoints
kthread: fix task state in kthread worker if being frozen
jbd2: introduce/export functions jbd2_journal_submit|finish_inode_data_buffers()
ext4: clear EXT4_GROUP_INFO_WAS_TRIMMED_BIT even mount with discard
smackfs: Use rcu_assign_pointer() to ensure safe assignment in smk_set_cipso
ext4: avoid negative min_clusters in find_group_orlov()
ext4: return error on ext4_find_inline_entry
ext4: avoid OOB when system.data xattr changes underneath the filesystem
nilfs2: fix potential null-ptr-deref in nilfs_btree_insert()
nilfs2: determine empty node blocks as corrupted
nilfs2: fix potential oob read in nilfs_btree_check_delete()
bpf: Fix bpf_strtol and bpf_strtoul helpers for 32bit
perf sched timehist: Fix missing free of session in perf_sched__timehist()
perf sched timehist: Fixed timestamp error when unable to confirm event sched_in time
perf time-utils: Fix 32-bit nsec parsing
clk: rockchip: Set parent rate for DCLK_VOP clock on RK3228
drivers: media: dvb-frontends/rtl2832: fix an out-of-bounds write error
drivers: media: dvb-frontends/rtl2830: fix an out-of-bounds write error
PCI: keystone: Fix if-statement expression in ks_pcie_quirk()
PCI: xilinx-nwl: Fix register misspelling
RDMA/iwcm: Fix WARNING:at_kernel/workqueue.c:#check_flush_dependency
pinctrl: single: fix missing error code in pcs_probe()
clk: ti: dra7-atl: Fix leak of of_nodes
pinctrl: mvebu: Fix devinit_dove_pinctrl_probe function
watchdog: imx_sc_wdt: Don't disable WDT in suspend
RDMA/hns: Optimize hem allocation performance
riscv: Fix fp alignment bug in perf_callchain_user()
RDMA/cxgb4: Added NULL check for lookup_atid
ntb: intel: Fix the NULL vs IS_ERR() bug for debugfs_create_dir()
nfsd: call cache_put if xdr_reserve_space returns NULL
nfsd: return -EINVAL when namelen is 0
f2fs: enhance to update i_mode and acl atomically in f2fs_setattr()
f2fs: fix typo
f2fs: fix to update i_ctime in __f2fs_setxattr()
f2fs: remove unneeded check condition in __f2fs_setxattr()
f2fs: reduce expensive checkpoint trigger frequency
iio: adc: ad7606: fix oversampling gpio array
iio: adc: ad7606: fix standby gpio state to match the documentation
coresight: tmc: sg: Do not leak sg_table
netfilter: nf_reject_ipv6: fix nf_reject_ip6_tcphdr_put()
net: seeq: Fix use after free vulnerability in ether3 Driver Due to Race Condition
tcp: check skb is non-NULL in tcp_rto_delta_us()
net: qrtr: Update packets cloning when broadcasting
netfilter: ctnetlink: compile ctnetlink_label_size with CONFIG_NF_CONNTRACK_EVENTS
crypto: aead,cipher - zeroize key buffer after use
Remove *.orig pattern from .gitignore
soc: versatile: integrator: fix OF node leak in probe() error path
drm/amd/display: Round calculated vtotal
USB: appledisplay: close race between probe and completion handler
USB: misc: cypress_cy7c63: check for short transfer
USB: class: CDC-ACM: fix race between get_serial and set_serial
firmware_loader: Block path traversal
tty: rp2: Fix reset with non forgiving PCIe host bridges
drbd: Fix atomicity violation in drbd_uuid_set_bm()
drbd: Add NULL check for net_conf to prevent dereference in state validation
ACPI: sysfs: validate return type of _STR method
ACPI: resource: Add another DMI match for the TongFang GMxXGxx
wifi: rtw88: 8822c: Fix reported RX band width
debugobjects: Fix conditions in fill_pool()
f2fs: prevent possible int overflow in dir_block_index()
f2fs: avoid potential int overflow in sanity_check_area_boundary()
hwrng: mtk - Use devm_pm_runtime_enable
vfs: fix race between evice_inodes() and find_inode()&iput()
fs: Fix file_set_fowner LSM hook inconsistencies
nfs: fix memory leak in error path of nfs4_do_reclaim
ASoC: meson: axg: extract sound card utils
ASoC: meson: axg-card: fix 'use-after-free'
PCI: xilinx-nwl: Use irq_data_get_irq_chip_data()
PCI: xilinx-nwl: Fix off-by-one in INTx IRQ handler
soc: versatile: realview: fix memory leak during device remove
soc: versatile: realview: fix soc_dev leak during device remove
usb: yurex: Replace snprintf() with the safer scnprintf() variant
USB: misc: yurex: fix race between read and write
pps: remove usage of the deprecated ida_simple_xx() API
pps: add an error check in parport_attach
mm: only enforce minimum stack gap size if it's sensible
i2c: aspeed: Update the stop sw state when the bus recovery occurs
i2c: isch: Add missed 'else'
usb: yurex: Fix inconsistent locking bug in yurex_read()
mailbox: rockchip: fix a typo in module autoloading
mailbox: bcm2835: Fix timeout during suspend mode
ceph: remove the incorrect Fw reference check when dirtying pages
Minor fixes to the CAIF Transport drivers Kconfig file
drivers: net: Fix Kconfig indentation, continued
ieee802154: Fix build error
net/mlx5: Added cond_resched() to crdump collection
netfilter: uapi: NFTA_FLOWTABLE_HOOK is NLA_NESTED
net: ieee802154: mcr20a: Use IRQF_NO_AUTOEN flag in request_irq()
netfilter: nf_tables: prevent nf_skb_duplicated corruption
Bluetooth: btmrvl_sdio: Refactor irq wakeup
Bluetooth: btmrvl: Use IRQF_NO_AUTOEN flag in request_irq()
net: ethernet: lantiq_etop: fix memory disclosure
net: avoid potential underflow in qdisc_pkt_len_init() with UFO
net: add more sanity checks to qdisc_pkt_len_init()
ipv4: ip_gre: Fix drops of small packets in ipgre_xmit
sctp: set sk_state back to CLOSED if autobind fails in sctp_listen_start
ALSA: hda/realtek: Fix the push button function for the ALC257
ALSA: hda/generic: Unconditionally prefer preferred_dacs pairs
ALSA: hda/conexant: Fix conflicting quirk for System76 Pangolin
f2fs: Require FMODE_WRITE for atomic write ioctls
wifi: ath9k: fix possible integer overflow in ath9k_get_et_stats()
wifi: ath9k_htc: Use __skb_set_length() for resetting urb before resubmit
ice: Adjust over allocation of memory in ice_sched_add_root_node() and ice_sched_add_node()
net: hisilicon: hip04: fix OF node leak in probe()
net: hisilicon: hns_dsaf_mac: fix OF node leak in hns_mac_get_info()
net: hisilicon: hns_mdio: fix OF node leak in probe()
ACPICA: Fix memory leak if acpi_ps_get_next_namepath() fails
ACPICA: Fix memory leak if acpi_ps_get_next_field() fails
net: sched: consistently use rcu_replace_pointer() in taprio_change()
wifi: rtw88: select WANT_DEV_COREDUMP
ACPI: EC: Do not release locks during operation region accesses
ACPICA: check null return of ACPI_ALLOCATE_ZEROED() in acpi_db_convert_to_package()
tipc: guard against string buffer overrun
net: mvpp2: Increase size of queue_name buffer
ipv4: Check !in_dev earlier for ioctl(SIOCSIFADDR).
ipv4: Mask upper DSCP bits and ECN bits in NETLINK_FIB_LOOKUP family
tcp: avoid reusing FIN_WAIT2 when trying to find port in connect() process
ACPICA: iasl: handle empty connection_node
proc: add config & param to block forcing mem writes
wifi: mwifiex: Fix memcpy() field-spanning write warning in mwifiex_cmd_802_11_scan_ext()
nfp: Use IRQF_NO_AUTOEN flag in request_irq()
signal: Replace BUG_ON()s
ALSA: asihpi: Fix potential OOB array access
ALSA: hdsp: Break infinite MIDI input flush loop
x86/syscall: Avoid memcpy() for ia32 syscall_get_arguments()
fbdev: pxafb: Fix possible use after free in pxafb_task()
power: reset: brcmstb: Do not go into infinite loop if reset fails
ata: sata_sil: Rename sil_blacklist to sil_quirks
jfs: UBSAN: shift-out-of-bounds in dbFindBits
jfs: Fix uaf in dbFreeBits
jfs: check if leafidx greater than num leaves per dmap tree
jfs: Fix uninit-value access of new_ea in ea_buffer
drm/amd/display: Check stream before comparing them
drm/amd/display: Fix index out of bounds in degamma hardware format translation
drm/amd/display: Initialize get_bytes_per_element's default to 1
drm/printer: Allow NULL data in devcoredump printer
scsi: aacraid: Rearrange order of struct aac_srb_unit
drm/radeon/r100: Handle unknown family in r100_cp_init_microcode()
of/irq: Refer to actual buffer size in of_irq_parse_one()
ext4: ext4_search_dir should return a proper error
ext4: fix i_data_sem unlock order in ext4_ind_migrate()
spi: s3c64xx: fix timeout counters in flush_fifo
selftests: breakpoints: use remaining time to check if suspend succeed
selftests: vDSO: fix vDSO symbols lookup for powerpc64
i2c: stm32f7: Do not prepare/unprepare clock during runtime suspend/resume
i2c: xiic: Wait for TX empty to avoid missed TX NAKs
firmware: tegra: bpmp: Drop unused mbox_client_to_bpmp()
spi: bcm63xx: Fix module autoloading
perf/core: Fix small negative period being ignored
parisc: Fix itlb miss handler for 64-bit programs
drm: Consistently use struct drm_mode_rect for FB_DAMAGE_CLIPS
ALSA: core: add isascii() check to card ID generator
ext4: no need to continue when the number of entries is 1
ext4: propagate errors from ext4_find_extent() in ext4_insert_range()
ext4: fix incorrect tid assumption in __jbd2_log_wait_for_space()
ext4: aovid use-after-free in ext4_ext_insert_extent()
ext4: fix double brelse() the buffer of the extents path
ext4: fix incorrect tid assumption in ext4_wait_for_tail_page_commit()
parisc: Fix 64-bit userspace syscall path
parisc: Fix stack start for ADDR_NO_RANDOMIZE personality
of/irq: Support #msi-cells=<0> in of_msi_get_domain
drm: omapdrm: Add missing check for alloc_ordered_workqueue
jbd2: stop waiting for space when jbd2_cleanup_journal_tail() returns error
mm: krealloc: consider spare memory for __GFP_ZERO
ocfs2: fix the la space leak when unmounting an ocfs2 volume
ocfs2: fix uninit-value in ocfs2_get_block()
ocfs2: reserve space for inline xattr before attaching reflink tree
ocfs2: cancel dqi_sync_work before freeing oinfo
ocfs2: remove unreasonable unlock in ocfs2_read_blocks
ocfs2: fix null-ptr-deref when journal load failed.
ocfs2: fix possible null-ptr-deref in ocfs2_set_buffer_uptodate
riscv: define ILLEGAL_POINTER_VALUE for 64bit
aoe: fix the potential use-after-free problem in more places
clk: rockchip: fix error for unknown clocks
media: sun4i_csi: Implement link validate for sun4i_csi subdev
media: uapi/linux/cec.h: cec_msg_set_reply_to: zero flags
media: venus: fix use after free bug in venus_remove due to race condition
iio: magnetometer: ak8975: Fix reading for ak099xx sensors
tomoyo: fallback to realpath if symlink's pathname does not exist
rtc: at91sam9: fix OF node leak in probe() error path
Input: adp5589-keys - fix adp5589_gpio_get_value()
ACPI: resource: Add Asus Vivobook X1704VAP to irq1_level_low_skip_override[]
ACPI: resource: Add Asus ExpertBook B2502CVA to irq1_level_low_skip_override[]
btrfs: fix a NULL pointer dereference when failed to start a new trasacntion
btrfs: wait for fixup workers before stopping cleaner kthread during umount
gpio: davinci: fix lazy disable
i2c: qcom-geni: Let firmware specify irq trigger flags
i2c: qcom-geni: Grow a dev pointer to simplify code
i2c: qcom-geni: Use IRQF_NO_AUTOEN flag in request_irq()
arm64: Add Cortex-715 CPU part definition
arm64: cputype: Add Neoverse-N3 definitions
arm64: errata: Expand speculative SSBS workaround once more
uprobes: fix kernel info leak via "[uprobes]" vma
nfsd: use ktime_get_seconds() for timestamps
nfsd: fix delegation_blocked() to block correctly for at least 30 seconds
clk: qcom: rpmh: Simplify clk_rpmh_bcm_send_cmd()
clk: qcom: clk-rpmh: Fix overflow in BCM vote
r8169: Fix spelling mistake: "tx_underun" -> "tx_underrun"
r8169: add tally counter fields added with RTL8125
ACPI: battery: Simplify battery hook locking
ACPI: battery: Fix possible crash when unregistering a battery hook
ext4: fix inode tree inconsistency caused by ENOMEM
unicode: Don't special case ignorable code points
net: ethernet: cortina: Drop TSO support
tracing: Remove precision vsnprintf() check from print event
drm/crtc: fix uninitialized variable use even harder
tracing: Have saved_cmdlines arrays all in one allocation
virtio_console: fix misc probe bugs
Input: synaptics-rmi4 - fix UAF of IRQ domain on driver removal
bpf: Check percpu map value size first
s390/facility: Disable compile time optimization for decompressor code
s390/mm: Add cond_resched() to cmm_alloc/free_pages()
ext4: nested locking for xattr inode
s390/cpum_sf: Remove WARN_ON_ONCE statements
ktest.pl: Avoid false positives with grub2 skip regex
clk: bcm: bcm53573: fix OF node leak in init
PCI: Add ACS quirk for Qualcomm SA8775P
i2c: i801: Use a different adapter-name for IDF adapters
PCI: Mark Creative Labs EMU20k2 INTx masking as broken
ntb: ntb_hw_switchtec: Fix use after free vulnerability in switchtec_ntb_remove due to race condition
media: videobuf2-core: clear memory related fields in __vb2_plane_dmabuf_put()
usb: chipidea: udc: enable suspend interrupt after usb reset
usb: dwc2: Adjust the timing of USB Driver Interrupt Registration in the Crashkernel Scenario
virtio_pmem: Check device status before requesting flush
tools/iio: Add memory allocation failure check for trigger_name
driver core: bus: Return -EIO instead of 0 when show/store invalid bus attribute
fbdev: sisfb: Fix strbuf array overflow
RDMA/rxe: Fix seg fault in rxe_comp_queue_pkt
ice: fix VLAN replay after reset
SUNRPC: Fix integer overflow in decode_rc_list()
tcp: fix to allow timestamp undo if no retransmits were sent
tcp: fix tcp_enter_recovery() to zero retrans_stamp when it's safe
netfilter: br_netfilter: fix panic with metadata_dst skb
Bluetooth: RFCOMM: FIX possible deadlock in rfcomm_sk_state_change
gpio: aspeed: Add the flush write to ensure the write complete.
gpio: aspeed: Use devm_clk api to manage clock source
igb: Do not bring the device up after non-fatal error
net/sched: accept TCA_STAB only for root qdisc
net: ibm: emac: mal: fix wrong goto
net: annotate lockless accesses to sk->sk_ack_backlog
net: annotate lockless accesses to sk->sk_max_ack_backlog
sctp: ensure sk_state is set to CLOSED if hashing fails in sctp_listen_start
ppp: fix ppp_async_encode() illegal access
slip: make slhc_remember() more robust against malicious packets
locking/lockdep: Fix bad recursion pattern
locking/lockdep: Rework lockdep_lock
locking/lockdep: Avoid potential access of invalid memory in lock_class
lockdep: fix deadlock issue between lockdep and rcu
resource: fix region_intersects() vs add_memory_driver_managed()
CDC-NCM: avoid overflow in sanity checking
HID: plantronics: Workaround for an unexcepted opposite volume key
Revert "usb: yurex: Replace snprintf() with the safer scnprintf() variant"
usb: dwc3: core: Stop processing of pending events if controller is halted
usb: xhci: Fix problem with xhci resume from suspend
usb: storage: ignore bogus device raised by JieLi BR21 USB sound chip
hid: intel-ish-hid: Fix uninitialized variable 'rv' in ish_fw_xfer_direct_dma
net: Fix an unsafe loop on the list
nouveau/dmem: Fix vulnerability in migrate_to_ram upon copy error
posix-clock: Fix missing timespec64 check in pc_clock_settime()
arm64: probes: Remove broken LDR (literal) uprobe support
arm64: probes: Fix simulate_ldr*_literal()
tracing/kprobes: Return EADDRNOTAVAIL when func matches several symbols
tracing/kprobes: Fix symbol counting logic by looking at modules as well
PCI: Add function 0 DMA alias quirk for Glenfly Arise chip
fat: fix uninitialized variable
mm/swapfile: skip HugeTLB pages for unuse_vma
wifi: mac80211: fix potential key use-after-free
KVM: Fix a data race on last_boosted_vcpu in kvm_vcpu_on_spin()
s390/sclp_vt220: Convert newlines to CRLF instead of LFCR
KVM: s390: Change virtual to physical address access in diag 0x258 handler
x86/cpufeatures: Define X86_FEATURE_AMD_IBPB_RET
blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race
drm/vmwgfx: Handle surface check failure correctly
iio: dac: ltc1660: add missing select REGMAP_SPI in Kconfig
iio: dac: stm32-dac-core: add missing select REGMAP_MMIO in Kconfig
iio: adc: ti-ads8688: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
iio: hid-sensors: Fix an error handling path in _hid_sensor_set_report_latency()
iio: light: opt3001: add missing full-scale range value
iio: proximity: mb1232: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
iio: adc: ti-ads124s08: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
Bluetooth: Remove debugfs directory on module init failure
Bluetooth: btusb: Fix regression with fake CSR controllers 0a12:0001
xhci: Fix incorrect stream context type macro
USB: serial: option: add support for Quectel EG916Q-GL
USB: serial: option: add Telit FN920C04 MBIM compositions
parport: Proper fix for array out-of-bounds access
x86/resctrl: Annotate get_mem_config() functions as __init
x86/apic: Always explicitly disarm TSC-deadline timer
nilfs2: propagate directory read errors from nilfs_find_entry()
erofs: fix lz4 inplace decompression
mac80211: Fix NULL ptr deref for injected rate info
RDMA/bnxt_re: Fix incorrect AVID type in WQE structure
ARM: dts: bcm2837-rpi-cm3-io3: Fix HDMI hpd-gpio pin
RDMA/cxgb4: Fix RDMA_CM_EVENT_UNREACHABLE error for iWARP
ipv4: give an IPv4 dev to blackhole_netdev
RDMA/bnxt_re: Return more meaningful error
drm/msm/dsi: fix 32-bit signed integer extension in pclk_rate calculation
macsec: don't increment counters for an unrelated SA
net: ethernet: aeroflex: fix potential memory leak in greth_start_xmit_gbit()
net: systemport: fix potential memory leak in bcm_sysport_xmit()
genetlink: hold RCU in genlmsg_mcast()
smb: client: fix OOBs when building SMB2_IOCTL request
usb: typec: altmode should keep reference to parent
Bluetooth: bnep: fix wild-memory-access in proto_unregister
arm64:uprobe fix the uprobe SWBP_INSN in big-endian
arm64: probes: Fix uprobes for big-endian kernels
KVM: s390: gaccess: Refactor gpa and length calculation
KVM: s390: gaccess: Refactor access address range check
KVM: s390: gaccess: Cleanup access to guest pages
KVM: s390: gaccess: Check if guest address is in memslot
drm/vboxvideo: Replace fake VLA at end of vbva_mouse_pointer_shape with real VLA
udf: fix uninit-value use in udf_get_fileshortad
jfs: Fix sanity check in dbMount
tracing: Consider the NULL character when validating the event length
net/sun3_82586: fix potential memory leak in sun3_82586_send_packet()
be2net: fix potential memory leak in be_xmit()
net: usb: usbnet: fix name regression
net: sched: fix use-after-free in taprio_change()
r8169: avoid unsolicited interrupts
posix-clock: posix-clock: Fix unbalanced locking in pc_clock_settime()
ALSA: firewire-lib: Avoid division by zero in apply_constraint_to_size()
ALSA: hda/realtek: Update default depop procedure
drm/amd: Guard against bad data for ATIF ACPI method
ACPI: resource: Add LG 16T90SP to irq1_level_low_skip_override[]
ACPI: button: Add DMI quirk for Samsung Galaxy Book2 to fix initial lid detection issue
nilfs2: fix kernel bug due to missing clearing of buffer delay flag
ALSA: hda/realtek: Add subwoofer quirk for Acer Predator G9-593
hv_netvsc: Fix VF namespace also in synthetic NIC NETDEV_REGISTER event
selinux: improve error checking in sel_write_load()
arm64/uprobes: change the uprobe_opcode_t typedef to fix the sparse warning
xfrm: validate new SA's prefixlen using SA family when sel.family is unset
cgroup: Fix potential overflow issue when checking max_depth
wifi: mac80211: skip non-uploaded keys in ieee80211_iter_keys
mac80211: do drv_reconfig_complete() before restarting all
mac80211: Add support to trigger sta disconnect on hardware restart
wifi: iwlwifi: mvm: disconnect station vifs if recovery failed
wifi: iwlwifi: mvm: Fix response handling in iwl_mvm_send_recovery_cmd()
ASoC: cs42l51: Fix some error handling paths in cs42l51_probe()
dt-bindings: gpu: Convert Samsung Image Rotator to dt-schema
gtp: simplify error handling code in 'gtp_encap_enable()'
gtp: allow -1 to be specified as file description from userspace
net/sched: stop qdisc_tree_reduce_backlog on TC_H_ROOT
bpf: Fix out-of-bounds write in trie_get_next_key()
net: support ip generic csum processing in skb_csum_hwoffload_help
net: skip offload for NETIF_F_IPV6_CSUM if ipv6 header contains extension
netfilter: nft_payload: sanitize offset and length before calling skb_checksum()
drivers/misc: ti-st: Remove unneeded variable in st_tty_open
firmware: arm_sdei: Fix the input parameter of cpuhp_remove_state()
net: amd: mvme147: Fix probe banner message
misc: sgi-gru: Don't disable preemption in GRU driver
usbip: tools: Fix detach_port() invalid port error path
usb: phy: Fix API devm_usb_put_phy() can not release the phy
xhci: Fix Link TRB DMA in command ring stopped completion event
Revert "driver core: Fix uevent_show() vs driver detach race"
wifi: mac80211: do not pass a stopped vif to the driver in .get_txpower
wifi: ath10k: Fix memory leak in management tx
wifi: iwlegacy: Clear stale interrupts before resuming device
staging: iio: frequency: ad9832: fix division by zero in ad9832_calc_freqreg()
nilfs2: fix potential deadlock with newly created symlinks
riscv: Remove unused GENERATING_ASM_OFFSETS
ocfs2: pass u64 to ocfs2_truncate_inline maybe overflow
nilfs2: fix kernel bug due to missing clearing of checked flag
mm: shmem: fix data-race in shmem_getattr()
Revert "drm/mipi-dsi: Set the fwnode for mipi_dsi_device"
vt: prevent kernel-infoleak in con_font_get()
mac80211: always have ieee80211_sta_restart()
mm: krealloc: Fix MTE false alarm in __do_krealloc
Linux 5.4.285
Change-Id: Ie1859b6122e2fdacf18a1fe83f792b855fd0e54c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
This reverts commit eb6f68ec92 which is
commit 20c20bd11a0702ce4dc9300c3da58acf551d9725 upstream.
It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.
Bug: 161946584
Change-Id: I4611eed3677738ab29469733e2b4f6734ef3d605
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ Upstream commit 20c20bd11a0702ce4dc9300c3da58acf551d9725 ]
map is the pointer of outer map, and need_defer needs some explanation.
need_defer tells the implementation to defer the reference release of
the passed element and ensure that the element is still alive before
the bpf program, which may manipulate it, exits.
The following three cases will invoke map_fd_put_ptr() and different
need_defer values will be passed to these callers:
1) release the reference of the old element in the map during map update
or map deletion. The release must be deferred, otherwise the bpf
program may incur use-after-free problem, so need_defer needs to be
true.
2) release the reference of the to-be-added element in the error path of
map update. The to-be-added element is not visible to any bpf
program, so it is OK to pass false for need_defer parameter.
3) release the references of all elements in the map during map release.
Any bpf program which has access to the map must have been exited and
released, so need_defer=false will be OK.
These two parameters will be used by the following patches to fix the
potential use-after-free problem for map-in-map.
Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20231204140425.1480317-3-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of version 2 of the gnu general public license as
published by the free software foundation this program is
distributed in the hope that it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 64 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.894819585@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Most bpf map types doing similar checks and bytes to pages
conversion during memory allocation and charging.
Let's unify these checks by moving them into bpf_map_charge_init().
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
In order to unify the existing memlock charging code with the
memcg-based memory accounting, which will be added later, let's
rework the current scheme.
Currently the following design is used:
1) .alloc() callback optionally checks if the allocation will likely
succeed using bpf_map_precharge_memlock()
2) .alloc() performs actual allocations
3) .alloc() callback calculates map cost and sets map.memory.pages
4) map_create() calls bpf_map_init_memlock() which sets map.memory.user
and performs actual charging; in case of failure the map is
destroyed
<map is in use>
1) bpf_map_free_deferred() calls bpf_map_release_memlock(), which
performs uncharge and releases the user
2) .map_free() callback releases the memory
The scheme can be simplified and made more robust:
1) .alloc() calculates map cost and calls bpf_map_charge_init()
2) bpf_map_charge_init() sets map.memory.user and performs actual
charge
3) .alloc() performs actual allocations
<map is in use>
1) .map_free() callback releases the memory
2) bpf_map_charge_finish() performs uncharge and releases the user
The new scheme also allows to reuse bpf_map_charge_init()/finish()
functions for memcg-based accounting. Because charges are performed
before actual allocations and uncharges after freeing the memory,
no bogus memory pressure can be created.
In cases when the map structure is not available (e.g. it's not
created yet, or is already destroyed), on-stack bpf_map_memory
structure is used. The charge can be transferred with the
bpf_map_charge_move() function.
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Group "user" and "pages" fields of bpf_map into the bpf_map_memory
structure. Later it can be extended with "memcg" and other related
information.
The main reason for a such change (beside cosmetics) is to pass
bpf_map_memory structure to charging functions before the actual
allocation of bpf_map.
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Given we'll be reusing BPF array maps for global data/bss/rodata
sections, we need a way to associate BTF DataSec type as its map
value type. In usual cases we have this ugly BPF_ANNOTATE_KV_PAIR()
macro hack e.g. via 38d5d3b3d5 ("bpf: Introduce BPF_ANNOTATE_KV_PAIR")
to get initial map to type association going. While more use cases
for it are discouraged, this also won't work for global data since
the use of array map is a BPF loader detail and therefore unknown
at compilation time. For array maps with just a single entry we make
an exception in terms of BTF in that key type is declared optional
if value type is of DataSec type. The latter LLVM is guaranteed to
emit and it also aligns with how we regard global data maps as just
a plain buffer area reusing existing map facilities for allowing
things like introspection with existing tools.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This work adds two new map creation flags BPF_F_RDONLY_PROG
and BPF_F_WRONLY_PROG in order to allow for read-only or
write-only BPF maps from a BPF program side.
Today we have BPF_F_RDONLY and BPF_F_WRONLY, but this only
applies to system call side, meaning the BPF program has full
read/write access to the map as usual while bpf(2) calls with
map fd can either only read or write into the map depending
on the flags. BPF_F_RDONLY_PROG and BPF_F_WRONLY_PROG allows
for the exact opposite such that verifier is going to reject
program loads if write into a read-only map or a read into a
write-only map is detected. For read-only map case also some
helpers are forbidden for programs that would alter the map
state such as map deletion, update, etc. As opposed to the two
BPF_F_RDONLY / BPF_F_WRONLY flags, BPF_F_RDONLY_PROG as well
as BPF_F_WRONLY_PROG really do correspond to the map lifetime.
We've enabled this generic map extension to various non-special
maps holding normal user data: array, hash, lru, lpm, local
storage, queue and stack. Further generic map types could be
followed up in future depending on use-case. Main use case
here is to forbid writes into .rodata map values from verifier
side.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This generic extension to BPF maps allows for directly loading
an address residing inside a BPF map value as a single BPF
ldimm64 instruction!
The idea is similar to what BPF_PSEUDO_MAP_FD does today, which
is a special src_reg flag for ldimm64 instruction that indicates
that inside the first part of the double insns's imm field is a
file descriptor which the verifier then replaces as a full 64bit
address of the map into both imm parts. For the newly added
BPF_PSEUDO_MAP_VALUE src_reg flag, the idea is the following:
the first part of the double insns's imm field is again a file
descriptor corresponding to the map, and the second part of the
imm field is an offset into the value. The verifier will then
replace both imm parts with an address that points into the BPF
map value at the given value offset for maps that support this
operation. Currently supported is array map with single entry.
It is possible to support more than just single map element by
reusing both 16bit off fields of the insns as a map index, so
full array map lookup could be expressed that way. It hasn't
been implemented here due to lack of concrete use case, but
could easily be done so in future in a compatible way, since
both off fields right now have to be 0 and would correctly
denote a map index 0.
The BPF_PSEUDO_MAP_VALUE is a distinct flag as otherwise with
BPF_PSEUDO_MAP_FD we could not differ offset 0 between load of
map pointer versus load of map's value at offset 0, and changing
BPF_PSEUDO_MAP_FD's encoding into off by one to differ between
regular map pointer and map value pointer would add unnecessary
complexity and increases barrier for debugability thus less
suitable. Using the second part of the imm field as an offset
into the value does /not/ come with limitations since maximum
possible value size is in u32 universe anyway.
This optimization allows for efficiently retrieving an address
to a map value memory area without having to issue a helper call
which needs to prepare registers according to calling convention,
etc, without needing the extra NULL test, and without having to
add the offset in an additional instruction to the value base
pointer. The verifier then treats the destination register as
PTR_TO_MAP_VALUE with constant reg->off from the user passed
offset from the second imm field, and guarantees that this is
within bounds of the map value. Any subsequent operations are
normally treated as typical map value handling without anything
extra needed from verification side.
The two map operations for direct value access have been added to
array map for now. In future other types could be supported as
well depending on the use case. The main use case for this commit
is to allow for BPF loader support for global variables that
reside in .data/.rodata/.bss sections such that we can directly
load the address of them with minimal additional infrastructure
required. Loader support has been added in subsequent commits for
libbpf library.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Introduce BPF_F_LOCK flag for map_lookup and map_update syscall commands
and for map_update() helper function.
In all these cases take a lock of existing element (which was provided
in BTF description) before copying (in or out) the rest of map value.
Implementation details that are part of uapi:
Array:
The array map takes the element lock for lookup/update.
Hash:
hash map also takes the lock for lookup/update and tries to avoid the bucket lock.
If old element exists it takes the element lock and updates the element in place.
If element doesn't exist it allocates new one and inserts into hash table
while holding the bucket lock.
In rare case the hashmap has to take both the bucket lock and the element lock
to update old value in place.
Cgroup local storage:
It is similar to array. update in place and lookup are done with lock taken.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Introduce 'struct bpf_spin_lock' and bpf_spin_lock/unlock() helpers to let
bpf program serialize access to other variables.
Example:
struct hash_elem {
int cnt;
struct bpf_spin_lock lock;
};
struct hash_elem * val = bpf_map_lookup_elem(&hash_map, &key);
if (val) {
bpf_spin_lock(&val->lock);
val->cnt++;
bpf_spin_unlock(&val->lock);
}
Restrictions and safety checks:
- bpf_spin_lock is only allowed inside HASH and ARRAY maps.
- BTF description of the map is mandatory for safety analysis.
- bpf program can take one bpf_spin_lock at a time, since two or more can
cause dead locks.
- only one 'struct bpf_spin_lock' is allowed per map element.
It drastically simplifies implementation yet allows bpf program to use
any number of bpf_spin_locks.
- when bpf_spin_lock is taken the calls (either bpf2bpf or helpers) are not allowed.
- bpf program must bpf_spin_unlock() before return.
- bpf program can access 'struct bpf_spin_lock' only via
bpf_spin_lock()/bpf_spin_unlock() helpers.
- load/store into 'struct bpf_spin_lock lock;' field is not allowed.
- to use bpf_spin_lock() helper the BTF description of map value must be
a struct and have 'struct bpf_spin_lock anyname;' field at the top level.
Nested lock inside another struct is not allowed.
- syscall map_lookup doesn't copy bpf_spin_lock field to user space.
- syscall map_update and program map_update do not update bpf_spin_lock field.
- bpf_spin_lock cannot be on the stack or inside networking packet.
bpf_spin_lock can only be inside HASH or ARRAY map value.
- bpf_spin_lock is available to root only and to all program types.
- bpf_spin_lock is not allowed in inner maps of map-in-map.
- ld_abs is not allowed inside spin_lock-ed region.
- tracing progs and socket filter progs cannot use bpf_spin_lock due to
insufficient preemption checks
Implementation details:
- cgroup-bpf class of programs can nest with xdp/tc programs.
Hence bpf_spin_lock is equivalent to spin_lock_irqsave.
Other solutions to avoid nested bpf_spin_lock are possible.
Like making sure that all networking progs run with softirq disabled.
spin_lock_irqsave is the simplest and doesn't add overhead to the
programs that don't use it.
- arch_spinlock_t is used when its implemented as queued_spin_lock
- archs can force their own arch_spinlock_t
- on architectures where queued_spin_lock is not available and
sizeof(arch_spinlock_t) != sizeof(__u32) trivial lock is used.
- presence of bpf_spin_lock inside map value could have been indicated via
extra flag during map_create, but specifying it via BTF is cleaner.
It provides introspection for map key/value and reduces user mistakes.
Next steps:
- allow bpf_spin_lock in other map types (like cgroup local storage)
- introduce BPF_F_LOCK flag for bpf_map_update() syscall and helper
to request kernel to grab bpf_spin_lock before rewriting the value.
That will serialize access to map elements.
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
If key_type or value_type are of non-trivial data types
(e.g. structure or typedef), it's not possible to check them without
the additional information, which can't be obtained without a pointer
to the btf structure.
So, let's pass btf pointer to the map_check_btf() callbacks.
Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Added bpffs pretty print for program array map. For a particular
array index, if the program array points to a valid program,
the "<index>: <prog_id>" will be printed out like
0: 6
which means bpf program with id "6" is installed at index "0".
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Added bpffs pretty print for percpu arraymap, percpu hashmap
and percpu lru hashmap.
For each map <key, value> pair, the format is:
<key_value>: {
cpu0: <value_on_cpu0>
cpu1: <value_on_cpu1>
...
cpun: <value_on_cpun>
}
For example, on my VM, there are 4 cpus, and
for test_btf test in the next patch:
cat /sys/fs/bpf/pprint_test_percpu_hash
You may get:
...
43602: {
cpu0: {43602,0,-43602,0x3,0xaa52,0x3,{43602|[82,170,0,0,0,0,0,0]},ENUM_TWO}
cpu1: {43602,0,-43602,0x3,0xaa52,0x3,{43602|[82,170,0,0,0,0,0,0]},ENUM_TWO}
cpu2: {43602,0,-43602,0x3,0xaa52,0x3,{43602|[82,170,0,0,0,0,0,0]},ENUM_TWO}
cpu3: {43602,0,-43602,0x3,0xaa52,0x3,{43602|[82,170,0,0,0,0,0,0]},ENUM_TWO}
}
72847: {
cpu0: {72847,0,-72847,0x3,0x11c8f,0x3,{72847|[143,28,1,0,0,0,0,0]},ENUM_THREE}
cpu1: {72847,0,-72847,0x3,0x11c8f,0x3,{72847|[143,28,1,0,0,0,0,0]},ENUM_THREE}
cpu2: {72847,0,-72847,0x3,0x11c8f,0x3,{72847|[143,28,1,0,0,0,0,0]},ENUM_THREE}
cpu3: {72847,0,-72847,0x3,0x11c8f,0x3,{72847|[143,28,1,0,0,0,0,0]},ENUM_THREE}
}
...
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Commit a26ca7c982 ("bpf: btf: Add pretty print support to
the basic arraymap") and 699c86d6ec ("bpf: btf: add pretty
print for hash/lru_hash maps") enabled support for BTF and
dumping via BPF fs for array and hash/lru map. However, both
can be decoupled from each other such that regular BPF maps
can be supported for attaching BTF key/value information,
while not all maps necessarily need to dump via map_seq_show_elem()
callback.
The basic sanity check which is a prerequisite for all maps
is that key/value size has to match in any case, and some maps
can have extra checks via map_check_btf() callback, e.g.
probing certain types or indicating no support in general. With
that we can also enable retrieving BTF info for per-cpu map
types and lpm.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
This patch introduces a new map type BPF_MAP_TYPE_REUSEPORT_SOCKARRAY.
To unleash the full potential of a bpf prog, it is essential for the
userspace to be capable of directly setting up a bpf map which can then
be consumed by the bpf prog to make decision. In this case, decide which
SO_REUSEPORT sk to serve the incoming request.
By adding BPF_MAP_TYPE_REUSEPORT_SOCKARRAY, the userspace has total control
and visibility on where a SO_REUSEPORT sk should be located in a bpf map.
The later patch will introduce BPF_PROG_TYPE_SK_REUSEPORT such that
the bpf prog can directly select a sk from the bpf map. That will
raise the programmability of the bpf prog attached to a reuseport
group (a group of sk serving the same IP:PORT).
For example, in UDP, the bpf prog can peek into the payload (e.g.
through the "data" pointer introduced in the later patch) to learn
the application level's connection information and then decide which sk
to pick from a bpf map. The userspace can tightly couple the sk's location
in a bpf map with the application logic in generating the UDP payload's
connection information. This connection info contact/API stays within the
userspace.
Also, when used with map-in-map, the userspace can switch the
old-server-process's inner map to a new-server-process's inner map
in one call "bpf_map_update_elem(outer_map, &index, &new_reuseport_array)".
The bpf prog will then direct incoming requests to the new process instead
of the old process. The old process can finish draining the pending
requests (e.g. by "accept()") before closing the old-fds. [Note that
deleting a fd from a bpf map does not necessary mean the fd is closed]
During map_update_elem(),
Only SO_REUSEPORT sk (i.e. which has already been added
to a reuse->socks[]) can be used. That means a SO_REUSEPORT sk that is
"bind()" for UDP or "bind()+listen()" for TCP. These conditions are
ensured in "reuseport_array_update_check()".
A SO_REUSEPORT sk can only be added once to a map (i.e. the
same sk cannot be added twice even to the same map). SO_REUSEPORT
already allows another sk to be created for the same IP:PORT.
There is no need to re-create a similar usage in the BPF side.
When a SO_REUSEPORT is deleted from the "reuse->socks[]" (e.g. "close()"),
it will notify the bpf map to remove it from the map also. It is
done through "bpf_sk_reuseport_detach()" and it will only be called
if >=1 of the "reuse->sock[]" has ever been added to a bpf map.
The map_update()/map_delete() has to be in-sync with the
"reuse->socks[]". Hence, the same "reuseport_lock" used
by "reuse->socks[]" has to be used here also. Care has
been taken to ensure the lock is only acquired when the
adding sk passes some strict tests. and
freeing the map does not require the reuseport_lock.
The reuseport_array will also support lookup from the syscall
side. It will return a sock_gen_cookie(). The sock_gen_cookie()
is on-demand (i.e. a sk's cookie is not generated until the very
first map_lookup_elem()).
The lookup cookie is 64bits but it goes against the logical userspace
expectation on 32bits sizeof(fd) (and as other fd based bpf maps do also).
It may catch user in surprise if we enforce value_size=8 while
userspace still pass a 32bits fd during update. Supporting different
value_size between lookup and update seems unintuitive also.
We also need to consider what if other existing fd based maps want
to return 64bits value from syscall's lookup in the future.
Hence, reuseport_array supports both value_size 4 and 8, and
assuming user will usually use value_size=4. The syscall's lookup
will return ENOSPC on value_size=4. It will will only
return 64bits value from sock_gen_cookie() when user consciously
choose value_size=8 (as a signal that lookup is desired) which then
requires a 64bits value in both lookup and update.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The current map_check_btf() in BPF_MAP_TYPE_ARRAY rejects
'> map->value_size' to ensure map_seq_show_elem() will not
access things beyond an array element.
Yonghong suggested that using '!=' is a more correct
check. The 8 bytes round_up on value_size is stored
in array->elem_size. Hence, using '!=' on map->value_size
is a proper check.
This patch also adds new tests to check the btf array
key type and value type. Two of these new tests verify
the btf's value_size (the change in this patch).
It also fixes two existing tests that wrongly encoded
a btf's type size (pprint_test) and the value_type_id (in one
of the raw_tests[]). However, that do not affect these two
BTF verification tests before or after this test changes.
These two tests mainly failed at array creation time after
this patch.
Fixes: a26ca7c982 ("bpf: btf: Add pretty print support to the basic arraymap")
Suggested-by: Yonghong Song <yhs@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
In "struct bpf_map_info", the name "btf_id", "btf_key_id" and "btf_value_id"
could cause confusion because the "id" of "btf_id" means the BPF obj id
given to the BTF object while
"btf_key_id" and "btf_value_id" means the BTF type id within
that BTF object.
To make it clear, btf_key_id and btf_value_id are
renamed to btf_key_type_id and btf_value_type_id.
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Relying on map_release hook to decrement the reference counts when a
map is removed only works if the map is not being pinned. In the
pinned case the ref is decremented immediately and the BPF programs
released. After this BPF programs may not be in-use which is not
what the user would expect.
This patch moves the release logic into bpf_map_put_uref() and brings
sockmap in-line with how a similar case is handled in prog array maps.
Fixes: 3d9e952697 ("bpf: sockmap, fix leaking maps with attached but not detached progs")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This patch adds pretty print support to the basic arraymap.
Support for other bpf maps can be added later.
This patch adds new attrs to the BPF_MAP_CREATE command to allow
specifying the btf_fd, btf_key_id and btf_value_id. The
BPF_MAP_CREATE can then associate the btf to the map if
the creating map supports BTF.
A BTF supported map needs to implement two new map ops,
map_seq_show_elem() and map_check_btf(). This patch has
implemented these new map ops for the basic arraymap.
It also adds file_operations, bpffs_map_fops, to the pinned
map such that the pinned map can be opened and read.
After that, the user has an intuitive way to do
"cat bpffs/pathto/a-pinned-map" instead of getting
an error.
bpffs_map_fops should not be extended further to support
other operations. Other operations (e.g. write/key-lookup...)
should be realized by the userspace tools (e.g. bpftool) through
the BPF_OBJ_GET_INFO_BY_FD, map's lookup/update interface...etc.
Follow up patches will allow the userspace to obtain
the BTF from a map-fd.
Here is a sample output when reading a pinned arraymap
with the following map's value:
struct map_value {
int count_a;
int count_b;
};
cat /sys/fs/bpf/pinned_array_map:
0: {1,2}
1: {3,4}
2: {5,6}
...
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
syszbot managed to trigger RCU detected stalls in
bpf_array_free_percpu()
It takes time to allocate a huge percpu map, but even more time to free
it.
Since we run in process context, use cond_resched() to yield cpu if
needed.
Fixes: a10423b87a ("bpf: introduce BPF_MAP_TYPE_PERCPU_ARRAY map")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
syzkaller recently triggered OOM during percpu map allocation;
while there is work in progress by Dennis Zhou to add __GFP_NORETRY
semantics for percpu allocator under pressure, there seems also a
missing bpf_map_precharge_memlock() check in array map allocation.
Given today the actual bpf_map_charge_memlock() happens after the
find_and_alloc_map() in syscall path, the bpf_map_precharge_memlock()
is there to bail out early before we go and do the map setup work
when we find that we hit the limits anyway. Therefore add this for
array map as well.
Fixes: 6c90598174 ("bpf: pre-allocate hash map elements")
Fixes: a10423b87a ("bpf: introduce BPF_MAP_TYPE_PERCPU_ARRAY map")
Reported-by: syzbot+adb03f3f0bb57ce3acda@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Dennis Zhou <dennisszhou@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Arraymap was not converted to use bpf_map_init_from_attr()
to avoid merge conflicts with emergency fixes. Do it now.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Use the new callback to perform allocation checks for array maps.
The fd maps don't need a special allocation callback, they only
need a special check callback.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
syzkaller tried to alloc a map with 0xfffffffd entries out of a userns,
and thus unprivileged. With the recently added logic in b2157399cc
("bpf: prevent out-of-bounds speculation") we round this up to the next
power of two value for max_entries for unprivileged such that we can
apply proper masking into potentially zeroed out map slots.
However, this will generate an index_mask of 0xffffffff, and therefore
a + 1 will let this overflow into new max_entries of 0. This will pass
allocation, etc, and later on map access we still enforce on the original
attr->max_entries value which was 0xfffffffd, therefore triggering GPF
all over the place. Thus bail out on overflow in such case.
Moreover, on 32 bit archs roundup_pow_of_two() can also not be used,
since fls_long(max_entries - 1) can result in 32 and 1UL << 32 in 32 bit
space is undefined. Therefore, do this by hand in a 64 bit variable.
This fixes all the issues triggered by syzkaller's reproducers.
Fixes: b2157399cc ("bpf: prevent out-of-bounds speculation")
Reported-by: syzbot+b0efb8e572d01bce1ae0@syzkaller.appspotmail.com
Reported-by: syzbot+6c15e9744f75f2364773@syzkaller.appspotmail.com
Reported-by: syzbot+d2f5524fb46fd3b312ee@syzkaller.appspotmail.com
Reported-by: syzbot+61d23c95395cc90dbc2b@syzkaller.appspotmail.com
Reported-by: syzbot+0d363c942452cca68c01@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Under speculation, CPUs may mis-predict branches in bounds checks. Thus,
memory accesses under a bounds check may be speculated even if the
bounds check fails, providing a primitive for building a side channel.
To avoid leaking kernel data round up array-based maps and mask the index
after bounds check, so speculated load with out of bounds index will load
either valid value from the array or zero from the padded area.
Unconditionally mask index for all array types even when max_entries
are not rounded to power of 2 for root user.
When map is created by unpriv user generate a sequence of bpf insns
that includes AND operation to make sure that JITed code includes
the same 'index & index_mask' operation.
If prog_array map is created by unpriv user replace
bpf_tail_call(ctx, map, index);
with
if (index >= max_entries) {
index &= map->index_mask;
bpf_tail_call(ctx, map, index);
}
(along with roundup to power 2) to prevent out-of-bounds speculation.
There is secondary redundant 'if (index >= max_entries)' in the interpreter
and in all JITs, but they can be optimized later if necessary.
Other array-like maps (cpumap, devmap, sockmap, perf_event_array, cgroup_array)
cannot be used by unpriv, so no changes there.
That fixes bpf side of "Variant 1: bounds check bypass (CVE-2017-5753)" on
all architectures with and without JIT.
v2->v3:
Daniel noticed that attack potentially can be crafted via syscall commands
without loading the program, so add masking to those paths as well.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
There were quite a few overlapping sets of changes here.
Daniel's bug fix for off-by-ones in the new BPF branch instructions,
along with the added allowances for "data_end > ptr + x" forms
collided with the metadata additions.
Along with those three changes came veritifer test cases, which in
their final form I tried to group together properly. If I had just
trimmed GIT's conflict tags as-is, this would have split up the
meta tests unnecessarily.
In the socketmap code, a set of preemption disabling changes
overlapped with the rename of bpf_compute_data_end() to
bpf_compute_data_pointers().
Changes were made to the mv88e6060.c driver set addr method
which got removed in net-next.
The hyperv transport socket layer had a locking change in 'net'
which overlapped with a change of socket state macro usage
in 'net-next'.
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce the map read/write flags to the eBPF syscalls that returns the
map fd. The flags is used to set up the file mode when construct a new
file descriptor for bpf maps. To not break the backward capability, the
f_flags is set to O_RDWR if the flag passed by syscall is 0. Otherwise
it should be O_RDONLY or O_WRONLY. When the userspace want to modify or
read the map content, it will check the file mode to see if it is
allowed to make the change.
Signed-off-by: Chenbo Feng <fengc@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
PCPU_MIN_UNIT_SIZE is an implementation detail of the percpu
allocator. Given we support __GFP_NOWARN now, lets just let
the allocation request fail naturally instead. The two call
sites from BPF mistakenly assumed __GFP_NOWARN would work, so
no changes needed to their actual __alloc_percpu_gfp() calls
which use the flag already.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch does not impact existing functionalities.
It contains the changes in perf event area needed for
subsequent bpf_perf_event_read_value and
bpf_perf_prog_read_value helpers.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid two successive functions calls for the map in map lookup, first
is the bpf_map_lookup_elem() helper call, and second the callback via
map->ops->map_lookup_elem() to get to the map in map implementation.
Implementation inlines array and htab flavor for map in map lookups.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current map creation API does not allow to provide the numa-node
preference. The memory usually comes from where the map-creation-process
is running. The performance is not ideal if the bpf_prog is known to
always run in a numa node different from the map-creation-process.
One of the use case is sharding on CPU to different LRU maps (i.e.
an array of LRU maps). Here is the test result of map_perf_test on
the INNER_LRU_HASH_PREALLOC test if we force the lru map used by
CPU0 to be allocated from a remote numa node:
[ The machine has 20 cores. CPU0-9 at node 0. CPU10-19 at node 1 ]
># taskset -c 10 ./map_perf_test 512 8 1260000 8000000
5:inner_lru_hash_map_perf pre-alloc 1628380 events per sec
4:inner_lru_hash_map_perf pre-alloc 1626396 events per sec
3:inner_lru_hash_map_perf pre-alloc 1626144 events per sec
6:inner_lru_hash_map_perf pre-alloc 1621657 events per sec
2:inner_lru_hash_map_perf pre-alloc 1621534 events per sec
1:inner_lru_hash_map_perf pre-alloc 1620292 events per sec
7:inner_lru_hash_map_perf pre-alloc 1613305 events per sec
0:inner_lru_hash_map_perf pre-alloc 1239150 events per sec #<<<
After specifying numa node:
># taskset -c 10 ./map_perf_test 512 8 1260000 8000000
5:inner_lru_hash_map_perf pre-alloc 1629627 events per sec
3:inner_lru_hash_map_perf pre-alloc 1628057 events per sec
1:inner_lru_hash_map_perf pre-alloc 1623054 events per sec
6:inner_lru_hash_map_perf pre-alloc 1616033 events per sec
2:inner_lru_hash_map_perf pre-alloc 1614630 events per sec
4:inner_lru_hash_map_perf pre-alloc 1612651 events per sec
7:inner_lru_hash_map_perf pre-alloc 1609337 events per sec
0:inner_lru_hash_map_perf pre-alloc 1619340 events per sec #<<<
This patch adds one field, numa_node, to the bpf_attr. Since numa node 0
is a valid node, a new flag BPF_F_NUMA_NODE is also added. The numa_node
field is honored if and only if the BPF_F_NUMA_NODE flag is set.
Numa node selection is not supported for percpu map.
This patch does not change all the kmalloc. F.e.
'htab = kzalloc()' is not changed since the object
is small enough to stay in the cache.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch allows userspace to do BPF_MAP_LOOKUP_ELEM on
BPF_MAP_TYPE_PROG_ARRAY,
BPF_MAP_TYPE_ARRAY_OF_MAPS and
BPF_MAP_TYPE_HASH_OF_MAPS.
The lookup returns a prog-id or map-id to the userspace.
The userspace can then use the BPF_PROG_GET_FD_BY_ID
or BPF_MAP_GET_FD_BY_ID to get a fd.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow BPF_PROG_TYPE_PERF_EVENT program types to attach to all
perf_event types, including HW_CACHE, RAW, and dynamic pmu events.
Only tracepoint/kprobe events are treated differently which require
BPF_PROG_TYPE_TRACEPOINT/BPF_PROG_TYPE_KPROBE program types accordingly.
Also add support for reading all event counters using
bpf_perf_event_read() helper.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
trie_alloc() always needs to have BPF_F_NO_PREALLOC passed in via
attr->map_flags, since it does not support preallocation yet. We
check the flag, but we never copy the flag into trie->map.map_flags,
which is later on exposed into fdinfo and used by loaders such as
iproute2. Latter uses this in bpf_map_selfcheck_pinned() to test
whether a pinned map has the same spec as the one from the BPF obj
file and if not, bails out, which is currently the case for lpm
since it exposes always 0 as flags.
Also copy over flags in array_map_alloc() and stack_map_alloc().
They always have to be 0 right now, but we should make sure to not
miss to copy them over at a later point in time when we add actual
flags for them to use.
Fixes: b95a5c4db0 ("bpf: add a longest prefix match trie map implementation")
Reported-by: Jarno Rajahalme <jarno@covalent.io>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When iterating through a map, we need to find a key that does not exist
in the map so map_get_next_key will give us the first key of the map.
This often requires a lot of guessing in production systems.
This patch makes map_get_next_key return the first key when the key
pointer in the parameter is NULL.
Signed-off-by: Teng Qin <qinteng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's no need to have struct bpf_map_type_list since
it just contains a list_head, the type, and the ops
pointer. Since the types are densely packed and not
actually dynamically registered, it's much easier and
smaller to have an array of type->ops pointer. Also
initialize this array statically to remove code needed
to initialize it.
In order to save duplicating the list, move it to the
types header file added by the previous patch and
include it in the same fashion.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a few helper funcs to enable map-in-map
support (i.e. outer_map->inner_map). The first outer_map type
BPF_MAP_TYPE_ARRAY_OF_MAPS is also added in this patch.
The next patch will introduce a hash of maps type.
Any bpf map type can be acted as an inner_map. The exception
is BPF_MAP_TYPE_PROG_ARRAY because the extra level of
indirection makes it harder to verify the owner_prog_type
and owner_jited.
Multi-level map-in-map is not supported (i.e. map->map is ok
but not map->map->map).
When adding an inner_map to an outer_map, it currently checks the
map_type, key_size, value_size, map_flags, max_entries and ops.
The verifier also uses those map's properties to do static analysis.
map_flags is needed because we need to ensure BPF_PROG_TYPE_PERF_EVENT
is using a preallocated hashtab for the inner_hash also. ops and
max_entries are needed to generate inlined map-lookup instructions.
For simplicity reason, a simple '==' test is used for both map_flags
and max_entries. The equality of ops is implied by the equality of
map_type.
During outer_map creation time, an inner_map_fd is needed to create an
outer_map. However, the inner_map_fd's life time does not depend on the
outer_map. The inner_map_fd is merely used to initialize
the inner_map_meta of the outer_map.
Also, for the outer_map:
* It allows element update and delete from syscall
* It allows element lookup from bpf_prog
The above is similar to the current fd_array pattern.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix in verifier:
For the same bpf_map_lookup_elem() instruction (i.e. "call 1"),
a broken case is "a different type of map could be used for the
same lookup instruction". For example, an array in one case and a
hashmap in another. We have to resort to the old dynamic call behavior
in this case. The fix is to check for collision on insn_aux->map_ptr.
If there is collision, don't inline the map lookup.
Please see the "do_reg_lookup()" in test_map_in_map_kern.c in the later
patch for how-to trigger the above case.
Simplifications on array_map_gen_lookup():
1. Calculate elem_size from map->value_size. It removes the
need for 'struct bpf_array' which makes the later map-in-map
implementation easier.
2. Remove the 'elem_size == 1' test
Fixes: 81ed18ab30 ("bpf: add helper inlining infra and optimize map_array lookup")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Optimize bpf_call -> bpf_map_lookup_elem() -> array_map_lookup_elem()
into a sequence of bpf instructions.
When JIT is on the sequence of bpf instructions is the sequence
of native cpu instructions with significantly faster performance
than indirect call and two function's prologue/epilogue.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
All map types and prog types are registered to the BPF core through
bpf_register_map_type() and bpf_register_prog_type() during init and
remain unchanged thereafter. As by design we don't (and never will)
have any pluggable code that can register to that at any later point
in time, lets mark all the existing bpf_{map,prog}_type_list objects
in the tree as __ro_after_init, so they can be moved to read-only
section from then onwards.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds two helpers, bpf_map_area_alloc() and bpf_map_area_free(),
that are to be used for map allocations. Using kmalloc() for very large
allocations can cause excessive work within the page allocator, so i) fall
back earlier to vmalloc() when the attempt is considered costly anyway,
and even more importantly ii) don't trigger OOM killer with any of the
allocators.
Since this is based on a user space request, for example, when creating
maps with element pre-allocation, we really want such requests to fail
instead of killing other user space processes.
Also, don't spam the kernel log with warnings should any of the allocations
fail under pressure. Given that, we can make backend selection in
bpf_map_area_alloc() generic, and convert all maps over to use this API
for spots with potentially large allocation requests.
Note, replacing the one kmalloc_array() is fine as overflow checks happen
earlier in htab_map_alloc(), since it must also protect the multiplication
for vmalloc() should kmalloc_array() fail.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 01b3f52157 ("bpf: fix allocation warnings in bpf maps and
integer overflow") has added checks for the maximum allocateable size.
It (ab)used KMALLOC_SHIFT_MAX for that purpose.
While this is not incorrect it is not very clean because we already have
KMALLOC_MAX_SIZE for this very reason so let's change both checks to use
KMALLOC_MAX_SIZE instead.
The original motivation for using KMALLOC_SHIFT_MAX was to work around
an incorrect KMALLOC_MAX_SIZE which could lead to allocation warnings
but it is no longer needed since "slab: make sure that KMALLOC_MAX_SIZE
will fit into MAX_ORDER".
Link: http://lkml.kernel.org/r/20161220130659.16461-3-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds a bpf helper that's similar to the skb_in_cgroup helper to check
whether the probe is currently executing in the context of a specific
subset of the cgroupsv2 hierarchy. It does this based on membership test
for a cgroup arraymap. It is invalid to call this in an interrupt, and
it'll return an error. The helper is primarily to be used in debugging
activities for containers, where you may have multiple programs running in
a given top-level "container".
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Should have been obvious, only called from bpf() syscall via map_update_elem()
that calls bpf_fd_array_map_update_elem() under RCU read lock and thus this
must also be in GFP_ATOMIC, of course.
Fixes: 3b1efb196e ("bpf, maps: flush own entries on perf map release")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>